Sample records for rule mining technique

  1. CARIBIAM: constrained Association Rules using Interactive Biological IncrementAl Mining.

    PubMed

    Rahal, Imad; Rahhal, Riad; Wang, Baoying; Perrizo, William

    2008-01-01

    This paper analyses annotated genome data by applying a very central data-mining technique known as Association Rule Mining (ARM) with the aim of discovering rules and hypotheses capable of yielding deeper insights into this type of data. In the literature, ARM has been noted for producing an overwhelming number of rules. This work proposes a new technique capable of using domain knowledge in the form of queries in order to efficiently mine only the subset of the associations that are of interest to investigators in an incremental and interactive manner.

  2. Analysis of Occupational Accidents in Underground and Surface Mining in Spain Using Data-Mining Techniques.

    PubMed

    Sanmiquel, Lluís; Bascompta, Marc; Rossell, Josep M; Anticoi, Hernán Francisco; Guash, Eduard

    2018-03-07

    An analysis of occupational accidents in the mining sector was conducted using the data from the Spanish Ministry of Employment and Social Safety between 2005 and 2015, and data-mining techniques were applied. Data was processed with the software Weka. Two scenarios were chosen from the accidents database: surface and underground mining. The most important variables involved in occupational accidents and their association rules were determined. These rules are composed of several predictor variables that cause accidents, defining its characteristics and context. This study exposes the 20 most important association rules in the sector-either surface or underground mining-based on the statistical confidence levels of each rule as obtained by Weka. The outcomes display the most typical immediate causes, along with the percentage of accidents with a basis in each association rule. The most important immediate cause is body movement with physical effort or overexertion, and the type of accident is physical effort or overexertion. On the other hand, the second most important immediate cause and type of accident are different between the two scenarios. Data-mining techniques were chosen as a useful tool to find out the root cause of the accidents.

  3. Analysis of Occupational Accidents in Underground and Surface Mining in Spain Using Data-Mining Techniques

    PubMed Central

    Sanmiquel, Lluís; Bascompta, Marc; Rossell, Josep M.; Anticoi, Hernán Francisco; Guash, Eduard

    2018-01-01

    An analysis of occupational accidents in the mining sector was conducted using the data from the Spanish Ministry of Employment and Social Safety between 2005 and 2015, and data-mining techniques were applied. Data was processed with the software Weka. Two scenarios were chosen from the accidents database: surface and underground mining. The most important variables involved in occupational accidents and their association rules were determined. These rules are composed of several predictor variables that cause accidents, defining its characteristics and context. This study exposes the 20 most important association rules in the sector—either surface or underground mining—based on the statistical confidence levels of each rule as obtained by Weka. The outcomes display the most typical immediate causes, along with the percentage of accidents with a basis in each association rule. The most important immediate cause is body movement with physical effort or overexertion, and the type of accident is physical effort or overexertion. On the other hand, the second most important immediate cause and type of accident are different between the two scenarios. Data-mining techniques were chosen as a useful tool to find out the root cause of the accidents. PMID:29518921

  4. A Bayesian Scoring Technique for Mining Predictive and Non-Spurious Rules

    PubMed Central

    Batal, Iyad; Cooper, Gregory; Hauskrecht, Milos

    2015-01-01

    Rule mining is an important class of data mining methods for discovering interesting patterns in data. The success of a rule mining method heavily depends on the evaluation function that is used to assess the quality of the rules. In this work, we propose a new rule evaluation score - the Predictive and Non-Spurious Rules (PNSR) score. This score relies on Bayesian inference to evaluate the quality of the rules and considers the structure of the rules to filter out spurious rules. We present an efficient algorithm for finding rules with high PNSR scores. The experiments demonstrate that our method is able to cover and explain the data with a much smaller rule set than existing methods. PMID:25938136

  5. A Bayesian Scoring Technique for Mining Predictive and Non-Spurious Rules.

    PubMed

    Batal, Iyad; Cooper, Gregory; Hauskrecht, Milos

    Rule mining is an important class of data mining methods for discovering interesting patterns in data. The success of a rule mining method heavily depends on the evaluation function that is used to assess the quality of the rules. In this work, we propose a new rule evaluation score - the Predictive and Non-Spurious Rules (PNSR) score. This score relies on Bayesian inference to evaluate the quality of the rules and considers the structure of the rules to filter out spurious rules. We present an efficient algorithm for finding rules with high PNSR scores. The experiments demonstrate that our method is able to cover and explain the data with a much smaller rule set than existing methods.

  6. Data mining and visualization techniques

    DOEpatents

    Wong, Pak Chung [Richland, WA; Whitney, Paul [Richland, WA; Thomas, Jim [Richland, WA

    2004-03-23

    Disclosed are association rule identification and visualization methods, systems, and apparatus. An association rule in data mining is an implication of the form X.fwdarw.Y where X is a set of antecedent items and Y is the consequent item. A unique visualization technique that provides multiple antecedent, consequent, confidence, and support information is disclosed to facilitate better presentation of large quantities of complex association rules.

  7. Analyzing Large Gene Expression and Methylation Data Profiles Using StatBicRM: Statistical Biclustering-Based Rule Mining

    PubMed Central

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level. PMID:25830807

  8. Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

    PubMed

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level.

  9. Inferring Intra-Community Microbial Interaction Patterns from Metagenomic Datasets Using Associative Rule Mining Techniques

    PubMed Central

    Mande, Sharmila S.

    2016-01-01

    The nature of inter-microbial metabolic interactions defines the stability of microbial communities residing in any ecological niche. Deciphering these interaction patterns is crucial for understanding the mode/mechanism(s) through which an individual microbial community transitions from one state to another (e.g. from a healthy to a diseased state). Statistical correlation techniques have been traditionally employed for mining microbial interaction patterns from taxonomic abundance data corresponding to a given microbial community. In spite of their efficiency, these correlation techniques can capture only 'pair-wise interactions'. Moreover, their emphasis on statistical significance can potentially result in missing out on several interactions that are relevant from a biological standpoint. This study explores the applicability of one of the earliest association rule mining algorithm i.e. the 'Apriori algorithm' for deriving 'microbial association rules' from the taxonomic profile of given microbial community. The classical Apriori approach derives association rules by analysing patterns of co-occurrence/co-exclusion between various '(subsets of) features/items' across various samples. Using real-world microbiome data, the efficiency/utility of this rule mining approach in deciphering multiple (biologically meaningful) association patterns between 'subsets/subgroups' of microbes (constituting microbiome samples) is demonstrated. As an example, association rules derived from publicly available gut microbiome datasets indicate an association between a group of microbes (Faecalibacterium, Dorea, and Blautia) that are known to have mutualistic metabolic associations among themselves. Application of the rule mining approach on gut microbiomes (sourced from the Human Microbiome Project) further indicated similar microbial association patterns in gut microbiomes irrespective of the gender of the subjects. A Linux implementation of the Association Rule Mining (ARM) software (customised for deriving 'microbial association rules' from microbiome data) is freely available for download from the following link: http://metagenomics.atc.tcs.com/arm. PMID:27124399

  10. Inferring Intra-Community Microbial Interaction Patterns from Metagenomic Datasets Using Associative Rule Mining Techniques.

    PubMed

    Tandon, Disha; Haque, Mohammed Monzoorul; Mande, Sharmila S

    2016-01-01

    The nature of inter-microbial metabolic interactions defines the stability of microbial communities residing in any ecological niche. Deciphering these interaction patterns is crucial for understanding the mode/mechanism(s) through which an individual microbial community transitions from one state to another (e.g. from a healthy to a diseased state). Statistical correlation techniques have been traditionally employed for mining microbial interaction patterns from taxonomic abundance data corresponding to a given microbial community. In spite of their efficiency, these correlation techniques can capture only 'pair-wise interactions'. Moreover, their emphasis on statistical significance can potentially result in missing out on several interactions that are relevant from a biological standpoint. This study explores the applicability of one of the earliest association rule mining algorithm i.e. the 'Apriori algorithm' for deriving 'microbial association rules' from the taxonomic profile of given microbial community. The classical Apriori approach derives association rules by analysing patterns of co-occurrence/co-exclusion between various '(subsets of) features/items' across various samples. Using real-world microbiome data, the efficiency/utility of this rule mining approach in deciphering multiple (biologically meaningful) association patterns between 'subsets/subgroups' of microbes (constituting microbiome samples) is demonstrated. As an example, association rules derived from publicly available gut microbiome datasets indicate an association between a group of microbes (Faecalibacterium, Dorea, and Blautia) that are known to have mutualistic metabolic associations among themselves. Application of the rule mining approach on gut microbiomes (sourced from the Human Microbiome Project) further indicated similar microbial association patterns in gut microbiomes irrespective of the gender of the subjects. A Linux implementation of the Association Rule Mining (ARM) software (customised for deriving 'microbial association rules' from microbiome data) is freely available for download from the following link: http://metagenomics.atc.tcs.com/arm.

  11. RANWAR: rank-based weighted association rule mining from gene expression and methylation data.

    PubMed

    Mallik, Saurav; Mukhopadhyay, Anirban; Maulik, Ujjwal

    2015-01-01

    Ranking of association rules is currently an interesting topic in data mining and bioinformatics. The huge number of evolved rules of items (or, genes) by association rule mining (ARM) algorithms makes confusion to the decision maker. In this article, we propose a weighted rule-mining technique (say, RANWAR or rank-based weighted association rule-mining) to rank the rules using two novel rule-interestingness measures, viz., rank-based weighted condensed support (wcs) and weighted condensed confidence (wcc) measures to bypass the problem. These measures are basically depended on the rank of items (genes). Using the rank, we assign weight to each item. RANWAR generates much less number of frequent itemsets than the state-of-the-art association rule mining algorithms. Thus, it saves time of execution of the algorithm. We run RANWAR on gene expression and methylation datasets. The genes of the top rules are biologically validated by Gene Ontologies (GOs) and KEGG pathway analyses. Many top ranked rules extracted from RANWAR that hold poor ranks in traditional Apriori, are highly biologically significant to the related diseases. Finally, the top rules evolved from RANWAR, that are not in Apriori, are reported.

  12. Data Mining for Financial Applications

    NASA Astrophysics Data System (ADS)

    Kovalerchuk, Boris; Vityaev, Evgenii

    This chapter describes Data Mining in finance by discussing financial tasks, specifics of methodologies and techniques in this Data Mining area. It includes time dependence, data selection, forecast horizon, measures of success, quality of patterns, hypothesis evaluation, problem ID, method profile, attribute-based and relational methodologies. The second part of the chapter discusses Data Mining models and practice in finance. It covers use of neural networks in portfolio management, design of interpretable trading rules and discovering money laundering schemes using decision rules and relational Data Mining methodology.

  13. Using an improved association rules mining optimization algorithm in web-based mobile-learning system

    NASA Astrophysics Data System (ADS)

    Huang, Yin; Chen, Jianhua; Xiong, Shaojun

    2009-07-01

    Mobile-Learning (M-learning) makes many learners get the advantages of both traditional learning and E-learning. Currently, Web-based Mobile-Learning Systems have created many new ways and defined new relationships between educators and learners. Association rule mining is one of the most important fields in data mining and knowledge discovery in databases. Rules explosion is a serious problem which causes great concerns, as conventional mining algorithms often produce too many rules for decision makers to digest. Since Web-based Mobile-Learning System collects vast amounts of student profile data, data mining and knowledge discovery techniques can be applied to find interesting relationships between attributes of learners, assessments, the solution strategies adopted by learners and so on. Therefore ,this paper focus on a new data-mining algorithm, combined with the advantages of genetic algorithm and simulated annealing algorithm , called ARGSA(Association rules based on an improved Genetic Simulated Annealing Algorithm), to mine the association rules. This paper first takes advantage of the Parallel Genetic Algorithm and Simulated Algorithm designed specifically for discovering association rules. Moreover, the analysis and experiment are also made to show the proposed method is superior to the Apriori algorithm in this Mobile-Learning system.

  14. Analysis of mesenchymal stem cell differentiation in vitro using classification association rule mining.

    PubMed

    Wang, Weiqi; Wang, Yanbo Justin; Bañares-Alcántara, René; Coenen, Frans; Cui, Zhanfeng

    2009-12-01

    In this paper, data mining is used to analyze the data on the differentiation of mammalian Mesenchymal Stem Cells (MSCs), aiming at discovering known and hidden rules governing MSC differentiation, following the establishment of a web-based public database containing experimental data on the MSC proliferation and differentiation. To this effect, a web-based public interactive database comprising the key parameters which influence the fate and destiny of mammalian MSCs has been constructed and analyzed using Classification Association Rule Mining (CARM) as a data-mining technique. The results show that the proposed approach is technically feasible and performs well with respect to the accuracy of (classification) prediction. Key rules mined from the constructed MSC database are consistent with experimental observations, indicating the validity of the method developed and the first step in the application of data mining to the study of MSCs.

  15. The association rules search of Indonesian university graduate’s data using FP-growth algorithm

    NASA Astrophysics Data System (ADS)

    Faza, S.; Rahmat, R. F.; Nababan, E. B.; Arisandi, D.; Effendi, S.

    2018-02-01

    The attribute varieties in university graduates data have caused frustrations to the institution in finding the combinations of attributes that often emerge and have high integration between attributes. Association rules mining is a data mining technique to determine the integration of the data or the way of a data set affects another set of data. By way of explanation, there are possibilities in finding the integration of data on a large scale. Frequent Pattern-Growth (FP-Growth) algorithm is one of the association rules mining technique to determine a frequent itemset in an FP-Tree data set. From the research on the search of university graduate’s association rules, it can be concluded that the most common attributes that have high integration between them are in the combination of State-owned High School outside Medan, regular university entrance exam, GPA of 3.00 to 3.49 and over 4-year-long study duration.

  16. Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships.

    PubMed

    Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M

    2013-10-01

    The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  17. A fuzzy hill-climbing algorithm for the development of a compact associative classifier

    NASA Astrophysics Data System (ADS)

    Mitra, Soumyaroop; Lam, Sarah S.

    2012-02-01

    Classification, a data mining technique, has widespread applications including medical diagnosis, targeted marketing, and others. Knowledge discovery from databases in the form of association rules is one of the important data mining tasks. An integrated approach, classification based on association rules, has drawn the attention of the data mining community over the last decade. While attention has been mainly focused on increasing classifier accuracies, not much efforts have been devoted towards building interpretable and less complex models. This paper discusses the development of a compact associative classification model using a hill-climbing approach and fuzzy sets. The proposed methodology builds the rule-base by selecting rules which contribute towards increasing training accuracy, thus balancing classification accuracy with the number of classification association rules. The results indicated that the proposed associative classification model can achieve competitive accuracies on benchmark datasets with continuous attributes and lend better interpretability, when compared with other rule-based systems.

  18. Rule Mining Techniques to Predict Prokaryotic Metabolic Pathways.

    PubMed

    Saidi, Rabie; Boudellioua, Imane; Martin, Maria J; Solovyev, Victor

    2017-01-01

    It is becoming more evident that computational methods are needed for the identification and the mapping of pathways in new genomes. We introduce an automatic annotation system (ARBA4Path Association Rule-Based Annotator for Pathways) that utilizes rule mining techniques to predict metabolic pathways across wide range of prokaryotes. It was demonstrated that specific combinations of protein domains (recorded in our rules) strongly determine pathways in which proteins are involved and thus provide information that let us very accurately assign pathway membership (with precision of 0.999 and recall of 0.966) to proteins of a given prokaryotic taxon. Our system can be used to enhance the quality of automatically generated annotations as well as annotating proteins with unknown function. The prediction models are represented in the form of human-readable rules, and they can be used effectively to add absent pathway information to many proteins in UniProtKB/TrEMBL database.

  19. Temporal data mining for the quality assessment of hemodialysis services.

    PubMed

    Bellazzi, Riccardo; Larizza, Cristiana; Magni, Paolo; Bellazzi, Roberto

    2005-05-01

    This paper describes the temporal data mining aspects of a research project that deals with the definition of methods and tools for the assessment of the clinical performance of hemodialysis (HD) services, on the basis of the time series automatically collected during hemodialysis sessions. Intelligent data analysis and temporal data mining techniques are applied to gain insight and to discover knowledge on the causes of unsatisfactory clinical results. In particular, two new methods for association rule discovery and temporal rule discovery are applied to the time series. Such methods exploit several pre-processing techniques, comprising data reduction, multi-scale filtering and temporal abstractions. We have analyzed the data of more than 5800 dialysis sessions coming from 43 different patients monitored for 19 months. The qualitative rules associating the outcome parameters and the measured variables were examined by the domain experts, which were able to distinguish between rules confirming available background knowledge and unexpected but plausible rules. The new methods proposed in the paper are suitable tools for knowledge discovery in clinical time series. Their use in the context of an auditing system for dialysis management helped clinicians to improve their understanding of the patients' behavior.

  20. The Weather Forecast Using Data Mining Research Based on Cloud Computing.

    NASA Astrophysics Data System (ADS)

    Wang, ZhanJie; Mazharul Mujib, A. B. M.

    2017-10-01

    Weather forecasting has been an important application in meteorology and one of the most scientifically and technologically challenging problem around the world. In my study, we have analyzed the use of data mining techniques in forecasting weather. This paper proposes a modern method to develop a service oriented architecture for the weather information systems which forecast weather using these data mining techniques. This can be carried out by using Artificial Neural Network and Decision tree Algorithms and meteorological data collected in Specific time. Algorithm has presented the best results to generate classification rules for the mean weather variables. The results showed that these data mining techniques can be enough for weather forecasting.

  1. Predicting missing values in a home care database using an adaptive uncertainty rule method.

    PubMed

    Konias, S; Gogou, G; Bamidis, P D; Vlahavas, I; Maglaveras, N

    2005-01-01

    Contemporary literature illustrates an abundance of adaptive algorithms for mining association rules. However, most literature is unable to deal with the peculiarities, such as missing values and dynamic data creation, that are frequently encountered in fields like medicine. This paper proposes an uncertainty rule method that uses an adaptive threshold for filling missing values in newly added records. A new approach for mining uncertainty rules and filling missing values is proposed, which is in turn particularly suitable for dynamic databases, like the ones used in home care systems. In this study, a new data mining method named FiMV (Filling Missing Values) is illustrated based on the mined uncertainty rules. Uncertainty rules have quite a similar structure to association rules and are extracted by an algorithm proposed in previous work, namely AURG (Adaptive Uncertainty Rule Generation). The main target was to implement an appropriate method for recovering missing values in a dynamic database, where new records are continuously added, without needing to specify any kind of thresholds beforehand. The method was applied to a home care monitoring system database. Randomly, multiple missing values for each record's attributes (rate 5-20% by 5% increments) were introduced in the initial dataset. FiMV demonstrated 100% completion rates with over 90% success in each case, while usual approaches, where all records with missing values are ignored or thresholds are required, experienced significantly reduced completion and success rates. It is concluded that the proposed method is appropriate for the data-cleaning step of the Knowledge Discovery process in databases. The latter, containing much significance for the output efficiency of any data mining technique, can improve the quality of the mined information.

  2. A primer to frequent itemset mining for bioinformatics

    PubMed Central

    Naulaerts, Stefan; Meysman, Pieter; Bittremieux, Wout; Vu, Trung Nghia; Vanden Berghe, Wim; Goethals, Bart

    2015-01-01

    Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other interesting properties, these techniques have proven their value in biological data analysis. Nevertheless, information about the bioinformatics applications of these techniques remains scattered. In this primer, we introduce frequent itemset mining and their derived association rules for life scientists. We give an overview of various algorithms, and illustrate how they can be used in several real-life bioinformatics application domains. We end with a discussion of the future potential and open challenges for frequent itemset mining in the life sciences. PMID:24162173

  3. Rules of meridians and acupoints selection in treatment of Parkinson's disease based on data mining techniques.

    PubMed

    Li, Zhe; Hu, Ying-Yu; Zheng, Chun-Ye; Su, Qiao-Zhen; An, Chang; Luo, Xiao-Dong; Liu, Mao-Cai

    2018-01-15

    To help selecting appropriate meridians and acupoints in clinical practice and experimental study for Parkinson's disease (PD), the rules of meridians and acupoints selection of acupuncture and moxibustion were analyzed in domestic and foreign clinical treatment for PD based on data mining techniques. Literature about PD treated by acupuncture and moxibustion in China and abroad was searched and selected from China National Knowledge Infrastructure and MEDLINE. Then the data from all eligible articles were extracted to establish the database of acupuncture-moxibustion for PD. The association rules of data mining techniques were used to analyze the rules of meridians and acupoints selection. Totally, 168 eligible articles were included and 184 acupoints were applied. The total frequency of acupoints application was 1,090 times. Those acupoints were mainly distributed in head and neck and extremities. Among all, Taichong (LR 3), Baihui (DU 20), Fengchi (GB 20), Hegu (LI 4) and Chorea-tremor Controlled Zone were the top five acupoints that had been used. Superior-inferior acupoints matching was utilized the most. As to involved meridians, Du Meridian, Dan (Gallbladder) Meridian, Dachang (Large Intestine) Meridian, and Gan (Liver) Meridian were the most popular meridians. The application of meridians and acupoints for PD treatment lay emphasis on the acupoints on the head, attach importance to extinguishing Gan wind, tonifying qi and blood, and nourishing sinews, and make good use of superior-inferior acupoints matching.

  4. Application of text mining for customer evaluations in commercial banking

    NASA Astrophysics Data System (ADS)

    Tan, Jing; Du, Xiaojiang; Hao, Pengpeng; Wang, Yanbo J.

    2015-07-01

    Nowadays customer attrition is increasingly serious in commercial banks. To combat this problem roundly, mining customer evaluation texts is as important as mining customer structured data. In order to extract hidden information from customer evaluations, Textual Feature Selection, Classification and Association Rule Mining are necessary techniques. This paper presents all three techniques by using Chinese Word Segmentation, C5.0 and Apriori, and a set of experiments were run based on a collection of real textual data that includes 823 customer evaluations taken from a Chinese commercial bank. Results, consequent solutions, some advice for the commercial bank are given in this paper.

  5. Data Mining Methods for Recommender Systems

    NASA Astrophysics Data System (ADS)

    Amatriain, Xavier; Jaimes*, Alejandro; Oliver, Nuria; Pujol, Josep M.

    In this chapter, we give an overview of the main Data Mining techniques used in the context of Recommender Systems. We first describe common preprocessing methods such as sampling or dimensionality reduction. Next, we review the most important classification techniques, including Bayesian Networks and Support Vector Machines. We describe the k-means clustering algorithm and discuss several alternatives. We also present association rules and related algorithms for an efficient training process. In addition to introducing these techniques, we survey their uses in Recommender Systems and present cases where they have been successfully applied.

  6. Promoter Sequences Prediction Using Relational Association Rule Mining

    PubMed Central

    Czibula, Gabriela; Bocicor, Maria-Iuliana; Czibula, Istvan Gergely

    2012-01-01

    In this paper we are approaching, from a computational perspective, the problem of promoter sequences prediction, an important problem within the field of bioinformatics. As the conditions for a DNA sequence to function as a promoter are not known, machine learning based classification models are still developed to approach the problem of promoter identification in the DNA. We are proposing a classification model based on relational association rules mining. Relational association rules are a particular type of association rules and describe numerical orderings between attributes that commonly occur over a data set. Our classifier is based on the discovery of relational association rules for predicting if a DNA sequence contains or not a promoter region. An experimental evaluation of the proposed model and comparison with similar existing approaches is provided. The obtained results show that our classifier overperforms the existing techniques for identifying promoter sequences, confirming the potential of our proposal. PMID:22563233

  7. Identifying Engineering Students' English Sentence Reading Comprehension Errors: Applying a Data Mining Technique

    ERIC Educational Resources Information Center

    Tsai, Yea-Ru; Ouyang, Chen-Sen; Chang, Yukon

    2016-01-01

    The purpose of this study is to propose a diagnostic approach to identify engineering students' English reading comprehension errors. Student data were collected during the process of reading texts of English for science and technology on a web-based cumulative sentence analysis system. For the analysis, the association-rule, data mining technique…

  8. Rule-based statistical data mining agents for an e-commerce application

    NASA Astrophysics Data System (ADS)

    Qin, Yi; Zhang, Yan-Qing; King, K. N.; Sunderraman, Rajshekhar

    2003-03-01

    Intelligent data mining techniques have useful e-Business applications. Because an e-Commerce application is related to multiple domains such as statistical analysis, market competition, price comparison, profit improvement and personal preferences, this paper presents a hybrid knowledge-based e-Commerce system fusing intelligent techniques, statistical data mining, and personal information to enhance QoS (Quality of Service) of e-Commerce. A Web-based e-Commerce application software system, eDVD Web Shopping Center, is successfully implemented uisng Java servlets and an Oracle81 database server. Simulation results have shown that the hybrid intelligent e-Commerce system is able to make smart decisions for different customers.

  9. An application of data mining in district heating substations for improving energy performance

    NASA Astrophysics Data System (ADS)

    Xue, Puning; Zhou, Zhigang; Chen, Xin; Liu, Jing

    2017-11-01

    Automatic meter reading system is capable of collecting and storing a huge number of district heating (DH) data. However, the data obtained are rarely fully utilized. Data mining is a promising technology to discover potential interesting knowledge from vast data. This paper applies data mining methods to analyse the massive data for improving energy performance of DH substation. The technical approach contains three steps: data selection, cluster analysis and association rule mining (ARM). Two-heating-season data of a substation are used for case study. Cluster analysis identifies six distinct heating patterns based on the primary heat of the substation. ARM reveals that secondary pressure difference and secondary flow rate have a strong correlation. Using the discovered rules, a fault occurring in remote flow meter installed at secondary network is detected accurately. The application demonstrates that data mining techniques can effectively extrapolate potential useful knowledge to better understand substation operation strategies and improve substation energy performance.

  10. Techniques of Acceleration for Association Rule Induction with Pseudo Artificial Life Algorithm

    NASA Astrophysics Data System (ADS)

    Kanakubo, Masaaki; Hagiwara, Masafumi

    Frequent patterns mining is one of the important problems in data mining. Generally, the number of potential rules grows rapidly as the size of database increases. It is therefore hard for a user to extract the association rules. To avoid such a difficulty, we propose a new method for association rule induction with pseudo artificial life approach. The proposed method is to decide whether there exists an item set which contains N or more items in two transactions. If it exists, a series of item sets which are contained in the part of transactions will be recorded. The iteration of this step contributes to the extraction of association rules. It is not necessary to calculate the huge number of candidate rules. In the evaluation test, we compared the extracted association rules using our method with the rules using other algorithms like Apriori algorithm. As a result of the evaluation using huge retail market basket data, our method is approximately 10 and 20 times faster than the Apriori algorithm and many its variants.

  11. Mining Student Data Captured from a Web-Based Tutoring Tool: Initial Exploration and Results

    ERIC Educational Resources Information Center

    Merceron, Agathe; Yacef, Kalina

    2004-01-01

    In this article we describe the initial investigations that we have conducted on student data collected from a web-based tutoring tool. We have used some data mining techniques such as association rule and symbolic data analysis, as well as traditional SQL queries to gain further insight on the students' learning and deduce information to improve…

  12. The Application of Data Mining Techniques to Create Promotion Strategy for Mobile Phone Shop

    NASA Astrophysics Data System (ADS)

    Khasanah, A. U.; Wibowo, K. S.; Dewantoro, H. F.

    2017-12-01

    The number of mobile shop is growing very fast in various regions in Indonesia including in Yogyakarta due to the increasing demand of mobile phone. This fact leads high competition among the mobile phone shops. In these conditions the mobile phone shop should have a good promotion strategy in order to survive in competition, especially for a small mobile phone shop. To create attractive promotion strategy, the companies/shops should know their customer segmentation and the buying pattern of their target market. These kind of analysis can be done using Data mining technique. This study aims to segment customer using Agglomerative Hierarchical Clustering and know customer buying pattern using Association Rule Mining. This result conducted in a mobile shop in Sleman Yogyakarta. The clustering result shows that the biggest customer segment of the shop was male university student who come on weekend and from association rule mining, it can be concluded that tempered glass and smart phone “x” as well as action camera and waterproof monopod and power bank have strong relationship. This results that used to create promotion strategies which are presented in the end of the study.

  13. Data Mining and Privacy of Social Network Sites' Users: Implications of the Data Mining Problem.

    PubMed

    Al-Saggaf, Yeslam; Islam, Md Zahidul

    2015-08-01

    This paper explores the potential of data mining as a technique that could be used by malicious data miners to threaten the privacy of social network sites (SNS) users. It applies a data mining algorithm to a real dataset to provide empirically-based evidence of the ease with which characteristics about the SNS users can be discovered and used in a way that could invade their privacy. One major contribution of this article is the use of the decision forest data mining algorithm (SysFor) to the context of SNS, which does not only build a decision tree but rather a forest allowing the exploration of more logic rules from a dataset. One logic rule that SysFor built in this study, for example, revealed that anyone having a profile picture showing just the face or a picture showing a family is less likely to be lonely. Another contribution of this article is the discussion of the implications of the data mining problem for governments, businesses, developers and the SNS users themselves.

  14. Safety rules and regulations on mine sites - the problem and a solution.

    PubMed

    Laurence, David

    2005-01-01

    Many accidents and incidents on mine sites have a causal factor in the rules and regulations that supposedly are in place to prevent the incident from occurring. The causes involve a lack of awareness or understanding, ignorance, or deliberate violations. The issue of mine rules, procedures, and regulations is a central focus of this paper, highlighted by this recent comment - "very few people have accidents for which there is no procedure in place..." An attitudinal survey was conducted at 33 mines throughout NSW, Queensland and international mine sites involving almost 500 mineworkers. The survey was in the form of a self-completing questionnaire, consisting of approximately 65 questions. It aimed to seek the opinions of the mining workforce on safety rules and regulations generally, as well as how they apply to their specific jobs on a mine site. The research also aimed to investigate: (a) the level of awareness and understanding of mine rules and procedures such as manager's rules and safe work procedures (SWPs); (b) the level of awareness and understanding of mine safety regulations and legislation; (c) the extent of communication of and commitment to rules and regulations; (d) the extent of compliance with rules and regulations; and (e) attitudes regarding errors, risk-taking, and accidents and their interaction with rules and regulations. The sample consisted of a random selection of underground and open pit mines, extracting coal, metals, or industrial minerals. The insights provided by the mineworkers enabled a set of principles to be developed to guide mine management and regulators in the development of more effective rules and regulations. CONCLUSIONS AND IMPACT ON THE MINING INDUSTRY: (a) Management and regulators should not continue to produce more and more rules and regulations to cover every aspect of mining. (b) Detailed prescriptive regulations, detailed safe work procedures, and voluminous safety management plans will not "connect" with a miner. (c) Achieving more effective rules and regulations is not the only answer to a safer workplace.

  15. Analysis of North Atlantic tropical cyclone intensify change using data mining

    NASA Astrophysics Data System (ADS)

    Tang, Jiang

    Tropical cyclones (TC), especially when their intensity reaches hurricane scale, can become a costly natural hazard. Accurate prediction of tropical cyclone intensity is very difficult because of inadequate observations on TC structures, poor understanding of physical processes, coarse model resolution and inaccurate initial conditions, etc. This study aims to tackle two factors that account for the underperformance of current TC intensity forecasts: (1) inadequate observations of TC structures, and (2) deficient understanding of the underlying physical processes governing TC intensification. To tackle the problem of inadequate observations of TC structures, efforts have been made to extract vertical and horizontal structural parameters of latent heat release from Tropical Rainfall Measuring Mission (TRMM) Precipitation Radar (PR) data products. A case study of Hurricane Isabel (2003) was conducted first to explore the feasibility of using the 3D TC structure information in predicting TC intensification. Afterwards, several structural parameters were extracted from 53 TRMM PR 2A25 observations on 25 North Atlantic TCs during the period of 1998 to 2003. A new generation of multi-correlation data mining algorithm (Apriori and its variations) was applied to find roles of the latent heat release structure in TC intensification. The results showed that the buildup of TC energy is indicated by the height of the convective tower, and the relative low latent heat release at the core area and around the outer band. Adverse conditions which prevent TC intensification include the following: (1) TC entering a higher latitude area where the underlying sea is relative cold, (2) TC moving too fast to absorb the thermal energy from the underlying sea, or (3) strong energy loss at the outer band. When adverse conditions and amicable conditions reached equilibrium status, tropical cyclone intensity would remain stable. The dataset from Statistical Hurricane Intensity Prediction Scheme (SHIPS) covering the period of 1982-2003 and the Apriori-based association rule mining algorithm were used to study the associations of underlying geophysical characteristics with the intensity change of tropical cyclones. The data have been stratified into 6 TC categories from tropical depression to category 4 hurricanes based on their strength. The result showed that the persistence of intensity change in the past and the strength of vertical shear in the environment are the most prevalent factors for all of the 6 TC categories. Hyper-edge searching had found 3 sets of parameters which showed strong intramural binds. Most of the parameters used in SHIPS model have a consistent "I-W" relation over different TC categories, indicating a consistent function of those parameters in TC development. However, the "I-W" relations of the relative momentum flux and the meridional motion change from tropical storm stage to hurricane stage, indicating a change in the role of those two parameters in TC development. Because rapid intensification (RI) is a major source of errors when predicting hurricane intensity, the association rule mining algorithm was performed on RI versus non-RI tropical cyclone cases using the same SHIPS dataset. The results had been compared with those from the traditional statistical analysis conducted by Kaplan and DeMaria (2003). The rapid intensification rule with 5 RI conditions proposed by the traditional statistical analysis was found by the association rule mining in this study as well. However, further analysis showed that the 5 RI conditions can be replaced by another association rule using fewer conditions but with a higher RI probability (RIP). This means that the rule with all 5 constraints found by Kaplan and DeMaria is not optimal, and the association rule mining technique can find a rule with fewer constraints yet fits more RI cases. The further analysis with the highest RIPs over different numbers of conditions has demonstrated that the interactions among multiple factors are responsible for the RI process of TCs. However, the influence of factors saturates at certain numbers. This study has shown successful data mining examples in studying tropical cyclone intensification using association rules. The higher RI probability with fewer conditions found by association rule technique is significant. This work demonstrated that data mining techniques can be used as an efficient exploration method to generate hypotheses, and that statistical analysis should be performed to confirm the hypotheses, as is generally expected for data mining applications.

  16. Exploration of association rule mining for coding consistency and completeness assessment in inpatient administrative health data.

    PubMed

    Peng, Mingkai; Sundararajan, Vijaya; Williamson, Tyler; Minty, Evan P; Smith, Tony C; Doktorchik, Chelsea T A; Quan, Hude

    2018-03-01

    Data quality assessment is a challenging facet for research using coded administrative health data. Current assessment approaches are time and resource intensive. We explored whether association rule mining (ARM) can be used to develop rules for assessing data quality. We extracted 2013 and 2014 records from the hospital discharge abstract database (DAD) for patients between the ages of 55 and 65 from five acute care hospitals in Alberta, Canada. The ARM was conducted using the 2013 DAD to extract rules with support ≥0.0019 and confidence ≥0.5 using the bootstrap technique, and tested in the 2014 DAD. The rules were compared against the method of coding frequency and assessed for their ability to detect error introduced by two kinds of data manipulation: random permutation and random deletion. The association rules generally had clear clinical meanings. Comparing 2014 data to 2013 data (both original), there were 3 rules with a confidence difference >0.1, while coding frequency difference of codes in the right hand of rules was less than 0.004. After random permutation of 50% of codes in the 2014 data, average rule confidence dropped from 0.72 to 0.27 while coding frequency remained unchanged. Rule confidence decreased with the increase of coding deletion, as expected. Rule confidence was more sensitive to code deletion compared to coding frequency, with slope of change ranging from 1.7 to 184.9 with a median of 9.1. The ARM is a promising technique to assess data quality. It offers a systematic way to derive coding association rules hidden in data, and potentially provides a sensitive and efficient method of assessing data quality compared to standard methods. Copyright © 2018 Elsevier Inc. All rights reserved.

  17. DTFP-Growth: Dynamic Threshold-Based FP-Growth Rule Mining Algorithm Through Integrating Gene Expression, Methylation, and Protein-Protein Interaction Profiles.

    PubMed

    Mallik, Saurav; Bhadra, Tapas; Mukherji, Ayan; Mallik, Saurav; Bhadra, Tapas; Mukherji, Ayan; Mallik, Saurav; Bhadra, Tapas; Mukherji, Ayan

    2018-04-01

    Association rule mining is an important technique for identifying interesting relationships between gene pairs in a biological data set. Earlier methods basically work for a single biological data set, and, in maximum cases, a single minimum support cutoff can be applied globally, i.e., across all genesets/itemsets. To overcome this limitation, in this paper, we propose dynamic threshold-based FP-growth rule mining algorithm that integrates gene expression, methylation and protein-protein interaction profiles based on weighted shortest distance to find the novel associations among different pairs of genes in multi-view data sets. For this purpose, we introduce three new thresholds, namely, Distance-based Variable/Dynamic Supports (DVS), Distance-based Variable Confidences (DVC), and Distance-based Variable Lifts (DVL) for each rule by integrating co-expression, co-methylation, and protein-protein interactions existed in the multi-omics data set. We develop the proposed algorithm utilizing these three novel multiple threshold measures. In the proposed algorithm, the values of , , and are computed for each rule separately, and subsequently it is verified whether the support, confidence, and lift of each evolved rule are greater than or equal to the corresponding individual , , and values, respectively, or not. If all these three conditions for a rule are found to be true, the rule is treated as a resultant rule. One of the major advantages of the proposed method compared with other related state-of-the-art methods is that it considers both the quantitative and interactive significance among all pairwise genes belonging to each rule. Moreover, the proposed method generates fewer rules, takes less running time, and provides greater biological significance for the resultant top-ranking rules compared to previous methods.

  18. [The method and application to construct experience recommendation platform of acupuncture ancient books based on data mining technology].

    PubMed

    Chen, Chuyun; Hong, Jiaming; Zhou, Weilin; Lin, Guohua; Wang, Zhengfei; Zhang, Qufei; Lu, Cuina; Lu, Lihong

    2017-07-12

    To construct a knowledge platform of acupuncture ancient books based on data mining technology, and to provide retrieval service for users. The Oracle 10 g database was applied and JAVA was selected as development language; based on the standard library and ancient books database established by manual entry, a variety of data mining technologies, including word segmentation, speech tagging, dependency analysis, rule extraction, similarity calculation, ambiguity analysis, supervised classification technology were applied to achieve text automatic extraction of ancient books; in the last, through association mining and decision analysis, the comprehensive and intelligent analysis of disease and symptom, meridians, acupoints, rules of acupuncture and moxibustion in acupuncture ancient books were realized, and retrieval service was provided for users through structure of browser/server (B/S). The platform realized full-text retrieval, word frequency analysis and association analysis; when diseases or acupoints were searched, the frequencies of meridian, acupoints (diseases) and techniques were presented from high to low, meanwhile the support degree and confidence coefficient between disease and acupoints (special acupoint), acupoints and acupoints in prescription, disease or acupoints and technique were presented. The experience platform of acupuncture ancient books based on data mining technology could be used as a reference for selection of disease, meridian and acupoint in clinical treatment and education of acupuncture and moxibustion.

  19. Effect of Temporal Relationships in Associative Rule Mining for Web Log Data

    PubMed Central

    Mohd Khairudin, Nazli; Mustapha, Aida

    2014-01-01

    The advent of web-based applications and services has created such diverse and voluminous web log data stored in web servers, proxy servers, client machines, or organizational databases. This paper attempts to investigate the effect of temporal attribute in relational rule mining for web log data. We incorporated the characteristics of time in the rule mining process and analysed the effect of various temporal parameters. The rules generated from temporal relational rule mining are then compared against the rules generated from the classical rule mining approach such as the Apriori and FP-Growth algorithms. The results showed that by incorporating the temporal attribute via time, the number of rules generated is subsequently smaller but is comparable in terms of quality. PMID:24587757

  20. A comprehensive review on privacy preserving data mining.

    PubMed

    Aldeen, Yousra Abdul Alsahib S; Salleh, Mazleena; Razzaque, Mohammad Abdur

    2015-01-01

    Preservation of privacy in data mining has emerged as an absolute prerequisite for exchanging confidential information in terms of data analysis, validation, and publishing. Ever-escalating internet phishing posed severe threat on widespread propagation of sensitive information over the web. Conversely, the dubious feelings and contentions mediated unwillingness of various information providers towards the reliability protection of data from disclosure often results utter rejection in data sharing or incorrect information sharing. This article provides a panoramic overview on new perspective and systematic interpretation of a list published literatures via their meticulous organization in subcategories. The fundamental notions of the existing privacy preserving data mining methods, their merits, and shortcomings are presented. The current privacy preserving data mining techniques are classified based on distortion, association rule, hide association rule, taxonomy, clustering, associative classification, outsourced data mining, distributed, and k-anonymity, where their notable advantages and disadvantages are emphasized. This careful scrutiny reveals the past development, present research challenges, future trends, the gaps and weaknesses. Further significant enhancements for more robust privacy protection and preservation are affirmed to be mandatory.

  1. A Swarm Optimization approach for clinical knowledge mining.

    PubMed

    Christopher, J Jabez; Nehemiah, H Khanna; Kannan, A

    2015-10-01

    Rule-based classification is a typical data mining task that is being used in several medical diagnosis and decision support systems. The rules stored in the rule base have an impact on classification efficiency. Rule sets that are extracted with data mining tools and techniques are optimized using heuristic or meta-heuristic approaches in order to improve the quality of the rule base. In this work, a meta-heuristic approach called Wind-driven Swarm Optimization (WSO) is used. The uniqueness of this work lies in the biological inspiration that underlies the algorithm. WSO uses Jval, a new metric, to evaluate the efficiency of a rule-based classifier. Rules are extracted from decision trees. WSO is used to obtain different permutations and combinations of rules whereby the optimal ruleset that satisfies the requirement of the developer is used for predicting the test data. The performance of various extensions of decision trees, namely, RIPPER, PART, FURIA and Decision Tables are analyzed. The efficiency of WSO is also compared with the traditional Particle Swarm Optimization. Experiments were carried out with six benchmark medical datasets. The traditional C4.5 algorithm yields 62.89% accuracy with 43 rules for liver disorders dataset where as WSO yields 64.60% with 19 rules. For Heart disease dataset, C4.5 is 68.64% accurate with 98 rules where as WSO is 77.8% accurate with 34 rules. The normalized standard deviation for accuracy of PSO and WSO are 0.5921 and 0.5846 respectively. WSO provides accurate and concise rulesets. PSO yields results similar to that of WSO but the novelty of WSO lies in its biological motivation and it is customization for rule base optimization. The trade-off between the prediction accuracy and the size of the rule base is optimized during the design and development of rule-based clinical decision support system. The efficiency of a decision support system relies on the content of the rule base and classification accuracy. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  2. On Interestingness Measures for Mining Statistically Significant and Novel Clinical Associations from EMRs

    PubMed Central

    Abar, Orhan; Charnigo, Richard J.; Rayapati, Abner

    2017-01-01

    Association rule mining has received significant attention from both the data mining and machine learning communities. While data mining researchers focus more on designing efficient algorithms to mine rules from large datasets, the learning community has explored applications of rule mining to classification. A major problem with rule mining algorithms is the explosion of rules even for moderate sized datasets making it very difficult for end users to identify both statistically significant and potentially novel rules that could lead to interesting new insights and hypotheses. Researchers have proposed many domain independent interestingness measures using which, one can rank the rules and potentially glean useful rules from the top ranked ones. However, these measures have not been fully explored for rule mining in clinical datasets owing to the relatively large sizes of the datasets often encountered in healthcare and also due to limited access to domain experts for review/analysis. In this paper, using an electronic medical record (EMR) dataset of diagnoses and medications from over three million patient visits to the University of Kentucky medical center and affiliated clinics, we conduct a thorough evaluation of dozens of interestingness measures proposed in data mining literature, including some new composite measures. Using cumulative relevance metrics from information retrieval, we compare these interestingness measures against human judgments obtained from a practicing psychiatrist for association rules involving the depressive disorders class as the consequent. Our results not only surface new interesting associations for depressive disorders but also indicate classes of interestingness measures that weight rule novelty and statistical strength in contrasting ways, offering new insights for end users in identifying interesting rules. PMID:28736771

  3. 76 FR 63238 - Proximity Detection Systems for Continuous Mining Machines in Underground Coal Mines

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-12

    ... Detection Systems for Continuous Mining Machines in Underground Coal Mines AGENCY: Mine Safety and Health... Agency's proposed rule addressing Proximity Detection Systems for Continuous Mining Machines in... proposed rule for Proximity Detection Systems on Continuous Mining Machines in Underground Coal Mines. Due...

  4. Integrated mined-area reclamation and land-use planning. Volume 3C. A case study of surface mining and reclamation planning: Georgia Kaolin Company Clay Mines, Washington County, Georgia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Guernsey, J L; Brown, L A; Perry, A O

    1978-02-01

    This case study examines the reclamation practices of the Georgia Kaolin's American Industrial Clay Company Division, a kaolin producer centered in Twiggs, Washington, and Wilkinson Counties, Georgia. The State of Georgia accounts for more than one-fourth of the world's kaolin production and about three-fourths of U.S. kaolin output. The mining of kaolin in Georgia illustrates the effects of mining and reclaiming lands disturbed by area surface mining. The disturbed areas are reclaimed under the rules and regulations of the Georgia Surface Mining Act of 1968. The natural conditions influencing the reclamation methodologies and techniques are markedly unique from those ofmore » other mining operations. The environmental disturbances and procedures used in reclaiming the kaolin mined lands are reviewed and implications for planners are noted.« less

  5. Personalized Privacy-Preserving Frequent Itemset Mining Using Randomized Response

    PubMed Central

    Sun, Chongjing; Fu, Yan; Zhou, Junlin; Gao, Hui

    2014-01-01

    Frequent itemset mining is the important first step of association rule mining, which discovers interesting patterns from the massive data. There are increasing concerns about the privacy problem in the frequent itemset mining. Some works have been proposed to handle this kind of problem. In this paper, we introduce a personalized privacy problem, in which different attributes may need different privacy levels protection. To solve this problem, we give a personalized privacy-preserving method by using the randomized response technique. By providing different privacy levels for different attributes, this method can get a higher accuracy on frequent itemset mining than the traditional method providing the same privacy level. Finally, our experimental results show that our method can have better results on the frequent itemset mining while preserving personalized privacy. PMID:25143989

  6. Personalized privacy-preserving frequent itemset mining using randomized response.

    PubMed

    Sun, Chongjing; Fu, Yan; Zhou, Junlin; Gao, Hui

    2014-01-01

    Frequent itemset mining is the important first step of association rule mining, which discovers interesting patterns from the massive data. There are increasing concerns about the privacy problem in the frequent itemset mining. Some works have been proposed to handle this kind of problem. In this paper, we introduce a personalized privacy problem, in which different attributes may need different privacy levels protection. To solve this problem, we give a personalized privacy-preserving method by using the randomized response technique. By providing different privacy levels for different attributes, this method can get a higher accuracy on frequent itemset mining than the traditional method providing the same privacy level. Finally, our experimental results show that our method can have better results on the frequent itemset mining while preserving personalized privacy.

  7. Visualization of usability and functionality of a professional website through web-mining.

    PubMed

    Jones, Josette F; Mahoui, Malika; Gopa, Venkata Devi Pragna

    2007-10-11

    Functional interface design requires understanding of the information system structure and the user. Web logs record user interactions with the interface, and thus provide some insight into user search behavior and efficiency of the search process. The present study uses a data-mining approach with techniques such as association rules, clustering and classification, to visualize the usability and functionality of a digital library through in depth analyses of web logs.

  8. Boosting association rule mining in large datasets via Gibbs sampling.

    PubMed

    Qian, Guoqi; Rao, Calyampudi Radhakrishna; Sun, Xiaoying; Wu, Yuehua

    2016-05-03

    Current algorithms for association rule mining from transaction data are mostly deterministic and enumerative. They can be computationally intractable even for mining a dataset containing just a few hundred transaction items, if no action is taken to constrain the search space. In this paper, we develop a Gibbs-sampling-induced stochastic search procedure to randomly sample association rules from the itemset space, and perform rule mining from the reduced transaction dataset generated by the sample. Also a general rule importance measure is proposed to direct the stochastic search so that, as a result of the randomly generated association rules constituting an ergodic Markov chain, the overall most important rules in the itemset space can be uncovered from the reduced dataset with probability 1 in the limit. In the simulation study and a real genomic data example, we show how to boost association rule mining by an integrated use of the stochastic search and the Apriori algorithm.

  9. Quantifying Associations between Environmental and Social Stressors

    EPA Science Inventory

    Introduction: Association rule mining (ARM) has been widely used to identify associations between various entities in many fields. Although some studies have utilized it to analyze the relationship between chemicals and human effects, fewer have used this technique to identify an...

  10. 26 CFR 1.611-2 - Rules applicable to mines, oil and gas wells, and other natural deposits.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 26 Internal Revenue 7 2013-04-01 2013-04-01 false Rules applicable to mines, oil and gas wells....611-2 Rules applicable to mines, oil and gas wells, and other natural deposits. (a) Computation of cost depletion of mines, oil and gas wells, and other natural deposits. (1) The basis upon which cost...

  11. 26 CFR 1.611-2 - Rules applicable to mines, oil and gas wells, and other natural deposits.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 26 Internal Revenue 7 2012-04-01 2012-04-01 false Rules applicable to mines, oil and gas wells....611-2 Rules applicable to mines, oil and gas wells, and other natural deposits. (a) Computation of cost depletion of mines, oil and gas wells, and other natural deposits. (1) The basis upon which cost...

  12. Evaluation of the mining techniques in constructing a traditional Chinese-language nursing recording system.

    PubMed

    Liao, Pei-Hung; Chu, William; Chu, Woei-Chyn

    2014-05-01

    In 2009, the Department of Health, part of Taiwan's Executive Yuan, announced the advent of electronic medical records to reduce medical expenses and facilitate the international exchange of medical record information. An information technology platform for nursing records in medical institutions was then quickly established, which improved nursing information systems and electronic databases. The purpose of the present study was to explore the usability of the data mining techniques to enhance completeness and ensure consistency of nursing records in the database system.First, the study used a Chinese word-segmenting system on common and special terms often used by the nursing staff. We also used text-mining techniques to collect keywords and create a keyword lexicon. We then used an association rule and artificial neural network to measure the correlation and forecasting capability for keywords. Finally, nursing staff members were provided with an on-screen pop-up menu to use when establishing nursing records. Our study found that by using mining techniques we were able to create a powerful keyword lexicon and establish a forecasting model for nursing diagnoses, ensuring the consistency of nursing terminology and improving the nursing staff's work efficiency and productivity.

  13. [Analysis of the characteristics of the older adults with depression using data mining decision tree analysis].

    PubMed

    Park, Myonghwa; Choi, Sora; Shin, A Mi; Koo, Chul Hoi

    2013-02-01

    The purpose of this study was to develop a prediction model for the characteristics of older adults with depression using the decision tree method. A large dataset from the 2008 Korean Elderly Survey was used and data of 14,970 elderly people were analyzed. Target variable was depression and 53 input variables were general characteristics, family & social relationship, economic status, health status, health behavior, functional status, leisure & social activity, quality of life, and living environment. Data were analyzed by decision tree analysis, a data mining technique using SPSS Window 19.0 and Clementine 12.0 programs. The decision trees were classified into five different rules to define the characteristics of older adults with depression. Classification & Regression Tree (C&RT) showed the best prediction with an accuracy of 80.81% among data mining models. Factors in the rules were life satisfaction, nutritional status, daily activity difficulty due to pain, functional limitation for basic or instrumental daily activities, number of chronic diseases and daily activity difficulty due to disease. The different rules classified by the decision tree model in this study should contribute as baseline data for discovering informative knowledge and developing interventions tailored to these individual characteristics.

  14. A Collaborative Educational Association Rule Mining Tool

    ERIC Educational Resources Information Center

    Garcia, Enrique; Romero, Cristobal; Ventura, Sebastian; de Castro, Carlos

    2011-01-01

    This paper describes a collaborative educational data mining tool based on association rule mining for the ongoing improvement of e-learning courses and allowing teachers with similar course profiles to share and score the discovered information. The mining tool is oriented to be used by non-expert instructors in data mining so its internal…

  15. Exploration of the association rules mining technique for the signal detection of adverse drug events in spontaneous reporting systems.

    PubMed

    Wang, Chao; Guo, Xiao-Jing; Xu, Jin-Fang; Wu, Cheng; Sun, Ya-Lin; Ye, Xiao-Fei; Qian, Wei; Ma, Xiu-Qiang; Du, Wen-Min; He, Jia

    2012-01-01

    The detection of signals of adverse drug events (ADEs) has increased because of the use of data mining algorithms in spontaneous reporting systems (SRSs). However, different data mining algorithms have different traits and conditions for application. The objective of our study was to explore the application of association rule (AR) mining in ADE signal detection and to compare its performance with that of other algorithms. Monte Carlo simulation was applied to generate drug-ADE reports randomly according to the characteristics of SRS datasets. Thousand simulated datasets were mined by AR and other algorithms. On average, 108,337 reports were generated by the Monte Carlo simulation. Based on the predefined criterion that 10% of the drug-ADE combinations were true signals, with RR equaling to 10, 4.9, 1.5, and 1.2, AR detected, on average, 284 suspected associations with a minimum support of 3 and a minimum lift of 1.2. The area under the receiver operating characteristic (ROC) curve of the AR was 0.788, which was equivalent to that shown for other algorithms. Additionally, AR was applied to reports submitted to the Shanghai SRS in 2009. Five hundred seventy combinations were detected using AR from 24,297 SRS reports, and they were compared with recognized ADEs identified by clinical experts and various other sources. AR appears to be an effective method for ADE signal detection, both in simulated and real SRS datasets. The limitations of this method exposed in our study, i.e., a non-uniform thresholds setting and redundant rules, require further research.

  16. Association-rule-based tuberculosis disease diagnosis

    NASA Astrophysics Data System (ADS)

    Asha, T.; Natarajan, S.; Murthy, K. N. B.

    2010-02-01

    Tuberculosis (TB) is a disease caused by bacteria called Mycobacterium tuberculosis. It usually spreads through the air and attacks low immune bodies such as patients with Human Immunodeficiency Virus (HIV). This work focuses on finding close association rules, a promising technique in Data Mining, within TB data. The proposed method first normalizes of raw data from medical records which includes categorical, nominal and continuous attributes and then determines Association Rules from the normalized data with different support and confidence. Association rules are applied on a real data set containing medical records of patients with TB obtained from a state hospital. The rules determined describes close association between one symptom to another; as an example, likelihood that an occurrence of sputum is closely associated with blood cough and HIV.

  17. Associations between socio-demographic characteristics and chemical concentrations contributing to cumulative exposures in the United States

    EPA Science Inventory

    Association rule mining (ARM) has been widely used to identify associations between various entities in many fields. Although some studies have utilized it to analyze the relationship between chemicals and human health effects, fewer have used this technique to identify and quant...

  18. Eastern Adult, Continuing, and Distance Education Research Conference Proceedings (University Park, Pennsylvania, October 24-26, 1996).

    ERIC Educational Resources Information Center

    Pennsylvania State Univ., University Park. Coll. of Education.

    Includes the following among 52 papers: "Accelerated Degree Programs" (Anderson et al.); "Basic Skills in the Workplace" (Askov); "Breaking All the Rules" (Baird); "Data Mining for Factors Affecting the Implementation of Interactive, Computer-Mediated Instructional Techniques for Students at a Distance"…

  19. Mining algorithm for association rules in big data based on Hadoop

    NASA Astrophysics Data System (ADS)

    Fu, Chunhua; Wang, Xiaojing; Zhang, Lijun; Qiao, Liying

    2018-04-01

    In order to solve the problem that the traditional association rules mining algorithm has been unable to meet the mining needs of large amount of data in the aspect of efficiency and scalability, take FP-Growth as an example, the algorithm is realized in the parallelization based on Hadoop framework and Map Reduce model. On the basis, it is improved using the transaction reduce method for further enhancement of the algorithm's mining efficiency. The experiment, which consists of verification of parallel mining results, comparison on efficiency between serials and parallel, variable relationship between mining time and node number and between mining time and data amount, is carried out in the mining results and efficiency by Hadoop clustering. Experiments show that the paralleled FP-Growth algorithm implemented is able to accurately mine frequent item sets, with a better performance and scalability. It can be better to meet the requirements of big data mining and efficiently mine frequent item sets and association rules from large dataset.

  20. Long-range prediction of Indian summer monsoon rainfall using data mining and statistical approaches

    NASA Astrophysics Data System (ADS)

    H, Vathsala; Koolagudi, Shashidhar G.

    2017-10-01

    This paper presents a hybrid model to better predict Indian summer monsoon rainfall. The algorithm considers suitable techniques for processing dense datasets. The proposed three-step algorithm comprises closed itemset generation-based association rule mining for feature selection, cluster membership for dimensionality reduction, and simple logistic function for prediction. The application of predicting rainfall into flood, excess, normal, deficit, and drought based on 36 predictors consisting of land and ocean variables is presented. Results show good accuracy in the considered study period of 37years (1969-2005).

  1. 76 FR 2617 - Lowering Miners' Exposure to Respirable Coal Mine Dust, Including Continuous Personal Dust Monitors

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-14

    ... 1219-AB64 Lowering Miners' Exposure to Respirable Coal Mine Dust, Including Continuous Personal Dust... comment period on the proposed rule addressing Lowering Miners' Exposure to Respirable Coal Mine Dust...), MSHA published a proposed rule, Lowering Miners' Exposure to Respirable Coal Mine Dust, Including...

  2. Occupancy schedules learning process through a data mining framework

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    D'Oca, Simona; Hong, Tianzhen

    Building occupancy is a paramount factor in building energy simulations. Specifically, lighting, plug loads, HVAC equipment utilization, fresh air requirements and internal heat gain or loss greatly depends on the level of occupancy within a building. Developing the appropriate methodologies to describe and reproduce the intricate network responsible for human-building interactions are needed. Extrapolation of patterns from big data streams is a powerful analysis technique which will allow for a better understanding of energy usage in buildings. A three-step data mining framework is applied to discover occupancy patterns in office spaces. First, a data set of 16 offices with 10more » minute interval occupancy data, over a two year period is mined through a decision tree model which predicts the occupancy presence. Then a rule induction algorithm is used to learn a pruned set of rules on the results from the decision tree model. Finally, a cluster analysis is employed in order to obtain consistent patterns of occupancy schedules. Furthermore, the identified occupancy rules and schedules are representative as four archetypal working profiles that can be used as input to current building energy modeling programs, such as EnergyPlus or IDA-ICE, to investigate impact of occupant presence on design, operation and energy use in office buildings.« less

  3. Validation of an association rule mining-based method to infer associations between medications and problems.

    PubMed

    Wright, A; McCoy, A; Henkin, S; Flaherty, M; Sittig, D

    2013-01-01

    In a prior study, we developed methods for automatically identifying associations between medications and problems using association rule mining on a large clinical data warehouse and validated these methods at a single site which used a self-developed electronic health record. To demonstrate the generalizability of these methods by validating them at an external site. We received data on medications and problems for 263,597 patients from the University of Texas Health Science Center at Houston Faculty Practice, an ambulatory practice that uses the Allscripts Enterprise commercial electronic health record product. We then conducted association rule mining to identify associated pairs of medications and problems and characterized these associations with five measures of interestingness: support, confidence, chi-square, interest and conviction and compared the top-ranked pairs to a gold standard. 25,088 medication-problem pairs were identified that exceeded our confidence and support thresholds. An analysis of the top 500 pairs according to each measure of interestingness showed a high degree of accuracy for highly-ranked pairs. The same technique was successfully employed at the University of Texas and accuracy was comparable to our previous results. Top associations included many medications that are highly specific for a particular problem as well as a large number of common, accurate medication-problem pairs that reflect practice patterns.

  4. Occupancy schedules learning process through a data mining framework

    DOE PAGES

    D'Oca, Simona; Hong, Tianzhen

    2014-12-17

    Building occupancy is a paramount factor in building energy simulations. Specifically, lighting, plug loads, HVAC equipment utilization, fresh air requirements and internal heat gain or loss greatly depends on the level of occupancy within a building. Developing the appropriate methodologies to describe and reproduce the intricate network responsible for human-building interactions are needed. Extrapolation of patterns from big data streams is a powerful analysis technique which will allow for a better understanding of energy usage in buildings. A three-step data mining framework is applied to discover occupancy patterns in office spaces. First, a data set of 16 offices with 10more » minute interval occupancy data, over a two year period is mined through a decision tree model which predicts the occupancy presence. Then a rule induction algorithm is used to learn a pruned set of rules on the results from the decision tree model. Finally, a cluster analysis is employed in order to obtain consistent patterns of occupancy schedules. Furthermore, the identified occupancy rules and schedules are representative as four archetypal working profiles that can be used as input to current building energy modeling programs, such as EnergyPlus or IDA-ICE, to investigate impact of occupant presence on design, operation and energy use in office buildings.« less

  5. Spatio-Temporal Pattern Mining on Trajectory Data Using Arm

    NASA Astrophysics Data System (ADS)

    Khoshahval, S.; Farnaghi, M.; Taleai, M.

    2017-09-01

    Preliminary mobile was considered to be a device to make human connections easier. But today the consumption of this device has been evolved to a platform for gaming, web surfing and GPS-enabled application capabilities. Embedding GPS in handheld devices, altered them to significant trajectory data gathering facilities. Raw GPS trajectory data is a series of points which contains hidden information. For revealing hidden information in traces, trajectory data analysis is needed. One of the most beneficial concealed information in trajectory data is user activity patterns. In each pattern, there are multiple stops and moves which identifies users visited places and tasks. This paper proposes an approach to discover user daily activity patterns from GPS trajectories using association rules. Finding user patterns needs extraction of user's visited places from stops and moves of GPS trajectories. In order to locate stops and moves, we have implemented a place recognition algorithm. After extraction of visited points an advanced association rule mining algorithm, called Apriori was used to extract user activity patterns. This study outlined that there are useful patterns in each trajectory that can be emerged from raw GPS data using association rule mining techniques in order to find out about multiple users' behaviour in a system and can be utilized in various location-based applications.

  6. Application of a hybrid association rules/decision tree model for drought monitoring

    NASA Astrophysics Data System (ADS)

    Nourani, Vahid; Molajou, Amir

    2017-12-01

    The previous researches have shown that the incorporation of the oceanic-atmospheric climate phenomena such as Sea Surface Temperature (SST) into hydro-climatic models could provide important predictive information about hydro-climatic variability. In this paper, the hybrid application of two data mining techniques (decision tree and association rules) was offered to discover affiliation between drought of Tabriz and Kermanshah synoptic stations (located in Iran) and de-trend SSTs of the Black, Mediterranean and Red Seas. Two major steps of the proposed model were the classification of de-trend SST data and selecting the most effective groups and extracting hidden information involved in the data. The techniques of decision tree which can identify the good traits from a data set for the classification purpose were used for classification and selecting the most effective groups and association rules were employed to extract the hidden predictive information from the large observed data. To examine the accuracy of the rules, confidence and Heidke Skill Score (HSS) measures were calculated and compared for different considering lag times. The computed measures confirm reliable performance of the proposed hybrid data mining method to forecast drought and the results show a relative correlation between the Mediterranean, Black and Red Sea de-trend SSTs and drought of Tabriz and Kermanshah synoptic stations so that the confidence between the monthly Standardized Precipitation Index (SPI) values and the de-trend SST of seas is higher than 70 and 80% respectively for Tabriz and Kermanshah synoptic stations.

  7. Discovering Sentinel Rules for Business Intelligence

    NASA Astrophysics Data System (ADS)

    Middelfart, Morten; Pedersen, Torben Bach

    This paper proposes the concept of sentinel rules for multi-dimensional data that warns users when measure data concerning the external environment changes. For instance, a surge in negative blogging about a company could trigger a sentinel rule warning that revenue will decrease within two months, so a new course of action can be taken. Hereby, we expand the window of opportunity for organizations and facilitate successful navigation even though the world behaves chaotically. Since sentinel rules are at the schema level as opposed to the data level, and operate on data changes as opposed to absolute data values, we are able to discover strong and useful sentinel rules that would otherwise be hidden when using sequential pattern mining or correlation techniques. We present a method for sentinel rule discovery and an implementation of this method that scales linearly on large data volumes.

  8. 77 FR 34894 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-12

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; withdrawal. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are announcing the withdrawal of a proposed rule...

  9. Data mining for the identification of metabolic syndrome status

    PubMed Central

    Worachartcheewan, Apilak; Schaduangrat, Nalini; Prachayasittikul, Virapong; Nantasenamat, Chanin

    2018-01-01

    Metabolic syndrome (MS) is a condition associated with metabolic abnormalities that are characterized by central obesity (e.g. waist circumference or body mass index), hypertension (e.g. systolic or diastolic blood pressure), hyperglycemia (e.g. fasting plasma glucose) and dyslipidemia (e.g. triglyceride and high-density lipoprotein cholesterol). It is also associated with the development of diabetes mellitus (DM) type 2 and cardiovascular disease (CVD). Therefore, the rapid identification of MS is required to prevent the occurrence of such diseases. Herein, we review the utilization of data mining approaches for MS identification. Furthermore, the concept of quantitative population-health relationship (QPHR) is also presented, which can be defined as the elucidation/understanding of the relationship that exists between health parameters and health status. The QPHR modeling uses data mining techniques such as artificial neural network (ANN), support vector machine (SVM), principal component analysis (PCA), decision tree (DT), random forest (RF) and association analysis (AA) for modeling and construction of predictive models for MS characterization. The DT method has been found to outperform other data mining techniques in the identification of MS status. Moreover, the AA technique has proved useful in the discovery of in-depth as well as frequently occurring health parameters that can be used for revealing the rules of MS development. This review presents the potential benefits on the applications of data mining as a rapid identification tool for classifying MS. PMID:29383020

  10. Data mining for the identification of metabolic syndrome status.

    PubMed

    Worachartcheewan, Apilak; Schaduangrat, Nalini; Prachayasittikul, Virapong; Nantasenamat, Chanin

    2018-01-01

    Metabolic syndrome (MS) is a condition associated with metabolic abnormalities that are characterized by central obesity (e.g. waist circumference or body mass index), hypertension (e.g. systolic or diastolic blood pressure), hyperglycemia (e.g. fasting plasma glucose) and dyslipidemia (e.g. triglyceride and high-density lipoprotein cholesterol). It is also associated with the development of diabetes mellitus (DM) type 2 and cardiovascular disease (CVD). Therefore, the rapid identification of MS is required to prevent the occurrence of such diseases. Herein, we review the utilization of data mining approaches for MS identification. Furthermore, the concept of quantitative population-health relationship (QPHR) is also presented, which can be defined as the elucidation/understanding of the relationship that exists between health parameters and health status. The QPHR modeling uses data mining techniques such as artificial neural network (ANN), support vector machine (SVM), principal component analysis (PCA), decision tree (DT), random forest (RF) and association analysis (AA) for modeling and construction of predictive models for MS characterization. The DT method has been found to outperform other data mining techniques in the identification of MS status. Moreover, the AA technique has proved useful in the discovery of in-depth as well as frequently occurring health parameters that can be used for revealing the rules of MS development. This review presents the potential benefits on the applications of data mining as a rapid identification tool for classifying MS.

  11. Target-Based Maintenance of Privacy Preserving Association Rules

    ERIC Educational Resources Information Center

    Ahluwalia, Madhu V.

    2011-01-01

    In the context of association rule mining, the state-of-the-art in privacy preserving data mining provides solutions for categorical and Boolean association rules but not for quantitative association rules. This research fills this gap by describing a method based on discrete wavelet transform (DWT) to protect input data privacy while preserving…

  12. A novel approach for incremental uncertainty rule generation from databases with missing values handling: application to dynamic medical databases.

    PubMed

    Konias, Sokratis; Chouvarda, Ioanna; Vlahavas, Ioannis; Maglaveras, Nicos

    2005-09-01

    Current approaches for mining association rules usually assume that the mining is performed in a static database, where the problem of missing attribute values does not practically exist. However, these assumptions are not preserved in some medical databases, like in a home care system. In this paper, a novel uncertainty rule algorithm is illustrated, namely URG-2 (Uncertainty Rule Generator), which addresses the problem of mining dynamic databases containing missing values. This algorithm requires only one pass from the initial dataset in order to generate the item set, while new metrics corresponding to the notion of Support and Confidence are used. URG-2 was evaluated over two medical databases, introducing randomly multiple missing values for each record's attribute (rate: 5-20% by 5% increments) in the initial dataset. Compared with the classical approach (records with missing values are ignored), the proposed algorithm was more robust in mining rules from datasets containing missing values. In all cases, the difference in preserving the initial rules ranged between 30% and 60% in favour of URG-2. Moreover, due to its incremental nature, URG-2 saved over 90% of the time required for thorough re-mining. Thus, the proposed algorithm can offer a preferable solution for mining in dynamic relational databases.

  13. 75 FR 22723 - Stream Protection Rule; Environmental Impact Statement

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-30

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Parts 780... of Surface Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; notice of intent to prepare an environmental impact statement. SUMMARY: We, the Office of Surface Mining Reclamation and...

  14. A Novel Biclustering Approach to Association Rule Mining for Predicting HIV-1–Human Protein Interactions

    PubMed Central

    Mukhopadhyay, Anirban; Maulik, Ujjwal; Bandyopadhyay, Sanghamitra

    2012-01-01

    Identification of potential viral-host protein interactions is a vital and useful approach towards development of new drugs targeting those interactions. In recent days, computational tools are being utilized for predicting viral-host interactions. Recently a database containing records of experimentally validated interactions between a set of HIV-1 proteins and a set of human proteins has been published. The problem of predicting new interactions based on this database is usually posed as a classification problem. However, posing the problem as a classification one suffers from the lack of biologically validated negative interactions. Therefore it will be beneficial to use the existing database for predicting new viral-host interactions without the need of negative samples. Motivated by this, in this article, the HIV-1–human protein interaction database has been analyzed using association rule mining. The main objective is to identify a set of association rules both among the HIV-1 proteins and among the human proteins, and use these rules for predicting new interactions. In this regard, a novel association rule mining technique based on biclustering has been proposed for discovering frequent closed itemsets followed by the association rules from the adjacency matrix of the HIV-1–human interaction network. Novel HIV-1–human interactions have been predicted based on the discovered association rules and tested for biological significance. For validation of the predicted new interactions, gene ontology-based and pathway-based studies have been performed. These studies show that the human proteins which are predicted to interact with a particular viral protein share many common biological activities. Moreover, literature survey has been used for validation purpose to identify some predicted interactions that are already validated experimentally but not present in the database. Comparison with other prediction methods is also discussed. PMID:22539940

  15. iADRs: towards online adverse drug reaction analysis.

    PubMed

    Lin, Wen-Yang; Li, He-Yi; Du, Jhih-Wei; Feng, Wen-Yu; Lo, Chiao-Feng; Soo, Von-Wun

    2012-12-01

    Adverse Drug Reaction (ADR) is one of the most important issues in the assessment of drug safety. In fact, many adverse drug reactions are not discovered during limited pre-marketing clinical trials; instead, they are only observed after long term post-marketing surveillance of drug usage. In light of this, the detection of adverse drug reactions, as early as possible, is an important topic of research for the pharmaceutical industry. Recently, large numbers of adverse events and the development of data mining technology have motivated the development of statistical and data mining methods for the detection of ADRs. These stand-alone methods, with no integration into knowledge discovery systems, are tedious and inconvenient for users and the processes for exploration are time-consuming. This paper proposes an interactive system platform for the detection of ADRs. By integrating an ADR data warehouse and innovative data mining techniques, the proposed system not only supports OLAP style multidimensional analysis of ADRs, but also allows the interactive discovery of associations between drugs and symptoms, called a drug-ADR association rule, which can be further developed using other factors of interest to the user, such as demographic information. The experiments indicate that interesting and valuable drug-ADR association rules can be efficiently mined.

  16. A New Data Mining Scheme Using Artificial Neural Networks

    PubMed Central

    Kamruzzaman, S. M.; Jehad Sarkar, A. M.

    2011-01-01

    Classification is one of the data mining problems receiving enormous attention in the database community. Although artificial neural networks (ANNs) have been successfully applied in a wide range of machine learning applications, they are however often regarded as black boxes, i.e., their predictions cannot be explained. To enhance the explanation of ANNs, a novel algorithm to extract symbolic rules from ANNs has been proposed in this paper. ANN methods have not been effectively utilized for data mining tasks because how the classifications were made is not explicitly stated as symbolic rules that are suitable for verification or interpretation by human experts. With the proposed approach, concise symbolic rules with high accuracy, that are easily explainable, can be extracted from the trained ANNs. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and the accuracy. The effectiveness of the proposed approach is clearly demonstrated by the experimental results on a set of benchmark data mining classification problems. PMID:22163866

  17. Procrastinating Behavior in Computer-Based Learning Environments to Predict Performance: A Case Study in Moodle

    PubMed Central

    Cerezo, Rebeca; Esteban, María; Sánchez-Santillán, Miguel; Núñez, José C.

    2017-01-01

    Introduction: Research about student performance has traditionally considered academic procrastination as a behavior that has negative effects on academic achievement. Although there is much evidence for this in class-based environments, there is a lack of research on Computer-Based Learning Environments (CBLEs). Therefore, the purpose of this study is to evaluate student behavior in a blended learning program and specifically procrastination behavior in relation to performance through Data Mining techniques. Materials and Methods: A sample of 140 undergraduate students participated in a blended learning experience implemented in a Moodle (Modular Object Oriented Developmental Learning Environment) Management System. Relevant interaction variables were selected for the study, taking into account student achievement and analyzing data by means of association rules, a mining technique. The association rules were arrived at and filtered through two selection criteria: 1, rules must have an accuracy over 0.8 and 2, they must be present in both sub-samples. Results: The findings of our study highlight the influence of time management in online learning environments, particularly on academic achievement, as there is an association between procrastination variables and student performance. Conclusion: Negative impact of procrastination in learning outcomes has been observed again but in virtual learning environments where practical implications, prevention of, and intervention in, are different from class-based learning. These aspects are discussed to help resolve student difficulties at various ages. PMID:28883801

  18. Procrastinating Behavior in Computer-Based Learning Environments to Predict Performance: A Case Study in Moodle.

    PubMed

    Cerezo, Rebeca; Esteban, María; Sánchez-Santillán, Miguel; Núñez, José C

    2017-01-01

    Introduction: Research about student performance has traditionally considered academic procrastination as a behavior that has negative effects on academic achievement. Although there is much evidence for this in class-based environments, there is a lack of research on Computer-Based Learning Environments (CBLEs) . Therefore, the purpose of this study is to evaluate student behavior in a blended learning program and specifically procrastination behavior in relation to performance through Data Mining techniques. Materials and Methods: A sample of 140 undergraduate students participated in a blended learning experience implemented in a Moodle (Modular Object Oriented Developmental Learning Environment) Management System. Relevant interaction variables were selected for the study, taking into account student achievement and analyzing data by means of association rules, a mining technique. The association rules were arrived at and filtered through two selection criteria: 1, rules must have an accuracy over 0.8 and 2, they must be present in both sub-samples. Results: The findings of our study highlight the influence of time management in online learning environments, particularly on academic achievement, as there is an association between procrastination variables and student performance. Conclusion: Negative impact of procrastination in learning outcomes has been observed again but in virtual learning environments where practical implications, prevention of, and intervention in, are different from class-based learning. These aspects are discussed to help resolve student difficulties at various ages.

  19. Improve Data Mining and Knowledge Discovery Through the Use of MatLab

    NASA Technical Reports Server (NTRS)

    Shaykhian, Gholam Ali; Martin, Dawn (Elliott); Beil, Robert

    2011-01-01

    Data mining is widely used to mine business, engineering, and scientific data. Data mining uses pattern based queries, searches, or other analyses of one or more electronic databases/datasets in order to discover or locate a predictive pattern or anomaly indicative of system failure, criminal or terrorist activity, etc. There are various algorithms, techniques and methods used to mine data; including neural networks, genetic algorithms, decision trees, nearest neighbor method, rule induction association analysis, slice and dice, segmentation, and clustering. These algorithms, techniques and methods used to detect patterns in a dataset, have been used in the development of numerous open source and commercially available products and technology for data mining. Data mining is best realized when latent information in a large quantity of data stored is discovered. No one technique solves all data mining problems; challenges are to select algorithms or methods appropriate to strengthen data/text mining and trending within given datasets. In recent years, throughout industry, academia and government agencies, thousands of data systems have been designed and tailored to serve specific engineering and business needs. Many of these systems use databases with relational algebra and structured query language to categorize and retrieve data. In these systems, data analyses are limited and require prior explicit knowledge of metadata and database relations; lacking exploratory data mining and discoveries of latent information. This presentation introduces MatLab(R) (MATrix LABoratory), an engineering and scientific data analyses tool to perform data mining. MatLab was originally intended to perform purely numerical calculations (a glorified calculator). Now, in addition to having hundreds of mathematical functions, it is a programming language with hundreds built in standard functions and numerous available toolboxes. MatLab's ease of data processing, visualization and its enormous availability of built in functionalities and toolboxes make it suitable to perform numerical computations and simulations as well as a data mining tool. Engineers and scientists can take advantage of the readily available functions/toolboxes to gain wider insight in their perspective data mining experiments.

  20. Improve Data Mining and Knowledge Discovery through the use of MatLab

    NASA Technical Reports Server (NTRS)

    Shaykahian, Gholan Ali; Martin, Dawn Elliott; Beil, Robert

    2011-01-01

    Data mining is widely used to mine business, engineering, and scientific data. Data mining uses pattern based queries, searches, or other analyses of one or more electronic databases/datasets in order to discover or locate a predictive pattern or anomaly indicative of system failure, criminal or terrorist activity, etc. There are various algorithms, techniques and methods used to mine data; including neural networks, genetic algorithms, decision trees, nearest neighbor method, rule induction association analysis, slice and dice, segmentation, and clustering. These algorithms, techniques and methods used to detect patterns in a dataset, have been used in the development of numerous open source and commercially available products and technology for data mining. Data mining is best realized when latent information in a large quantity of data stored is discovered. No one technique solves all data mining problems; challenges are to select algorithms or methods appropriate to strengthen data/text mining and trending within given datasets. In recent years, throughout industry, academia and government agencies, thousands of data systems have been designed and tailored to serve specific engineering and business needs. Many of these systems use databases with relational algebra and structured query language to categorize and retrieve data. In these systems, data analyses are limited and require prior explicit knowledge of metadata and database relations; lacking exploratory data mining and discoveries of latent information. This presentation introduces MatLab(TradeMark)(MATrix LABoratory), an engineering and scientific data analyses tool to perform data mining. MatLab was originally intended to perform purely numerical calculations (a glorified calculator). Now, in addition to having hundreds of mathematical functions, it is a programming language with hundreds built in standard functions and numerous available toolboxes. MatLab's ease of data processing, visualization and its enormous availability of built in functionalities and toolboxes make it suitable to perform numerical computations and simulations as well as a data mining tool. Engineers and scientists can take advantage of the readily available functions/toolboxes to gain wider insight in their perspective data mining experiments.

  1. Exploring Characterizations of Learning Object Repositories Using Data Mining Techniques

    NASA Astrophysics Data System (ADS)

    Segura, Alejandra; Vidal, Christian; Menendez, Victor; Zapata, Alfredo; Prieto, Manuel

    Learning object repositories provide a platform for the sharing of Web-based educational resources. As these repositories evolve independently, it is difficult for users to have a clear picture of the kind of contents they give access to. Metadata can be used to automatically extract a characterization of these resources by using machine learning techniques. This paper presents an exploratory study carried out in the contents of four public repositories that uses clustering and association rule mining algorithms to extract characterizations of repository contents. The results of the analysis include potential relationships between different attributes of learning objects that may be useful to gain an understanding of the kind of resources available and eventually develop search mechanisms that consider repository descriptions as a criteria in federated search.

  2. Efficient mining of association rules for the early diagnosis of Alzheimer's disease

    NASA Astrophysics Data System (ADS)

    Chaves, R.; Górriz, J. M.; Ramírez, J.; Illán, I. A.; Salas-Gonzalez, D.; Gómez-Río, M.

    2011-09-01

    In this paper, a novel technique based on association rules (ARs) is presented in order to find relations among activated brain areas in single photon emission computed tomography (SPECT) imaging. In this sense, the aim of this work is to discover associations among attributes which characterize the perfusion patterns of normal subjects and to make use of them for the early diagnosis of Alzheimer's disease (AD). Firstly, voxel-as-feature-based activation estimation methods are used to find the tridimensional activated brain regions of interest (ROIs) for each patient. These ROIs serve as input to secondly mine ARs with a minimum support and confidence among activation blocks by using a set of controls. In this context, support and confidence measures are related to the proportion of functional areas which are singularly and mutually activated across the brain. Finally, we perform image classification by comparing the number of ARs verified by each subject under test to a given threshold that depends on the number of previously mined rules. Several classification experiments were carried out in order to evaluate the proposed methods using a SPECT database that consists of 41 controls (NOR) and 56 AD patients labeled by trained physicians. The proposed methods were validated by means of the leave-one-out cross validation strategy, yielding up to 94.87% classification accuracy, thus outperforming recent developed methods for computer aided diagnosis of AD.

  3. Analysis 320 coal mine accidents using structural equation modeling with unsafe conditions of the rules and regulations as exogenous variables.

    PubMed

    Zhang, Yingyu; Shao, Wei; Zhang, Mengjia; Li, Hejun; Yin, Shijiu; Xu, Yingjun

    2016-07-01

    Mining has been historically considered as a naturally high-risk industry worldwide. Deaths caused by coal mine accidents are more than the sum of all other accidents in China. Statistics of 320 coal mine accidents in Shandong province show that all accidents contain indicators of "unsafe conditions of the rules and regulations" with a frequency of 1590, accounting for 74.3% of the total frequency of 2140. "Unsafe behaviors of the operator" is another important contributory factor, which mainly includes "operator error" and "venturing into dangerous places." A systems analysis approach was applied by using structural equation modeling (SEM) to examine the interactions between the contributory factors of coal mine accidents. The analysis of results leads to three conclusions. (i) "Unsafe conditions of the rules and regulations," affect the "unsafe behaviors of the operator," "unsafe conditions of the equipment," and "unsafe conditions of the environment." (ii) The three influencing factors of coal mine accidents (with the frequency of effect relation in descending order) are "lack of safety education and training," "rules and regulations of safety production responsibility," and "rules and regulations of supervision and inspection." (iii) The three influenced factors (with the frequency in descending order) of coal mine accidents are "venturing into dangerous places," "poor workplace environment," and "operator error." Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Highly scalable and robust rule learner: performance evaluation and comparison.

    PubMed

    Kurgan, Lukasz A; Cios, Krzysztof J; Dick, Scott

    2006-02-01

    Business intelligence and bioinformatics applications increasingly require the mining of datasets consisting of millions of data points, or crafting real-time enterprise-level decision support systems for large corporations and drug companies. In all cases, there needs to be an underlying data mining system, and this mining system must be highly scalable. To this end, we describe a new rule learner called DataSqueezer. The learner belongs to the family of inductive supervised rule extraction algorithms. DataSqueezer is a simple, greedy, rule builder that generates a set of production rules from labeled input data. In spite of its relative simplicity, DataSqueezer is a very effective learner. The rules generated by the algorithm are compact, comprehensible, and have accuracy comparable to rules generated by other state-of-the-art rule extraction algorithms. The main advantages of DataSqueezer are very high efficiency, and missing data resistance. DataSqueezer exhibits log-linear asymptotic complexity with the number of training examples, and it is faster than other state-of-the-art rule learners. The learner is also robust to large quantities of missing data, as verified by extensive experimental comparison with the other learners. DataSqueezer is thus well suited to modern data mining and business intelligence tasks, which commonly involve huge datasets with a large fraction of missing data.

  5. Prediction model for peninsular Indian summer monsoon rainfall using data mining and statistical approaches

    NASA Astrophysics Data System (ADS)

    Vathsala, H.; Koolagudi, Shashidhar G.

    2017-01-01

    In this paper we discuss a data mining application for predicting peninsular Indian summer monsoon rainfall, and propose an algorithm that combine data mining and statistical techniques. We select likely predictors based on association rules that have the highest confidence levels. We then cluster the selected predictors to reduce their dimensions and use cluster membership values for classification. We derive the predictors from local conditions in southern India, including mean sea level pressure, wind speed, and maximum and minimum temperatures. The global condition variables include southern oscillation and Indian Ocean dipole conditions. The algorithm predicts rainfall in five categories: Flood, Excess, Normal, Deficit and Drought. We use closed itemset mining, cluster membership calculations and a multilayer perceptron function in the algorithm to predict monsoon rainfall in peninsular India. Using Indian Institute of Tropical Meteorology data, we found the prediction accuracy of our proposed approach to be exceptionally good.

  6. PubMedMiner: Mining and Visualizing MeSH-based Associations in PubMed.

    PubMed

    Zhang, Yucan; Sarkar, Indra Neil; Chen, Elizabeth S

    2014-01-01

    The exponential growth of biomedical literature provides the opportunity to develop approaches for facilitating the identification of possible relationships between biomedical concepts. Indexing by Medical Subject Headings (MeSH) represent high-quality summaries of much of this literature that can be used to support hypothesis generation and knowledge discovery tasks using techniques such as association rule mining. Based on a survey of literature mining tools, a tool implemented using Ruby and R - PubMedMiner - was developed in this study for mining and visualizing MeSH-based associations for a set of MEDLINE articles. To demonstrate PubMedMiner's functionality, a case study was conducted that focused on identifying and comparing comorbidities for asthma in children and adults. Relative to the tools surveyed, the initial results suggest that PubMedMiner provides complementary functionality for summarizing and comparing topics as well as identifying potentially new knowledge.

  7. 5 CFR 5201.105 - Additional rules for Mine Safety and Health Administration employees.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... Health Administration employees. 5201.105 Section 5201.105 Administrative Personnel DEPARTMENT OF LABOR... for Mine Safety and Health Administration employees. The rules in this section apply to employees of the Mine Safety and Health Administration (MSHA) and are in addition to §§ 5201.101, 5201.102, and...

  8. From data mining rules to medical logical modules and medical advices.

    PubMed

    Gomoi, Valentin; Vida, Mihaela; Robu, Raul; Stoicu-Tivadar, Vasile; Bernad, Elena; Lupşe, Oana

    2013-01-01

    Using data mining in collaboration with Clinical Decision Support Systems adds new knowledge as support for medical diagnosis. The current work presents a tool which translates data mining rules supporting generation of medical advices to Arden Syntax formalism. The developed system was tested with data related to 2326 births that took place in 2010 at the Bega Obstetrics - Gynaecology Hospital, Timişoara. Based on processing these data, 14 medical rules regarding the Apgar score were generated and then translated in Arden Syntax language.

  9. Dietary patterns analysis using data mining method. An application to data from the CYKIDS study.

    PubMed

    Lazarou, Chrystalleni; Karaolis, Minas; Matalas, Antonia-Leda; Panagiotakos, Demosthenes B

    2012-11-01

    Data mining is a computational method that permits the extraction of patterns from large databases. We applied the data mining approach in data from 1140 children (9-13 years), in order to derive dietary habits related to children's obesity status. Rules emerged via data mining approach revealed the detrimental influence of the increased consumption of soft dinks, delicatessen meat, sweets, fried and junk food. For example, frequent (3-5 times/week) consumption of all these foods increases the risk for being obese by 75%, whereas in children who have a similar dietary pattern, but eat >2 times/week fish and seafood the risk for obesity is reduced by 33%. In conclusion patterns revealed from data mining technique refer to specific groups of children and demonstrate the effect on the risk associated with obesity status when a single dietary habit might be modified. Thus, a more individualized approach when translating public health messages could be achieved. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  10. In Brief: Coal mining regulations

    NASA Astrophysics Data System (ADS)

    Showstack, Randy

    2009-12-01

    The U.S. Department of the Interior (DOI) announced on 18 November measures to strengthen the oversight of state surface coal mining programs and to promulgate federal regulations to protect streams affected by surface coal mining operations. DOI's Office of Surface Mining Reclamation and Enforcement (OSM) is publishing an advance notice of a proposed rule about protecting streams from adverse impacts of surface coal mining operations. A rule issued by the Bush administration in December 2008 allows coal mine operators to place excess excavated materials into streams if they can show it is not reasonably possible to avoid doing so. “We are moving as quickly as possible under the law to gather public input for a new rule, based on sound science, that will govern how companies handle fill removed from mountaintop coal seams,” according to Wilma Lewis, assistant secretary for Land and Minerals Management at DOI.

  11. SCADA-based Operator Support System for Power Plant Equipment Fault Forecasting

    NASA Astrophysics Data System (ADS)

    Mayadevi, N.; Ushakumari, S. S.; Vinodchandra, S. S.

    2014-12-01

    Power plant equipment must be monitored closely to prevent failures from disrupting plant availability. Online monitoring technology integrated with hybrid forecasting techniques can be used to prevent plant equipment faults. A self learning rule-based expert system is proposed in this paper for fault forecasting in power plants controlled by supervisory control and data acquisition (SCADA) system. Self-learning utilizes associative data mining algorithms on the SCADA history database to form new rules that can dynamically update the knowledge base of the rule-based expert system. In this study, a number of popular associative learning algorithms are considered for rule formation. Data mining results show that the Tertius algorithm is best suited for developing a learning engine for power plants. For real-time monitoring of the plant condition, graphical models are constructed by K-means clustering. To build a time-series forecasting model, a multi layer preceptron (MLP) is used. Once created, the models are updated in the model library to provide an adaptive environment for the proposed system. Graphical user interface (GUI) illustrates the variation of all sensor values affecting a particular alarm/fault, as well as the step-by-step procedure for avoiding critical situations and consequent plant shutdown. The forecasting performance is evaluated by computing the mean absolute error and root mean square error of the predictions.

  12. Soil quality assessment using weighted fuzzy association rules

    USGS Publications Warehouse

    Xue, Yue-Ju; Liu, Shu-Guang; Hu, Yue-Ming; Yang, Jing-Feng

    2010-01-01

    Fuzzy association rules (FARs) can be powerful in assessing regional soil quality, a critical step prior to land planning and utilization; however, traditional FARs mined from soil quality database, ignoring the importance variability of the rules, can be redundant and far from optimal. In this study, we developed a method applying different weights to traditional FARs to improve accuracy of soil quality assessment. After the FARs for soil quality assessment were mined, redundant rules were eliminated according to whether the rules were significant or not in reducing the complexity of the soil quality assessment models and in improving the comprehensibility of FARs. The global weights, each representing the importance of a FAR in soil quality assessment, were then introduced and refined using a gradient descent optimization method. This method was applied to the assessment of soil resources conditions in Guangdong Province, China. The new approach had an accuracy of 87%, when 15 rules were mined, as compared with 76% from the traditional approach. The accuracy increased to 96% when 32 rules were mined, in contrast to 88% from the traditional approach. These results demonstrated an improved comprehensibility of FARs and a high accuracy of the proposed method.

  13. 77 FR 43721 - Examinations of Work Areas in Underground Coal Mines for Violations of Mandatory Health or Safety...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-26

    ... Examinations of Work Areas in Underground Coal Mines for Violations of Mandatory Health or Safety Standards... effectiveness of information collection requirements contained in the final rule on Examinations of Work Areas... requirements in MSHA's final rule on Examinations of Work Areas in Underground Coal Mines for Violations of...

  14. 75 FR 28227 - National Emission Standards for Hazardous Air Pollutants: Gold Mine Ore Processing and Production...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-05-20

    ... published a proposed rule for mercury emissions from the gold mine ore processing and production area source... proposed rule (75 FR 22470). Several parties requested that EPA extend the comment period. EPA has granted...-AP48 National Emission Standards for Hazardous Air Pollutants: Gold Mine Ore Processing and Production...

  15. Using association rule mining to identify risk factors for early childhood caries.

    PubMed

    Ivančević, Vladimir; Tušek, Ivan; Tušek, Jasmina; Knežević, Marko; Elheshk, Salaheddin; Luković, Ivan

    2015-11-01

    Early childhood caries (ECC) is a potentially severe disease affecting children all over the world. The available findings are mostly based on a logistic regression model, but data mining, in particular association rule mining, could be used to extract more information from the same data set. ECC data was collected in a cross-sectional analytical study of the 10% sample of preschool children in the South Bačka area (Vojvodina, Serbia). Association rules were extracted from the data by association rule mining. Risk factors were extracted from the highly ranked association rules. Discovered dominant risk factors include male gender, frequent breastfeeding (with other risk factors), high birth order, language, and low body weight at birth. Low health awareness of parents was significantly associated to ECC only in male children. The discovered risk factors are mostly confirmed by the literature, which corroborates the value of the methods. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  16. Attribute index and uniform design based multiobjective association rule mining with evolutionary algorithm.

    PubMed

    Zhang, Jie; Wang, Yuping; Feng, Junhong

    2013-01-01

    In association rule mining, evaluating an association rule needs to repeatedly scan database to compare the whole database with the antecedent, consequent of a rule and the whole rule. In order to decrease the number of comparisons and time consuming, we present an attribute index strategy. It only needs to scan database once to create the attribute index of each attribute. Then all metrics values to evaluate an association rule do not need to scan database any further, but acquire data only by means of the attribute indices. The paper visualizes association rule mining as a multiobjective problem rather than a single objective one. In order to make the acquired solutions scatter uniformly toward the Pareto frontier in the objective space, elitism policy and uniform design are introduced. The paper presents the algorithm of attribute index and uniform design based multiobjective association rule mining with evolutionary algorithm, abbreviated as IUARMMEA. It does not require the user-specified minimum support and minimum confidence anymore, but uses a simple attribute index. It uses a well-designed real encoding so as to extend its application scope. Experiments performed on several databases demonstrate that the proposed algorithm has excellent performance, and it can significantly reduce the number of comparisons and time consumption.

  17. Attribute Index and Uniform Design Based Multiobjective Association Rule Mining with Evolutionary Algorithm

    PubMed Central

    Wang, Yuping; Feng, Junhong

    2013-01-01

    In association rule mining, evaluating an association rule needs to repeatedly scan database to compare the whole database with the antecedent, consequent of a rule and the whole rule. In order to decrease the number of comparisons and time consuming, we present an attribute index strategy. It only needs to scan database once to create the attribute index of each attribute. Then all metrics values to evaluate an association rule do not need to scan database any further, but acquire data only by means of the attribute indices. The paper visualizes association rule mining as a multiobjective problem rather than a single objective one. In order to make the acquired solutions scatter uniformly toward the Pareto frontier in the objective space, elitism policy and uniform design are introduced. The paper presents the algorithm of attribute index and uniform design based multiobjective association rule mining with evolutionary algorithm, abbreviated as IUARMMEA. It does not require the user-specified minimum support and minimum confidence anymore, but uses a simple attribute index. It uses a well-designed real encoding so as to extend its application scope. Experiments performed on several databases demonstrate that the proposed algorithm has excellent performance, and it can significantly reduce the number of comparisons and time consumption. PMID:23766683

  18. 75 FR 34666 - Stream Protection Rule; Environmental Impact Statement

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-06-18

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Chapter VII RIN 1029-AC63 Stream Protection Rule; Environmental Impact Statement AGENCY: Office of Surface Mining... impact statement. [[Page 34667

  19. 26 CFR 1.611-2 - Rules applicable to mines, oil and gas wells, and other natural deposits.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... Rules applicable to mines, oil and gas wells, and other natural deposits. (a) Computation of cost depletion of mines, oil and gas wells, and other natural deposits. (1) The basis upon which cost depletion... for the taxable year, the cost depletion for that year shall be computed by dividing such amount by...

  20. Implementation of hospital examination reservation system using data mining technique.

    PubMed

    Cha, Hyo Soung; Yoon, Tae Sik; Ryu, Ki Chung; Shin, Il Won; Choe, Yang Hyo; Lee, Kyoung Yong; Lee, Jae Dong; Ryu, Keun Ho; Chung, Seung Hyun

    2015-04-01

    New methods for obtaining appropriate information for users have been attempted with the development of information technology and the Internet. Among such methods, the demand for systems and services that can improve patient satisfaction has increased in hospital care environments. In this paper, we proposed the Hospital Exam Reservation System (HERS), which uses the data mining method. First, we focused on carrying clinical exam data and finding the optimal schedule for generating rules using the multi-examination pattern-mining algorithm. Then, HERS was applied by a rule master and recommending system with an exam log. Finally, HERS was designed as a user-friendly interface. HERS has been applied at the National Cancer Center in Korea since June 2014. As the number of scheduled exams increased, the time required to schedule more than a single condition decreased (from 398.67% to 168.67% and from 448.49% to 188.49%; p < 0.0001). As the number of tests increased, the difference between HERS and non-HERS increased (from 0.18 days to 0.81 days). It was possible to expand the efficiency of HERS studies using mining technology in not only exam reservations, but also the medical environment. The proposed system based on doctor prescription removes exams that were not executed in order to improve recommendation accuracy. In addition, we expect HERS to become an effective system in various medical environments.

  1. Using data mining techniques to characterize participation in observational studies.

    PubMed

    Linden, Ariel; Yarnold, Paul R

    2016-12-01

    Data mining techniques are gaining in popularity among health researchers for an array of purposes, such as improving diagnostic accuracy, identifying high-risk patients and extracting concepts from unstructured data. In this paper, we describe how these techniques can be applied to another area in the health research domain: identifying characteristics of individuals who do and do not choose to participate in observational studies. In contrast to randomized studies where individuals have no control over their treatment assignment, participants in observational studies self-select into the treatment arm and therefore have the potential to differ in their characteristics from those who elect not to participate. These differences may explain part, or all, of the difference in the observed outcome, making it crucial to assess whether there is differential participation based on observed characteristics. As compared to traditional approaches to this assessment, data mining offers a more precise understanding of these differences. To describe and illustrate the application of data mining in this domain, we use data from a primary care-based medical home pilot programme and compare the performance of commonly used classification approaches - logistic regression, support vector machines, random forests and classification tree analysis (CTA) - in correctly classifying participants and non-participants. We find that CTA is substantially more accurate than the other models. Moreover, unlike the other models, CTA offers transparency in its computational approach, ease of interpretation via the decision rules produced and provides statistical results familiar to health researchers. Beyond their application to research, data mining techniques could help administrators to identify new candidates for participation who may most benefit from the intervention. © 2016 John Wiley & Sons, Ltd.

  2. 76 FR 70075 - Proximity Detection Systems for Continuous Mining Machines in Underground Coal Mines

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-11-10

    ... Detection Systems for Continuous Mining Machines in Underground Coal Mines AGENCY: Mine Safety and Health... proposed rule addressing Proximity Detection Systems for Continuous Mining Machines in Underground Coal... Detection Systems for Continuous Mining Machines in Underground Coal Mines. MSHA conducted hearings on...

  3. H2RM: A Hybrid Rough Set Reasoning Model for Prediction and Management of Diabetes Mellitus.

    PubMed

    Ali, Rahman; Hussain, Jamil; Siddiqi, Muhammad Hameed; Hussain, Maqbool; Lee, Sungyoung

    2015-07-03

    Diabetes is a chronic disease characterized by high blood glucose level that results either from a deficiency of insulin produced by the body, or the body's resistance to the effects of insulin. Accurate and precise reasoning and prediction models greatly help physicians to improve diagnosis, prognosis and treatment procedures of different diseases. Though numerous models have been proposed to solve issues of diagnosis and management of diabetes, they have the following drawbacks: (1) restricted one type of diabetes; (2) lack understandability and explanatory power of the techniques and decision; (3) limited either to prediction purpose or management over the structured contents; and (4) lack competence for dimensionality and vagueness of patient's data. To overcome these issues, this paper proposes a novel hybrid rough set reasoning model (H2RM) that resolves problems of inaccurate prediction and management of type-1 diabetes mellitus (T1DM) and type-2 diabetes mellitus (T2DM). For verification of the proposed model, experimental data from fifty patients, acquired from a local hospital in semi-structured format, is used. First, the data is transformed into structured format and then used for mining prediction rules. Rough set theory (RST) based techniques and algorithms are used to mine the prediction rules. During the online execution phase of the model, these rules are used to predict T1DM and T2DM for new patients. Furthermore, the proposed model assists physicians to manage diabetes using knowledge extracted from online diabetes guidelines. Correlation-based trend analysis techniques are used to manage diabetic observations. Experimental results demonstrate that the proposed model outperforms the existing methods with 95.9% average and balanced accuracies.

  4. H2RM: A Hybrid Rough Set Reasoning Model for Prediction and Management of Diabetes Mellitus

    PubMed Central

    Ali, Rahman; Hussain, Jamil; Siddiqi, Muhammad Hameed; Hussain, Maqbool; Lee, Sungyoung

    2015-01-01

    Diabetes is a chronic disease characterized by high blood glucose level that results either from a deficiency of insulin produced by the body, or the body’s resistance to the effects of insulin. Accurate and precise reasoning and prediction models greatly help physicians to improve diagnosis, prognosis and treatment procedures of different diseases. Though numerous models have been proposed to solve issues of diagnosis and management of diabetes, they have the following drawbacks: (1) restricted one type of diabetes; (2) lack understandability and explanatory power of the techniques and decision; (3) limited either to prediction purpose or management over the structured contents; and (4) lack competence for dimensionality and vagueness of patient’s data. To overcome these issues, this paper proposes a novel hybrid rough set reasoning model (H2RM) that resolves problems of inaccurate prediction and management of type-1 diabetes mellitus (T1DM) and type-2 diabetes mellitus (T2DM). For verification of the proposed model, experimental data from fifty patients, acquired from a local hospital in semi-structured format, is used. First, the data is transformed into structured format and then used for mining prediction rules. Rough set theory (RST) based techniques and algorithms are used to mine the prediction rules. During the online execution phase of the model, these rules are used to predict T1DM and T2DM for new patients. Furthermore, the proposed model assists physicians to manage diabetes using knowledge extracted from online diabetes guidelines. Correlation-based trend analysis techniques are used to manage diabetic observations. Experimental results demonstrate that the proposed model outperforms the existing methods with 95.9% average and balanced accuracies. PMID:26151207

  5. Big data mining analysis method based on cloud computing

    NASA Astrophysics Data System (ADS)

    Cai, Qing Qiu; Cui, Hong Gang; Tang, Hao

    2017-08-01

    Information explosion era, large data super-large, discrete and non-(semi) structured features have gone far beyond the traditional data management can carry the scope of the way. With the arrival of the cloud computing era, cloud computing provides a new technical way to analyze the massive data mining, which can effectively solve the problem that the traditional data mining method cannot adapt to massive data mining. This paper introduces the meaning and characteristics of cloud computing, analyzes the advantages of using cloud computing technology to realize data mining, designs the mining algorithm of association rules based on MapReduce parallel processing architecture, and carries out the experimental verification. The algorithm of parallel association rule mining based on cloud computing platform can greatly improve the execution speed of data mining.

  6. [Rule of Clinical Application of Auricular Acupuncture Based on Data Mining].

    PubMed

    Bao, Na; Wang, Qiong; Sun, Yan-Hui; Shi, Jing; Li, Xiao-Feng; Xu, Jing; Xing, Hai-Jiao; Zhang, Xuan-Ping; Zhang, Xin; Du, Yu-Zhu; Li, Jun-Lei; Yang, Qing-Qing; Feng, Xin-Xin; Jia, Chun-Sheng; Wang, Jian-Ling

    2017-02-25

    To explore the rule of clinical application of auricular acupuncture therapy by data mining in order to guide clinical practice. The data base about single auricular acupuncture therapy for different clinical diseases was established by collection, sorting, screening, recording, collation, data extraction, statistic analysis on data samples from journals, academic theses dissertations published in near 60 years. The application rules of auricular therapy including its predominant diseases, stimulus modality, therapeutic effect, and angle of needling were summarized by data mining technique. Auricular acupuncture therapy has been widely and mostly used in the internal medicine department, accounting for 48.56%. Of stimulus modalities, auricular point paste and pressure is applied with the highest frequency, accounting for 64%. The highest effective rate is found in the surgery department diseases(81.41%). Pressure is the most effective stimulus in the internal medi-cine department, and bloodletting combined with paste and pressure in the surgery department, auricular point injection in the gynecology and pediatrics departments, bloodletting in the ophthalmology and otorhinolaryngology department, and auricular point incision in the dermatology department. Auricular point injection has remarkable effect. Bloodletting combined with paste and pressure has nearly the same effect as bloodletting in the same medical department except dematology department. Otherwise, angle of needling is rarely studied. Auricular therapy is widely used and has remarkable effect in treating diseases by using different stimulus modalities. Whereas the angle of needling is rarely studied and future investigation is needed.

  7. Applying data mining techniques to determine important parameters in chronic kidney disease and the relations of these parameters to each other.

    PubMed

    Tahmasebian, Shahram; Ghazisaeedi, Marjan; Langarizadeh, Mostafa; Mokhtaran, Mehrshad; Mahdavi-Mazdeh, Mitra; Javadian, Parisa

    2017-01-01

    Introduction: Chronic kidney disease (CKD) includes a wide range of pathophysiological processes which will be observed along with abnormal function of kidneys and progressive decrease in glomerular filtration rate (GFR). According to the definition decreasing GFR must have been present for at least three months. CKD will eventually result in end-stage kidney disease. In this process different factors play role and finding the relations between effective parameters in this regard can help to prevent or slow progression of this disease. There are always a lot of data being collected from the patients' medical records. This huge array of data can be considered a valuable source for analyzing, exploring and discovering information. Objectives: Using the data mining techniques, the present study tries to specify the effective parameters and also aims to determine their relations with each other in Iranian patients with CKD. Material and Methods: The study population includes 31996 patients with CKD. First, all of the data is registered in the database. Then data mining tools were used to find the hidden rules and relationships between parameters in collected data. Results: After data cleaning based on CRISP-DM (Cross Industry Standard Process for Data Mining) methodology and running mining algorithms on the data in the database the relationships between the effective parameters was specified. Conclusion: This study was done using the data mining method pertaining to the effective factors on patients with CKD.

  8. Applying data mining techniques to determine important parameters in chronic kidney disease and the relations of these parameters to each other

    PubMed Central

    Tahmasebian, Shahram; Ghazisaeedi, Marjan; Langarizadeh, Mostafa; Mokhtaran, Mehrshad; Mahdavi-Mazdeh, Mitra; Javadian, Parisa

    2017-01-01

    Introduction: Chronic kidney disease (CKD) includes a wide range of pathophysiological processes which will be observed along with abnormal function of kidneys and progressive decrease in glomerular filtration rate (GFR). According to the definition decreasing GFR must have been present for at least three months. CKD will eventually result in end-stage kidney disease. In this process different factors play role and finding the relations between effective parameters in this regard can help to prevent or slow progression of this disease. There are always a lot of data being collected from the patients’ medical records. This huge array of data can be considered a valuable source for analyzing, exploring and discovering information. Objectives: Using the data mining techniques, the present study tries to specify the effective parameters and also aims to determine their relations with each other in Iranian patients with CKD. Material and Methods: The study population includes 31996 patients with CKD. First, all of the data is registered in the database. Then data mining tools were used to find the hidden rules and relationships between parameters in collected data. Results: After data cleaning based on CRISP-DM (Cross Industry Standard Process for Data Mining) methodology and running mining algorithms on the data in the database the relationships between the effective parameters was specified. Conclusion: This study was done using the data mining method pertaining to the effective factors on patients with CKD. PMID:28497080

  9. An Incremental High-Utility Mining Algorithm with Transaction Insertion

    PubMed Central

    Gan, Wensheng; Zhang, Binbin

    2015-01-01

    Association-rule mining is commonly used to discover useful and meaningful patterns from a very large database. It only considers the occurrence frequencies of items to reveal the relationships among itemsets. Traditional association-rule mining is, however, not suitable in real-world applications since the purchased items from a customer may have various factors, such as profit or quantity. High-utility mining was designed to solve the limitations of association-rule mining by considering both the quantity and profit measures. Most algorithms of high-utility mining are designed to handle the static database. Fewer researches handle the dynamic high-utility mining with transaction insertion, thus requiring the computations of database rescan and combination explosion of pattern-growth mechanism. In this paper, an efficient incremental algorithm with transaction insertion is designed to reduce computations without candidate generation based on the utility-list structures. The enumeration tree and the relationships between 2-itemsets are also adopted in the proposed algorithm to speed up the computations. Several experiments are conducted to show the performance of the proposed algorithm in terms of runtime, memory consumption, and number of generated patterns. PMID:25811038

  10. A novel on-line spatial-temporal k-anonymity method for location privacy protection from sequence rules-based inference attacks.

    PubMed

    Zhang, Haitao; Wu, Chenxue; Chen, Zewei; Liu, Zhao; Zhu, Yunhong

    2017-01-01

    Analyzing large-scale spatial-temporal k-anonymity datasets recorded in location-based service (LBS) application servers can benefit some LBS applications. However, such analyses can allow adversaries to make inference attacks that cannot be handled by spatial-temporal k-anonymity methods or other methods for protecting sensitive knowledge. In response to this challenge, first we defined a destination location prediction attack model based on privacy-sensitive sequence rules mined from large scale anonymity datasets. Then we proposed a novel on-line spatial-temporal k-anonymity method that can resist such inference attacks. Our anti-attack technique generates new anonymity datasets with awareness of privacy-sensitive sequence rules. The new datasets extend the original sequence database of anonymity datasets to hide the privacy-sensitive rules progressively. The process includes two phases: off-line analysis and on-line application. In the off-line phase, sequence rules are mined from an original sequence database of anonymity datasets, and privacy-sensitive sequence rules are developed by correlating privacy-sensitive spatial regions with spatial grid cells among the sequence rules. In the on-line phase, new anonymity datasets are generated upon LBS requests by adopting specific generalization and avoidance principles to hide the privacy-sensitive sequence rules progressively from the extended sequence anonymity datasets database. We conducted extensive experiments to test the performance of the proposed method, and to explore the influence of the parameter K value. The results demonstrated that our proposed approach is faster and more effective for hiding privacy-sensitive sequence rules in terms of hiding sensitive rules ratios to eliminate inference attacks. Our method also had fewer side effects in terms of generating new sensitive rules ratios than the traditional spatial-temporal k-anonymity method, and had basically the same side effects in terms of non-sensitive rules variation ratios with the traditional spatial-temporal k-anonymity method. Furthermore, we also found the performance variation tendency from the parameter K value, which can help achieve the goal of hiding the maximum number of original sensitive rules while generating a minimum of new sensitive rules and affecting a minimum number of non-sensitive rules.

  11. A novel on-line spatial-temporal k-anonymity method for location privacy protection from sequence rules-based inference attacks

    PubMed Central

    Wu, Chenxue; Liu, Zhao; Zhu, Yunhong

    2017-01-01

    Analyzing large-scale spatial-temporal k-anonymity datasets recorded in location-based service (LBS) application servers can benefit some LBS applications. However, such analyses can allow adversaries to make inference attacks that cannot be handled by spatial-temporal k-anonymity methods or other methods for protecting sensitive knowledge. In response to this challenge, first we defined a destination location prediction attack model based on privacy-sensitive sequence rules mined from large scale anonymity datasets. Then we proposed a novel on-line spatial-temporal k-anonymity method that can resist such inference attacks. Our anti-attack technique generates new anonymity datasets with awareness of privacy-sensitive sequence rules. The new datasets extend the original sequence database of anonymity datasets to hide the privacy-sensitive rules progressively. The process includes two phases: off-line analysis and on-line application. In the off-line phase, sequence rules are mined from an original sequence database of anonymity datasets, and privacy-sensitive sequence rules are developed by correlating privacy-sensitive spatial regions with spatial grid cells among the sequence rules. In the on-line phase, new anonymity datasets are generated upon LBS requests by adopting specific generalization and avoidance principles to hide the privacy-sensitive sequence rules progressively from the extended sequence anonymity datasets database. We conducted extensive experiments to test the performance of the proposed method, and to explore the influence of the parameter K value. The results demonstrated that our proposed approach is faster and more effective for hiding privacy-sensitive sequence rules in terms of hiding sensitive rules ratios to eliminate inference attacks. Our method also had fewer side effects in terms of generating new sensitive rules ratios than the traditional spatial-temporal k-anonymity method, and had basically the same side effects in terms of non-sensitive rules variation ratios with the traditional spatial-temporal k-anonymity method. Furthermore, we also found the performance variation tendency from the parameter K value, which can help achieve the goal of hiding the maximum number of original sensitive rules while generating a minimum of new sensitive rules and affecting a minimum number of non-sensitive rules. PMID:28767687

  12. Software tool for data mining and its applications

    NASA Astrophysics Data System (ADS)

    Yang, Jie; Ye, Chenzhou; Chen, Nianyi

    2002-03-01

    A software tool for data mining is introduced, which integrates pattern recognition (PCA, Fisher, clustering, hyperenvelop, regression), artificial intelligence (knowledge representation, decision trees), statistical learning (rough set, support vector machine), computational intelligence (neural network, genetic algorithm, fuzzy systems). It consists of nine function models: pattern recognition, decision trees, association rule, fuzzy rule, neural network, genetic algorithm, Hyper Envelop, support vector machine, visualization. The principle and knowledge representation of some function models of data mining are described. The software tool of data mining is realized by Visual C++ under Windows 2000. Nonmonotony in data mining is dealt with by concept hierarchy and layered mining. The software tool of data mining has satisfactorily applied in the prediction of regularities of the formation of ternary intermetallic compounds in alloy systems, and diagnosis of brain glioma.

  13. Negative and Positive Association Rules Mining from Text Using Frequent and Infrequent Itemsets

    PubMed Central

    Mahmood, Sajid; Shahbaz, Muhammad; Guergachi, Aziz

    2014-01-01

    Association rule mining research typically focuses on positive association rules (PARs), generated from frequently occurring itemsets. However, in recent years, there has been a significant research focused on finding interesting infrequent itemsets leading to the discovery of negative association rules (NARs). The discovery of infrequent itemsets is far more difficult than their counterparts, that is, frequent itemsets. These problems include infrequent itemsets discovery and generation of accurate NARs, and their huge number as compared with positive association rules. In medical science, for example, one is interested in factors which can either adjudicate the presence of a disease or write-off of its possibility. The vivid positive symptoms are often obvious; however, negative symptoms are subtler and more difficult to recognize and diagnose. In this paper, we propose an algorithm for discovering positive and negative association rules among frequent and infrequent itemsets. We identify associations among medications, symptoms, and laboratory results using state-of-the-art data mining technology. PMID:24955429

  14. 77 FR 5740 - Tennessee Abandoned Mine Land Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-02-06

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 942... Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule; public comment period and... amendment to the Tennessee Abandoned Mine Land (AML) Reclamation Plan under the Surface Mining Control and...

  15. 30 CFR 784.200 - Interpretive rules related to General Performance Standards.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... RECLAMATION AND OPERATION PLAN § 784.200 Interpretive rules related to General Performance Standards. The... ENFORCEMENT, DEPARTMENT OF THE INTERIOR SURFACE COAL MINING AND RECLAMATION OPERATIONS PERMITS AND COAL... Surface Mining Reclamation and Enforcement. (a) Interpretation of § 784.15: Reclamation plan: Postmining...

  16. A Recommendation Algorithm for Automating Corollary Order Generation

    PubMed Central

    Klann, Jeffrey; Schadow, Gunther; McCoy, JM

    2009-01-01

    Manual development and maintenance of decision support content is time-consuming and expensive. We explore recommendation algorithms, e-commerce data-mining tools that use collective order history to suggest purchases, to assist with this. In particular, previous work shows corollary order suggestions are amenable to automated data-mining techniques. Here, an item-based collaborative filtering algorithm augmented with association rule interestingness measures mined suggestions from 866,445 orders made in an inpatient hospital in 2007, generating 584 potential corollary orders. Our expert physician panel evaluated the top 92 and agreed 75.3% were clinically meaningful. Also, at least one felt 47.9% would be directly relevant in guideline development. This automated generation of a rough-cut of corollary orders confirms prior indications about automated tools in building decision support content. It is an important step toward computerized augmentation to decision support development, which could increase development efficiency and content quality while automatically capturing local standards. PMID:20351875

  17. A recommendation algorithm for automating corollary order generation.

    PubMed

    Klann, Jeffrey; Schadow, Gunther; McCoy, J M

    2009-11-14

    Manual development and maintenance of decision support content is time-consuming and expensive. We explore recommendation algorithms, e-commerce data-mining tools that use collective order history to suggest purchases, to assist with this. In particular, previous work shows corollary order suggestions are amenable to automated data-mining techniques. Here, an item-based collaborative filtering algorithm augmented with association rule interestingness measures mined suggestions from 866,445 orders made in an inpatient hospital in 2007, generating 584 potential corollary orders. Our expert physician panel evaluated the top 92 and agreed 75.3% were clinically meaningful. Also, at least one felt 47.9% would be directly relevant in guideline development. This automated generation of a rough-cut of corollary orders confirms prior indications about automated tools in building decision support content. It is an important step toward computerized augmentation to decision support development, which could increase development efficiency and content quality while automatically capturing local standards.

  18. Collaborative Data Mining Tool for Education

    ERIC Educational Resources Information Center

    Garcia, Enrique; Romero, Cristobal; Ventura, Sebastian; Gea, Miguel; de Castro, Carlos

    2009-01-01

    This paper describes a collaborative educational data mining tool based on association rule mining for the continuous improvement of e-learning courses allowing teachers with similar course's profile sharing and scoring the discovered information. This mining tool is oriented to be used by instructors non experts in data mining such that, its…

  19. 77 FR 44155 - Administration of Mining Claims and Sites

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-27

    ... 1004-AE27 Administration of Mining Claims and Sites AGENCY: Bureau of Land Management, Interior. ACTION... on locating, recording, and maintaining mining claims or sites. In this rule, the BLM amends its... placer mining claims. The law specifies that the holder of an unpatented placer mining claim must pay the...

  20. 43 CFR 3487.1 - Logical mining units.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 43 Public Lands: Interior 2 2011-10-01 2011-10-01 false Logical mining units. 3487.1 Section 3487..., DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) COAL EXPLORATION AND MINING OPERATIONS RULES Logical Mining Unit § 3487.1 Logical mining units. (a) An LMU shall become effective only upon approval of the...

  1. Applying Data Mining Techniques to Extract Hidden Patterns about Breast Cancer Survival in an Iranian Cohort Study.

    PubMed

    Khalkhali, Hamid Reza; Lotfnezhad Afshar, Hadi; Esnaashari, Omid; Jabbari, Nasrollah

    2016-01-01

    Breast cancer survival has been analyzed by many standard data mining algorithms. A group of these algorithms belonged to the decision tree category. Ability of the decision tree algorithms in terms of visualizing and formulating of hidden patterns among study variables were main reasons to apply an algorithm from the decision tree category in the current study that has not studied already. The classification and regression trees (CART) was applied to a breast cancer database contained information on 569 patients in 2007-2010. The measurement of Gini impurity used for categorical target variables was utilized. The classification error that is a function of tree size was measured by 10-fold cross-validation experiments. The performance of created model was evaluated by the criteria as accuracy, sensitivity and specificity. The CART model produced a decision tree with 17 nodes, 9 of which were associated with a set of rules. The rules were meaningful clinically. They showed in the if-then format that Stage was the most important variable for predicting breast cancer survival. The scores of accuracy, sensitivity and specificity were: 80.3%, 93.5% and 53%, respectively. The current study model as the first one created by the CART was able to extract useful hidden rules from a relatively small size dataset.

  2. A hybrid, auto-adaptive and rule-based multi-agent approach using evolutionary algorithms for improved searching

    NASA Astrophysics Data System (ADS)

    Izquierdo, Joaquín; Montalvo, Idel; Campbell, Enrique; Pérez-García, Rafael

    2016-08-01

    Selecting the most appropriate heuristic for solving a specific problem is not easy, for many reasons. This article focuses on one of these reasons: traditionally, the solution search process has operated in a given manner regardless of the specific problem being solved, and the process has been the same regardless of the size, complexity and domain of the problem. To cope with this situation, search processes should mould the search into areas of the search space that are meaningful for the problem. This article builds on previous work in the development of a multi-agent paradigm using techniques derived from knowledge discovery (data-mining techniques) on databases of so-far visited solutions. The aim is to improve the search mechanisms, increase computational efficiency and use rules to enrich the formulation of optimization problems, while reducing the search space and catering to realistic problems.

  3. Health-Mining: a Disease Management Support Service based on Data Mining and Rule Extraction.

    PubMed

    Bei, Andrea; De Luca, Stefano; Ruscitti, Giancarlo; Salamon, Diego

    2005-01-01

    The disease management is the collection of the processes aimed to control the health care and improving the quality at same time reducing the overall cost of the procedures. Our system, Health-Mining, is a Decision Support System with the objective of controlling the adequacy of hospitalization and therapies, determining the effective use of standard guidelines and eventually identifying better ones emerged from the medical practice (Evidence Based Medicine). In realizing the system, we have the aim of creation of a path to admissions- appropriateness criteria construction, valid at an international level. A main goal of the project is rule extraction and the identification of the rules adequate in term of efficacy, quality and cost reduction, especially in the view of fast changing technologies and medicines. We tested Health-Mining in a real test case for an Italian Region, Regione Veneto, on the installation of pacemaker and ICD.

  4. Mining Rare Associations between Biological Ontologies

    PubMed Central

    Benites, Fernando; Simon, Svenja; Sapozhnikova, Elena

    2014-01-01

    The constantly increasing volume and complexity of available biological data requires new methods for their management and analysis. An important challenge is the integration of information from different sources in order to discover possible hidden relations between already known data. In this paper we introduce a data mining approach which relates biological ontologies by mining cross and intra-ontology pairwise generalized association rules. Its advantage is sensitivity to rare associations, for these are important for biologists. We propose a new class of interestingness measures designed for hierarchically organized rules. These measures allow one to select the most important rules and to take into account rare cases. They favor rules with an actual interestingness value that exceeds the expected value. The latter is calculated taking into account the parent rule. We demonstrate this approach by applying it to the analysis of data from Gene Ontology and GPCR databases. Our objective is to discover interesting relations between two different ontologies or parts of a single ontology. The association rules that are thus discovered can provide the user with new knowledge about underlying biological processes or help improve annotation consistency. The obtained results show that produced rules represent meaningful and quite reliable associations. PMID:24404165

  5. Mining rare associations between biological ontologies.

    PubMed

    Benites, Fernando; Simon, Svenja; Sapozhnikova, Elena

    2014-01-01

    The constantly increasing volume and complexity of available biological data requires new methods for their management and analysis. An important challenge is the integration of information from different sources in order to discover possible hidden relations between already known data. In this paper we introduce a data mining approach which relates biological ontologies by mining cross and intra-ontology pairwise generalized association rules. Its advantage is sensitivity to rare associations, for these are important for biologists. We propose a new class of interestingness measures designed for hierarchically organized rules. These measures allow one to select the most important rules and to take into account rare cases. They favor rules with an actual interestingness value that exceeds the expected value. The latter is calculated taking into account the parent rule. We demonstrate this approach by applying it to the analysis of data from Gene Ontology and GPCR databases. Our objective is to discover interesting relations between two different ontologies or parts of a single ontology. The association rules that are thus discovered can provide the user with new knowledge about underlying biological processes or help improve annotation consistency. The obtained results show that produced rules represent meaningful and quite reliable associations.

  6. 76 FR 35801 - Examinations of Work Areas in Underground Coal Mines and Pattern of Violations

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-20

    ..., 1219-AB73 Examinations of Work Areas in Underground Coal Mines and Pattern of Violations AGENCY: Mine... public hearings on the Agency's proposed rules for Examinations of Work Areas in Underground Coal Mines... Underground Coal Mines' submissions, and with ``RIN 1219-AB73'' for Pattern of Violations' submissions...

  7. 78 FR 48591 - Refuge Alternatives for Underground Coal Mines

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-08

    ... Administration 30 CFR Parts 7 and 75 Refuge Alternatives for Underground Coal Mines; Proposed Rules #0;#0;Federal... Underground Coal Mines AGENCY: Mine Safety and Health Administration, Labor. ACTION: Limited reopening of the... for miners to deploy and use refuge alternatives in underground coal mines. The U.S. Court of Appeals...

  8. 75 FR 20918 - High-Voltage Continuous Mining Machine Standard for Underground Coal Mines

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-22

    ... DEPARTMENT OF LABOR Mine Safety and Health Administration 30 CFR Parts 18 and 75 RIN 1219-AB34 High-Voltage Continuous Mining Machine Standard for Underground Coal Mines Correction In rule document 2010-7309 beginning on page 17529 in the issue of Tuesday, April 6, 2010, make the following correction...

  9. Location Prediction Based on Transition Probability Matrices Constructing from Sequential Rules for Spatial-Temporal K-Anonymity Dataset

    PubMed Central

    Liu, Zhao; Zhu, Yunhong; Wu, Chenxue

    2016-01-01

    Spatial-temporal k-anonymity has become a mainstream approach among techniques for protection of users’ privacy in location-based services (LBS) applications, and has been applied to several variants such as LBS snapshot queries and continuous queries. Analyzing large-scale spatial-temporal anonymity sets may benefit several LBS applications. In this paper, we propose two location prediction methods based on transition probability matrices constructing from sequential rules for spatial-temporal k-anonymity dataset. First, we define single-step sequential rules mined from sequential spatial-temporal k-anonymity datasets generated from continuous LBS queries for multiple users. We then construct transition probability matrices from mined single-step sequential rules, and normalize the transition probabilities in the transition matrices. Next, we regard a mobility model for an LBS requester as a stationary stochastic process and compute the n-step transition probability matrices by raising the normalized transition probability matrices to the power n. Furthermore, we propose two location prediction methods: rough prediction and accurate prediction. The former achieves the probabilities of arriving at target locations along simple paths those include only current locations, target locations and transition steps. By iteratively combining the probabilities for simple paths with n steps and the probabilities for detailed paths with n-1 steps, the latter method calculates transition probabilities for detailed paths with n steps from current locations to target locations. Finally, we conduct extensive experiments, and correctness and flexibility of our proposed algorithm have been verified. PMID:27508502

  10. 43 CFR 3482.3 - Mining operations maps.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 43 Public Lands: Interior 2 2011-10-01 2011-10-01 false Mining operations maps. 3482.3 Section... MANAGEMENT, DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) COAL EXPLORATION AND MINING OPERATIONS RULES Exploration and Resource Recovery and Protection Plans § 3482.3 Mining operations maps. (a...

  11. Real-time intelligent decision making with data mining

    NASA Astrophysics Data System (ADS)

    Gupta, Deepak P.; Gopalakrishnan, Bhaskaran

    2004-03-01

    Database mining, widely known as knowledge discovery and data mining (KDD), has attracted lot of attention in recent years. With the rapid growth of databases in commercial, industrial, administrative and other applications, it is necessary and interesting to extract knowledge automatically from huge amount of data. Almost all the organizations are generating data and information at an unprecedented rate and they need to get some useful information from this data. Data mining is the extraction of non-trivial, previously unknown and potentially useful patterns, trends, dependence and correlation known as association rules among data values in large databases. In last ten to fifteen years, data mining spread out from one company to the other to help them understand more about customers' aspect of quality and response and also distinguish the customers they want from those they do not. A credit-card company found that customers who complete their applications in pencil rather than pen are more likely to default. There is a program that identifies callers by purchase history. The bigger the spender, the quicker the call will be answered. If you feel your call is being answered in the order in which it was received, think again. Many algorithms assume that data is static in nature and mine the rules and relations in that data. But for a dynamic database e.g. in most of the manufacturing industries, the rules and relations thus developed among the variables/items no longer hold true. A simple approach may be to mine the associations among the variables after every fixed period of time. But again, how much the length of this period should be, is a question to be answered. The next problem with the static data mining is that some of the relationships that might be of interest from one period to the other may be lost after a new set of data is used. To reflect the effect of new data set and current status of the association rules where some of the strong rules might become weak and vice versa, there is a need to develop an efficient algorithm to adapt to the current patterns and associations. Some work has been done in developing the association rules for incremental database but to the best of the author"s knowledge no work has been done to do the same for periodic cause and effect analysis for online association rules in manufacturing industries. The present research attempts to answer these questions and develop an algorithm that can display the association rules online, find the periodic patterns in the data and detect the root cause of the problem.

  12. Effective Diagnosis of Alzheimer's Disease by Means of Association Rules

    NASA Astrophysics Data System (ADS)

    Chaves, R.; Ramírez, J.; Górriz, J. M.; López, M.; Salas-Gonzalez, D.; Illán, I.; Segovia, F.; Padilla, P.

    In this paper we present a novel classification method of SPECT images for the early diagnosis of the Alzheimer's disease (AD). The proposed method is based on Association Rules (ARs) aiming to discover interesting associations between attributes contained in the database. The system uses firstly voxel-as-features (VAF) and Activation Estimation (AE) to find tridimensional activated brain regions of interest (ROIs) for each patient. These ROIs act as inputs to secondly mining ARs between activated blocks for controls, with a specified minimum support and minimum confidence. ARs are mined in supervised mode, using information previously extracted from the most discriminant rules for centering interest in the relevant brain areas, reducing the computational requirement of the system. Finally classification process is performed depending on the number of previously mined rules verified by each subject, yielding an up to 95.87% classification accuracy, thus outperforming recent developed methods for AD diagnosis.

  13. 76 FR 12852 - Louisiana Regulatory Program/Abandoned Mine Land Reclamation Plan

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-09

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 918... Reclamation Plan AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are...

  14. 75 FR 60373 - Louisiana Regulatory Program/Abandoned Mine Land Reclamation Plan

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-09-30

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 918... Reclamation Plan AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule... of Surface Mining Reclamation and Enforcement (OSM), are announcing receipt of a proposed amendment...

  15. 26 CFR 1.614-3 - Rules relating to separate operating mineral interests in the case of mines.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 26 Internal Revenue 7 2011-04-01 2009-04-01 true Rules relating to separate operating mineral interests in the case of mines. 1.614-3 Section 1.614-3 Internal Revenue INTERNAL REVENUE SERVICE, DEPARTMENT OF THE TREASURY (CONTINUED) INCOME TAX (CONTINUED) INCOME TAXES (CONTINUED) Natural Resources § 1...

  16. 26 CFR 1.611-2 - Rules applicable to mines, oil and gas wells, and other natural deposits.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... other natural deposits. 1.611-2 Section 1.611-2 Internal Revenue INTERNAL REVENUE SERVICE, DEPARTMENT OF THE TREASURY (CONTINUED) INCOME TAX (CONTINUED) INCOME TAXES (CONTINUED) Natural Resources § 1.611-2 Rules applicable to mines, oil and gas wells, and other natural deposits. (a) Computation of cost...

  17. Revealing Significant Relations between Chemical/Biological Features and Activity: Associative Classification Mining for Drug Discovery

    ERIC Educational Resources Information Center

    Yu, Pulan

    2012-01-01

    Classification, clustering and association mining are major tasks of data mining and have been widely used for knowledge discovery. Associative classification mining, the combination of both association rule mining and classification, has emerged as an indispensable way to support decision making and scientific research. In particular, it offers a…

  18. 30 CFR 49.60 - Requirements for a local mine rescue contest.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Coal Mines § 49.60 Requirements... United States; (2) Uses MSHA-recognized rules; (3) Has a minimum of three mine rescue teams competing; (4) Has one or more problems conducted on one or more days with a determined winner; (5) Includes team...

  19. 30 CFR 49.60 - Requirements for a local mine rescue contest.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Coal Mines § 49.60 Requirements... United States; (2) Uses MSHA-recognized rules; (3) Has a minimum of three mine rescue teams competing; (4) Has one or more problems conducted on one or more days with a determined winner; (5) Includes team...

  20. 30 CFR 49.60 - Requirements for a local mine rescue contest.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Coal Mines § 49.60 Requirements... United States; (2) Uses MSHA-recognized rules; (3) Has a minimum of three mine rescue teams competing; (4) Has one or more problems conducted on one or more days with a determined winner; (5) Includes team...

  1. 30 CFR 49.60 - Requirements for a local mine rescue contest.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Coal Mines § 49.60 Requirements... United States; (2) Uses MSHA-recognized rules; (3) Has a minimum of three mine rescue teams competing; (4) Has one or more problems conducted on one or more days with a determined winner; (5) Includes team...

  2. 30 CFR 49.60 - Requirements for a local mine rescue contest.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Coal Mines § 49.60 Requirements... United States; (2) Uses MSHA-recognized rules; (3) Has a minimum of three mine rescue teams competing; (4) Has one or more problems conducted on one or more days with a determined winner; (5) Includes team...

  3. Implementation of Data Mining to Analyze Drug Cases Using C4.5 Decision Tree

    NASA Astrophysics Data System (ADS)

    Wahyuni, Sri

    2018-03-01

    Data mining was the process of finding useful information from a large set of databases. One of the existing techniques in data mining was classification. The method used was decision tree method and algorithm used was C4.5 algorithm. The decision tree method was a method that transformed a very large fact into a decision tree which was presenting the rules. Decision tree method was useful for exploring data, as well as finding a hidden relationship between a number of potential input variables with a target variable. The decision tree of the C4.5 algorithm was constructed with several stages including the selection of attributes as roots, created a branch for each value and divided the case into the branch. These stages would be repeated for each branch until all the cases on the branch had the same class. From the solution of the decision tree there would be some rules of a case. In this case the researcher classified the data of prisoners at Labuhan Deli prison to know the factors of detainees committing criminal acts of drugs. By applying this C4.5 algorithm, then the knowledge was obtained as information to minimize the criminal acts of drugs. From the findings of the research, it was found that the most influential factor of the detainee committed the criminal act of drugs was from the address variable.

  4. 76 FR 76104 - Arkansas Regulatory Program and Abandoned Mine Land Reclamation Plan

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-06

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 904... Reclamation Plan AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation and...

  5. 77 FR 55430 - Arkansas Regulatory Program and Abandoned Mine Land Reclamation Plan

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-10

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 904... Reclamation Plan AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation and...

  6. Association Rule Mining from an Intelligent Tutor

    ERIC Educational Resources Information Center

    Dogan, Buket; Camurcu, A. Yilmaz

    2008-01-01

    Educational data mining is a very novel research area, offering fertile ground for many interesting data mining applications. Educational data mining can extract useful information from educational activities for better understanding and assessment of the student learning process. In this way, it is possible to explore how students learn topics in…

  7. Analysis on composition rules of Chinese patent drugs treating pain-related diseases based on data mining method.

    PubMed

    Tang, Shi-Huan; Shen, Dan; Yang, Hong-Jun

    2017-08-24

    To analyze the composition rules of oral prescriptions in the treatment of headache, stomachache and dysmenorrhea recorded in National Standard for Chinese Patent Drugs (NSCPD) enacted by Ministry of Public Health of China and then make comparison between them to better understand pain treatment in different regions of human body. Constructed NSCPD database had been constructed in 2014. Prescriptions treating the three pain-related diseases were searched and screened from the database. Then data mining method such as association rules analysis and complex system entropy method integrated in the data mining software Traditional Chinese Medicine Inheritance Support System (TCMISS) were applied to process the data. Top 25 drugs with high frequency in the treatment of each disease were selected, and 51, 33 and 22 core combinations treating headache, stomachache and dysmenorrhea respectively were mined out as well. The composition rules of the oral prescriptions for treating headache, stomachache and dysmenorrhea recorded in NSCPD has been summarized. Although there were similarities between them, formula varied according to different locations of pain. It can serve as an evidence and reference for clinical treatment and new drug development.

  8. 76 FR 64047 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-17

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Montana regulatory program (hereinafter, the ``Montana program'') under the Surface Mining...

  9. 76 FR 36040 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-21

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Wyoming regulatory program (hereinafter, the ``Wyoming program'') under the Surface Mining...

  10. 78 FR 16204 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-03-14

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Wyoming regulatory program (hereinafter, the ``Wyoming program'') under the Surface Mining...

  11. 76 FR 80310 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-23

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Wyoming regulatory program (hereinafter, the ``Wyoming program'') under the Surface Mining...

  12. 76 FR 67635 - Alaska Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-11-02

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 902... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Alaska regulatory program (hereinafter, the ``Alaska program'') under the Surface Mining...

  13. 76 FR 64045 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-17

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Montana regulatory program (hereinafter, the ``Montana program'') under the Surface Mining...

  14. 76 FR 76111 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-06

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Montana regulatory program (hereinafter, the ``Montana program'') under the Surface Mining...

  15. 77 FR 25874 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-02

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938... Mining Reclamation and Enforcement (OSM), Interior. ACTION: Final rule; removal of required amendment... regulatory program (the ``Pennsylvania program'') regulations under the Surface Mining Control and...

  16. TSCA Chemical Data Reporting Fact Sheet: Reporting Manufactured Chemical Substances from Metal Mining and Related Activities

    EPA Pesticide Factsheets

    This fact sheet provides guidance on the Chemical Data Reporting (CDR) rule requirements related to the reporting of mined metals, intermediates, and byproducts manufactured during metal mining and related activities.

  17. 77 FR 1430 - Maryland Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-01-10

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 920... Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule; extension of the comment... the Maryland regulatory program (the ``Maryland program'') under the Surface Mining Control and...

  18. Data mining for multiagent rules, strategies, and fuzzy decision tree structure

    NASA Astrophysics Data System (ADS)

    Smith, James F., III; Rhyne, Robert D., II; Fisher, Kristin

    2002-03-01

    A fuzzy logic based resource manager (RM) has been developed that automatically allocates electronic attack resources in real-time over many dissimilar platforms. Two different data mining algorithms have been developed to determine rules, strategies, and fuzzy decision tree structure. The first data mining algorithm uses a genetic algorithm as a data mining function and is called from an electronic game. The game allows a human expert to play against the resource manager in a simulated battlespace with each of the defending platforms being exclusively directed by the fuzzy resource manager and the attacking platforms being controlled by the human expert or operating autonomously under their own logic. This approach automates the data mining problem. The game automatically creates a database reflecting the domain expert's knowledge. It calls a data mining function, a genetic algorithm, for data mining of the database as required and allows easy evaluation of the information mined in the second step. The criterion for re- optimization is discussed as well as experimental results. Then a second data mining algorithm that uses a genetic program as a data mining function is introduced to automatically discover fuzzy decision tree structures. Finally, a fuzzy decision tree generated through this process is discussed.

  19. 75 FR 69617 - Lowering Miners' Exposure to Respirable Coal Mine Dust, Including Continuous Personal Dust Monitors

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-11-15

    ... 1219-AB64 Lowering Miners' Exposure to Respirable Coal Mine Dust, Including Continuous Personal Dust... hearings on the proposed rule addressing Lowering Miners' Exposure to Respirable Coal Mine Dust, Including... miners' exposure to respirable coal mine dust by revising the Agency's existing standards on miners...

  20. 76 FR 11187 - Examinations of Work Areas in Underground Coal Mines for Violations of Mandatory Health or Safety...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-01

    ... Examinations of Work Areas in Underground Coal Mines for Violations of Mandatory Health or Safety Standards... rule addressing Examinations of Work Areas in Underground Coal Mines for Violations of Mandatory Health..., and weekly examinations of underground coal mines. This extension gives commenters an additional 30...

  1. Validity of association rules extracted by healthcare-data-mining.

    PubMed

    Takeuchi, Hiroshi; Kodama, Naoki

    2014-01-01

    A personal healthcare system used with cloud computing has been developed. It enables a daily time-series of personal health and lifestyle data to be stored in the cloud through mobile devices. The cloud automatically extracts personally useful information, such as rules and patterns concerning the user's lifestyle and health condition embedded in their personal big data, by using healthcare-data-mining. This study has verified that the extracted rules on the basis of a daily time-series data stored during a half- year by volunteer users of this system are valid.

  2. Chemical named entities recognition: a review on approaches and applications.

    PubMed

    Eltyeb, Safaa; Salim, Naomie

    2014-01-01

    The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to "text mine" these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted.

  3. 77 FR 58056 - Mississippi Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-19

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 924... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM...

  4. 76 FR 36039 - Colorado Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-21

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 906... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). Colorado proposes both additions...

  5. 77 FR 34890 - Oklahoma Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-12

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 936... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  6. 76 FR 50708 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-08-16

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing. SUMMARY: We, the Office of Surface Mining Reclamation...

  7. 75 FR 60371 - Alabama Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-09-30

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 901... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  8. 77 FR 41680 - Indiana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-16

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 914... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving amendments to the Indiana...

  9. 77 FR 25949 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-02

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  10. 76 FR 76109 - Colorado Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-06

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 906... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; reopening and extension of public...'') under the Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). Colorado...

  11. 77 FR 66574 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-11-06

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  12. 77 FR 18149 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-03-27

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; reopening and extension of public... receipt of Montana's response to the Office of Surface Mining Reclamation and Enforcement's (OSM) November...

  13. 77 FR 24661 - North Dakota Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-04-25

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 934... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). North Dakota proposes...

  14. 76 FR 23522 - Oklahoma Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-27

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 936... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM...

  15. 75 FR 21534 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-26

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  16. 77 FR 34892 - Utah Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-12

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 944... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  17. 77 FR 18738 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-03-28

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  18. 76 FR 9700 - Alabama Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-22

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 901... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  19. 77 FR 40796 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-11

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Final rule. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are removing a disapproval codified in OSM regulations...

  20. 76 FR 12857 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-09

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... of Surface Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment... the Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). Montana proposed...

  1. 78 FR 11617 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-19

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938... Surface Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule; reopening of comment... regulatory program (the ``Pennsylvania program'') under the Surface Mining Control and Reclamation Act of...

  2. Empirical evaluation of interest-level criteria

    NASA Astrophysics Data System (ADS)

    Sahar, Sigal; Mansour, Yishay

    1999-02-01

    Efficient association rule mining algorithms already exist, however, as the size of databases increases, the number of patterns mined by the algorithms increases to such an extent that their manual evaluation becomes impractical. Automatic evaluation methods are, therefore, required in order to sift through the initial list of rules, which the datamining algorithm outputs. These evaluation methods, or criteria, rank the association rules mined from the dataset. We empirically examined several such statistical criteria: new criteria, as well as previously known ones. The empirical evaluation was conducted using several databases, including a large real-life dataset, acquired from an order-by-phone grocery store, a dataset composed from www proxy logs, and several datasets from the UCI repository. We were interested in discovering whether the ranking performed by the various criteria is similar or easily distinguishable. Our evaluation detected, when significant differences exist, three patterns of behavior in the eight criteria we examined. There is an obvious dilemma in determining how many association rules to choose (in accordance with support and confidence parameters). The tradeoff is between having stringent parameters and, therefore, few rules, or lenient parameters and, thus, a multitude of rules. In many cases, our empirical evaluation revealed that most of the rules found by the comparably strict parameters ranked highly according to the interestingness criteria, when using lax parameters (producing significantly more association rules). Finally, we discuss the association rules that ranked highest, explain why these results are sound, and how they direct future research.

  3. An Algorithm of Association Rule Mining for Microbial Energy Prospection

    PubMed Central

    Shaheen, Muhammad; Shahbaz, Muhammad

    2017-01-01

    The presence of hydrocarbons beneath earth’s surface produces some microbiological anomalies in soils and sediments. The detection of such microbial populations involves pure bio chemical processes which are specialized, expensive and time consuming. This paper proposes a new algorithm of context based association rule mining on non spatial data. The algorithm is a modified form of already developed algorithm which was for spatial database only. The algorithm is applied to mine context based association rules on microbial database to extract interesting and useful associations of microbial attributes with existence of hydrocarbon reserve. The surface and soil manifestations caused by the presence of hydrocarbon oxidizing microbes are selected from existing literature and stored in a shared database. The algorithm is applied on the said database to generate direct and indirect associations among the stored microbial indicators. These associations are then correlated with the probability of hydrocarbon’s existence. The numerical evaluation shows better accuracy for non-spatial data as compared to conventional algorithms at generating reliable and robust rules. PMID:28393846

  4. The application of data mining to explore association rules between metabolic syndrome and lifestyles.

    PubMed

    Huang, Yi Chao

    This study used an efficient data mining algorithm, called DCIP (the data cutting and inner product method), to explore association rules between the lifestyles of factory workers in Taiwan and the metabolic syndrome. A total of 1,216 workers in four companies completed a lifestyle questionnaire. Results of the questionnaire survey were integrated into the workers' health examination reports to form an attribute database of the metabolic syndrome. Among the association rules derived by DCIP, 80% of those on the list of the top 15 highest support counts are corroborated by medical literature or by healthcare professionals. These findings prove that data mining is a valid and effective research method, and that larger sample sizes will likely produce more accurate associations connecting the metabolic syndrome to specific lifestyles. The rules already verified can serve as a reference guide for the health management of factory workers. The remaining 20%, while still lacking hard evidence, provide fertile ground for future research.

  5. 78 FR 6062 - North Dakota Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-29

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 934... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). North Dakota intends to...

  6. 76 FR 4266 - New Mexico Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-25

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 931... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). New Mexico proposes revisions to...

  7. 76 FR 9642 - Alabama Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-22

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 901... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving an amendment to the Alabama...

  8. 78 FR 13002 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-26

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938... Mining Reclamation and Enforcement (``OSM''), Interior. ACTION: Proposed rule; public comment period and... regulatory program under the Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or the ``Act...

  9. 78 FR 11579 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-19

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving an amendment to the Texas...

  10. 76 FR 40649 - Indiana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-11

    ... at 312 IAC 25-6-30 Surface mining; explosives; general requirements. The full text of the program... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 914... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period on proposed...

  11. 78 FR 10512 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-14

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment with certain... ``Wyoming program'') under the Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act...

  12. 77 FR 8144 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-02-14

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving three...

  13. 78 FR 9807 - Utah Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-12

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 944... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We are approving an amendment to the Utah regulatory program (the ``Utah program'') under the Surface Mining...

  14. 76 FR 30008 - Alabama Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-24

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 901... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving an amendment to the Alabama...

  15. 75 FR 43476 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-26

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; reopening and extension of public...'') under the Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). Montana revised...

  16. 75 FR 81122 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-27

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving an amendment to the Texas...

  17. 77 FR 58025 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-19

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving an amendment to the Texas...

  18. 76 FR 25277 - Examinations of Work Areas in Underground Coal Mines and Pattern of Violations

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-04

    ..., 1219-AB73 Examinations of Work Areas in Underground Coal Mines and Pattern of Violations AGENCY: Mine... four public hearings on the Agency's proposed rules for Examinations of Work Areas in Underground Coal... 1219-AB75'' for Examinations of Work Areas in Underground Coal Mines' submissions, and with ``RIN 1219...

  19. 78 FR 49079 - Lease Modifications, Lease and Logical Mining Unit Diligence, Advance Royalty, Royalty Rates, and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-12

    ... Management 43 CFR Parts 3000, 3400, 3430, et al. Lease Modifications, Lease and Logical Mining Unit Diligence... Lease Modifications, Lease and Logical Mining Unit Diligence, Advance Royalty, Royalty Rates, and Bonds... leases and logical mining units (LMUs). The proposed rule would implement Title IV, Subtitle D of the...

  20. Knowledge-guided mutation in classification rules for autism treatment efficacy.

    PubMed

    Engle, Kelley; Rada, Roy

    2017-03-01

    Data mining methods in biomedical research might benefit by combining genetic algorithms with domain-specific knowledge. The objective of this research is to show how the evolution of treatment rules for autism might be guided. The semantic distance between two concepts in the taxonomy is measured by the number of relationships separating the concepts in the taxonomy. The hypothesis is that replacing a concept in a treatment rule will change the accuracy of the rule in direct proportion to the semantic distance between the concepts. The method uses a patient database and autism taxonomies. Treatment rules are developed with an algorithm that exploits the taxonomies. The results support the hypothesis. This research should both advance the understanding of autism data mining in particular and of knowledge-guided evolutionary search in biomedicine in general.

  1. Dynamic association rules for gene expression data analysis.

    PubMed

    Chen, Shu-Chuan; Tsai, Tsung-Hsien; Chung, Cheng-Han; Li, Wen-Hsiung

    2015-10-14

    The purpose of gene expression analysis is to look for the association between regulation of gene expression levels and phenotypic variations. This association based on gene expression profile has been used to determine whether the induction/repression of genes correspond to phenotypic variations including cell regulations, clinical diagnoses and drug development. Statistical analyses on microarray data have been developed to resolve gene selection issue. However, these methods do not inform us of causality between genes and phenotypes. In this paper, we propose the dynamic association rule algorithm (DAR algorithm) which helps ones to efficiently select a subset of significant genes for subsequent analysis. The DAR algorithm is based on association rules from market basket analysis in marketing. We first propose a statistical way, based on constructing a one-sided confidence interval and hypothesis testing, to determine if an association rule is meaningful. Based on the proposed statistical method, we then developed the DAR algorithm for gene expression data analysis. The method was applied to analyze four microarray datasets and one Next Generation Sequencing (NGS) dataset: the Mice Apo A1 dataset, the whole genome expression dataset of mouse embryonic stem cells, expression profiling of the bone marrow of Leukemia patients, Microarray Quality Control (MAQC) data set and the RNA-seq dataset of a mouse genomic imprinting study. A comparison of the proposed method with the t-test on the expression profiling of the bone marrow of Leukemia patients was conducted. We developed a statistical way, based on the concept of confidence interval, to determine the minimum support and minimum confidence for mining association relationships among items. With the minimum support and minimum confidence, one can find significant rules in one single step. The DAR algorithm was then developed for gene expression data analysis. Four gene expression datasets showed that the proposed DAR algorithm not only was able to identify a set of differentially expressed genes that largely agreed with that of other methods, but also provided an efficient and accurate way to find influential genes of a disease. In the paper, the well-established association rule mining technique from marketing has been successfully modified to determine the minimum support and minimum confidence based on the concept of confidence interval and hypothesis testing. It can be applied to gene expression data to mine significant association rules between gene regulation and phenotype. The proposed DAR algorithm provides an efficient way to find influential genes that underlie the phenotypic variance.

  2. Privacy Preserving Association Rule Mining Revisited: Privacy Enhancement and Resources Efficiency

    NASA Astrophysics Data System (ADS)

    Mohaisen, Abedelaziz; Jho, Nam-Su; Hong, Dowon; Nyang, Daehun

    Privacy preserving association rule mining algorithms have been designed for discovering the relations between variables in data while maintaining the data privacy. In this article we revise one of the recently introduced schemes for association rule mining using fake transactions (FS). In particular, our analysis shows that the FS scheme has exhaustive storage and high computation requirements for guaranteeing a reasonable level of privacy. We introduce a realistic definition of privacy that benefits from the average case privacy and motivates the study of a weakness in the structure of FS by fake transactions filtering. In order to overcome this problem, we improve the FS scheme by presenting a hybrid scheme that considers both privacy and resources as two concurrent guidelines. Analytical and empirical results show the efficiency and applicability of our proposed scheme.

  3. Educational Data Mining Application for Estimating Students Performance in Weka Environment

    NASA Astrophysics Data System (ADS)

    Gowri, G. Shiyamala; Thulasiram, Ramasamy; Amit Baburao, Mahindra

    2017-11-01

    Educational data mining (EDM) is a multi-disciplinary research area that examines artificial intelligence, statistical modeling and data mining with the data generated from an educational institution. EDM utilizes computational ways to deal with explicate educational information keeping in mind the end goal to examine educational inquiries. To make a country stand unique among the other nations of the world, the education system has to undergo a major transition by redesigning its framework. The concealed patterns and data from various information repositories can be extracted by adopting the techniques of data mining. In order to summarize the performance of students with their credentials, we scrutinize the exploitation of data mining in the field of academics. Apriori algorithmic procedure is extensively applied to the database of students for a wider classification based on various categorizes. K-means procedure is applied to the same set of databases in order to accumulate them into a specific category. Apriori algorithm deals with mining the rules in order to extract patterns that are similar along with their associations in relation to various set of records. The records can be extracted from academic information repositories. The parameters used in this study gives more importance to psychological traits than academic features. The undesirable student conduct can be clearly witnessed if we make use of information mining frameworks. Thus, the algorithms efficiently prove to profile the students in any educational environment. The ultimate objective of the study is to suspect if a student is prone to violence or not.

  4. 76 FR 64048 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-17

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938... Surface Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule; reopening and extension... Mining Control and Reclamation Act of 1977 (SMCRA or the Act) published on February 7, 2011. In response...

  5. 30 CFR 301.1 - Cross reference.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... within the jurisdiction of administrative law judges and the Interior Board of Surface Mining and... Resources BOARD OF SURFACE MINING AND RECLAMATION APPEALS, DEPARTMENT OF THE INTERIOR PROCEDURES UNDER SURFACE MINING CONTROL AND RECLAMATION ACT OF 1977 § 301.1 Cross reference. For special rules applicable...

  6. 75 FR 60271 - Technical Amendments 2010

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-09-29

    ... Part VI Department of the Interior Office of Surface Mining Reclamation and Enforcement 30 CFR... INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Parts 740, 761, 773, 795, 816, 817...: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Final rule. SUMMARY: We, the...

  7. 30 CFR 921.700 - Massachusetts Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 921.700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE MASSACHUSETTS § 921.700 Massachusetts Federal program. (a) This part contains all rules that are applicable to surface coal mining...

  8. 77 FR 58053 - Kentucky Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-19

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 917... Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule; Removal of Required Amendments... program'') under the Surface Mining Control and Reclamation Act of 1977 (SMCRA or the Act). As a result of...

  9. 30 CFR 937.700 - Oregon Federal program.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Federal program. (c) The rules in this part apply to all surface coal mining operations in Oregon... more stringent environmental control and regulation of surface coal mining operations than do the... extent they provide for regulation of surface coal mining and reclamation operations which are exempt...

  10. 30 CFR 912.700 - Idaho Federal program.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... seq. and Rules 1 through 20 promulgated thereunder pertaining to regulation of dredge mining. (6... Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE IDAHO § 912.700 Idaho Federal...

  11. 75 FR 64411 - Lowering Miners' Exposure to Respirable Coal Mine Dust, Including Continuous Personal Dust Monitors

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-10-19

    ...The Mine Safety and Health Administration (MSHA) proposes to lower miners' exposure to respirable coal mine dust by revising the Agency's existing standards on miners' occupational exposure to respirable coal mine dust. The major provisions of the proposal would lower the existing exposure limit; provide for full-shift sampling; redefine the term ``normal production shift; '' and add reexamination and decertification requirements for persons certified to sample, and maintain and calibrate sampling devices. In addition, the proposed rule would provide for single shift compliance sampling under the mine operator and MSHA's inspector sampling programs, and would establish sampling requirements for use of the Continuous Personal Dust Monitor (CPDM) and expanded requirements for medical surveillance. The proposed rule would significantly improve health protections for this Nation's coal miners by reducing their occupational exposure to respirable coal mine dust and lowering the risk that they will suffer material impairment of health or functional capacity over their working lives.

  12. Effective application of improved profit-mining algorithm for the interday trading model.

    PubMed

    Hsieh, Yu-Lung; Yang, Don-Lin; Wu, Jungpin

    2014-01-01

    Many real world applications of association rule mining from large databases help users make better decisions. However, they do not work well in financial markets at this time. In addition to a high profit, an investor also looks for a low risk trading with a better rate of winning. The traditional approach of using minimum confidence and support thresholds needs to be changed. Based on an interday model of trading, we proposed effective profit-mining algorithms which provide investors with profit rules including information about profit, risk, and winning rate. Since profit-mining in the financial market is still in its infant stage, it is important to detail the inner working of mining algorithms and illustrate the best way to apply them. In this paper we go into details of our improved profit-mining algorithm and showcase effective applications with experiments using real world trading data. The results show that our approach is practical and effective with good performance for various datasets.

  13. Effective Application of Improved Profit-Mining Algorithm for the Interday Trading Model

    PubMed Central

    Wu, Jungpin

    2014-01-01

    Many real world applications of association rule mining from large databases help users make better decisions. However, they do not work well in financial markets at this time. In addition to a high profit, an investor also looks for a low risk trading with a better rate of winning. The traditional approach of using minimum confidence and support thresholds needs to be changed. Based on an interday model of trading, we proposed effective profit-mining algorithms which provide investors with profit rules including information about profit, risk, and winning rate. Since profit-mining in the financial market is still in its infant stage, it is important to detail the inner working of mining algorithms and illustrate the best way to apply them. In this paper we go into details of our improved profit-mining algorithm and showcase effective applications with experiments using real world trading data. The results show that our approach is practical and effective with good performance for various datasets. PMID:24688442

  14. Computer finds ore

    NASA Astrophysics Data System (ADS)

    Bell, Peter M.

    Artificial intelligence techniques are being used for the first time to evaluate geophysical, geochemical, and geologic data and theory in order to locate ore deposits. After several years of development, an intelligent computer code has been formulated and applied to the Mount Tolman area in Washington state. In a project funded by the United States Geological Survey and the National Science Foundation a set of computer programs, under the general title Prospector, was used successfully to locate a previously unknown ore-grade porphyry molybdenum deposit in the vicinity of Mount Tolman (Science, Sept. 3, 1982).The general area of the deposit had been known to contain exposures of porphyry mineralization. Between 1964 and 1978, exploration surveys had been run by the Bear Creek Mining Company, and later exploration was done in the area by the Amax Corporation. Some of the geophysical data and geochemical and other prospecting surveys were incorporated into the programs, and mine exploration specialists contributed to a set of rules for Prospector. The rules were encoded as ‘inference networks’ to form the ‘expert system’ on which the artificial intelligence codes were based. The molybdenum ore deposit discovered by the test is large, located subsurface, and has an areal extent of more than 18 km2.

  15. Assessing Lightning and Wildfire Hazard by Land Properties and Cloud to Ground Lightning Data with Association Rule Mining in Alberta, Canada

    PubMed Central

    Cha, DongHwan; Wang, Xin; Kim, Jeong Woo

    2017-01-01

    Hotspot analysis was implemented to find regions in the province of Alberta (Canada) with high frequency Cloud to Ground (CG) lightning strikes clustered together. Generally, hotspot regions are located in the central, central east, and south central regions of the study region. About 94% of annual lightning occurred during warm months (June to August) and the daily lightning frequency was influenced by the diurnal heating cycle. The association rule mining technique was used to investigate frequent CG lightning patterns, which were verified by similarity measurement to check the patterns’ consistency. The similarity coefficient values indicated that there were high correlations throughout the entire study period. Most wildfires (about 93%) in Alberta occurred in forests, wetland forests, and wetland shrub areas. It was also found that lightning and wildfires occur in two distinct areas: frequent wildfire regions with a high frequency of lightning, and frequent wild-fire regions with a low frequency of lightning. Further, the preference index (PI) revealed locations where the wildfires occurred more frequently than in other class regions. The wildfire hazard area was estimated with the CG lightning hazard map and specific land use types. PMID:29065564

  16. Assessing Lightning and Wildfire Hazard by Land Properties and Cloud to Ground Lightning Data with Association Rule Mining in Alberta, Canada.

    PubMed

    Cha, DongHwan; Wang, Xin; Kim, Jeong Woo

    2017-10-23

    Hotspot analysis was implemented to find regions in the province of Alberta (Canada) with high frequency Cloud to Ground (CG) lightning strikes clustered together. Generally, hotspot regions are located in the central, central east, and south central regions of the study region. About 94% of annual lightning occurred during warm months (June to August) and the daily lightning frequency was influenced by the diurnal heating cycle. The association rule mining technique was used to investigate frequent CG lightning patterns, which were verified by similarity measurement to check the patterns' consistency. The similarity coefficient values indicated that there were high correlations throughout the entire study period. Most wildfires (about 93%) in Alberta occurred in forests, wetland forests, and wetland shrub areas. It was also found that lightning and wildfires occur in two distinct areas: frequent wildfire regions with a high frequency of lightning, and frequent wild-fire regions with a low frequency of lightning. Further, the preference index (PI) revealed locations where the wildfires occurred more frequently than in other class regions. The wildfire hazard area was estimated with the CG lightning hazard map and specific land use types.

  17. Extracting Cross-Ontology Weighted Association Rules from Gene Ontology Annotations.

    PubMed

    Agapito, Giuseppe; Milano, Marianna; Guzzi, Pietro Hiram; Cannataro, Mario

    2016-01-01

    Gene Ontology (GO) is a structured repository of concepts (GO Terms) that are associated to one or more gene products through a process referred to as annotation. The analysis of annotated data is an important opportunity for bioinformatics. There are different approaches of analysis, among those, the use of association rules (AR) which provides useful knowledge, discovering biologically relevant associations between terms of GO, not previously known. In a previous work, we introduced GO-WAR (Gene Ontology-based Weighted Association Rules), a methodology for extracting weighted association rules from ontology-based annotated datasets. We here adapt the GO-WAR algorithm to mine cross-ontology association rules, i.e., rules that involve GO terms present in the three sub-ontologies of GO. We conduct a deep performance evaluation of GO-WAR by mining publicly available GO annotated datasets, showing how GO-WAR outperforms current state of the art approaches.

  18. 76 FR 41411 - West Virginia Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-14

    ... of Environmental Protection (WVDEP). The interim rule provided an opportunity for public comment and... 30 CFR Part 948 Intergovernmental relations, Surface mining, Underground mining. Dated: July 5, 2011...

  19. [Characteristics of acupoint selection of acupuncture-moxibustion for vertigo in history: a data mining research].

    PubMed

    Li, Xiang; Shou, Yi-Xia; Ren, Yu-Lan; Liang, Fan-Rong

    2014-05-01

    The data mining technique is adopted to analyze characteristics and rules of acupoint and meridian selection of acupuncture-moxibustion for treatment of vertigo at different time periods in the ancient. The data is collected from literature regarding acupuncture-moxibustion from the pre-Qin period to the end of Qing Dynasty, so as to establish a clinical literature database of ancient acupuncture-moxibustion for treatment of vertigo. Data mining method is applied to analyze the commonly used meridians, acupoints and special acupoints in different dynasties, also possible rules are explored. Totally 82 pieces of prescription of acupuncture-moxibustion for treatment of vertigo are included. In the history the leading selection of acupoitns are Fengchi (GB 20), Hegu (LI 4), Shangxing (GV 23) and Jiexi (ST 41) while that of meridians are mainly three yang meridians of foot and the Governor Vessel, especially the acupoints on the Bladder Meridian of foot yangming had the highest utilization rate, accounting for 23.04%. The acupoint selection is characterized by special acupoint, accounting for 80.6%, among which the crossing points are the most common choice. Distal-proximal acupoints combination is the most frequent method. The results indicate that the ancient acupuncture-moxibustion for treatment of vertigo focused on acupoints in the yang meridians, and the specific acupoints play an essential role in prescription; also the principle of syndrome differentiation and selecting acupoints along the meridians could be seen.

  20. 30 CFR 912.700 - Idaho Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE IDAHO § 912.700 Idaho Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in Idaho...

  1. 30 CFR 905.700 - California Federal Program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE CALIFORNIA § 905.700 California Federal Program. (a) This part contains all rules that are applicable to surface coal mining operations in...

  2. 30 CFR 947.700 - Washington Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE WASHINGTON § 947.700 Washington Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in...

  3. 30 CFR 922.700 - Michigan Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE MICHIGAN § 922.700 Michigan Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in...

  4. 30 CFR 910.700 - Georgia Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE GEORGIA § 910.700 Georgia Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in Georgia...

  5. 30 CFR 937.700 - Oregon Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE OREGON § 937.700 Oregon Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in Oregon...

  6. 30 CFR 942.700 - Tennessee Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE TENNESSEE § 942.700 Tennessee Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in...

  7. Co-evolutionary data mining for fuzzy rules: automatic fitness function creation phase space, and experiments

    NASA Astrophysics Data System (ADS)

    Smith, James F., III; Blank, Joseph A.

    2003-03-01

    An approach is being explored that involves embedding a fuzzy logic based resource manager in an electronic game environment. Game agents can function under their own autonomous logic or human control. This approach automates the data mining problem. The game automatically creates a cleansed database reflecting the domain expert's knowledge, it calls a data mining function, a genetic algorithm, for data mining of the data base as required and allows easy evaluation of the information extracted. The co-evolutionary fitness functions, chromosomes and stopping criteria for ending the game are discussed. Genetic algorithm and genetic program based data mining procedures are discussed that automatically discover new fuzzy rules and strategies. The strategy tree concept and its relationship to co-evolutionary data mining are examined as well as the associated phase space representation of fuzzy concepts. The overlap of fuzzy concepts in phase space reduces the effective strategies available to adversaries. Co-evolutionary data mining alters the geometric properties of the overlap region known as the admissible region of phase space significantly enhancing the performance of the resource manager. Procedures for validation of the information data mined are discussed and significant experimental results provided.

  8. Object-Driven and Temporal Action Rules Mining

    ERIC Educational Resources Information Center

    Hajja, Ayman

    2013-01-01

    In this thesis, I present my complete research work in the field of action rules, more precisely object-driven and temporal action rules. The drive behind the introduction of object-driven and temporally based action rules is to bring forth an adapted approach to extract action rules from a subclass of systems that have a specific nature, in which…

  9. Mining Hesitation Information by Vague Association Rules

    NASA Astrophysics Data System (ADS)

    Lu, An; Ng, Wilfred

    In many online shopping applications, such as Amazon and eBay, traditional Association Rule (AR) mining has limitations as it only deals with the items that are sold but ignores the items that are almost sold (for example, those items that are put into the basket but not checked out). We say that those almost sold items carry hesitation information, since customers are hesitating to buy them. The hesitation information of items is valuable knowledge for the design of good selling strategies. However, there is no conceptual model that is able to capture different statuses of hesitation information. Herein, we apply and extend vague set theory in the context of AR mining. We define the concepts of attractiveness and hesitation of an item, which represent the overall information of a customer's intent on an item. Based on the two concepts, we propose the notion of Vague Association Rules (VARs). We devise an efficient algorithm to mine the VARs. Our experiments show that our algorithm is efficient and the VARs capture more specific and richer information than do the traditional ARs.

  10. Intertransaction Class Association Rule Mining Based on Genetic Network Programming and Its Application to Stock Market Prediction

    NASA Astrophysics Data System (ADS)

    Yang, Yuchen; Mabu, Shingo; Shimada, Kaoru; Hirasawa, Kotaro

    Intertransaction association rules have been reported to be useful in many fields such as stock market prediction, but still there are not so many efficient methods to dig them out from large data sets. Furthermore, how to use and measure these more complex rules should be considered carefully. In this paper, we propose a new intertransaction class association rule mining method based on Genetic Network Programming (GNP), which has the ability to overcome some shortages of Apriori-like based intertransaction association methods. Moreover, a general classifier model for intertransaction rules is also introduced. In experiments on the real world application of stock market prediction, the method shows its efficiency and ability to obtain good results and can bring more benefits with a suitable classifier considering larger interval span.

  11. Evolutionary Data Mining Approach to Creating Digital Logic

    DTIC Science & Technology

    2010-01-01

    To deal with this problem a genetic program (GP) based data mining ( DM ) procedure has been invented (Smith 2005). A genetic program is an algorithm...that can operate on the variables. When a GP was used as a DM function in the past to automatically create fuzzy decision trees, the Report...rules represents an approach to the determining the effect of linguistic imprecision, i.e., the inability of experts to provide crisp rules. The

  12. Genetic Algorithm Calibration of Probabilistic Cellular Automata for Modeling Mining Permit Activity

    USGS Publications Warehouse

    Louis, S.J.; Raines, G.L.

    2003-01-01

    We use a genetic algorithm to calibrate a spatially and temporally resolved cellular automata to model mining activity on public land in Idaho and western Montana. The genetic algorithm searches through a space of transition rule parameters of a two dimensional cellular automata model to find rule parameters that fit observed mining activity data. Previous work by one of the authors in calibrating the cellular automaton took weeks - the genetic algorithm takes a day and produces rules leading to about the same (or better) fit to observed data. These preliminary results indicate that genetic algorithms are a viable tool in calibrating cellular automata for this application. Experience gained during the calibration of this cellular automata suggests that mineral resource information is a critical factor in the quality of the results. With automated calibration, further refinements of how the mineral-resource information is provided to the cellular automaton will probably improve our model.

  13. Rescue complex for coal mines

    NASA Astrophysics Data System (ADS)

    Yungmeyster, D. A.; Urazbakhtin, R. Yu

    2017-10-01

    The mining industry was potentially dangerous at all times, even with the use of modern equipment in mines, accidents continue to occur, including catastrophic ones. Accidents in mines are due to the presence of specific features in the conduct of mining operations. These include the inconsistency of mining and geological conditions, the contamination of the mine atmosphere due to the release of gases from minerals, the presence of self-igniting coal strata, which creates the danger of underground fires, gas explosions. The main cause of accidents is the irresponsibility of both the manager and the personnel who violate the safety rules during mining operations.

  14. PKDE4J: Entity and relation extraction for public knowledge discovery.

    PubMed

    Song, Min; Kim, Won Chul; Lee, Dahee; Heo, Go Eun; Kang, Keun Young

    2015-10-01

    Due to an enormous number of scientific publications that cannot be handled manually, there is a rising interest in text-mining techniques for automated information extraction, especially in the biomedical field. Such techniques provide effective means of information search, knowledge discovery, and hypothesis generation. Most previous studies have primarily focused on the design and performance improvement of either named entity recognition or relation extraction. In this paper, we present PKDE4J, a comprehensive text-mining system that integrates dictionary-based entity extraction and rule-based relation extraction in a highly flexible and extensible framework. Starting with the Stanford CoreNLP, we developed the system to cope with multiple types of entities and relations. The system also has fairly good performance in terms of accuracy as well as the ability to configure text-processing components. We demonstrate its competitive performance by evaluating it on many corpora and found that it surpasses existing systems with average F-measures of 85% for entity extraction and 81% for relation extraction. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. A Novel Hybrid Intelligent Indoor Location Method for Mobile Devices by Zones Using Wi-Fi Signals

    PubMed Central

    Castañón–Puga, Manuel; Salazar, Abby Stephanie; Aguilar, Leocundo; Gaxiola-Pacheco, Carelia; Licea, Guillermo

    2015-01-01

    The increasing use of mobile devices in indoor spaces brings challenges to location methods. This work presents a hybrid intelligent method based on data mining and Type-2 fuzzy logic to locate mobile devices in an indoor space by zones using Wi-Fi signals from selected access points (APs). This approach takes advantage of wireless local area networks (WLANs) over other types of architectures and implements the complete method in a mobile application using the developed tools. Besides, the proposed approach is validated by experimental data obtained from case studies and the cross-validation technique. For the purpose of generating the fuzzy rules that conform to the Takagi–Sugeno fuzzy system structure, a semi-supervised data mining technique called subtractive clustering is used. This algorithm finds centers of clusters from the radius map given by the collected signals from APs. Measurements of Wi-Fi signals can be noisy due to several factors mentioned in this work, so this method proposed the use of Type-2 fuzzy logic for modeling and dealing with such uncertain information. PMID:26633417

  16. A Novel Hybrid Intelligent Indoor Location Method for Mobile Devices by Zones Using Wi-Fi Signals.

    PubMed

    Castañón-Puga, Manuel; Salazar, Abby Stephanie; Aguilar, Leocundo; Gaxiola-Pacheco, Carelia; Licea, Guillermo

    2015-12-02

    The increasing use of mobile devices in indoor spaces brings challenges to location methods. This work presents a hybrid intelligent method based on data mining and Type-2 fuzzy logic to locate mobile devices in an indoor space by zones using Wi-Fi signals from selected access points (APs). This approach takes advantage of wireless local area networks (WLANs) over other types of architectures and implements the complete method in a mobile application using the developed tools. Besides, the proposed approach is validated by experimental data obtained from case studies and the cross-validation technique. For the purpose of generating the fuzzy rules that conform to the Takagi-Sugeno fuzzy system structure, a semi-supervised data mining technique called subtractive clustering is used. This algorithm finds centers of clusters from the radius map given by the collected signals from APs. Measurements of Wi-Fi signals can be noisy due to several factors mentioned in this work, so this method proposed the use of Type-2 fuzzy logic for modeling and dealing with such uncertain information.

  17. Java implementation of Class Association Rule algorithms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tamura, Makio

    2007-08-30

    Java implementation of three Class Association Rule mining algorithms, NETCAR, CARapriori, and clustering based rule mining. NETCAR algorithm is a novel algorithm developed by Makio Tamura. The algorithm is discussed in a paper: UCRL-JRNL-232466-DRAFT, and would be published in a peer review scientific journal. The software is used to extract combinations of genes relevant with a phenotype from a phylogenetic profile and a phenotype profile. The phylogenetic profiles is represented by a binary matrix and a phenotype profile is represented by a binary vector. The present application of this software will be in genome analysis, however, it could be appliedmore » more generally.« less

  18. NIOSH comments to DOL on the Mine Safety and Health Administration's proposed rule on air quality, chemical substances, and respiratory protection standards by J. D. Millar, March 1, 1990

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    The testimony concerns the views of NIOSH regarding the Mine Safety and Health Administration (MSHA) proposed rule on permissible exposure limits; exposure monitoring, abrasive blasting; drill dust control; dangerous atmospheres; and prohibited areas for food and beverages. NIOSH continues to endorse the recommended exposure limit of 1 part per million (ppm) as a 15 minute short term exposure limit for nitrogen-dioxide (10102440). NIOSH supports MSHA in proposing an 8 hour time weighted average of 25ppm for nitric-oxide (10102439). NIOSH supports MSHA in proposing a limit of 35ppm as an 8 hour time weighted average (TWA) for carbon-monoxide (630080) and recommendsmore » that sulfur-dioxide (7446095) exposure be limited to 0.5ppm as an 8 hour TWA. NIOSH recommends that routine air monitoring be required on a periodic basis. NIOSH recommends that mine operators be required to establish a written exposure monitoring plan for each facility that outlines where area and personal samples should be taken, how many samples should be taken, and the implementation of the remaining portions of the proposed rule change. NIOSH supports the rules for abrasive blasting for both coal and metal/nonmetal mines and has identified several substitutive materials for silica sand that could be used in abrasive blasting.« less

  19. 30 CFR 77.1600 - Loading and haulage; general.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... permitted on haulage roads and at loading or dumping locations. (b) Traffic rules, signals, and warning signs shall be standardized at each mine and posted. (c) Where side or overhead clearances on any haulage road or at any loading or dumping location at the mine are hazardous to mine workers, such areas...

  20. 30 CFR 77.1600 - Loading and haulage; general.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... permitted on haulage roads and at loading or dumping locations. (b) Traffic rules, signals, and warning signs shall be standardized at each mine and posted. (c) Where side or overhead clearances on any haulage road or at any loading or dumping location at the mine are hazardous to mine workers, such areas...

  1. 30 CFR 77.1600 - Loading and haulage; general.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... permitted on haulage roads and at loading or dumping locations. (b) Traffic rules, signals, and warning signs shall be standardized at each mine and posted. (c) Where side or overhead clearances on any haulage road or at any loading or dumping location at the mine are hazardous to mine workers, such areas...

  2. 30 CFR 77.1600 - Loading and haulage; general.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... permitted on haulage roads and at loading or dumping locations. (b) Traffic rules, signals, and warning signs shall be standardized at each mine and posted. (c) Where side or overhead clearances on any haulage road or at any loading or dumping location at the mine are hazardous to mine workers, such areas...

  3. 30 CFR 77.1600 - Loading and haulage; general.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... permitted on haulage roads and at loading or dumping locations. (b) Traffic rules, signals, and warning signs shall be standardized at each mine and posted. (c) Where side or overhead clearances on any haulage road or at any loading or dumping location at the mine are hazardous to mine workers, such areas...

  4. 30 CFR 944.30 - State-Federal Cooperative Agreement.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Division of Oil, Gas, and Mining (DOGM) will be responsible for administering this Agreement on behalf of..., Final Rules of the Board of Oil, Gas and Mining, UMC/SMC 700 et seq. [52 FR 7850, Mar. 13, 1987] ... INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE UTAH § 944.30 State...

  5. 30 CFR 944.30 - State-Federal Cooperative Agreement.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... Division of Oil, Gas, and Mining (DOGM) will be responsible for administering this Agreement on behalf of..., Final Rules of the Board of Oil, Gas and Mining, UMC/SMC 700 et seq. [52 FR 7850, Mar. 13, 1987] ... INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE UTAH § 944.30 State...

  6. 30 CFR 944.30 - State-Federal Cooperative Agreement.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... Division of Oil, Gas, and Mining (DOGM) will be responsible for administering this Agreement on behalf of..., Final Rules of the Board of Oil, Gas and Mining, UMC/SMC 700 et seq. [52 FR 7850, Mar. 13, 1987] ... INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE UTAH § 944.30 State...

  7. 30 CFR 944.30 - State-Federal Cooperative Agreement.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Division of Oil, Gas, and Mining (DOGM) will be responsible for administering this Agreement on behalf of..., Final Rules of the Board of Oil, Gas and Mining, UMC/SMC 700 et seq. [52 FR 7850, Mar. 13, 1987] ... INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE UTAH § 944.30 State...

  8. Constructing Compact Takagi-Sugeno Rule Systems: Identification of Complex Interactions in Epidemiological Data

    PubMed Central

    Zhou, Shang-Ming; Lyons, Ronan A.; Brophy, Sinead; Gravenor, Mike B.

    2012-01-01

    The Takagi-Sugeno (TS) fuzzy rule system is a widely used data mining technique, and is of particular use in the identification of non-linear interactions between variables. However the number of rules increases dramatically when applied to high dimensional data sets (the curse of dimensionality). Few robust methods are available to identify important rules while removing redundant ones, and this results in limited applicability in fields such as epidemiology or bioinformatics where the interaction of many variables must be considered. Here, we develop a new parsimonious TS rule system. We propose three statistics: R, L, and ω-values, to rank the importance of each TS rule, and a forward selection procedure to construct a final model. We use our method to predict how key components of childhood deprivation combine to influence educational achievement outcome. We show that a parsimonious TS model can be constructed, based on a small subset of rules, that provides an accurate description of the relationship between deprivation indices and educational outcomes. The selected rules shed light on the synergistic relationships between the variables, and reveal that the effect of targeting specific domains of deprivation is crucially dependent on the state of the other domains. Policy decisions need to incorporate these interactions, and deprivation indices should not be considered in isolation. The TS rule system provides a basis for such decision making, and has wide applicability for the identification of non-linear interactions in complex biomedical data. PMID:23272108

  9. Constructing compact Takagi-Sugeno rule systems: identification of complex interactions in epidemiological data.

    PubMed

    Zhou, Shang-Ming; Lyons, Ronan A; Brophy, Sinead; Gravenor, Mike B

    2012-01-01

    The Takagi-Sugeno (TS) fuzzy rule system is a widely used data mining technique, and is of particular use in the identification of non-linear interactions between variables. However the number of rules increases dramatically when applied to high dimensional data sets (the curse of dimensionality). Few robust methods are available to identify important rules while removing redundant ones, and this results in limited applicability in fields such as epidemiology or bioinformatics where the interaction of many variables must be considered. Here, we develop a new parsimonious TS rule system. We propose three statistics: R, L, and ω-values, to rank the importance of each TS rule, and a forward selection procedure to construct a final model. We use our method to predict how key components of childhood deprivation combine to influence educational achievement outcome. We show that a parsimonious TS model can be constructed, based on a small subset of rules, that provides an accurate description of the relationship between deprivation indices and educational outcomes. The selected rules shed light on the synergistic relationships between the variables, and reveal that the effect of targeting specific domains of deprivation is crucially dependent on the state of the other domains. Policy decisions need to incorporate these interactions, and deprivation indices should not be considered in isolation. The TS rule system provides a basis for such decision making, and has wide applicability for the identification of non-linear interactions in complex biomedical data.

  10. Visual exploration and analysis of human-robot interaction rules

    NASA Astrophysics Data System (ADS)

    Zhang, Hui; Boyles, Michael J.

    2013-01-01

    We present a novel interaction paradigm for the visual exploration, manipulation and analysis of human-robot interaction (HRI) rules; our development is implemented using a visual programming interface and exploits key techniques drawn from both information visualization and visual data mining to facilitate the interaction design and knowledge discovery process. HRI is often concerned with manipulations of multi-modal signals, events, and commands that form various kinds of interaction rules. Depicting, manipulating and sharing such design-level information is a compelling challenge. Furthermore, the closed loop between HRI programming and knowledge discovery from empirical data is a relatively long cycle. This, in turn, makes design-level verification nearly impossible to perform in an earlier phase. In our work, we exploit a drag-and-drop user interface and visual languages to support depicting responsive behaviors from social participants when they interact with their partners. For our principal test case of gaze-contingent HRI interfaces, this permits us to program and debug the robots' responsive behaviors through a graphical data-flow chart editor. We exploit additional program manipulation interfaces to provide still further improvement to our programming experience: by simulating the interaction dynamics between a human and a robot behavior model, we allow the researchers to generate, trace and study the perception-action dynamics with a social interaction simulation to verify and refine their designs. Finally, we extend our visual manipulation environment with a visual data-mining tool that allows the user to investigate interesting phenomena such as joint attention and sequential behavioral patterns from multiple multi-modal data streams. We have created instances of HRI interfaces to evaluate and refine our development paradigm. As far as we are aware, this paper reports the first program manipulation paradigm that integrates visual programming interfaces, information visualization, and visual data mining methods to facilitate designing, comprehending, and evaluating HRI interfaces.

  11. 43 CFR 4.1272 - Interlocutory appeals.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... PROCEDURES Special Rules Applicable to Surface Coal Mining Hearings and Appeals Appeals to the Board from... modification of the administrative law judge's interlocutory ruling or order, the jurisdiction of the Board...

  12. Predicting mining activity with parallel genetic algorithms

    USGS Publications Warehouse

    Talaie, S.; Leigh, R.; Louis, S.J.; Raines, G.L.; Beyer, H.G.; O'Reilly, U.M.; Banzhaf, Arnold D.; Blum, W.; Bonabeau, C.; Cantu-Paz, E.W.; ,; ,

    2005-01-01

    We explore several different techniques in our quest to improve the overall model performance of a genetic algorithm calibrated probabilistic cellular automata. We use the Kappa statistic to measure correlation between ground truth data and data predicted by the model. Within the genetic algorithm, we introduce a new evaluation function sensitive to spatial correctness and we explore the idea of evolving different rule parameters for different subregions of the land. We reduce the time required to run a simulation from 6 hours to 10 minutes by parallelizing the code and employing a 10-node cluster. Our empirical results suggest that using the spatially sensitive evaluation function does indeed improve the performance of the model and our preliminary results also show that evolving different rule parameters for different regions tends to improve overall model performance. Copyright 2005 ACM.

  13. Applying data mining techniques to explore factors contributing to occupational injuries in Taiwan's construction industry.

    PubMed

    Cheng, Ching-Wu; Leu, Sou-Sen; Cheng, Ying-Mei; Wu, Tsung-Chih; Lin, Chen-Chung

    2012-09-01

    Construction accident research involves the systematic sorting, classification, and encoding of comprehensive databases of injuries and fatalities. The present study explores the causes and distribution of occupational accidents in the Taiwan construction industry by analyzing such a database using the data mining method known as classification and regression tree (CART). Utilizing a database of 1542 accident cases during the period 2000-2009, the study seeks to establish potential cause-and-effect relationships regarding serious occupational accidents in the industry. The results of this study show that the occurrence rules for falls and collapses in both public and private project construction industries serve as key factors to predict the occurrence of occupational injuries. The results of the study provide a framework for improving the safety practices and training programs that are essential to protecting construction workers from occasional or unexpected accidents. Copyright © 2011 Elsevier Ltd. All rights reserved.

  14. [Exploring pharmacological principle of Artemisia carvifolia with textmining technology].

    PubMed

    Zhao, Yu-Ping; Wang, Hui; Yang, Guang; Qiu, Zhi-Dong; Qu, Xiao-Bo; Zhang, Xiao-Bo

    2016-08-01

    To explore the pharmacological principle of Artemisia carvifolia,the text mining technique was used. All the references of A. carvifolia were collected from PubMed database, and then the rules of the main ingredient,relative diseases, organs, tissues, proteins and metabolites were analyzed. Finally, a network was set up. Then it was found that the main ingredients included sesquiterpenoids,flavonoids,and volatileoils.The diseases such as malaria, cerebral malaria, falciparum malaria, visceral leishmaniasis and systemic lupus erythematosus were often treated with A. carvifolia. In association in organ were the liver, skin, trachea,lungs,and spleen.Correlations with tissues were mainly including macrophages, T lymphocytes, blood vessels, epithelial cells.The protein was correlation with it involved CYP450, PI3K, TNF-α, AASDPPT, DNA polymerase and so on. Comprehensive and systematic treatment principle of A. carvifolia was obtained by text mining, which was helpful in clinical application. Copyright© by the Chinese Pharmaceutical Association.

  15. Product Recommendation System Based on Personal Preference Model Using CAM

    NASA Astrophysics Data System (ADS)

    Murakami, Tomoko; Yoshioka, Nobukazu; Orihara, Ryohei; Furukawa, Koichi

    Product recommendation system is realized by applying business rules acquired by data maining techniques. Business rules such as demographical patterns of purchase, are able to cover the groups of users that have a tendency to purchase products, but it is difficult to recommend products adaptive to various personal preferences only by utilizing them. In addition to that, it is very costly to gather the large volume of high quality survey data, which is necessary for good recommendation based on personal preference model. A method collecting kansei information automatically without questionnaire survey is required. The constructing personal preference model from less favor data is also necessary, since it is costly for the user to input favor data. In this paper, we propose product recommendation system based on kansei information extracted by text mining and user's preference model constructed by Category-guided Adaptive Modeling, CAM for short. CAM is a feature construction method that can generate new features constructing the space where same labeled examples are close and different labeled examples are far away from some labeled examples. It is possible to construct personal preference model by CAM despite less information of likes and dislikes categories. In the system, retrieval agent gathers the products' specification and user agent manages preference model, user's likes and dislikes. Kansei information of the products is gained by applying text mining technique to the reputation documents about the products on the web site. We carry out some experimental studies to make sure that prefrence model obtained by our method performs effectively.

  16. A New Framework for Textual Information Mining over Parse Trees. CRESST Report 805

    ERIC Educational Resources Information Center

    Mousavi, Hamid; Kerr, Deirdre; Iseli, Markus R.

    2011-01-01

    Textual information mining is a challenging problem that has resulted in the creation of many different rule-based linguistic query languages. However, these languages generally are not optimized for the purpose of text mining. In other words, they usually consider queries as individuals and only return raw results for each query. Moreover they…

  17. ChemBrowser: a flexible framework for mining chemical documents.

    PubMed

    Wu, Xian; Zhang, Li; Chen, Ying; Rhodes, James; Griffin, Thomas D; Boyer, Stephen K; Alba, Alfredo; Cai, Keke

    2010-01-01

    The ability to extract chemical and biological entities and relations from text documents automatically has great value to biochemical research and development activities. The growing maturity of text mining and artificial intelligence technologies shows promise in enabling such automatic chemical entity extraction capabilities (called "Chemical Annotation" in this paper). Many techniques have been reported in the literature, ranging from dictionary and rule-based techniques to machine learning approaches. In practice, we found that no single technique works well in all cases. A combinatorial approach that allows one to quickly compose different annotation techniques together for a given situation is most effective. In this paper, we describe the key challenges we face in real-world chemical annotation scenarios. We then present a solution called ChemBrowser which has a flexible framework for chemical annotation. ChemBrowser includes a suite of customizable processing units that might be utilized in a chemical annotator, a high-level language that describes the composition of various processing units that would form a chemical annotator, and an execution engine that translates the composition language to an actual annotator that can generate annotation results for a given set of documents. We demonstrate the impact of this approach by tailoring an annotator for extracting chemical names from patent documents and show how this annotator can be easily modified with simple configuration alone.

  18. Documents for SBAR Panel: CERCLA 108(b) Hard Rock Mining Financial Assurance Rule

    EPA Pesticide Factsheets

    SBAR panel documents for small business advocacy review panel on the financial responsibilities of the hard rock mining industry under Section 108(b) of the Comprehensive Environmental Response, Compensation, and Liability Act

  19. Preference Mining Using Neighborhood Rough Set Model on Two Universes.

    PubMed

    Zeng, Kai

    2016-01-01

    Preference mining plays an important role in e-commerce and video websites for enhancing user satisfaction and loyalty. Some classical methods are not available for the cold-start problem when the user or the item is new. In this paper, we propose a new model, called parametric neighborhood rough set on two universes (NRSTU), to describe the user and item data structures. Furthermore, the neighborhood lower approximation operator is used for defining the preference rules. Then, we provide the means for recommending items to users by using these rules. Finally, we give an experimental example to show the details of NRSTU-based preference mining for cold-start problem. The parameters of the model are also discussed. The experimental results show that the proposed method presents an effective solution for preference mining. In particular, NRSTU improves the recommendation accuracy by about 19% compared to the traditional method.

  20. Knowledge mining from clinical datasets using rough sets and backpropagation neural network.

    PubMed

    Nahato, Kindie Biredagn; Harichandran, Khanna Nehemiah; Arputharaj, Kannan

    2015-01-01

    The availability of clinical datasets and knowledge mining methodologies encourages the researchers to pursue research in extracting knowledge from clinical datasets. Different data mining techniques have been used for mining rules, and mathematical models have been developed to assist the clinician in decision making. The objective of this research is to build a classifier that will predict the presence or absence of a disease by learning from the minimal set of attributes that has been extracted from the clinical dataset. In this work rough set indiscernibility relation method with backpropagation neural network (RS-BPNN) is used. This work has two stages. The first stage is handling of missing values to obtain a smooth data set and selection of appropriate attributes from the clinical dataset by indiscernibility relation method. The second stage is classification using backpropagation neural network on the selected reducts of the dataset. The classifier has been tested with hepatitis, Wisconsin breast cancer, and Statlog heart disease datasets obtained from the University of California at Irvine (UCI) machine learning repository. The accuracy obtained from the proposed method is 97.3%, 98.6%, and 90.4% for hepatitis, breast cancer, and heart disease, respectively. The proposed system provides an effective classification model for clinical datasets.

  1. Process Metallurgy an Enabler of Resource Efficiency: Linking Product Design to Metallurgy in Product Centric Recycling

    NASA Astrophysics Data System (ADS)

    Reuter, Markus; van Schaik, Antoinette

    In this paper the link between process metallurgy, classical minerals processing, product centric recycling and urban/landfill mining is discussed. The depth that has to be achieved in urban mining and recycling must glean from the wealth of theoretical knowledge and insight that have been developed in the past in minerals and metallurgical processing. This background learns that recycling demands a product centric approach, which considers simultaneously the multi-material interactions in man-made complex `minerals'. Fast innovation in recycling and urban mining can be achieved by further evolving from this well developed basis, evolving the techniques and tools that have been developed over the years. This basis has already been used for many years to design, operate and control industrial plants for metal production. This has been the basis for Design for Recycling rules for End-of-Life products. Using, among others, the UNEP Metal Recycling report as a basis (authors are respectively Lead and Main authors of report), it is demonstrated that a common theoretical basis as developed in metallurgy and minerals processing can help much to level the playing field between primary processing, secondary processing, recycling, and urban/landfill mining and product design hence enhancing resource efficiency. Thus various scales of detail link product design with metallurgical process design and its fundamentals.

  2. Underground Mining Method Selection Using WPM and PROMETHEE

    NASA Astrophysics Data System (ADS)

    Balusa, Bhanu Chander; Singam, Jayanthu

    2018-04-01

    The aim of this paper is to represent the solution to the problem of selecting suitable underground mining method for the mining industry. It is achieved by using two multi-attribute decision making techniques. These two techniques are weighted product method (WPM) and preference ranking organization method for enrichment evaluation (PROMETHEE). In this paper, analytic hierarchy process is used for weight's calculation of the attributes (i.e. parameters which are used in this paper). Mining method selection depends on physical parameters, mechanical parameters, economical parameters and technical parameters. WPM and PROMETHEE techniques have the ability to consider the relationship between the parameters and mining methods. The proposed techniques give higher accuracy and faster computation capability when compared with other decision making techniques. The proposed techniques are presented to determine the effective mining method for bauxite mine. The results of these techniques are compared with methods used in the earlier research works. The results show, conventional cut and fill method is the most suitable mining method.

  3. Weighted Association Rule Mining for Item Groups with Different Properties and Risk Assessment for Networked Systems

    NASA Astrophysics Data System (ADS)

    Kim, Jungja; Ceong, Heetaek; Won, Yonggwan

    In market-basket analysis, weighted association rule (WAR) discovery can mine the rules that include more beneficial information by reflecting item importance for special products. In the point-of-sale database, each transaction is composed of items with similar properties, and item weights are pre-defined and fixed by a factor such as the profit. However, when items are divided into more than one group and the item importance must be measured independently for each group, traditional weighted association rule discovery cannot be used. To solve this problem, we propose a new weighted association rule mining methodology. The items should be first divided into subgroups according to their properties, and the item importance, i.e. item weight, is defined or calculated only with the items included in the subgroup. Then, transaction weight is measured by appropriately summing the item weights from each subgroup, and the weighted support is computed as the fraction of the transaction weights that contains the candidate items relative to the weight of all transactions. As an example, our proposed methodology is applied to assess the vulnerability to threats of computer systems that provide networked services. Our algorithm provides both quantitative risk-level values and qualitative risk rules for the security assessment of networked computer systems using WAR discovery. Also, it can be widely used for new applications with many data sets in which the data items are distinctly separated.

  4. An efficient incremental learning mechanism for tracking concept drift in spam filtering

    PubMed Central

    Sheu, Jyh-Jian; Chu, Ko-Tsung; Li, Nien-Feng; Lee, Cheng-Chi

    2017-01-01

    This research manages in-depth analysis on the knowledge about spams and expects to propose an efficient spam filtering method with the ability of adapting to the dynamic environment. We focus on the analysis of email’s header and apply decision tree data mining technique to look for the association rules about spams. Then, we propose an efficient systematic filtering method based on these association rules. Our systematic method has the following major advantages: (1) Checking only the header sections of emails, which is different from those spam filtering methods at present that have to analyze fully the email’s content. Meanwhile, the email filtering accuracy is expected to be enhanced. (2) Regarding the solution to the problem of concept drift, we propose a window-based technique to estimate for the condition of concept drift for each unknown email, which will help our filtering method in recognizing the occurrence of spam. (3) We propose an incremental learning mechanism for our filtering method to strengthen the ability of adapting to the dynamic environment. PMID:28182691

  5. A review of approaches to identifying patient phenotype cohorts using electronic health records

    PubMed Central

    Shivade, Chaitanya; Raghavan, Preethi; Fosler-Lussier, Eric; Embi, Peter J; Elhadad, Noemie; Johnson, Stephen B; Lai, Albert M

    2014-01-01

    Objective To summarize literature describing approaches aimed at automatically identifying patients with a common phenotype. Materials and methods We performed a review of studies describing systems or reporting techniques developed for identifying cohorts of patients with specific phenotypes. Every full text article published in (1) Journal of American Medical Informatics Association, (2) Journal of Biomedical Informatics, (3) Proceedings of the Annual American Medical Informatics Association Symposium, and (4) Proceedings of Clinical Research Informatics Conference within the past 3 years was assessed for inclusion in the review. Only articles using automated techniques were included. Results Ninety-seven articles met our inclusion criteria. Forty-six used natural language processing (NLP)-based techniques, 24 described rule-based systems, 41 used statistical analyses, data mining, or machine learning techniques, while 22 described hybrid systems. Nine articles described the architecture of large-scale systems developed for determining cohort eligibility of patients. Discussion We observe that there is a rise in the number of studies associated with cohort identification using electronic medical records. Statistical analyses or machine learning, followed by NLP techniques, are gaining popularity over the years in comparison with rule-based systems. Conclusions There are a variety of approaches for classifying patients into a particular phenotype. Different techniques and data sources are used, and good performance is reported on datasets at respective institutions. However, no system makes comprehensive use of electronic medical records addressing all of their known weaknesses. PMID:24201027

  6. [Research of bleeding volume and method in blood-letting acupuncture therapy based on data mining].

    PubMed

    Liu, Xin; Jia, Chun-Sheng; Wang, Jian-Ling; Du, Yu-Zhu; Zhang, Xiao-Xu; Shi, Jing; Li, Xiao-Feng; Sun, Yan-Hui; Zhang, Shen; Zhang, Xuan-Ping; Gang, Wei-Juan

    2014-03-01

    Through computer-based technology and data mining method, with treatment in cases of bloodletting acupuncture therapy in collected literature as sample data, the association rule in data mining was applied. According to self-built database platform, the data was input, arranged and summarized, and eventually required data was acquired to perform the data mining of bleeding volume and method in blood-letting acupuncture therapy, which summarized its application rules and clinical values to provide better guide for clinical practice. There were 9 kinds of blood-letting tools in the literature, in which the frequency of three-edge needle was the highest, accounting for 84.4% (1239/1468). The bleeding volume was classified into six levels, in which less volume (less than 0.1 mL) had the highest frequency (401 times). According to the results of the data mining, blood-letting acupuncture therapy was widely applied in clinical practice of acupuncture, in which use of three-edge needle and less volume (less than 0.1 mL) of blood were the most common, however, there was no central tendency in general.

  7. 20 CFR 410.681 - Change of ruling or legal precedent.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 20 Employees' Benefits 2 2011-04-01 2011-04-01 false Change of ruling or legal precedent. 410.681 Section 410.681 Employees' Benefits SOCIAL SECURITY ADMINISTRATION FEDERAL COAL MINE HEALTH AND SAFETY ACT..., Administrative Review, Finality of Decisions, and Representation of Parties § 410.681 Change of ruling or legal...

  8. 20 CFR 410.681 - Change of ruling or legal precedent.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Change of ruling or legal precedent. 410.681 Section 410.681 Employees' Benefits SOCIAL SECURITY ADMINISTRATION FEDERAL COAL MINE HEALTH AND SAFETY ACT..., Administrative Review, Finality of Decisions, and Representation of Parties § 410.681 Change of ruling or legal...

  9. Japanese Aggression in Asia (1895-1930). Japan’s Dream of ’Hakko Ichuo’ (Eight Corners of the World under Japanese Rule).

    DTIC Science & Technology

    1980-12-01

    the British Navy was also of significant value, for then Britannia still ruled the waves. The huge indemnity received from the Chinese played an...11 among the sons, the eldest took all and the second and third sons became either factory or mine workers or apprentices of a merchant. When...warehouses, spin- ning, paper and sugar mills, all based on the large profits which came from banking, mining and foreign trade. Mitsubishi had its

  10. CARSVM: a class association rule-based classification framework and its application to gene expression data.

    PubMed

    Kianmehr, Keivan; Alhajj, Reda

    2008-09-01

    In this study, we aim at building a classification framework, namely the CARSVM model, which integrates association rule mining and support vector machine (SVM). The goal is to benefit from advantages of both, the discriminative knowledge represented by class association rules and the classification power of the SVM algorithm, to construct an efficient and accurate classifier model that improves the interpretability problem of SVM as a traditional machine learning technique and overcomes the efficiency issues of associative classification algorithms. In our proposed framework: instead of using the original training set, a set of rule-based feature vectors, which are generated based on the discriminative ability of class association rules over the training samples, are presented to the learning component of the SVM algorithm. We show that rule-based feature vectors present a high-qualified source of discrimination knowledge that can impact substantially the prediction power of SVM and associative classification techniques. They provide users with more conveniences in terms of understandability and interpretability as well. We have used four datasets from UCI ML repository to evaluate the performance of the developed system in comparison with five well-known existing classification methods. Because of the importance and popularity of gene expression analysis as real world application of the classification model, we present an extension of CARSVM combined with feature selection to be applied to gene expression data. Then, we describe how this combination will provide biologists with an efficient and understandable classifier model. The reported test results and their biological interpretation demonstrate the applicability, efficiency and effectiveness of the proposed model. From the results, it can be concluded that a considerable increase in classification accuracy can be obtained when the rule-based feature vectors are integrated in the learning process of the SVM algorithm. In the context of applicability, according to the results obtained from gene expression analysis, we can conclude that the CARSVM system can be utilized in a variety of real world applications with some adjustments.

  11. British Defense Policy: A New Approach?

    DTIC Science & Technology

    1988-12-14

    inherent to their well-being, was also acknowledged by the remainder of the world in its attitude toward Britain. Is not "Rule Britannia , Britannia ...Castle Class 1 1 Island Class 7 43 Mine -Counter Minesweepers 2 2 Mine River Class 12 Ton Class 10 3 Hunt Class 12 1 Patrol Craft Bird Class 5 Coastal 15...submarine warfare carriers, assault ships, and mine -counter mine vessels. British naval aircraft is as depicted in Table 2. Table 2. Aircraft of the Royal

  12. 77 FR 54490 - Alabama Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-05

    ... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation... will follow for the public hearing, if one is requested. DATES: We will accept written comments on this...

  13. Quantifying Associations between Environmental Stressors and Demographic Factors

    EPA Science Inventory

    Association rule mining (ARM) [1-3], also known as frequent item set mining [4] or market basket analysis [1], has been widely applied in many different areas, such as business product portfolio planning [5], intrusion detection infrastructure design [6], gene expression analysis...

  14. 43 CFR 3420.1-4 - General requirements for land use planning.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... mining by other than underground mining techniques. (ii) For the purposes of this paragraph, any surface... techniques shall be deemed to have expressed a preference in favor of mining. Where a significant number of... underground mining techniques, that area shall be considered acceptable for further consideration only for...

  15. 43 CFR 3420.1-4 - General requirements for land use planning.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... mining by other than underground mining techniques. (ii) For the purposes of this paragraph, any surface... techniques shall be deemed to have expressed a preference in favor of mining. Where a significant number of... underground mining techniques, that area shall be considered acceptable for further consideration only for...

  16. 43 CFR 3420.1-4 - General requirements for land use planning.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... mining by other than underground mining techniques. (ii) For the purposes of this paragraph, any surface... techniques shall be deemed to have expressed a preference in favor of mining. Where a significant number of... underground mining techniques, that area shall be considered acceptable for further consideration only for...

  17. 43 CFR 3420.1-4 - General requirements for land use planning.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... mining by other than underground mining techniques. (ii) For the purposes of this paragraph, any surface... techniques shall be deemed to have expressed a preference in favor of mining. Where a significant number of... underground mining techniques, that area shall be considered acceptable for further consideration only for...

  18. Efficient discovery of risk patterns in medical data.

    PubMed

    Li, Jiuyong; Fu, Ada Wai-chee; Fahey, Paul

    2009-01-01

    This paper studies a problem of efficiently discovering risk patterns in medical data. Risk patterns are defined by a statistical metric, relative risk, which has been widely used in epidemiological research. To avoid fruitless search in the complete exploration of risk patterns, we define optimal risk pattern set to exclude superfluous patterns, i.e. complicated patterns with lower relative risk than their corresponding simpler form patterns. We prove that mining optimal risk pattern sets conforms an anti-monotone property that supports an efficient mining algorithm. We propose an efficient algorithm for mining optimal risk pattern sets based on this property. We also propose a hierarchical structure to present discovered patterns for the easy perusal by domain experts. The proposed approach is compared with two well-known rule discovery methods, decision tree and association rule mining approaches on benchmark data sets and applied to a real world application. The proposed method discovers more and better quality risk patterns than a decision tree approach. The decision tree method is not designed for such applications and is inadequate for pattern exploring. The proposed method does not discover a large number of uninteresting superfluous patterns as an association mining approach does. The proposed method is more efficient than an association rule mining method. A real world case study shows that the method reveals some interesting risk patterns to medical practitioners. The proposed method is an efficient approach to explore risk patterns. It quickly identifies cohorts of patients that are vulnerable to a risk outcome from a large data set. The proposed method is useful for exploratory study on large medical data to generate and refine hypotheses. The method is also useful for designing medical surveillance systems.

  19. Medication regularity of pulmonary fibrosis treatment by contemporary traditional Chinese medicine experts based on data mining.

    PubMed

    Zhang, Suxian; Wu, Hao; Liu, Jie; Gu, Huihui; Li, Xiujuan; Zhang, Tiansong

    2018-03-01

    Treatment of pulmonary fibrosis by traditional Chinese medicine (TCM) has accumulated important experience. Our interest is in exploring the medication regularity of contemporary Chinese medical specialists treating pulmonary fibrosis. Through literature search, medical records from TCM experts who treat pulmonary fibrosis, which were published in Chinese and English medical journals, were selected for this study. As the object of study, a database was established after analysing the records. After data cleaning, the rules of medicine in the treatment of pulmonary fibrosis in medical records of TCM were explored by using data mining technologies such as frequency analysis, association rule analysis, and link analysis. A total of 124 medical records from 60 doctors were selected in this study; 263 types of medicinals were used a total of 5,455 times; the herbs that were used more than 30 times can be grouped into 53 species and were used a total of 3,681 times. Using main medicinals cluster analysis, medicinals were divided into qi-tonifying, yin-tonifying, blood-activating, phlegm-resolving, cough-suppressing, panting-calming, and ten other major medicinal categories. According to the set conditions, a total of 62 drug compatibility rules have been obtained, involving mainly qi-tonifying, yin-tonifying, blood-activating, phlegm-resolving, qi-descending, and panting-calming medicinals, as well as other medicinals used in combination. The results of data mining are consistent with clinical practice and it is feasible to explore the medical rules applicable to the treatment of pulmonary fibrosis in medical records of TCM by data mining.

  20. A novel artificial immune clonal selection classification and rule mining with swarm learning model

    NASA Astrophysics Data System (ADS)

    Al-Sheshtawi, Khaled A.; Abdul-Kader, Hatem M.; Elsisi, Ashraf B.

    2013-06-01

    Metaheuristic optimisation algorithms have become popular choice for solving complex problems. By integrating Artificial Immune clonal selection algorithm (CSA) and particle swarm optimisation (PSO) algorithm, a novel hybrid Clonal Selection Classification and Rule Mining with Swarm Learning Algorithm (CS2) is proposed. The main goal of the approach is to exploit and explore the parallel computation merit of Clonal Selection and the speed and self-organisation merits of Particle Swarm by sharing information between clonal selection population and particle swarm. Hence, we employed the advantages of PSO to improve the mutation mechanism of the artificial immune CSA and to mine classification rules within datasets. Consequently, our proposed algorithm required less training time and memory cells in comparison to other AIS algorithms. In this paper, classification rule mining has been modelled as a miltiobjective optimisation problem with predictive accuracy. The multiobjective approach is intended to allow the PSO algorithm to return an approximation to the accuracy and comprehensibility border, containing solutions that are spread across the border. We compared our proposed algorithm classification accuracy CS2 with five commonly used CSAs, namely: AIRS1, AIRS2, AIRS-Parallel, CLONALG, and CSCA using eight benchmark datasets. We also compared our proposed algorithm classification accuracy CS2 with other five methods, namely: Naïve Bayes, SVM, MLP, CART, and RFB. The results show that the proposed algorithm is comparable to the 10 studied algorithms. As a result, the hybridisation, built of CSA and PSO, can develop respective merit, compensate opponent defect, and make search-optimal effect and speed better.

  1. Urinary metabolic profiling of asymptomatic acute intermittent porphyria using a rule-mining-based algorithm.

    PubMed

    Luck, Margaux; Schmitt, Caroline; Talbi, Neila; Gouya, Laurent; Caradeuc, Cédric; Puy, Hervé; Bertho, Gildas; Pallet, Nicolas

    2018-01-01

    Metabolomic profiling combines Nuclear Magnetic Resonance spectroscopy with supervised statistical analysis that might allow to better understanding the mechanisms of a disease. In this study, the urinary metabolic profiling of individuals with porphyrias was performed to predict different types of disease, and to propose new pathophysiological hypotheses. Urine 1 H-NMR spectra of 73 patients with asymptomatic acute intermittent porphyria (aAIP) and familial or sporadic porphyria cutanea tarda (f/sPCT) were compared using a supervised rule-mining algorithm. NMR spectrum buckets bins, corresponding to rules, were extracted and a logistic regression was trained. Our rule-mining algorithm generated results were consistent with those obtained using partial least square discriminant analysis (PLS-DA) and the predictive performance of the model was significant. Buckets that were identified by the algorithm corresponded to metabolites involved in glycolysis and energy-conversion pathways, notably acetate, citrate, and pyruvate, which were found in higher concentrations in the urines of aAIP compared with PCT patients. Metabolic profiling did not discriminate sPCT from fPCT patients. These results suggest that metabolic reprogramming occurs in aAIP individuals, even in the absence of overt symptoms, and supports the relationship that occur between heme synthesis and mitochondrial energetic metabolism.

  2. Mining Land Subsidence Monitoring Using SENTINEL-1 SAR Data

    NASA Astrophysics Data System (ADS)

    Yuan, W.; Wang, Q.; Fan, J.; Li, H.

    2017-09-01

    In this paper, DInSAR technique was used to monitor land subsidence in mining area. The study area was selected in the coal mine area located in Yuanbaoshan District, Chifeng City, and Sentinel-1 data were used to carry out DInSAR techniqu. We analyzed the interferometric results by Sentinel-1 data from December 2015 to May 2016. Through the comparison of the results of DInSAR technique and the location of the mine on the optical images, it is shown that DInSAR technique can be used to effectively monitor the land subsidence caused by underground mining, and it is an effective tool for law enforcement of over-mining.

  3. 26 CFR 1.614-3 - Rules relating to separate operating mineral interests in the case of mines.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... method of mining the mineral, the location of the excavations or other workings in relation to the mineral deposit or deposits, and the topography of the area. The determination of the taxpayer as to the...

  4. 30 CFR 56.18006 - New employees.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... New employees. New employees shall be indoctrinated in safety rules and safe work procedures. ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false New employees. 56.18006 Section 56.18006 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE...

  5. Power System Transient Stability Based on Data Mining Theory

    NASA Astrophysics Data System (ADS)

    Cui, Zhen; Shi, Jia; Wu, Runsheng; Lu, Dan; Cui, Mingde

    2018-01-01

    In order to study the stability of power system, a power system transient stability based on data mining theory is designed. By introducing association rules analysis in data mining theory, an association classification method for transient stability assessment is presented. A mathematical model of transient stability assessment based on data mining technology is established. Meanwhile, combining rule reasoning with classification prediction, the method of association classification is proposed to perform transient stability assessment. The transient stability index is used to identify the samples that cannot be correctly classified in association classification. Then, according to the critical stability of each sample, the time domain simulation method is used to determine the state, so as to ensure the accuracy of the final results. The results show that this stability assessment system can improve the speed of operation under the premise that the analysis result is completely correct, and the improved algorithm can find out the inherent relation between the change of power system operation mode and the change of transient stability degree.

  6. 20 CFR 410.703 - Adjudicatory rules for determining entitlement to benefits.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... COAL MINE HEALTH AND SAFETY ACT OF 1969, TITLE IV-BLACK LUNG BENEFITS (1969- ) Rules for the Review of Denied and Pending Claims Under the Black Lung Benefits Reform Act (BLBRA) of 1977 § 410.703 Adjudicatory...

  7. 20 CFR 410.703 - Adjudicatory rules for determining entitlement to benefits.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... COAL MINE HEALTH AND SAFETY ACT OF 1969, TITLE IV-BLACK LUNG BENEFITS (1969- ) Rules for the Review of Denied and Pending Claims Under the Black Lung Benefits Reform Act (BLBRA) of 1977 § 410.703 Adjudicatory...

  8. 43 CFR 4.1383 - Hearing.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 43 Public Lands: Interior 1 2014-10-01 2014-10-01 false Hearing. 4.1383 Section 4.1383 Public Lands: Interior Office of the Secretary of the Interior DEPARTMENT HEARINGS AND APPEALS PROCEDURES Special Rules Applicable to Surface Coal Mining Hearings and Appeals Review of Office of Surface Mining...

  9. 30 CFR 48.6 - Experienced miner training.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    .... (b) Experienced miners must complete the training prescribed in this section before beginning work... to work environment. The course shall include a visit and tour of the mine. The methods of mining... responsibilities of such supervisors and miners' representatives; and an introduction to the operator's rules and...

  10. 43 CFR 3483.6 - Special logical mining unit rules.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... the LMU, of either Federal or non-Federal recoverable coal reserves or a combination thereof, shall be... Section 3483.6 Public Lands: Interior Regulations Relating to Public Lands (Continued) BUREAU OF LAND MANAGEMENT, DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) COAL EXPLORATION AND MINING OPERATIONS...

  11. 30 CFR 939.700 - Rhode Island Federal program.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Rhode Island Federal program. (a) This part contains all rules that are applicable to surface coal... to all surface coal mining and reclamation operations in Rhode Island conducted on non-Federal and... stringent environmental control and regulation of surface coal mining and reclamation operations than do the...

  12. 43 CFR 3483.6 - Special logical mining unit rules.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... the LMU, of either Federal or non-Federal recoverable coal reserves or a combination thereof, shall be... Section 3483.6 Public Lands: Interior Regulations Relating to Public Lands (Continued) BUREAU OF LAND MANAGEMENT, DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) COAL EXPLORATION AND MINING OPERATIONS...

  13. 43 CFR 3483.6 - Special logical mining unit rules.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... the LMU, of either Federal or non-Federal recoverable coal reserves or a combination thereof, shall be... Section 3483.6 Public Lands: Interior Regulations Relating to Public Lands (Continued) BUREAU OF LAND MANAGEMENT, DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) COAL EXPLORATION AND MINING OPERATIONS...

  14. 43 CFR 4.1351 - Preliminary finding by OSM.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... APPEALS PROCEDURES Special Rules Applicable to Surface Coal Mining Hearings and Appeals Request for...(c) of the Act, 30 U.s.c. 1260(c) (federal Program; Federal Lands Program; Federal Program for Indian... or has controlled surface coal mining and reclamation operations with a demonstrated pattern of...

  15. 43 CFR 3483.6 - Special logical mining unit rules.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... the LMU, of either Federal or non-Federal recoverable coal reserves or a combination thereof, shall be... Section 3483.6 Public Lands: Interior Regulations Relating to Public Lands (Continued) BUREAU OF LAND MANAGEMENT, DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) COAL EXPLORATION AND MINING OPERATIONS...

  16. 43 CFR 4.1383 - Hearing.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 43 Public Lands: Interior 1 2010-10-01 2010-10-01 false Hearing. 4.1383 Section 4.1383 Public Lands: Interior Office of the Secretary of the Interior DEPARTMENT HEARINGS AND APPEALS PROCEDURES Special Rules Applicable to Surface Coal Mining Hearings and Appeals Review of Office of Surface Mining...

  17. Association rule mining on grid monitoring data to detect error sources

    NASA Astrophysics Data System (ADS)

    Maier, Gerhild; Schiffers, Michael; Kranzlmueller, Dieter; Gaidioz, Benjamin

    2010-04-01

    Error handling is a crucial task in an infrastructure as complex as a grid. There are several monitoring tools put in place, which report failing grid jobs including exit codes. However, the exit codes do not always denote the actual fault, which caused the job failure. Human time and knowledge is required to manually trace back errors to the real fault underlying an error. We perform association rule mining on grid job monitoring data to automatically retrieve knowledge about the grid components' behavior by taking dependencies between grid job characteristics into account. Therewith, problematic grid components are located automatically and this information - expressed by association rules - is visualized in a web interface. This work achieves a decrease in time for fault recovery and yields an improvement of a grid's reliability.

  18. An Automated Technique to Construct a Knowledge Base of Traditional Chinese Herbal Medicine for Cancers: An Exploratory Study for Breast Cancer.

    PubMed

    Nguyen, Phung Anh; Yang, Hsuan-Chia; Xu, Rong; Li, Yu-Chuan Jack

    2018-01-01

    Traditional Chinese Medicine utilization has rapidly increased worldwide. However, there is limited database provides the information of TCM herbs and diseases. The study aims to identify and evaluate the meaningful associations between TCM herbs and breast cancer by using the association rule mining (ARM) techniques. We employed the ARM techniques for 19.9 million TCM prescriptions by using Taiwan National Health Insurance claim database from 1999 to 2013. 364 TCM herbs-breast cancer associations were derived from those prescriptions and were then filtered by their support of 20. Resulting of 296 associations were evaluated by comparing to a gold-standard that was curated information from Chinese-Wikipedia with the following terms, cancer, tumor, malignant. All 14 TCM herbs-breast cancer associations with their confidence of 1% were valid when compared to gold-standard. For other confidences, the statistical results showed consistently with high precisions. We thus succeed to identify the TCM herbs-breast cancer associations with useful techniques.

  19. Evaluation of a rule-based method for epidemiological document classification towards the automation of systematic reviews.

    PubMed

    Karystianis, George; Thayer, Kristina; Wolfe, Mary; Tsafnat, Guy

    2017-06-01

    Most data extraction efforts in epidemiology are focused on obtaining targeted information from clinical trials. In contrast, limited research has been conducted on the identification of information from observational studies, a major source for human evidence in many fields, including environmental health. The recognition of key epidemiological information (e.g., exposures) through text mining techniques can assist in the automation of systematic reviews and other evidence summaries. We designed and applied a knowledge-driven, rule-based approach to identify targeted information (study design, participant population, exposure, outcome, confounding factors, and the country where the study was conducted) from abstracts of epidemiological studies included in several systematic reviews of environmental health exposures. The rules were based on common syntactical patterns observed in text and are thus not specific to any systematic review. To validate the general applicability of our approach, we compared the data extracted using our approach versus hand curation for 35 epidemiological study abstracts manually selected for inclusion in two systematic reviews. The returned F-score, precision, and recall ranged from 70% to 98%, 81% to 100%, and 54% to 97%, respectively. The highest precision was observed for exposure, outcome and population (100%) while recall was best for exposure and study design with 97% and 89%, respectively. The lowest recall was observed for the population (54%), which also had the lowest F-score (70%). The generated performance of our text-mining approach demonstrated encouraging results for the identification of targeted information from observational epidemiological study abstracts related to environmental exposures. We have demonstrated that rules based on generic syntactic patterns in one corpus can be applied to other observational study design by simple interchanging the dictionaries aiming to identify certain characteristics (i.e., outcomes, exposures). At the document level, the recognised information can assist in the selection and categorization of studies included in a systematic review. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. 75 FR 26828 - Self-Regulatory Organizations; Chicago Board Options Exchange, Incorporated; Notice of Filing and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-05-12

    ... amend [sic] its rules relating to the Penny Pilot Program. The text of the rule proposal is available on... proposed rule change. The text of those statements may be examined at the places specified in Item IV below... Technology Select Sector XME SPDR S&P Metals & Mining SPDR Fund. ETF. AKS AK Steel Holding Corp... KGC...

  1. Federal Register Notice for the Mining Waste Exclusion Final Rule, September 1, 1989

    EPA Pesticide Factsheets

    Final rule responding to a federal Appeals Court directive to narrow the exclusion of solid waste from the extraction, beneficiation, and processing of ores and minerals from regulation as hazardous waste as it applies to mineral processing wastes.

  2. 43 CFR 4.1109 - Service.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... Special Rules Applicable to Surface Coal Mining Hearings and Appeals General Provisions § 4.1109 Service.... Department of the Interior, representing OSMRE in the state in which the mining operation at issue is located, and on any other statutory parties specified under § 4.1105 of this part. (2) The jurisdictions...

  3. 78 FR 37404 - Small Business Size Standards: Support Activities for Mining

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-06-20

    ... SMALL BUSINESS ADMINISTRATION 13 CFR Part 121 RIN 3245-AG44 Small Business Size Standards: Support Activities for Mining AGENCY: U.S. Small Business Administration. ACTION: Final rule. SUMMARY: The United States Small Business Administration (SBA) is increasing the small business size standards for three of...

  4. 26 CFR 1.611-5 - Depreciation of improvements.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... (CONTINUED) INCOME TAXES (CONTINUED) Natural Resources § 1.611-5 Depreciation of improvements. (a) In general. Section 611 provides in the case of mines, oil and gas wells, other natural deposits, and timber that...). (b) Special rules for mines, oil and gas wells, other natural deposits and timber. (1) For principles...

  5. 75 FR 21987 - Penalty Settlement Procedure

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-27

    ... and Health Act of 1977, or Mine Act. Hearings are held before the Commission's Administrative Law... settling civil penalties assessed under the Mine Act. DATES: The interim rule takes effect on May 27, 2010... Commission has explored is to simplify how it processes civil penalty settlements. Under section 110(k) of...

  6. 30 CFR 906.1 - Scope.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 3 2010-07-01 2010-07-01 false Scope. 906.1 Section 906.1 Mineral Resources... OF SURFACE MINING OPERATIONS WITHIN EACH STATE COLORADO § 906.1 Scope. This part contains all rules applicable only within Colorado that have been adopted under the Surface Mining Control and Reclamation Act...

  7. 75 FR 52980 - Submission for OMB Review; Comment Request

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-08-30

    .../maintaining): $303,512. Description: The Safety Standards for Underground Coal Mine Ventilation Belt Entry rule provides safety requirements for the use of the conveyor belt entry as a ventilation intake to... Underground Coal Mine Ventilation--Belt Entry Used as an Intake Air Course to Ventilate Working Sections and...

  8. Data Mining in Health and Medical Information.

    ERIC Educational Resources Information Center

    Bath, Peter A.

    2004-01-01

    Presents a literature review that covers the following topics related to data mining (DM) in health and medical information: the potential of DM in health and medicine; statistical methods; evaluation of methods; DM tools for health and medicine; inductive learning of symbolic rules; application of DM tools in diagnosis and prognosis; and…

  9. Army Needs to Identify Government Purchase Card High-Risk Transactions

    DTIC Science & Technology

    2012-01-20

    Purchase Card Program Data Mining Process Needs Improvement 11...Mining Process Needs Improvement The 17 transactions that were noncompliant occurred because cardholders ignored the GPC business rules so the...Scope and Methodology 16 Use of Computer- Processed Data 16 Use of Technical Assistance 17 Prior Coverage

  10. A Study of Pattern Prediction in the Monitoring Data of Earthen Ruins with the Internet of Things.

    PubMed

    Xiao, Yun; Wang, Xin; Eshragh, Faezeh; Wang, Xuanhong; Chen, Xiaojiang; Fang, Dingyi

    2017-05-11

    An understanding of the changes of the rammed earth temperature of earthen ruins is important for protection of such ruins. To predict the rammed earth temperature pattern using the air temperature pattern of the monitoring data of earthen ruins, a pattern prediction method based on interesting pattern mining and correlation, called PPER, is proposed in this paper. PPER first finds the interesting patterns in the air temperature sequence and the rammed earth temperature sequence. To reduce the processing time, two pruning rules and a new data structure based on an R-tree are also proposed. Correlation rules between the air temperature patterns and the rammed earth temperature patterns are then mined. The correlation rules are merged into predictive rules for the rammed earth temperature pattern. Experiments were conducted to show the accuracy of the presented method and the power of the pruning rules. Moreover, the Ming Dynasty Great Wall dataset was used to examine the algorithm, and six predictive rules from the air temperature to rammed earth temperature based on the interesting patterns were obtained, with the average hit rate reaching 89.8%. The PPER and predictive rules will be useful for rammed earth temperature prediction in protection of earthen ruins.

  11. A semantic-based method for extracting concept definitions from scientific publications: evaluation in the autism phenotype domain.

    PubMed

    Hassanpour, Saeed; O'Connor, Martin J; Das, Amar K

    2013-08-12

    A variety of informatics approaches have been developed that use information retrieval, NLP and text-mining techniques to identify biomedical concepts and relations within scientific publications or their sentences. These approaches have not typically addressed the challenge of extracting more complex knowledge such as biomedical definitions. In our efforts to facilitate knowledge acquisition of rule-based definitions of autism phenotypes, we have developed a novel semantic-based text-mining approach that can automatically identify such definitions within text. Using an existing knowledge base of 156 autism phenotype definitions and an annotated corpus of 26 source articles containing such definitions, we evaluated and compared the average rank of correctly identified rule definition or corresponding rule template using both our semantic-based approach and a standard term-based approach. We examined three separate scenarios: (1) the snippet of text contained a definition already in the knowledge base; (2) the snippet contained an alternative definition for a concept in the knowledge base; and (3) the snippet contained a definition not in the knowledge base. Our semantic-based approach had a higher average rank than the term-based approach for each of the three scenarios (scenario 1: 3.8 vs. 5.0; scenario 2: 2.8 vs. 4.9; and scenario 3: 4.5 vs. 6.2), with each comparison significant at the p-value of 0.05 using the Wilcoxon signed-rank test. Our work shows that leveraging existing domain knowledge in the information extraction of biomedical definitions significantly improves the correct identification of such knowledge within sentences. Our method can thus help researchers rapidly acquire knowledge about biomedical definitions that are specified and evolving within an ever-growing corpus of scientific publications.

  12. Privacy Is Become with, Data Perturbation

    NASA Astrophysics Data System (ADS)

    Singh, Er. Niranjan; Singhai, Niky

    2011-06-01

    Privacy is becoming an increasingly important issue in many data mining applications that deal with health care, security, finance, behavior and other types of sensitive data. Is particularly becoming important in counterterrorism and homeland security-related applications. We touch upon several techniques of masking the data, namely random distortion, including the uniform and Gaussian noise, applied to the data in order to protect it. These perturbation schemes are equivalent to additive perturbation after the logarithmic Transformation. Due to the large volume of research in deriving private information from the additive noise perturbed data, the security of these perturbation schemes is questionable Many artificial intelligence and statistical methods exist for data analysis interpretation, Identifying and measuring the interestingness of patterns and rules discovered, or to be discovered is essential for the evaluation of the mined knowledge and the KDD process as a whole. While some concrete measurements exist, assessing the interestingness of discovered knowledge is still an important research issue. As the tool for the algorithm implementations we chose the language of choice in industrial world MATLAB.

  13. Chemical named entities recognition: a review on approaches and applications

    PubMed Central

    2014-01-01

    The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to “text mine” these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted. PMID:24834132

  14. Development of a GIService based on spatial data mining for location choice of convenience stores in Taipei City

    NASA Astrophysics Data System (ADS)

    Jung, Chinte; Sun, Chih-Hong

    2006-10-01

    Motivated by the increasing accessibility of technology, more and more spatial data are being made digitally available. How to extract the valuable knowledge from these large (spatial) databases is becoming increasingly important to businesses, as well. It is essential to be able to analyze and utilize these large datasets, convert them into useful knowledge, and transmit them through GIS-enabled instruments and the Internet, conveying the key information to business decision-makers effectively and benefiting business entities. In this research, we combine the techniques of GIS, spatial decision support system (SDSS), spatial data mining (SDM), and ArcGIS Server to achieve the following goals: (1) integrate databases from spatial and non-spatial datasets about the locations of businesses in Taipei, Taiwan; (2) use the association rules, one of the SDM methods, to extract the knowledge from the integrated databases; and (3) develop a Web-based SDSS GIService as a location-selection tool for business by the product of ArcGIS Server.

  15. Detecting Malicious Tweets in Twitter Using Runtime Monitoring With Hidden Information

    DTIC Science & Technology

    2016-06-01

    text mining using Twitter streaming API and python [Online]. Available: http://adilmoujahid.com/posts/2014/07/twitter-analytics/ [22] M. Singh, B...sites with 645,750,000 registered users [3] and has open source public tweets for data mining . 2. Malicious Users and Tweets In the modern world...want to data mine in Twitter, and presents the natural language assertions and corresponding rule patterns. It then describes the steps performed using

  16. 75 FR 73995 - Lowering Miners' Exposure to Respirable Coal Mine Dust, Including Continuous Personal Dust Monitors

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-11-30

    ... http://www.msha.gov/REGS/FEDREG/PROPOSED/2010PROP/2010-25249.pdf . The proposed rule would revise the.../PROPOSED/2010PROP/2010-25249.pdf . The following error in the preamble to the proposed rule is corrected to...

  17. 78 FR 5055 - Pattern of Violations

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-23

    ...The Mine Safety and Health Administration (MSHA) is revising the Agency's existing regulation for pattern of violations (POV). MSHA has determined that the existing regulation does not adequately achieve the intent of the Federal Mine Safety and Health Act of 1977 (Mine Act) that the POV provision be used to address mine operators who have demonstrated a disregard for the health and safety of miners. Congress included the POV provision in the Mine Act so that mine operators would manage health and safety conditions at mines and find and fix the root causes of significant and substantial (S&S) violations, protecting the health and safety of miners. The final rule simplifies the existing POV criteria, improves consistency in applying the POV criteria, and more effectively achieves the Mine Act's statutory intent. It also encourages chronic safety violators to comply with the Mine Act and MSHA's health and safety standards.

  18. Association rule mining in the US Vaccine Adverse Event Reporting System (VAERS).

    PubMed

    Wei, Lai; Scott, John

    2015-09-01

    Spontaneous adverse event reporting systems are critical tools for monitoring the safety of licensed medical products. Commonly used signal detection algorithms identify disproportionate product-adverse event pairs and may not be sensitive to more complex potential signals. We sought to develop a computationally tractable multivariate data-mining approach to identify product-multiple adverse event associations. We describe an application of stepwise association rule mining (Step-ARM) to detect potential vaccine-symptom group associations in the US Vaccine Adverse Event Reporting System. Step-ARM identifies strong associations between one vaccine and one or more adverse events. To reduce the number of redundant association rules found by Step-ARM, we also propose a clustering method for the post-processing of association rules. In sample applications to a trivalent intradermal inactivated influenza virus vaccine and to measles, mumps, rubella, and varicella (MMRV) vaccine and in simulation studies, we find that Step-ARM can detect a variety of medically coherent potential vaccine-symptom group signals efficiently. In the MMRV example, Step-ARM appears to outperform univariate methods in detecting a known safety signal. Our approach is sensitive to potentially complex signals, which may be particularly important when monitoring novel medical countermeasure products such as pandemic influenza vaccines. The post-processing clustering algorithm improves the applicability of the approach as a screening method to identify patterns that may merit further investigation. Copyright © 2015 John Wiley & Sons, Ltd.

  19. Association Rule Based Feature Extraction for Character Recognition

    NASA Astrophysics Data System (ADS)

    Dua, Sumeet; Singh, Harpreet

    Association rules that represent isomorphisms among data have gained importance in exploratory data analysis because they can find inherent, implicit, and interesting relationships among data. They are also commonly used in data mining to extract the conditions among attribute values that occur together frequently in a dataset [1]. These rules have wide range of applications, namely in the financial and retail sectors of marketing, sales, and medicine.

  20. Association algorithm to mine the rules that govern enzyme definition and to classify protein sequences.

    PubMed

    Chiu, Shih-Hau; Chen, Chien-Chi; Yuan, Gwo-Fang; Lin, Thy-Hou

    2006-06-15

    The number of sequences compiled in many genome projects is growing exponentially, but most of them have not been characterized experimentally. An automatic annotation scheme must be in an urgent need to reduce the gap between the amount of new sequences produced and reliable functional annotation. This work proposes rules for automatically classifying the fungus genes. The approach involves elucidating the enzyme classifying rule that is hidden in UniProt protein knowledgebase and then applying it for classification. The association algorithm, Apriori, is utilized to mine the relationship between the enzyme class and significant InterPro entries. The candidate rules are evaluated for their classificatory capacity. There were five datasets collected from the Swiss-Prot for establishing the annotation rules. These were treated as the training sets. The TrEMBL entries were treated as the testing set. A correct enzyme classification rate of 70% was obtained for the prokaryote datasets and a similar rate of about 80% was obtained for the eukaryote datasets. The fungus training dataset which lacks an enzyme class description was also used to evaluate the fungus candidate rules. A total of 88 out of 5085 test entries were matched with the fungus rule set. These were otherwise poorly annotated using their functional descriptions. The feasibility of using the method presented here to classify enzyme classes based on the enzyme domain rules is evident. The rules may be also employed by the protein annotators in manual annotation or implemented in an automatic annotation flowchart.

  1. Application of data mining in science and technology management information system based on WebGIS

    NASA Astrophysics Data System (ADS)

    Wu, Xiaofang; Xu, Zhiyong; Bao, Shitai; Chen, Feixiang

    2009-10-01

    With the rapid development of science and technology and the quick increase of information, a great deal of data is accumulated in the management department of science and technology. Usually, many knowledge and rules are contained and concealed in the data. Therefore, how to excavate and use the knowledge fully is very important in the management of science and technology. It will help to examine and approve the project of science and technology more scientifically and make the achievement transformed as the realistic productive forces easier. Therefore, the data mine technology will be researched and applied to the science and technology management information system to find and excavate the knowledge in the paper. According to analyzing the disadvantages of traditional science and technology management information system, the database technology, data mining and web geographic information systems (WebGIS) technology will be introduced to develop and construct the science and technology management information system based on WebGIS. The key problems are researched in detail such as data mining and statistical analysis. What's more, the prototype system is developed and validated based on the project data of National Natural Science Foundation Committee. The spatial data mining is done from the axis of time, space and other factors. Then the variety of knowledge and rules will be excavated by using data mining technology, which helps to provide an effective support for decisionmaking.

  2. 76 FR 6110 - Mine Safety Disclosure

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-03

    ... Comments Use the Commission's Internet comment form ( http://www.sec.gov/rules/proposed.shtml ); Send an e... all comments on the Commission's Internet Web site ( http://www.sec.gov/rules/proposed.shtml... on the proposal to, among other things, allow for the collection of information and improve the...

  3. 30 CFR 937.700 - Oregon Federal program.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 3 2012-07-01 2012-07-01 false Oregon Federal program. 937.700 Section 937.700... PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE OREGON § 937.700 Oregon Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in Oregon...

  4. 30 CFR 937.700 - Oregon Federal program.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 3 2011-07-01 2011-07-01 false Oregon Federal program. 937.700 Section 937.700... PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE OREGON § 937.700 Oregon Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in Oregon...

  5. 30 CFR 937.700 - Oregon Federal program.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 3 2014-07-01 2014-07-01 false Oregon Federal program. 937.700 Section 937.700... PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE OREGON § 937.700 Oregon Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in Oregon...

  6. Frequent Itemset Hiding Algorithm Using Frequent Pattern Tree Approach

    ERIC Educational Resources Information Center

    Alnatsheh, Rami

    2012-01-01

    A problem that has been the focus of much recent research in privacy preserving data-mining is the frequent itemset hiding (FIH) problem. Identifying itemsets that appear together frequently in customer transactions is a common task in association rule mining. Organizations that share data with business partners may consider some of the frequent…

  7. 76 FR 12648 - Lowering Miners' Exposure to Respirable Coal Mine Dust, Including Continuous Personal Dust Monitors

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-08

    ... be appropriate to use on a short-term basis. 13. The proposed rule addresses (1) which occupations... for respirable coal mine dust, provide for full- shift sampling, redefine the term ``normal production... respect to their availability. If shorter or longer timeframes are recommended, please provide the...

  8. Mining Research on Vibration Signal Association Rules of Quayside Container Crane Hoisting Motor Based on Apriori Algorithm

    NASA Astrophysics Data System (ADS)

    Yang, Chencheng; Tang, Gang; Hu, Xiong

    2017-07-01

    Shore-hoisting motor in the daily work will produce a large number of vibration signal data,in order to analyze the correlation among the data and discover the fault and potential safety hazard of the motor, the data are discretized first, and then Apriori algorithm are used to mine the strong association rules among the data. The results show that the relationship between day 1 and day 16 is the most closely related, which can guide the staff to analyze the work of these two days of motor to find and solve the problem of fault and safety.

  9. 30 CFR 910.817 - Performance standards-underground mining activities.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... with the Georgia Safe Dams Act and Rules for Safety of the Natural Resources, Environmental Protection Division; the Solid Waste Management Rules of the Georgia Department of Natural Resources, Environmental Protection Division, Chapter 391-3-4; and the Georgia Seed Laws and Regulation 4. [47 FR 36399, Aug. 19, 1982...

  10. 30 CFR 910.816 - Performance standards-surface mining activities.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... except in compliance with the Georgia Safe Dams Act and Rules for Safety of the Natural Resources, Environmental Protection Division; the Solid Waste Management Rules of the Georgia Department of Natural Resources, Environmental Protection Division, Chapter 391-3-4; and the Georgia Seed Laws and Regulation 4...

  11. 30 CFR 910.817 - Performance standards-underground mining activities.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... with the Georgia Safe Dams Act and Rules for Safety of the Natural Resources, Environmental Protection Division; the Solid Waste Management Rules of the Georgia Department of Natural Resources, Environmental Protection Division, Chapter 391-3-4; and the Georgia Seed Laws and Regulation 4. [47 FR 36399, Aug. 19, 1982...

  12. 30 CFR 910.817 - Performance standards-underground mining activities.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... with the Georgia Safe Dams Act and Rules for Safety of the Natural Resources, Environmental Protection Division; the Solid Waste Management Rules of the Georgia Department of Natural Resources, Environmental Protection Division, Chapter 391-3-4; and the Georgia Seed Laws and Regulation 4. [47 FR 36399, Aug. 19, 1982...

  13. 30 CFR 910.816 - Performance standards-surface mining activities.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... except in compliance with the Georgia Safe Dams Act and Rules for Safety of the Natural Resources, Environmental Protection Division; the Solid Waste Management Rules of the Georgia Department of Natural Resources, Environmental Protection Division, Chapter 391-3-4; and the Georgia Seed Laws and Regulation 4...

  14. 30 CFR 910.816 - Performance standards-surface mining activities.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... except in compliance with the Georgia Safe Dams Act and Rules for Safety of the Natural Resources, Environmental Protection Division; the Solid Waste Management Rules of the Georgia Department of Natural Resources, Environmental Protection Division, Chapter 391-3-4; and the Georgia Seed Laws and Regulation 4...

  15. Sustainability Activities In The Mining Sector: Current Status And Challenges Ahead Limestone Mining In Nusakambangan

    NASA Astrophysics Data System (ADS)

    Ayuningrum, Theresia Vika; Purnaweni, Hartuti

    2018-02-01

    Potential Karst area in Nusakambangan has an important role in maintaining the balance of nature. But with the existence of mining activities, will automatically change the environmental conditions there. In order for the utilization of resources to meet the rules of optimization between the interests of mining and sustainability of the environment so in every mining sector activities required a variety of environmental studies. The purpose of this study is to find out how the analysis of environmental management due to limestone mining activities in Nusakambangan so that it can be known the management of mining areas are optimal, wise based on ecological principles, and sustainability. In qualitative research methods, data analysis using description percentage, with the type of data collected in the form of primary data and secondary data.

  16. APPLYING DATA MINING APPROACHES TO FURTHER ...

    EPA Pesticide Factsheets

    This dataset will be used to illustrate various data mining techniques to biologically profile the chemical space. This dataset will be used to illustrate various data mining techniques to biologically profile the chemical space.

  17. Clustering and Dimensionality Reduction to Discover Interesting Patterns in Binary Data

    NASA Astrophysics Data System (ADS)

    Palumbo, Francesco; D'Enza, Alfonso Iodice

    The attention towards binary data coding increased consistently in the last decade due to several reasons. The analysis of binary data characterizes several fields of application, such as market basket analysis, DNA microarray data, image mining, text mining and web-clickstream mining. The paper illustrates two different approaches exploiting a profitable combination of clustering and dimensionality reduction for the identification of non-trivial association structures in binary data. An application in the Association Rules framework supports the theory with the empirical evidence.

  18. Online breakage detection of multitooth tools using classifier ensembles for imbalanced data

    NASA Astrophysics Data System (ADS)

    Bustillo, Andrés; Rodríguez, Juan J.

    2014-12-01

    Cutting tool breakage detection is an important task, due to its economic impact on mass production lines in the automobile industry. This task presents a central limitation: real data-sets are extremely imbalanced because breakage occurs in very few cases compared with normal operation of the cutting process. In this paper, we present an analysis of different data-mining techniques applied to the detection of insert breakage in multitooth tools. The analysis applies only one experimental variable: the electrical power consumption of the tool drive. This restriction profiles real industrial conditions more accurately than other physical variables, such as acoustic or vibration signals, which are not so easily measured. Many efforts have been made to design a method that is able to identify breakages with a high degree of reliability within a short period of time. The solution is based on classifier ensembles for imbalanced data-sets. Classifier ensembles are combinations of classifiers, which in many situations are more accurate than individual classifiers. Six different base classifiers are tested: Decision Trees, Rules, Naïve Bayes, Nearest Neighbour, Multilayer Perceptrons and Logistic Regression. Three different balancing strategies are tested with each of the classifier ensembles and compared to their performance with the original data-set: Synthetic Minority Over-Sampling Technique (SMOTE), undersampling and a combination of SMOTE and undersampling. To identify the most suitable data-mining solution, Receiver Operating Characteristics (ROC) graph and Recall-precision graph are generated and discussed. The performance of logistic regression ensembles on the balanced data-set using the combination of SMOTE and undersampling turned out to be the most suitable technique. Finally a comparison using industrial performance measures is presented, which concludes that this technique is also more suited to this industrial problem than the other techniques presented in the bibliography.

  19. Learning Semantic Tags from Big Data for Clinical Text Representation.

    PubMed

    Li, Yanpeng; Liu, Hongfang

    2015-01-01

    In clinical text mining, it is one of the biggest challenges to represent medical terminologies and n-gram terms in sparse medical reports using either supervised or unsupervised methods. Addressing this issue, we propose a novel method for word and n-gram representation at semantic level. We first represent each word by its distance with a set of reference features calculated by reference distance estimator (RDE) learned from labeled and unlabeled data, and then generate new features using simple techniques of discretization, random sampling and merging. The new features are a set of binary rules that can be interpreted as semantic tags derived from word and n-grams. We show that the new features significantly outperform classical bag-of-words and n-grams in the task of heart disease risk factor extraction in i2b2 2014 challenge. It is promising to see that semantics tags can be used to replace the original text entirely with even better prediction performance as well as derive new rules beyond lexical level.

  20. A systematic mapping study of process mining

    NASA Astrophysics Data System (ADS)

    Maita, Ana Rocío Cárdenas; Martins, Lucas Corrêa; López Paz, Carlos Ramón; Rafferty, Laura; Hung, Patrick C. K.; Peres, Sarajane Marques; Fantinato, Marcelo

    2018-05-01

    This study systematically assesses the process mining scenario from 2005 to 2014. The analysis of 705 papers evidenced 'discovery' (71%) as the main type of process mining addressed and 'categorical prediction' (25%) as the main mining task solved. The most applied traditional technique is the 'graph structure-based' ones (38%). Specifically concerning computational intelligence and machine learning techniques, we concluded that little relevance has been given to them. The most applied are 'evolutionary computation' (9%) and 'decision tree' (6%), respectively. Process mining challenges, such as balancing among robustness, simplicity, accuracy and generalization, could benefit from a larger use of such techniques.

  1. Privacy Preserving Sequential Pattern Mining in Data Stream

    NASA Astrophysics Data System (ADS)

    Huang, Qin-Hua

    The privacy preserving data mining technique researches have gained much attention in recent years. For data stream systems, wireless networks and mobile devices, the related stream data mining techniques research is still in its' early stage. In this paper, an data mining algorithm dealing with privacy preserving problem in data stream is presented.

  2. Assessment of corn and banana leaves as potential standardized substrates for leaf decomposition in streams affected by mountaintop removal coal mining, West Virginia, USA

    EPA Science Inventory

    Mountaintop removal and valley filling is a method of coal mining that buries Central Appalachian headwater streams. A 2007 federal court ruling highlighted the need for measurement of both ecosystem structure and function when assessing streams for mitigaton. Rapid functional as...

  3. Discovering amino acid patterns on binding sites in protein complexes

    PubMed Central

    Kuo, Huang-Cheng; Ong, Ping-Lin; Lin, Jung-Chang; Huang, Jen-Peng

    2011-01-01

    Discovering amino acid (AA) patterns on protein binding sites has recently become popular. We propose a method to discover the association relationship among AAs on binding sites. Such knowledge of binding sites is very helpful in predicting protein-protein interactions. In this paper, we focus on protein complexes which have protein-protein recognition. The association rule mining technique is used to discover geographically adjacent amino acids on a binding site of a protein complex. When mining, instead of treating all AAs of binding sites as a transaction, we geographically partition AAs of binding sites in a protein complex. AAs in a partition are treated as a transaction. For the partition process, AAs on a binding site are projected from three-dimensional to two-dimensional. And then, assisted with a circular grid, AAs on the binding site are placed into grid cells. A circular grid has ten rings: a central ring, the second ring with 6 sectors, the third ring with 12 sectors, and later rings are added to four sectors in order. As for the radius of each ring, we examined the complexes and found that 10Å is a suitable range, which can be set by the user. After placing these recognition complexes on the circular grid, we obtain mining records (i.e. transactions) from each sector. A sector is regarded as a record. Finally, we use the association rule to mine these records for frequent AA patterns. If the support of an AA pattern is larger than the predetermined minimum support (i.e. threshold), it is called a frequent pattern. With these discovered patterns, we offer the biologists a novel point of view, which will improve the prediction accuracy of protein-protein recognition. In our experiments, we produced the AA patterns by data mining. As a result, we found that arginine (arg) most frequently appears on the binding sites of two proteins in the recognition protein complexes, while cysteine (cys) appears the fewest. In addition, if we discriminate the shape of binding sites between concave and convex further, we discover that patterns {arg, glu, asp} and {arg, ser, asp} on the concave shape of binding sites in a protein more frequently (i.e. higher probability) make contact with {lys} or {arg} on the convex shape of binding sites in another protein. Thus, we can confidently achieve a rate of at least 78%. On the other hand {val, gly, lys} on the convex surface of binding sites in proteins is more frequently in contact with {asp} on the concave site of another protein, and the confidence achieved is over 81%. Applying data mining in biology can reveal more facts that may otherwise be ignored or not easily discovered by the naked eye. Furthermore, we can discover more relationships among AAs on binding sites by appropriately rotating these residues on binding sites from a three-dimension to two-dimension perspective. We designed a circular grid to deposit the data, which total to 463 records consisting of AAs. Then we used the association rules to mine these records for discovering relationships. The proposed method in this paper provides an insight into the characteristics of binding sites for recognition complexes. PMID:21464838

  4. Association algorithm to mine the rules that govern enzyme definition and to classify protein sequences

    PubMed Central

    Chiu, Shih-Hau; Chen, Chien-Chi; Yuan, Gwo-Fang; Lin, Thy-Hou

    2006-01-01

    Background The number of sequences compiled in many genome projects is growing exponentially, but most of them have not been characterized experimentally. An automatic annotation scheme must be in an urgent need to reduce the gap between the amount of new sequences produced and reliable functional annotation. This work proposes rules for automatically classifying the fungus genes. The approach involves elucidating the enzyme classifying rule that is hidden in UniProt protein knowledgebase and then applying it for classification. The association algorithm, Apriori, is utilized to mine the relationship between the enzyme class and significant InterPro entries. The candidate rules are evaluated for their classificatory capacity. Results There were five datasets collected from the Swiss-Prot for establishing the annotation rules. These were treated as the training sets. The TrEMBL entries were treated as the testing set. A correct enzyme classification rate of 70% was obtained for the prokaryote datasets and a similar rate of about 80% was obtained for the eukaryote datasets. The fungus training dataset which lacks an enzyme class description was also used to evaluate the fungus candidate rules. A total of 88 out of 5085 test entries were matched with the fungus rule set. These were otherwise poorly annotated using their functional descriptions. Conclusion The feasibility of using the method presented here to classify enzyme classes based on the enzyme domain rules is evident. The rules may be also employed by the protein annotators in manual annotation or implemented in an automatic annotation flowchart. PMID:16776838

  5. The Royal Navy and British Security Policy.

    DTIC Science & Technology

    1983-12-01

    supremacy were embodied in that fleet. Britannia ruled the waves around the world. -~ Sixty-six years later Rear Admiral Sandy Woodward went *into battle off...already sold to Australia and just over a dozen destroyers and frigates. Britannia ruled the waves around those remote islands only with great difficulty...with the Americans, vulnerability to mining and the costs in manpower and money that a larger force would require, ruled out the non-nuclear-powered

  6. Comparative analysis of data mining techniques for business data

    NASA Astrophysics Data System (ADS)

    Jamil, Jastini Mohd; Shaharanee, Izwan Nizal Mohd

    2014-12-01

    Data mining is the process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data contained within a database. Companies are using this tool to further understand their customers, to design targeted sales and marketing campaigns, to predict what product customers will buy and the frequency of purchase, and to spot trends in customer preferences that can lead to new product development. In this paper, we conduct a systematic approach to explore several of data mining techniques in business application. The experimental result reveals that all data mining techniques accomplish their goals perfectly, but each of the technique has its own characteristics and specification that demonstrate their accuracy, proficiency and preference.

  7. 75 FR 17803 - Self-Regulatory Organizations; NASDAQ OMX PHLX, Inc.; Notice of Filing and Immediate...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-07

    .... The text of the proposed rule change is available on the Exchange's Web site at http://nasdaqtrader... and discussed any comments it received on the proposed rule change. The text of these statements may... Mining Corporation (``NEM''); Palm, Inc. (``PALM''); Pfizer, Inc. (``PFE''); ''); Potash Corp...

  8. 29 CFR 2700.55 - Powers of Judges.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 29 Labor 9 2013-07-01 2013-07-01 false Powers of Judges. 2700.55 Section 2700.55 Labor Regulations Relating to Labor (Continued) FEDERAL MINE SAFETY AND HEALTH REVIEW COMMISSION PROCEDURAL RULES Hearings § 2700.55 Powers of Judges. Subject to these rules, a Judge is empowered to: (a) Administer oaths and...

  9. 29 CFR 2700.55 - Powers of Judges.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 29 Labor 9 2012-07-01 2012-07-01 false Powers of Judges. 2700.55 Section 2700.55 Labor Regulations Relating to Labor (Continued) FEDERAL MINE SAFETY AND HEALTH REVIEW COMMISSION PROCEDURAL RULES Hearings § 2700.55 Powers of Judges. Subject to these rules, a Judge is empowered to: (a) Administer oaths and...

  10. 29 CFR 2700.55 - Powers of Judges.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 29 Labor 9 2014-07-01 2014-07-01 false Powers of Judges. 2700.55 Section 2700.55 Labor Regulations Relating to Labor (Continued) FEDERAL MINE SAFETY AND HEALTH REVIEW COMMISSION PROCEDURAL RULES Hearings § 2700.55 Powers of Judges. Subject to these rules, a Judge is empowered to: (a) Administer oaths and...

  11. 20 CFR 410.687 - Rules governing the representation and advising of claimants and parties.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 20 Employees' Benefits 2 2010-04-01 2010-04-01 false Rules governing the representation and advising of claimants and parties. 410.687 Section 410.687 Employees' Benefits SOCIAL SECURITY ADMINISTRATION FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, TITLE IV-BLACK LUNG BENEFITS (1969...

  12. Runtime support for parallelizing data mining algorithms

    NASA Astrophysics Data System (ADS)

    Jin, Ruoming; Agrawal, Gagan

    2002-03-01

    With recent technological advances, shared memory parallel machines have become more scalable, and offer large main memories and high bus bandwidths. They are emerging as good platforms for data warehousing and data mining. In this paper, we focus on shared memory parallelization of data mining algorithms. We have developed a series of techniques for parallelization of data mining algorithms, including full replication, full locking, fixed locking, optimized full locking, and cache-sensitive locking. Unlike previous work on shared memory parallelization of specific data mining algorithms, all of our techniques apply to a large number of common data mining algorithms. In addition, we propose a reduction-object based interface for specifying a data mining algorithm. We show how our runtime system can apply any of the technique we have developed starting from a common specification of the algorithm.

  13. Data Mining Techniques Applied to Hydrogen Lactose Breath Test.

    PubMed

    Rubio-Escudero, Cristina; Valverde-Fernández, Justo; Nepomuceno-Chamorro, Isabel; Pontes-Balanza, Beatriz; Hernández-Mendoza, Yoedusvany; Rodríguez-Herrera, Alfonso

    2017-01-01

    Analyze a set of data of hydrogen breath tests by use of data mining tools. Identify new patterns of H2 production. Hydrogen breath tests data sets as well as k-means clustering as the data mining technique to a dataset of 2571 patients. Six different patterns have been extracted upon analysis of the hydrogen breath test data. We have also shown the relevance of each of the samples taken throughout the test. Analysis of the hydrogen breath test data sets using data mining techniques has identified new patterns of hydrogen generation upon lactose absorption. We can see the potential of application of data mining techniques to clinical data sets. These results offer promising data for future research on the relations between gut microbiota produced hydrogen and its link to clinical symptoms.

  14. Compass: a hybrid method for clinical and biobank data mining.

    PubMed

    Krysiak-Baltyn, K; Nordahl Petersen, T; Audouze, K; Jørgensen, Niels; Angquist, L; Brunak, S

    2014-02-01

    We describe a new method for identification of confident associations within large clinical data sets. The method is a hybrid of two existing methods; Self-Organizing Maps and Association Mining. We utilize Self-Organizing Maps as the initial step to reduce the search space, and then apply Association Mining in order to find association rules. We demonstrate that this procedure has a number of advantages compared to traditional Association Mining; it allows for handling numerical variables without a priori binning and is able to generate variable groups which act as "hotspots" for statistically significant associations. We showcase the method on infertility-related data from Danish military conscripts. The clinical data we analyzed contained both categorical type questionnaire data and continuous variables generated from biological measurements, including missing values. From this data set, we successfully generated a number of interesting association rules, which relate an observation with a specific consequence and the p-value for that finding. Additionally, we demonstrate that the method can be used on non-clinical data containing chemical-disease associations in order to find associations between different phenotypes, such as prostate cancer and breast cancer. Copyright © 2013 Elsevier Inc. All rights reserved.

  15. Quantum algorithm for association rules mining

    NASA Astrophysics Data System (ADS)

    Yu, Chao-Hua; Gao, Fei; Wang, Qing-Le; Wen, Qiao-Yan

    2016-10-01

    Association rules mining (ARM) is one of the most important problems in knowledge discovery and data mining. Given a transaction database that has a large number of transactions and items, the task of ARM is to acquire consumption habits of customers by discovering the relationships between itemsets (sets of items). In this paper, we address ARM in the quantum settings and propose a quantum algorithm for the key part of ARM, finding frequent itemsets from the candidate itemsets and acquiring their supports. Specifically, for the case in which there are Mf(k ) frequent k -itemsets in the Mc(k ) candidate k -itemsets (Mf(k )≤Mc(k ) ), our algorithm can efficiently mine these frequent k -itemsets and estimate their supports by using parallel amplitude estimation and amplitude amplification with complexity O (k/√{Mc(k )Mf(k ) } ɛ ) , where ɛ is the error for estimating the supports. Compared with the classical counterpart, i.e., the classical sampling-based algorithm, whose complexity is O (k/Mc(k ) ɛ2) , our quantum algorithm quadratically improves the dependence on both ɛ and Mc(k ) in the best case when Mf(k )≪Mc(k ) and on ɛ alone in the worst case when Mf(k )≈Mc(k ) .

  16. Analyzing injury severity factors at highway railway grade crossing accidents involving vulnerable road users: A comparative study.

    PubMed

    Ghomi, Haniyeh; Bagheri, Morteza; Fu, Liping; Miranda-Moreno, Luis F

    2016-11-16

    The main objective of this study is to identify the main factors associated with injury severity of vulnerable road users (VRUs) involved in accidents at highway railroad grade crossings (HRGCs) using data mining techniques. This article applies an ordered probit model, association rules, and classification and regression tree (CART) algorithms to the U.S. Federal Railroad Administration's (FRA) HRGC accident database for the period 2007-2013 to identify VRU injury severity factors at HRGCs. The results show that train speed is a key factor influencing injury severity. Further analysis illustrated that the presence of illumination does not reduce the severity of accidents for high-speed trains. In addition, there is a greater propensity toward fatal accidents for elderly road users compared to younger individuals. Interestingly, at night, injury accidents involving female road users are more severe compared to those involving males. The ordered probit model was the primary technique, and CART and association rules act as the supporter and identifier of interactions between variables. All 3 algorithms' results consistently show that the most influential accident factors are train speed, VRU age, and gender. The findings of this research could be applied for identifying high-risk hotspots and developing cost-effective countermeasures targeting VRUs at HRGCs.

  17. A Review of Financial Accounting Fraud Detection based on Data Mining Techniques

    NASA Astrophysics Data System (ADS)

    Sharma, Anuj; Kumar Panigrahi, Prabin

    2012-02-01

    With an upsurge in financial accounting fraud in the current economic scenario experienced, financial accounting fraud detection (FAFD) has become an emerging topic of great importance for academic, research and industries. The failure of internal auditing system of the organization in identifying the accounting frauds has lead to use of specialized procedures to detect financial accounting fraud, collective known as forensic accounting. Data mining techniques are providing great aid in financial accounting fraud detection, since dealing with the large data volumes and complexities of financial data are big challenges for forensic accounting. This paper presents a comprehensive review of the literature on the application of data mining techniques for the detection of financial accounting fraud and proposes a framework for data mining techniques based accounting fraud detection. The systematic and comprehensive literature review of the data mining techniques applicable to financial accounting fraud detection may provide a foundation to future research in this field. The findings of this review show that data mining techniques like logistic models, neural networks, Bayesian belief network, and decision trees have been applied most extensively to provide primary solutions to the problems inherent in the detection and classification of fraudulent data.

  18. Evaluating bump control techniques through convergence monitoring

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Campoli, A.A.

    1987-07-01

    A coal mine bump is the violent failure of a pillar or pillars due to overstress. Retreat coal mining concentrates stresses on the pillars directly outby gob areas, and the situation becomes critical when mining a coalbed encased in rigid associated strata. Bump control techniques employed by the Olga Mine, McDowell County, WV, were evaluated through convergence monitoring in a Bureau of Mines study. Olga uses a novel pillar splitting mining method to extract 55-ft by 70-ft chain pillars, under 1,100 to 1,550 ft of overburden. Three rows of pillars are mined simultaneously to soften the pillar line and reducemore » strain energy storage capacity. Localized stress reduction (destressing) techniques, auger drilling and shot firing, induced approximately 0.1 in. of roof-to-floor convergence in ''high'' -stress pillars near the gob line. Auger drilling of a ''low''-stress pillar located between two barrier pillars produced no convergence effects.« less

  19. Decision Tree based Prediction and Rule Induction for Groundwater Trichloroethene (TCE) Pollution Vulnerability

    NASA Astrophysics Data System (ADS)

    Park, J.; Yoo, K.

    2013-12-01

    For groundwater resource conservation, it is important to accurately assess groundwater pollution sensitivity or vulnerability. In this work, we attempted to use data mining approach to assess groundwater pollution vulnerability in a TCE (trichloroethylene) contaminated Korean industrial site. The conventional DRASTIC method failed to describe TCE sensitivity data with a poor correlation with hydrogeological properties. Among the different data mining methods such as Artificial Neural Network (ANN), Multiple Logistic Regression (MLR), Case Base Reasoning (CBR), and Decision Tree (DT), the accuracy and consistency of Decision Tree (DT) was the best. According to the following tree analyses with the optimal DT model, the failure of the conventional DRASTIC method in fitting with TCE sensitivity data may be due to the use of inaccurate weight values of hydrogeological parameters for the study site. These findings provide a proof of concept that DT based data mining approach can be used in predicting and rule induction of groundwater TCE sensitivity without pre-existing information on weights of hydrogeological properties.

  20. An Analysis Pipeline with Statistical and Visualization-Guided Knowledge Discovery for Michigan-Style Learning Classifier Systems

    PubMed Central

    Urbanowicz, Ryan J.; Granizo-Mackenzie, Ambrose; Moore, Jason H.

    2014-01-01

    Michigan-style learning classifier systems (M-LCSs) represent an adaptive and powerful class of evolutionary algorithms which distribute the learned solution over a sizable population of rules. However their application to complex real world data mining problems, such as genetic association studies, has been limited. Traditional knowledge discovery strategies for M-LCS rule populations involve sorting and manual rule inspection. While this approach may be sufficient for simpler problems, the confounding influence of noise and the need to discriminate between predictive and non-predictive attributes calls for additional strategies. Additionally, tests of significance must be adapted to M-LCS analyses in order to make them a viable option within fields that require such analyses to assess confidence. In this work we introduce an M-LCS analysis pipeline that combines uniquely applied visualizations with objective statistical evaluation for the identification of predictive attributes, and reliable rule generalizations in noisy single-step data mining problems. This work considers an alternative paradigm for knowledge discovery in M-LCSs, shifting the focus from individual rules to a global, population-wide perspective. We demonstrate the efficacy of this pipeline applied to the identification of epistasis (i.e., attribute interaction) and heterogeneity in noisy simulated genetic association data. PMID:25431544

  1. Analysis of correlation between pediatric asthma exacerbation and exposure to pollutant mixtures with association rule mining.

    PubMed

    Toti, Giulia; Vilalta, Ricardo; Lindner, Peggy; Lefer, Barry; Macias, Charles; Price, Daniel

    2016-11-01

    Traditional studies on effects of outdoor pollution on asthma have been criticized for questionable statistical validity and inefficacy in exploring the effects of multiple air pollutants, alone and in combination. Association rule mining (ARM), a method easily interpretable and suitable for the analysis of the effects of multiple exposures, could be of use, but the traditional interest metrics of support and confidence need to be substituted with metrics that focus on risk variations caused by different exposures. We present an ARM-based methodology that produces rules associated with relevant odds ratios and limits the number of final rules even at very low support levels (0.5%), thanks to post-pruning criteria that limit rule redundancy and control for statistical significance. The methodology has been applied to a case-crossover study to explore the effects of multiple air pollutants on risk of asthma in pediatric subjects. We identified 27 rules with interesting odds ratio among more than 10,000 having the required support. The only rule including only one chemical is exposure to ozone on the previous day of the reported asthma attack (OR=1.14). 26 combinatory rules highlight the limitations of air quality policies based on single pollutant thresholds and suggest that exposure to mixtures of chemicals is more harmful, with odds ratio as high as 1.54 (associated with the combination day0 SO 2 , day0 NO, day0 NO 2 , day1 PM). The proposed method can be used to analyze risk variations caused by single and multiple exposures. The method is reliable and requires fewer assumptions on the data than parametric approaches. Rules including more than one pollutant highlight interactions that deserve further investigation, while helping to limit the search field. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. Study on the Rule of Super Strata Movement and Subsidence

    NASA Astrophysics Data System (ADS)

    Yao, Shunli; Yuan, Hongyong; Jiang, Fuxing; Chen, Tao; Wu, Peng

    2018-01-01

    The movement of key strata is related to the safety of the whole earth’s surface for coal mining under super strata. Based on the key strata theory, the paper comprehensively analyzes the characteristics of the subsidence before and after the instability of the super strata by studing through FLAC3D and microseismic dynamic monitoring of the surface rock movement observation. The stability of the super strata movement is analyzed according to the characteristic value of the subsidence. The subsidence law and quantitative indexes under the control of the super rock strata that provides basis for the prevention and control of surface risk, optimize mining area and face layout and reasonably set mining boundary around mining area. It provides basis for the even growth of mine safety production and regional public safety.

  3. A strategy for selecting data mining techniques in metabolomics.

    PubMed

    Banimustafa, Ahmed Hmaidan; Hardy, Nigel W

    2012-01-01

    There is a general agreement that the development of metabolomics depends not only on advances in chemical analysis techniques but also on advances in computing and data analysis methods. Metabolomics data usually requires intensive pre-processing, analysis, and mining procedures. Selecting and applying such procedures requires attention to issues including justification, traceability, and reproducibility. We describe a strategy for selecting data mining techniques which takes into consideration the goals of data mining techniques on the one hand, and the goals of metabolomics investigations and the nature of the data on the other. The strategy aims to ensure the validity and soundness of results and promote the achievement of the investigation goals.

  4. The Value of Data Mining in Music Education Research and Some Findings from Its Application to a Study of Instrumental Learning during Childhood

    ERIC Educational Resources Information Center

    Faulkner, Robert; Davidson, Jane W.; McPherson, Gary E.

    2010-01-01

    The use of data mining for the analysis of data collected in natural settings is increasingly recognized as a legitimate mode of enquiry. This rule-inductive paradigm is an effective means of discovering relationships within large datasets--especially in research that has limited experimental design--and for the subsequent formulation of…

  5. Data mining techniques for assisting the diagnosis of pressure ulcer development in surgical patients.

    PubMed

    Su, Chao-Ton; Wang, Pa-Chun; Chen, Yan-Cheng; Chen, Li-Fei

    2012-08-01

    Pressure ulcer is a serious problem during patient care processes. The high risk factors in the development of pressure ulcer remain unclear during long surgery. Moreover, past preventive policies are hard to implement in a busy operation room. The objective of this study is to use data mining techniques to construct the prediction model for pressure ulcers. Four data mining techniques, namely, Mahalanobis Taguchi System (MTS), Support Vector Machines (SVMs), decision tree (DT), and logistic regression (LR), are used to select the important attributes from the data to predict the incidence of pressure ulcers. Measurements of sensitivity, specificity, F(1), and g-means were used to compare the performance of four classifiers on the pressure ulcer data set. The results show that data mining techniques obtain good results in predicting the incidence of pressure ulcer. We can conclude that data mining techniques can help identify the important factors and provide a feasible model to predict pressure ulcer development.

  6. Introduction to the JASIST Special Topic Issue on Web Retrieval and Mining: A Machine Learning Perspective.

    ERIC Educational Resources Information Center

    Chen, Hsinchun

    2003-01-01

    Discusses information retrieval techniques used on the World Wide Web. Topics include machine learning in information extraction; relevance feedback; information filtering and recommendation; text classification and text clustering; Web mining, based on data mining techniques; hyperlink structure; and Web size. (LRW)

  7. Artificial neural network, genetic algorithm, and logistic regression applications for predicting renal colic in emergency settings.

    PubMed

    Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay

    2009-06-03

    Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.

  8. The Usage of Association Rule Mining to Identify Influencing Factors on Deafness After Birth.

    PubMed

    Shahraki, Azimeh Danesh; Safdari, Reza; Gahfarokhi, Hamid Habibi; Tahmasebian, Shahram

    2015-12-01

    Providing complete and high quality health care services has very important role to enable people to understand the factors related to personal and social health and to make decision regarding choice of suitable healthy behaviors in order to achieve healthy life. For this reason, demographic and clinical data of person are collecting, this huge volume of data can be known as a valuable resource for analyzing, exploring and discovering valuable information and communication. This study using forum rules techniques in the data mining has tried to identify the affecting factors on hearing loss after birth in Iran. The survey is kind of data oriented study. The population of the study is contained questionnaires in several provinces of the country. First, all data of questionnaire was implemented in the form of information table in Software SQL Server and followed by Data Entry using written software of C # .Net, then algorithm Association in SQL Server Data Tools software and Clementine software was implemented to determine the rules and hidden patterns in the gathered data. Two factors of number of deaf brothers and the degree of consanguinity of the parents have a significant impact on severity of deafness of individuals. Also, when the severity of hearing loss is greater than or equal to moderately severe hearing loss, people use hearing aids and Men are also less interested in the use of hearing aids. In fact, it can be said that in families with consanguineous marriage of parents that are from first degree (girl/boy cousins) and 2(nd) degree relatives (girl/boy cousins) and especially from first degree, the number of people with severe hearing loss or deafness are more and in the use of hearing aids, gender of the patient is more important than the severity of the hearing loss.

  9. MINEs: Open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics

    DOE PAGES

    Jeffryes, James G.; Colastani, Ricardo L.; Elbadawi-Sidhu, Mona; ...

    2015-08-28

    Metabolomics have proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography–mass spectrometry (LC–MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likelymore » to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC–MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures.« less

  10. 26 CFR 1.367(a)-4T - Special rules applicable to specified transfers of property (temporary).

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... property (as defined in paragraph (b)(2) of this section) to a foreign corporation in an exchange described... subject to the rules of this paragraph (b) is any property that— (i) Is either mining property (as defined in section 617(f)(2)), section 1245 property (as defined in section 1245(a)(3)), section 1250...

  11. 40 CFR 52.1222 - Original Identification of plan section.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... between the State Pollution Control Agency and Erie Mining Company submitted by the State on February 20... 19, 1983, at 8 S.R. 1419 (text of rule starting at 8 S.R. 1420) and adopted as modified on April 16... Permits—Proposed and Published on December 19, 1983, at 8 S.R. 1419 (text of rule starting at 8 S.R. 1470...

  12. MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics.

    PubMed

    Jeffryes, James G; Colastani, Ricardo L; Elbadawi-Sidhu, Mona; Kind, Tobias; Niehaus, Thomas D; Broadbelt, Linda J; Hanson, Andrew D; Fiehn, Oliver; Tyo, Keith E J; Henry, Christopher S

    2015-01-01

    In spite of its great promise, metabolomics has proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography-mass spectrometry (LC-MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likely to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC-MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. Furthermore, MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures. Graphical abstractMINE database construction and access methods. The process of constructing a MINE database from the curated source databases is depicted on the left. The methods for accessing the database are shown on the right.

  13. Evaluation of Aster Images for Characterization and Mapping of Amethyst Mining Residues

    NASA Astrophysics Data System (ADS)

    Markoski, P. R.; Rolim, S. B. A.

    2012-07-01

    The objective of this work was to evaluate the potential of Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), subsystems VNIR (Visible and Near Infrared) and SWIR (Short Wave Infrared) images, for discrimination and mapping of amethyst mining residues (basalt) in the Ametista do Sul Region, Rio Grande do Sul State, Brazil. This region provides the most part of amethyst mining of the World. The basalt is extracted during the mining process and deposited outside the mine. As a result, mounts of residues (basalt) rise up. These mounts are many times smaller than ASTER pixel size (VNIR - 15 meters and SWIR - 30 meters). Thus, the pixel composition becomes a mixing of various materials, hampering its identification and mapping. Trying to solve this problem, multispectral algorithm Maximum Likelihood (MaxVer) and the hyperspectral technique SAM (Spectral Angle Mapper) were used in this work. Images from ASTER subsystems VNIR and SWIR were used to perform the classifications. SAM technique produced better results than MaxVer algorithm. The main error found by the techniques was the mixing between "shadow" and "mining residues/basalt" classes. With the SAM technique the confusion decreased because it employed the basalt spectral curve as a reference, while the multispectral techniques employed pixels groups that could have spectral mixture with other targets. The results showed that in tropical terrains as the study area, ASTER data can be efficacious for the characterization of mining residues.

  14. Mining knowledge from corpora: an application to retrieval and indexing.

    PubMed

    Soualmia, Lina F; Dahamna, Badisse; Darmoni, Stéfan

    2008-01-01

    The present work aims at discovering new associations between medical concepts to be exploited as input in retrieval and indexing. Association rules method is applied to documents. The process is carried out on three major document categories referring to e-health information consumers: health professionals, students and lay people. Association rules evaluation is founded on statistical measures combined with domain knowledge. Association rules represent existing relations between medical concepts (60.62%) and new knowledge (54.21%). Based on observations, 463 expert rules are defined by medical librarians for retrieval and indexing. Association rules bear out existing relations, produce new knowledge and support users and indexers in document retrieval and indexing.

  15. Microbial genotype-phenotype mapping by class association rule mining.

    PubMed

    Tamura, Makio; D'haeseleer, Patrik

    2008-07-01

    Microbial phenotypes are typically due to the concerted action of multiple gene functions, yet the presence of each gene may have only a weak correlation with the observed phenotype. Hence, it may be more appropriate to examine co-occurrence between sets of genes and a phenotype (multiple-to-one) instead of pairwise relations between a single gene and the phenotype. Here, we propose an efficient class association rule mining algorithm, netCAR, in order to extract sets of COGs (clusters of orthologous groups of proteins) associated with a phenotype from COG phylogenetic profiles and a phenotype profile. netCAR takes into account the phylogenetic co-occurrence graph between COGs to restrict hypothesis space, and uses mutual information to evaluate the biconditional relation. We examined the mining capability of pairwise and multiple-to-one association by using netCAR to extract COGs relevant to six microbial phenotypes (aerobic, anaerobic, facultative, endospore, motility and Gram negative) from 11,969 unique COG profiles across 155 prokaryotic organisms. With the same level of false discovery rate, multiple-to-one association can extract about 10 times more relevant COGs than one-to-one association. We also reveal various topologies of association networks among COGs (modules) from extracted multiple-to-one correlation rules relevant with the six phenotypes; including a well-connected network for motility, a star-shaped network for aerobic and intermediate topologies for the other phenotypes. netCAR outperforms a standard CAR mining algorithm, CARapriori, while requiring several orders of magnitude less computational time for extracting 3-COG sets. Source code of the Java implementation is available as Supplementary Material at the Bioinformatics online website, or upon request to the author. Supplementary data are available at Bioinformatics online.

  16. Jackson Mills and Mine Falls Dams, Nashua, New Hampshire. Reconnaissance Report, Hydroelectric Feasibility. Volume 2. Mine Falls Dam.

    DTIC Science & Technology

    1980-01-01

    producers under a state law of 1978. Until the regulations under PURPA Title II (the National Energy Act of 1978) are promulgated and the PUC reviews this...hour (rWi); end it is FURTr.R ORDERMD, that the Corumission will re-examine th4 PURPA issues in this proceedirg upon the issuance of rules by the F-RC

  17. Mining and Exploitation of Rare Earth Elements in Africa as an Engagement Strategy in US Africa Command

    DTIC Science & Technology

    2011-06-17

    rechargeable batteries, cell phones, catalytic converters, fluorescent lights, hybrid vehicle batteries, and other pollution control devices.21 Figure...79 Lee Yong-tim, “South China Villagers Slam Pollution from Rare Earth Mine,” February 22, 2008, http://www.rfa.org/english...writing and implementing new environmental standards. “The rules will limit pollutants allowed in waste water and emissions of radioactive elements

  18. 76 FR 51274 - Supplemental Nutrition Assistance Program: Major System Failures

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-08-18

    ... data mining as necessary to determine if losses are occurring in the process of issuing benefits. It is... further by using data mining techniques on States' data or analyzing QC data for error patterns that may... conjunction with an additional sample of cases. Data mining techniques may be employed when QC data cannot...

  19. Assessment of practicality of remote sensing techniques for a study of the effects of strip mining in Alabama

    NASA Technical Reports Server (NTRS)

    Hughes, T. H.; Dillion, A. C., III; White, J. R., Jr.; Drummond, S. E., Jr.; Hooks, W. G.

    1975-01-01

    Because of the volume of coal produced by strip mining, the proximity of mining operations, and the diversity of mining methods (e.g. contour stripping, area stripping, multiple seam stripping, and augering, as well as underground mining), the Warrior Coal Basin seemed best suited for initial studies on the physical impact of strip mining in Alabama. Two test sites, (Cordova and Searles) representative of the various strip mining techniques and environmental problems, were chosen for intensive studies of the correlation between remote sensing and ground truth data. Efforts were eventually concentrated in the Searles Area, since it is more accessible and offers a better opportunity for study of erosional and depositional processes than the Cordova Area.

  20. Environmental characterisation of coal mine waste rock in the field: an example from New Zealand

    NASA Astrophysics Data System (ADS)

    Hughes, J.; Craw, D.; Peake, B.; Lindsay, P.; Weber, P.

    2007-08-01

    Characterisation of mine waste rock with respect to acid generation potential is a necessary part of routine mine operations, so that environmentally benign waste rock stacks can be constructed for permanent storage. Standard static characterisation techniques, such as acid neutralisation capacity (ANC), maximum potential acidity, and associated acid-base accounting, require laboratory tests that can be difficult to obtain rapidly at remote mine sites. We show that a combination of paste pH and a simple portable carbonate dissolution test, both techniques that can be done in the field in a 15 min time-frame, is useful for distinguishing rocks that are potentially acid-forming from those that are acid-neutralising. Use of these techniques could allow characterisation of mine wastes at the metre scale during mine excavation operations. Our application of these techniques to pyrite-bearing (total S = 1-4 wt%) but variably calcareous coal mine overburden shows that there is a strong correlation between the portable carbonate dissolution technique and laboratory-determined ANC measurements (range of 0-10 wt% calcite equivalent). Paste pH measurements on the same rocks are bimodal, with high-sulphur, low-calcite rocks yielding pH near 3 after 10 min, whereas high-ANC rocks yield paste pH of 7-8. In our coal mine example, the field tests were most effective when used in conjunction with stratigraphy. However, the same field tests have potential for routine use in any mine in which distinction of acid-generating rocks from acid-neutralising rocks is required. Calibration of field-based acid-base accounting characteristics of the rocks with laboratory-based static and/or kinetic tests is still necessary.

  1. Modeling and analysis of CSAMT field source effect and its characteristics

    NASA Astrophysics Data System (ADS)

    Da, Lei; Xiaoping, Wu; Qingyun, Di; Gang, Wang; Xiangrong, Lv; Ruo, Wang; Jun, Yang; Mingxin, Yue

    2016-02-01

    Controlled-source audio-frequency magnetotellurics (CSAMT) has been a highly successful geophysical tool used in a variety of geological exploration studies for many years. However, due to the artificial source used in the CSAMT technique, two important factors are considered during interpretation: non-plane-wave or geometric effects and source overprint effects. Hence, in this paper we simulate the source overprint effects and analyzed the rule and characteristics of its influence on CSAMT applications. Two-dimensional modeling was carried out using an adaptive unstructured finite element method to simulate several typical models. Also, we summarized the characteristics and rule of the source overprint effects and analyzed its influence on the data taken over several mining areas. The results obtained from the study shows that the occurrence and strength of the source overprint effect is dependent on the location of the source dipole, in relation to the receiver and the subsurface geology. In order to avoid source overprint effects, three principle were suggested to determine the best location for the grounded dipole source in the field.

  2. Research on preventive technologies for bed-separation water hazard in China coal mines

    NASA Astrophysics Data System (ADS)

    Gui, Herong; Tong, Shijie; Qiu, Weizhong; Lin, Manli

    2018-03-01

    Bed-separation water is one of the major water hazards in coal mines. Targeted researches on the preventive technologies are of paramount importance to safe mining. This article studied the restrictive effect of geological and mining factors, such as lithological properties of roof strata, coal seam inclination, water source to bed separations, roof management method, dimensions of mining working face, and mining progress, on the formation of bed-separation water hazard. The key techniques to prevent bed-separation water-related accidents include interception, diversion, destructing the buffer layer, grouting and backfilling, etc. The operation and efficiency of each technique are corroborated in field engineering cases. The results of this study will offer reference to countries with similar mining conditions in the researches on bed-separation water burst and hazard control in coal mines.

  3. Text Mining in Biomedical Domain with Emphasis on Document Clustering.

    PubMed

    Renganathan, Vinaitheerthan

    2017-07-01

    With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

  4. Survey of Natural Language Processing Techniques in Bioinformatics.

    PubMed

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.

  5. Integrated approach using data mining-based decision tree and object-based image analysis for high-resolution urban mapping of WorldView-2 satellite sensor data

    NASA Astrophysics Data System (ADS)

    Hamedianfar, Alireza; Shafri, Helmi Zulhaidi Mohd

    2016-04-01

    This paper integrates decision tree-based data mining (DM) and object-based image analysis (OBIA) to provide a transferable model for the detailed characterization of urban land-cover classes using WorldView-2 (WV-2) satellite images. Many articles have been published on OBIA in recent years based on DM for different applications. However, less attention has been paid to the generation of a transferable model for characterizing detailed urban land cover features. Three subsets of WV-2 images were used in this paper to generate transferable OBIA rule-sets. Many features were explored by using a DM algorithm, which created the classification rules as a decision tree (DT) structure from the first study area. The developed DT algorithm was applied to object-based classifications in the first study area. After this process, we validated the capability and transferability of the classification rules into second and third subsets. Detailed ground truth samples were collected to assess the classification results. The first, second, and third study areas achieved 88%, 85%, and 85% overall accuracies, respectively. Results from the investigation indicate that DM was an efficient method to provide the optimal and transferable classification rules for OBIA, which accelerates the rule-sets creation stage in the OBIA classification domain.

  6. Rules of co-occurring mutations characterize the antigenic evolution of human influenza A/H3N2, A/H1N1 and B viruses.

    PubMed

    Chen, Haifen; Zhou, Xinrui; Zheng, Jie; Kwoh, Chee-Keong

    2016-12-05

    The human influenza viruses undergo rapid evolution (especially in hemagglutinin (HA), a glycoprotein on the surface of the virus), which enables the virus population to constantly evade the human immune system. Therefore, the vaccine has to be updated every year to stay effective. There is a need to characterize the evolution of influenza viruses for better selection of vaccine candidates and the prediction of pandemic strains. Studies have shown that the influenza hemagglutinin evolution is driven by the simultaneous mutations at antigenic sites. Here, we analyze simultaneous or co-occurring mutations in the HA protein of human influenza A/H3N2, A/H1N1 and B viruses to predict potential mutations, characterizing the antigenic evolution. We obtain the rules of mutation co-occurrence using association rule mining after extracting HA1 sequences and detect co-mutation sites under strong selective pressure. Then we predict the potential drifts with specific mutations of the viruses based on the rules and compare the results with the "observed" mutations in different years. The sites under frequent mutations are in antigenic regions (epitopes) or receptor binding sites. Our study demonstrates the co-occurring site mutations obtained by rule mining can capture the evolution of influenza viruses, and confirms that cooperative interactions among sites of HA1 protein drive the influenza antigenic evolution.

  7. Traffic accident in Cuiabá-MT: an analysis through the data mining technology.

    PubMed

    Galvão, Noemi Dreyer; de Fátima Marin, Heimar

    2010-01-01

    The traffic road accidents (ATT) are non-intentional events with an important magnitude worldwide, mainly in the urban centers. This article aims to analyzes data related to the victims of ATT recorded by the Justice Secretariat and Public Security (SEJUSP) in hospital morbidity and mortality incidence at the city of Cuiabá-MT during 2006, using data mining technology. An observational, retrospective and exploratory study of the secondary data bases was carried out. The three database selected were related using the probabilistic method, through the free software RecLink. One hundred and thirty-nine (139) real pairs of victims of ATT were obtained. In this related database the data mining technology was applied with the software WEKA using the Apriori algorithm. The result generated 10 best rules, six of them were considered according to the parameters established that indicated a useful and comprehensible knowledge to characterize the victims of accidents in Cuiabá. Finally, the findings of the associative rules showed peculiarities of the road traffic accident victims in Cuiabá and highlight the need of prevention measures in the collision accidents for males.

  8. Recommendation System Based On Association Rules For Distributed E-Learning Management Systems

    NASA Astrophysics Data System (ADS)

    Mihai, Gabroveanu

    2015-09-01

    Traditional Learning Management Systems are installed on a single server where learning materials and user data are kept. To increase its performance, the Learning Management System can be installed on multiple servers; learning materials and user data could be distributed across these servers obtaining a Distributed Learning Management System. In this paper is proposed the prototype of a recommendation system based on association rules for Distributed Learning Management System. Information from LMS databases is analyzed using distributed data mining algorithms in order to extract the association rules. Then the extracted rules are used as inference rules to provide personalized recommendations. The quality of provided recommendations is improved because the rules used to make the inferences are more accurate, since these rules aggregate knowledge from all e-Learning systems included in Distributed Learning Management System.

  9. Implementing the Seapower Strategy

    DTIC Science & Technology

    2008-01-01

    between the two ends. Here is an example. When Britannia ruled the waves with a global navy to pro- tect the empire, Sir Julian Corbett specified three...because torpedo boats, submarines, and mines threatened cheap kills.7 Upon the rise of the German High Seas Fleet in the decades before World War I...face swarms of small combatants are being developed with accompanying search and attack systems. We have reawakened to the threats from mines and quiet

  10. Comparing digital data processing techniques for surface mine and reclamation monitoring

    NASA Technical Reports Server (NTRS)

    Witt, R. G.; Bly, B. G.; Campbell, W. J.; Bloemer, H. H. L.; Brumfield, J. O.

    1982-01-01

    The results of three techniques used for processing Landsat digital data are compared for their utility in delineating areas of surface mining and subsequent reclamation. An unsupervised clustering algorithm (ISOCLS), a maximum-likelihood classifier (CLASFY), and a hybrid approach utilizing canonical analysis (ISOCLS/KLTRANS/ISOCLS) were compared by means of a detailed accuracy assessment with aerial photography at NASA's Goddard Space Flight Center. Results show that the hybrid approach was superior to the traditional techniques in distinguishing strip mined and reclaimed areas.

  11. FIR: An Effective Scheme for Extracting Useful Metadata from Social Media.

    PubMed

    Chen, Long-Sheng; Lin, Zue-Cheng; Chang, Jing-Rong

    2015-11-01

    Recently, the use of social media for health information exchange is expanding among patients, physicians, and other health care professionals. In medical areas, social media allows non-experts to access, interpret, and generate medical information for their own care and the care of others. Researchers paid much attention on social media in medical educations, patient-pharmacist communications, adverse drug reactions detection, impacts of social media on medicine and healthcare, and so on. However, relatively few papers discuss how to extract useful knowledge from a huge amount of textual comments in social media effectively. Therefore, this study aims to propose a Fuzzy adaptive resonance theory network based Information Retrieval (FIR) scheme by combining Fuzzy adaptive resonance theory (ART) network, Latent Semantic Indexing (LSI), and association rules (AR) discovery to extract knowledge from social media. In our FIR scheme, Fuzzy ART network firstly has been employed to segment comments. Next, for each customer segment, we use LSI technique to retrieve important keywords. Then, in order to make the extracted keywords understandable, association rules mining is presented to organize these extracted keywords to build metadata. These extracted useful voices of customers will be transformed into design needs by using Quality Function Deployment (QFD) for further decision making. Unlike conventional information retrieval techniques which acquire too many keywords to get key points, our FIR scheme can extract understandable metadata from social media.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chironis, N.P.

    This book contains a wealth of valuable information carefully selected and compiled from recent issues of Coal Age magazine. Much of the source material has been gathered by Coal Age Editors during their visits to coal mines, research establishments, universities and technical symposiums. Equally important are the articles and data contributed by over 50 top experts, many of whom are well known to the mining industry. Specifically, this easy-to-use handbook is divided into eleven key areas of underground mining. Here you will find the latest information on continuous mining techniques, longwall and shortwall methods and equipment, specialized mining and boringmore » systems, continuous haulage techniques, improved roof control and ventilation methods, mine communications and instrumentation, power systems, fire control methods, and new mining regulations. There is also a section on engineering and management considerations, including the modern use of computer terminals, practical techniques for picking leaders and for encouraging more safety consciousness in employees, factors affecting absenteeism, and some highly important financial considerations. All of this valuable information has been thoroughly indexed to provide immediate access to the specific data needed by the reader.« less

  13. Data mining in pharma sector: benefits.

    PubMed

    Ranjan, Jayanthi

    2009-01-01

    The amount of data getting generated in any sector at present is enormous. The information flow in the pharma industry is huge. Pharma firms are progressing into increased technology-enabled products and services. Data mining, which is knowledge discovery from large sets of data, helps pharma firms to discover patterns in improving the quality of drug discovery and delivery methods. The paper aims to present how data mining is useful in the pharma industry, how its techniques can yield good results in pharma sector, and to show how data mining can really enhance in making decisions using pharmaceutical data. This conceptual paper is written based on secondary study, research and observations from magazines, reports and notes. The author has listed the types of patterns that can be discovered using data mining in pharma data. The paper shows how data mining is useful in the pharma industry and how its techniques can yield good results in pharma sector. Although much work can be produced for discovering knowledge in pharma data using data mining, the paper is limited to conceptualizing the ideas and view points at this stage; future work may include applying data mining techniques to pharma data based on primary research using the available, famous significant data mining tools. Research papers and conceptual papers related to data mining in Pharma industry are rare; this is the motivation for the paper.

  14. Data Mining in Child Welfare.

    ERIC Educational Resources Information Center

    Schoech, Dick; Quinn, Andrew; Rycraft, Joan R.

    2000-01-01

    Examines the historical and larger context of data mining and describes data mining processes, techniques, and tools. Illustrates these using a child welfare dataset concerning the employee turnover that is mined, using logistic regression and a Bayesian neural network. Discusses the data mining process, the resulting models, their predictive…

  15. 76 FR 67637 - West Virginia Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-11-02

    ... Surface Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule with public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We are announcing receipt of a... [[Page 67638

  16. Text Mining in Biomedical Domain with Emphasis on Document Clustering

    PubMed Central

    2017-01-01

    Objectives With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. Methods This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Results Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Conclusions Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise. PMID:28875048

  17. Association mining of dependency between time series

    NASA Astrophysics Data System (ADS)

    Hafez, Alaaeldin

    2001-03-01

    Time series analysis is considered as a crucial component of strategic control over a broad variety of disciplines in business, science and engineering. Time series data is a sequence of observations collected over intervals of time. Each time series describes a phenomenon as a function of time. Analysis on time series data includes discovering trends (or patterns) in a time series sequence. In the last few years, data mining has emerged and been recognized as a new technology for data analysis. Data Mining is the process of discovering potentially valuable patterns, associations, trends, sequences and dependencies in data. Data mining techniques can discover information that many traditional business analysis and statistical techniques fail to deliver. In this paper, we adapt and innovate data mining techniques to analyze time series data. By using data mining techniques, maximal frequent patterns are discovered and used in predicting future sequences or trends, where trends describe the behavior of a sequence. In order to include different types of time series (e.g. irregular and non- systematic), we consider past frequent patterns of the same time sequences (local patterns) and of other dependent time sequences (global patterns). We use the word 'dependent' instead of the word 'similar' for emphasis on real life time series where two time series sequences could be completely different (in values, shapes, etc.), but they still react to the same conditions in a dependent way. In this paper, we propose the Dependence Mining Technique that could be used in predicting time series sequences. The proposed technique consists of three phases: (a) for all time series sequences, generate their trend sequences, (b) discover maximal frequent trend patterns, generate pattern vectors (to keep information of frequent trend patterns), use trend pattern vectors to predict future time series sequences.

  18. Biclustering Learning of Trading Rules.

    PubMed

    Huang, Qinghua; Wang, Ting; Tao, Dacheng; Li, Xuelong

    2015-10-01

    Technical analysis with numerous indicators and patterns has been regarded as important evidence for making trading decisions in financial markets. However, it is extremely difficult for investors to find useful trading rules based on numerous technical indicators. This paper innovatively proposes the use of biclustering mining to discover effective technical trading patterns that contain a combination of indicators from historical financial data series. This is the first attempt to use biclustering algorithm on trading data. The mined patterns are regarded as trading rules and can be classified as three trading actions (i.e., the buy, the sell, and no-action signals) with respect to the maximum support. A modified K nearest neighborhood ( K -NN) method is applied to classification of trading days in the testing period. The proposed method [called biclustering algorithm and the K nearest neighbor (BIC- K -NN)] was implemented on four historical datasets and the average performance was compared with the conventional buy-and-hold strategy and three previously reported intelligent trading systems. Experimental results demonstrate that the proposed trading system outperforms its counterparts and will be useful for investment in various financial markets.

  19. The Hazards of Data Mining in Healthcare.

    PubMed

    Househ, Mowafa; Aldosari, Bakheet

    2017-01-01

    From the mid-1990s, data mining methods have been used to explore and find patterns and relationships in healthcare data. During the 1990s and early 2000's, data mining was a topic of great interest to healthcare researchers, as data mining showed some promise in the use of its predictive techniques to help model the healthcare system and improve the delivery of healthcare services. However, it was soon discovered that mining healthcare data had many challenges relating to the veracity of healthcare data and limitations around predictive modelling leading to failures of data mining projects. As the Big Data movement has gained momentum over the past few years, there has been a reemergence of interest in the use of data mining techniques and methods to analyze healthcare generated Big Data. Much has been written on the positive impacts of data mining on healthcare practice relating to issues of best practice, fraud detection, chronic disease management, and general healthcare decision making. Little has been written about the limitations and challenges of data mining use in healthcare. In this review paper, we explore some of the limitations and challenges in the use of data mining techniques in healthcare. Our results show that the limitations of data mining in healthcare include reliability of medical data, data sharing between healthcare organizations, inappropriate modelling leading to inaccurate predictions. We conclude that there are many pitfalls in the use of data mining in healthcare and more work is needed to show evidence of its utility in facilitating healthcare decision-making for healthcare providers, managers, and policy makers and more evidence is needed on data mining's overall impact on healthcare services and patient care.

  20. Entity Bases: Large-Scale Knowledgebases for Intelligence Data

    DTIC Science & Technology

    2009-02-01

    declaratively expressed as Datalog rules . The EntityBase supports two query scenarios: • Free-Form Querying: A human analyst or a client program can pose...integration, Prometheus follows the Inverse Rules algo- rithm (Duschka 1997) with additional optimizations (Thakkar et al. 2005). We use the mediator...Discovery and Data Mining (PAKDD󈧈), Sydney, Australia. Crammer , K., Dekel, O., Keshet, J., Shalev-Shwartz, S., and Singer, Y. (2006). Online passive

  1. Analyzing Divisia Rules Extracted from a Feedforward Neural Network

    DTIC Science & Technology

    2006-03-01

    assumptions. (Barnett and work, Data Mining, Rule Generation Serletis give a detailed treatment of the the- ory of monetary aggregation [1].) However, 1... Serletis , A. (Eds.) (2000), The The- Swizerland, 1995. ory of Monetary Aggregation, North-H ollandeAmsterdam, Chgaptero , pp.- [11] Vincent A. Schmidt and...gas, Nevada, 2002. sets. Macroeconomic Dynamics, 1:485-512, 1997. Reprinted in Barnett, WA. [12] Vincent A. Schmidt and Jane M. Binner. and Serletis

  2. On-Demand Associative Cross-Language Information Retrieval

    NASA Astrophysics Data System (ADS)

    Geraldo, André Pinto; Moreira, Viviane P.; Gonçalves, Marcos A.

    This paper proposes the use of algorithms for mining association rules as an approach for Cross-Language Information Retrieval. These algorithms have been widely used to analyse market basket data. The idea is to map the problem of finding associations between sales items to the problem of finding term translations over a parallel corpus. The proposal was validated by means of experiments using queries in two distinct languages: Portuguese and Finnish to retrieve documents in English. The results show that the performance of our proposed approach is comparable to the performance of the monolingual baseline and to query translation via machine translation, even though these systems employ more complex Natural Language Processing techniques. The combination between machine translation and our approach yielded the best results, even outperforming the monolingual baseline.

  3. Application of advanced computing techniques to the analysis and display of space science measurements

    NASA Technical Reports Server (NTRS)

    Klumpar, D. M.; Lapolla, M. V.; Horblit, B.

    1995-01-01

    A prototype system has been developed to aid the experimental space scientist in the display and analysis of spaceborne data acquired from direct measurement sensors in orbit. We explored the implementation of a rule-based environment for semi-automatic generation of visualizations that assist the domain scientist in exploring one's data. The goal has been to enable rapid generation of visualizations which enhance the scientist's ability to thoroughly mine his data. Transferring the task of visualization generation from the human programmer to the computer produced a rapid prototyping environment for visualizations. The visualization and analysis environment has been tested against a set of data obtained from the Hot Plasma Composition Experiment on the AMPTE/CCE satellite creating new visualizations which provided new insight into the data.

  4. Rule-guided human classification of Volunteered Geographic Information

    NASA Astrophysics Data System (ADS)

    Ali, Ahmed Loai; Falomir, Zoe; Schmid, Falko; Freksa, Christian

    2017-05-01

    During the last decade, web technologies and location sensing devices have evolved generating a form of crowdsourcing known as Volunteered Geographic Information (VGI). VGI acted as a platform of spatial data collection, in particular, when a group of public participants are involved in collaborative mapping activities: they work together to collect, share, and use information about geographic features. VGI exploits participants' local knowledge to produce rich data sources. However, the resulting data inherits problematic data classification. In VGI projects, the challenges of data classification are due to the following: (i) data is likely prone to subjective classification, (ii) remote contributions and flexible contribution mechanisms in most projects, and (iii) the uncertainty of spatial data and non-strict definitions of geographic features. These factors lead to various forms of problematic classification: inconsistent, incomplete, and imprecise data classification. This research addresses classification appropriateness. Whether the classification of an entity is appropriate or inappropriate is related to quantitative and/or qualitative observations. Small differences between observations may be not recognizable particularly for non-expert participants. Hence, in this paper, the problem is tackled by developing a rule-guided classification approach. This approach exploits data mining techniques of Association Classification (AC) to extract descriptive (qualitative) rules of specific geographic features. The rules are extracted based on the investigation of qualitative topological relations between target features and their context. Afterwards, the extracted rules are used to develop a recommendation system able to guide participants to the most appropriate classification. The approach proposes two scenarios to guide participants towards enhancing the quality of data classification. An empirical study is conducted to investigate the classification of grass-related features like forest, garden, park, and meadow. The findings of this study indicate the feasibility of the proposed approach.

  5. A novel association rule mining approach using TID intermediate itemset.

    PubMed

    Aqra, Iyad; Herawan, Tutut; Abdul Ghani, Norjihan; Akhunzada, Adnan; Ali, Akhtar; Bin Razali, Ramdan; Ilahi, Manzoor; Raymond Choo, Kim-Kwang

    2018-01-01

    Designing an efficient association rule mining (ARM) algorithm for multilevel knowledge-based transactional databases that is appropriate for real-world deployments is of paramount concern. However, dynamic decision making that needs to modify the threshold either to minimize or maximize the output knowledge certainly necessitates the extant state-of-the-art algorithms to rescan the entire database. Subsequently, the process incurs heavy computation cost and is not feasible for real-time applications. The paper addresses efficiently the problem of threshold dynamic updation for a given purpose. The paper contributes by presenting a novel ARM approach that creates an intermediate itemset and applies a threshold to extract categorical frequent itemsets with diverse threshold values. Thus, improving the overall efficiency as we no longer needs to scan the whole database. After the entire itemset is built, we are able to obtain real support without the need of rebuilding the itemset (e.g. Itemset list is intersected to obtain the actual support). Moreover, the algorithm supports to extract many frequent itemsets according to a pre-determined minimum support with an independent purpose. Additionally, the experimental results of our proposed approach demonstrate the capability to be deployed in any mining system in a fully parallel mode; consequently, increasing the efficiency of the real-time association rules discovery process. The proposed approach outperforms the extant state-of-the-art and shows promising results that reduce computation cost, increase accuracy, and produce all possible itemsets.

  6. A novel association rule mining approach using TID intermediate itemset

    PubMed Central

    Ali, Akhtar; Bin Razali, Ramdan; Ilahi, Manzoor; Raymond Choo, Kim-Kwang

    2018-01-01

    Designing an efficient association rule mining (ARM) algorithm for multilevel knowledge-based transactional databases that is appropriate for real-world deployments is of paramount concern. However, dynamic decision making that needs to modify the threshold either to minimize or maximize the output knowledge certainly necessitates the extant state-of-the-art algorithms to rescan the entire database. Subsequently, the process incurs heavy computation cost and is not feasible for real-time applications. The paper addresses efficiently the problem of threshold dynamic updation for a given purpose. The paper contributes by presenting a novel ARM approach that creates an intermediate itemset and applies a threshold to extract categorical frequent itemsets with diverse threshold values. Thus, improving the overall efficiency as we no longer needs to scan the whole database. After the entire itemset is built, we are able to obtain real support without the need of rebuilding the itemset (e.g. Itemset list is intersected to obtain the actual support). Moreover, the algorithm supports to extract many frequent itemsets according to a pre-determined minimum support with an independent purpose. Additionally, the experimental results of our proposed approach demonstrate the capability to be deployed in any mining system in a fully parallel mode; consequently, increasing the efficiency of the real-time association rules discovery process. The proposed approach outperforms the extant state-of-the-art and shows promising results that reduce computation cost, increase accuracy, and produce all possible itemsets. PMID:29351287

  7. A case-based reasoning tool for breast cancer knowledge management with data mining concepts and techniques

    NASA Astrophysics Data System (ADS)

    Demigha, Souâd.

    2016-03-01

    The paper presents a Case-Based Reasoning Tool for Breast Cancer Knowledge Management to improve breast cancer screening. To develop this tool, we combine both concepts and techniques of Case-Based Reasoning (CBR) and Data Mining (DM). Physicians and radiologists ground their diagnosis on their expertise (past experience) based on clinical cases. Case-Based Reasoning is the process of solving new problems based on the solutions of similar past problems and structured as cases. CBR is suitable for medical use. On the other hand, existing traditional hospital information systems (HIS), Radiological Information Systems (RIS) and Picture Archiving Information Systems (PACS) don't allow managing efficiently medical information because of its complexity and heterogeneity. Data Mining is the process of mining information from a data set and transform it into an understandable structure for further use. Combining CBR to Data Mining techniques will facilitate diagnosis and decision-making of medical experts.

  8. Proceedings: Fourth Workshop on Mining Scientific Datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kamath, C

    Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratorymore » data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of applications is matched only by the opportunities that await a practitioner.« less

  9. Machine Learning and Data Mining Methods in Diabetes Research.

    PubMed

    Kavakiotis, Ioannis; Tsave, Olga; Salifoglou, Athanasios; Maglaveras, Nicos; Vlahavas, Ioannis; Chouvarda, Ioanna

    2017-01-01

    The remarkable advances in biotechnology and health sciences have led to a significant production of data, such as high throughput genetic data and clinical information, generated from large Electronic Health Records (EHRs). To this end, application of machine learning and data mining methods in biosciences is presently, more than ever before, vital and indispensable in efforts to transform intelligently all available information into valuable knowledge. Diabetes mellitus (DM) is defined as a group of metabolic disorders exerting significant pressure on human health worldwide. Extensive research in all aspects of diabetes (diagnosis, etiopathophysiology, therapy, etc.) has led to the generation of huge amounts of data. The aim of the present study is to conduct a systematic review of the applications of machine learning, data mining techniques and tools in the field of diabetes research with respect to a) Prediction and Diagnosis, b) Diabetic Complications, c) Genetic Background and Environment, and e) Health Care and Management with the first category appearing to be the most popular. A wide range of machine learning algorithms were employed. In general, 85% of those used were characterized by supervised learning approaches and 15% by unsupervised ones, and more specifically, association rules. Support vector machines (SVM) arise as the most successful and widely used algorithm. Concerning the type of data, clinical datasets were mainly used. The title applications in the selected articles project the usefulness of extracting valuable knowledge leading to new hypotheses targeting deeper understanding and further investigation in DM.

  10. Classification Based on Pruning and Double Covered Rule Sets for the Internet of Things Applications

    PubMed Central

    Zhou, Zhongmei; Wang, Weiping

    2014-01-01

    The Internet of things (IOT) is a hot issue in recent years. It accumulates large amounts of data by IOT users, which is a great challenge to mining useful knowledge from IOT. Classification is an effective strategy which can predict the need of users in IOT. However, many traditional rule-based classifiers cannot guarantee that all instances can be covered by at least two classification rules. Thus, these algorithms cannot achieve high accuracy in some datasets. In this paper, we propose a new rule-based classification, CDCR-P (Classification based on the Pruning and Double Covered Rule sets). CDCR-P can induce two different rule sets A and B. Every instance in training set can be covered by at least one rule not only in rule set A, but also in rule set B. In order to improve the quality of rule set B, we take measure to prune the length of rules in rule set B. Our experimental results indicate that, CDCR-P not only is feasible, but also it can achieve high accuracy. PMID:24511304

  11. Classification based on pruning and double covered rule sets for the internet of things applications.

    PubMed

    Li, Shasha; Zhou, Zhongmei; Wang, Weiping

    2014-01-01

    The Internet of things (IOT) is a hot issue in recent years. It accumulates large amounts of data by IOT users, which is a great challenge to mining useful knowledge from IOT. Classification is an effective strategy which can predict the need of users in IOT. However, many traditional rule-based classifiers cannot guarantee that all instances can be covered by at least two classification rules. Thus, these algorithms cannot achieve high accuracy in some datasets. In this paper, we propose a new rule-based classification, CDCR-P (Classification based on the Pruning and Double Covered Rule sets). CDCR-P can induce two different rule sets A and B. Every instance in training set can be covered by at least one rule not only in rule set A, but also in rule set B. In order to improve the quality of rule set B, we take measure to prune the length of rules in rule set B. Our experimental results indicate that, CDCR-P not only is feasible, but also it can achieve high accuracy.

  12. Mind your crossings: Mining GIS imagery for crosswalk localization.

    PubMed

    Ahmetovic, Dragan; Manduchi, Roberto; Coughlan, James M; Mascetti, Sergio

    2017-04-01

    For blind travelers, finding crosswalks and remaining within their borders while traversing them is a crucial part of any trip involving street crossings. While standard Orientation & Mobility (O&M) techniques allow blind travelers to safely negotiate street crossings, additional information about crosswalks and other important features at intersections would be helpful in many situations, resulting in greater safety and/or comfort during independent travel. For instance, in planning a trip a blind pedestrian may wish to be informed of the presence of all marked crossings near a desired route. We have conducted a survey of several O&M experts from the United States and Italy to determine the role that crosswalks play in travel by blind pedestrians. The results show stark differences between survey respondents from the U.S. compared with Italy: the former group emphasized the importance of following standard O&M techniques at all legal crossings (marked or unmarked), while the latter group strongly recommended crossing at marked crossings whenever possible. These contrasting opinions reflect differences in the traffic regulations of the two countries and highlight the diversity of needs that travelers in different regions may have. To address the challenges faced by blind pedestrians in negotiating street crossings, we devised a computer vision-based technique that mines existing spatial image databases for discovery of zebra crosswalks in urban settings. Our algorithm first searches for zebra crosswalks in satellite images; all candidates thus found are validated against spatially registered Google Street View images. This cascaded approach enables fast and reliable discovery and localization of zebra crosswalks in large image datasets. While fully automatic, our algorithm can be improved by a final crowdsourcing validation. To this end, we developed a Pedestrian Crossing Human Validation (PCHV) web service, which supports crowdsourcing to rule out false positives and identify false negatives.

  13. Mind your crossings: Mining GIS imagery for crosswalk localization

    PubMed Central

    Ahmetovic, Dragan; Manduchi, Roberto; Coughlan, James M.; Mascetti, Sergio

    2017-01-01

    For blind travelers, finding crosswalks and remaining within their borders while traversing them is a crucial part of any trip involving street crossings. While standard Orientation & Mobility (O&M) techniques allow blind travelers to safely negotiate street crossings, additional information about crosswalks and other important features at intersections would be helpful in many situations, resulting in greater safety and/or comfort during independent travel. For instance, in planning a trip a blind pedestrian may wish to be informed of the presence of all marked crossings near a desired route. We have conducted a survey of several O&M experts from the United States and Italy to determine the role that crosswalks play in travel by blind pedestrians. The results show stark differences between survey respondents from the U.S. compared with Italy: the former group emphasized the importance of following standard O&M techniques at all legal crossings (marked or unmarked), while the latter group strongly recommended crossing at marked crossings whenever possible. These contrasting opinions reflect differences in the traffic regulations of the two countries and highlight the diversity of needs that travelers in different regions may have. To address the challenges faced by blind pedestrians in negotiating street crossings, we devised a computer vision-based technique that mines existing spatial image databases for discovery of zebra crosswalks in urban settings. Our algorithm first searches for zebra crosswalks in satellite images; all candidates thus found are validated against spatially registered Google Street View images. This cascaded approach enables fast and reliable discovery and localization of zebra crosswalks in large image datasets. While fully automatic, our algorithm can be improved by a final crowdsourcing validation. To this end, we developed a Pedestrian Crossing Human Validation (PCHV) web service, which supports crowdsourcing to rule out false positives and identify false negatives. PMID:28757907

  14. Technique for predicting ground-water discharge to surface coal mines and resulting changes in head

    USGS Publications Warehouse

    Weiss, L.S.; Galloway, D.L.; Ishii, Audrey L.

    1986-01-01

    Changes in seepage flux and head (groundwater level) from groundwater drainage into a surface coal mine can be predicted by a technique that considers drainage from the unsaturated zone. The user applies site-specific data to precalculated head and seepage-flux profiles. Groundwater flow through hypothetical aquifer cross sections was simulated using the U.S. Geological Survey finite-difference model, VS2D, which considers variably saturated two-dimensional flow. Conceptual models considered were (1) drainage to a first cut, and (2) drainage to multiple cuts, which includes drainage effects of an area surface mine. Dimensionless head and seepage flux profiles from 246 simulations are presented. Step-by-step instructions and examples are presented. Users are required to know aquifer characteristics and to estimate size and timing of the mine operation at a proposed site. Calculated groundwater drainage to the mine is from one excavated face only. First cut considers confined and unconfined aquifers of a wide range of permeabilities; multiple cuts considers unconfined aquifers of higher permeabilities only. The technique, developed for Illinois coal-mining regions that use area surface mining and evaluated with an actual field example, will be useful in assessing potential hydrologic impacts of mining. Application is limited to hydrogeologic settings and mine operations similar to those considered. Fracture flow, recharge, and leakage are nor considered. (USGS)

  15. Research of Litchi Diseases Diagnosis Expertsystem Based on Rbr and Cbr

    NASA Astrophysics Data System (ADS)

    Xu, Bing; Liu, Liqun

    To conquer the bottleneck problems existing in the traditional rule-based reasoning diseases diagnosis system, such as low reasoning efficiency and lack of flexibility, etc.. It researched the integrated case-based reasoning (CBR) and rule-based reasoning (RBR) technology, and put forward a litchi diseases diagnosis expert system (LDDES) with integrated reasoning method. The method use data mining and knowledge obtaining technology to establish knowledge base and case library. It adopt rules to instruct the retrieval and matching for CBR, and use association rule and decision trees algorithm to calculate case similarity.The experiment shows that the method can increase the system's flexibility and reasoning ability, and improve the accuracy of litchi diseases diagnosis.

  16. Predictive Mining of Time Series Data

    NASA Astrophysics Data System (ADS)

    Java, A.; Perlman, E. S.

    2002-05-01

    All-sky monitors are a relatively new development in astronomy, and their data represent a largely untapped resource. Proper utilization of this resource could lead to important discoveries not only in the physics of variable objects, but in how one observes such objects. We discuss the development of a Java toolbox for astronomical time series data. Rather than using methods conventional in astronomy (e.g., power spectrum and cross-correlation analysis) we employ rule discovery techniques commonly used in analyzing stock-market data. By clustering patterns found within the data, rule discovery allows one to build predictive models, allowing one to forecast when a given event might occur or whether the occurrence of one event will trigger a second. We have tested the toolbox and accompanying display tool on datasets (representing several classes of objects) from the RXTE All Sky Monitor. We use these datasets to illustrate the methods and functionality of the toolbox. We have found predictive patterns in several ASM datasets. We also discuss problems faced in the development process, particularly the difficulties of dealing with discretized and irregularly sampled data. A possible application would be in scheduling target of opportunity observations where the astronomer wants to observe an object when a certain event or series of events occurs. By combining such a toolbox with an automatic, Java query tool which regularly gathers data on objects of interest, the astronomer or telescope operator could use the real-time datastream to efficiently predict the occurrence of (for example) a flare or other event. By combining the toolbox with dynamic time warping data-mining tools, one could predict events which may happen on variable time scales.

  17. Data Mining: Going beyond Traditional Statistics

    ERIC Educational Resources Information Center

    Zhao, Chun-Mei; Luan, Jing

    2006-01-01

    The authors provide an overview of data mining, giving special attention to the relationship between data mining and statistics to unravel some misunderstandings about the two techniques. (Contains 1 figure.)

  18. Mine Water Treatment in Hongai Coal Mines

    NASA Astrophysics Data System (ADS)

    Dang, Phuong Thao; Dang, Vu Chi

    2018-03-01

    Acid mine drainage (AMD) is recognized as one of the most serious environmental problem associated with mining industry. Acid water, also known as acid mine drainage forms when iron sulfide minerals found in the rock of coal seams are exposed to oxidizing conditions in coal mining. Until 2009, mine drainage in Hongai coal mines was not treated, leading to harmful effects on humans, animals and aquatic ecosystem. This report has examined acid mine drainage problem and techniques for acid mine drainage treatment in Hongai coal mines. In addition, selection and criteria for the design of the treatment systems have been presented.

  19. A novel method for predicting kidney stone type using ensemble learning.

    PubMed

    Kazemi, Yassaman; Mirroshandel, Seyed Abolghasem

    2018-01-01

    The high morbidity rate associated with kidney stone disease, which is a silent killer, is one of the main concerns in healthcare systems all over the world. Advanced data mining techniques such as classification can help in the early prediction of this disease and reduce its incidence and associated costs. The objective of the present study is to derive a model for the early detection of the type of kidney stone and the most influential parameters with the aim of providing a decision-support system. Information was collected from 936 patients with nephrolithiasis at the kidney center of the Razi Hospital in Rasht from 2012 through 2016. The prepared dataset included 42 features. Data pre-processing was the first step toward extracting the relevant features. The collected data was analyzed with Weka software, and various data mining models were used to prepare a predictive model. Various data mining algorithms such as the Bayesian model, different types of Decision Trees, Artificial Neural Networks, and Rule-based classifiers were used in these models. We also proposed four models based on ensemble learning to improve the accuracy of each learning algorithm. In addition, a novel technique for combining individual classifiers in ensemble learning was proposed. In this technique, for each individual classifier, a weight is assigned based on our proposed genetic algorithm based method. The generated knowledge was evaluated using a 10-fold cross-validation technique based on standard measures. However, the assessment of each feature for building a predictive model was another significant challenge. The predictive strength of each feature for creating a reproducible outcome was also investigated. Regarding the applied models, parameters such as sex, acid uric condition, calcium level, hypertension, diabetes, nausea and vomiting, flank pain, and urinary tract infection (UTI) were the most vital parameters for predicting the chance of nephrolithiasis. The final ensemble-based model (with an accuracy of 97.1%) was a robust one and could be safely applied to future studies to predict the chances of developing nephrolithiasis. This model provides a novel way to study stone disease by deciphering the complex interaction among different biological variables, thus helping in an early identification and reduction in diagnosis time. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. In-situ identification of anti-personnel mines using acoustic resonant spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perry, R L; Roberts, R S

    1999-02-01

    A new technique for identifying buried Anti-Personnel Mines is described, and a set of preliminary experiments designed to assess the feasibility of this technique is presented. Analysis of the experimental results indicates that the technique has potential, but additional work is required to bring the technique to fruition. In addition to the experimental results presented here, a technique used to characterize the sensor employed in the experiments is detailed.

  1. Responses of Terrestrial Herpetofauna to Persistent, Novel Ecosystems Resulting from Mountaintop Removal Mining

    Treesearch

    Jennifer M. Williams; Donald J. Brown; Petra B. Wood

    2017-01-01

    Mountaintop removal mining is a large-scale surface mining technique that removes entire floral and faunal communities, along with soil horizons located above coal seams. In West Virginia, the majority of this mining occurs on forested mountaintops. However, after mining ceases the land is typically reclaimed to grasslands and shrublands, resulting in novel ecosystems...

  2. Using Open Web APIs in Teaching Web Mining

    ERIC Educational Resources Information Center

    Chen, Hsinchun; Li, Xin; Chau, M.; Ho, Yi-Jen; Tseng, Chunju

    2009-01-01

    With the advent of the World Wide Web, many business applications that utilize data mining and text mining techniques to extract useful business information on the Web have evolved from Web searching to Web mining. It is important for students to acquire knowledge and hands-on experience in Web mining during their education in information systems…

  3. Research on PM2.5 time series characteristics based on data mining technology

    NASA Astrophysics Data System (ADS)

    Zhao, Lifang; Jia, Jin

    2018-02-01

    With the development of data mining technology and the establishment of environmental air quality database, it is necessary to discover the potential correlations and rules by digging the massive environmental air quality information and analyzing the air pollution process. In this paper, we have presented a sequential pattern mining method based on the air quality data and pattern association technology to analyze the PM2.5 time series characteristics. Utilizing the real-time monitoring data of urban air quality in China, the time series rule and variation properties of PM2.5 under different pollution levels are extracted and analyzed. The analysis results show that the time sequence features of the PM2.5 concentration is directly affected by the alteration of the pollution degree. The longest time that PM2.5 remained stable is about 24 hours. As the pollution degree gets severer, the instability time and step ascending time gradually changes from 12-24 hours to 3 hours. The presented method is helpful for the controlling and forecasting of the air quality while saving the measuring costs, which is of great significance for the government regulation and public prevention of the air pollution.

  4. Data-Mining Technologies for Diabetes: A Systematic Review

    PubMed Central

    Marinov, Miroslav; Mosa, Abu Saleh Mohammad; Yoo, Illhoi; Boren, Suzanne Austin

    2011-01-01

    Background The objective of this study is to conduct a systematic review of applications of data-mining techniques in the field of diabetes research. Method We searched the MEDLINE database through PubMed. We initially identified 31 articles by the search, and selected 17 articles representing various data-mining methods used for diabetes research. Our main interest was to identify research goals, diabetes types, data sets, data-mining methods, data-mining software and technologies, and outcomes. Results The applications of data-mining techniques in the selected articles were useful for extracting valuable knowledge and generating new hypothesis for further scientific research/experimentation and improving health care for diabetes patients. The results could be used for both scientific research and real-life practice to improve the quality of health care diabetes patients. Conclusions Data mining has played an important role in diabetes research. Data mining would be a valuable asset for diabetes researchers because it can unearth hidden knowledge from a huge amount of diabetes-related data. We believe that data mining can significantly help diabetes research and ultimately improve the quality of health care for diabetes patients. PMID:22226277

  5. A fuzzy classifier system for process control

    NASA Technical Reports Server (NTRS)

    Karr, C. L.; Phillips, J. C.

    1994-01-01

    A fuzzy classifier system that discovers rules for controlling a mathematical model of a pH titration system was developed by researchers at the U.S. Bureau of Mines (USBM). Fuzzy classifier systems successfully combine the strengths of learning classifier systems and fuzzy logic controllers. Learning classifier systems resemble familiar production rule-based systems, but they represent their IF-THEN rules by strings of characters rather than in the traditional linguistic terms. Fuzzy logic is a tool that allows for the incorporation of abstract concepts into rule based-systems, thereby allowing the rules to resemble the familiar 'rules-of-thumb' commonly used by humans when solving difficult process control and reasoning problems. Like learning classifier systems, fuzzy classifier systems employ a genetic algorithm to explore and sample new rules for manipulating the problem environment. Like fuzzy logic controllers, fuzzy classifier systems encapsulate knowledge in the form of production rules. The results presented in this paper demonstrate the ability of fuzzy classifier systems to generate a fuzzy logic-based process control system.

  6. Managing the Big Data Avalanche in Astronomy - Data Mining the Galaxy Zoo Classification Database

    NASA Astrophysics Data System (ADS)

    Borne, Kirk D.

    2014-01-01

    We will summarize a variety of data mining experiments that have been applied to the Galaxy Zoo database of galaxy classifications, which were provided by the volunteer citizen scientists. The goal of these exercises is to learn new and improved classification rules for diverse populations of galaxies, which can then be applied to much larger sky surveys of the future, such as the LSST (Large Synoptic Sky Survey), which is proposed to obtain detailed photometric data for approximately 20 billion galaxies. The massive Big Data that astronomy projects will generate in the future demand greater application of data mining and data science algorithms, as well as greater training of astronomy students in the skills of data mining and data science. The project described here has involved several graduate and undergraduate research assistants at George Mason University.

  7. Surface mining

    Treesearch

    Robert Leopold; Bruce Rowland; Reed Stalder

    1979-01-01

    The surface mining process consists of four phases: (1) exploration; (2) development; (3) production; and (4) reclamation. A variety of surface mining methods has been developed, including strip mining, auger, area strip, open pit, dredging, and hydraulic. Sound planning and design techniques are essential to implement alternatives to meet the myriad of laws,...

  8. A Survey of Educational Data-Mining Research

    ERIC Educational Resources Information Center

    Huebner, Richard A.

    2013-01-01

    Educational data mining (EDM) is an emerging discipline that focuses on applying data mining tools and techniques to educationally related data. The discipline focuses on analyzing educational data to develop models for improving learning experiences and improving institutional effectiveness. A literature review on educational data mining topics…

  9. A Comparative Study of Data Mining Techniques on Football Match Prediction

    NASA Astrophysics Data System (ADS)

    Rosli, Che Mohamad Firdaus Che Mohd; Zainuri Saringat, Mohd; Razali, Nazim; Mustapha, Aida

    2018-05-01

    Data prediction have become a trend in today’s business or organization. This paper is set to predict match outcomes for association football from the perspective of football club managers and coaches. This paper explored different data mining techniques used for predicting the match outcomes where the target class is win, draw and lose. The main objective of this research is to find the most accurate data mining technique that fits the nature of football data. The techniques tested are Decision Trees, Neural Networks, Bayesian Network, and k-Nearest Neighbors. The results from the comparative experiments showed that Decision Trees produced the highest average prediction accuracy in the domain of football match prediction by 99.56%.

  10. Using text-mining techniques in electronic patient records to identify ADRs from medicine use.

    PubMed

    Warrer, Pernille; Hansen, Ebba Holme; Juhl-Jensen, Lars; Aagaard, Lise

    2012-05-01

    This literature review included studies that use text-mining techniques in narrative documents stored in electronic patient records (EPRs) to investigate ADRs. We searched PubMed, Embase, Web of Science and International Pharmaceutical Abstracts without restrictions from origin until July 2011. We included empirically based studies on text mining of electronic patient records (EPRs) that focused on detecting ADRs, excluding those that investigated adverse events not related to medicine use. We extracted information on study populations, EPR data sources, frequencies and types of the identified ADRs, medicines associated with ADRs, text-mining algorithms used and their performance. Seven studies, all from the United States, were eligible for inclusion in the review. Studies were published from 2001, the majority between 2009 and 2010. Text-mining techniques varied over time from simple free text searching of outpatient visit notes and inpatient discharge summaries to more advanced techniques involving natural language processing (NLP) of inpatient discharge summaries. Performance appeared to increase with the use of NLP, although many ADRs were still missed. Due to differences in study design and populations, various types of ADRs were identified and thus we could not make comparisons across studies. The review underscores the feasibility and potential of text mining to investigate narrative documents in EPRs for ADRs. However, more empirical studies are needed to evaluate whether text mining of EPRs can be used systematically to collect new information about ADRs. © 2011 The Authors. British Journal of Clinical Pharmacology © 2011 The British Pharmacological Society.

  11. Using text-mining techniques in electronic patient records to identify ADRs from medicine use

    PubMed Central

    Warrer, Pernille; Hansen, Ebba Holme; Juhl-Jensen, Lars; Aagaard, Lise

    2012-01-01

    This literature review included studies that use text-mining techniques in narrative documents stored in electronic patient records (EPRs) to investigate ADRs. We searched PubMed, Embase, Web of Science and International Pharmaceutical Abstracts without restrictions from origin until July 2011. We included empirically based studies on text mining of electronic patient records (EPRs) that focused on detecting ADRs, excluding those that investigated adverse events not related to medicine use. We extracted information on study populations, EPR data sources, frequencies and types of the identified ADRs, medicines associated with ADRs, text-mining algorithms used and their performance. Seven studies, all from the United States, were eligible for inclusion in the review. Studies were published from 2001, the majority between 2009 and 2010. Text-mining techniques varied over time from simple free text searching of outpatient visit notes and inpatient discharge summaries to more advanced techniques involving natural language processing (NLP) of inpatient discharge summaries. Performance appeared to increase with the use of NLP, although many ADRs were still missed. Due to differences in study design and populations, various types of ADRs were identified and thus we could not make comparisons across studies. The review underscores the feasibility and potential of text mining to investigate narrative documents in EPRs for ADRs. However, more empirical studies are needed to evaluate whether text mining of EPRs can be used systematically to collect new information about ADRs. PMID:22122057

  12. Post-acquisition data mining techniques for LC-MS/MS-acquired data in drug metabolite identification.

    PubMed

    Dhurjad, Pooja Sukhdev; Marothu, Vamsi Krishna; Rathod, Rajeshwari

    2017-08-01

    Metabolite identification is a crucial part of the drug discovery process. LC-MS/MS-based metabolite identification has gained widespread use, but the data acquired by the LC-MS/MS instrument is complex, and thus the interpretation of data becomes troublesome. Fortunately, advancements in data mining techniques have simplified the process of data interpretation with improved mass accuracy and provide a potentially selective, sensitive, accurate and comprehensive way for metabolite identification. In this review, we have discussed the targeted (extracted ion chromatogram, mass defect filter, product ion filter, neutral loss filter and isotope pattern filter) and untargeted (control sample comparison, background subtraction and metabolomic approaches) post-acquisition data mining techniques, which facilitate the drug metabolite identification. We have also discussed the importance of integrated data mining strategy.

  13. Knowledge Discovery and Data Mining: An Overview

    NASA Technical Reports Server (NTRS)

    Fayyad, U.

    1995-01-01

    The process of knowledge discovery and data mining is the process of information extraction from very large databases. Its importance is described along with several techniques and considerations for selecting the most appropriate technique for extracting information from a particular data set.

  14. Application of Information-Theoretic Data Mining Techniques in a National Ambulatory Practice Outcomes Research Network

    PubMed Central

    Wright, Adam; Ricciardi, Thomas N.; Zwick, Martin

    2005-01-01

    The Medical Quality Improvement Consortium data warehouse contains de-identified data on more than 3.6 million patients including their problem lists, test results, procedures and medication lists. This study uses reconstructability analysis, an information-theoretic data mining technique, on the MQIC data warehouse to empirically identify risk factors for various complications of diabetes including myocardial infarction and microalbuminuria. The risk factors identified match those risk factors identified in the literature, demonstrating the utility of the MQIC data warehouse for outcomes research, and RA as a technique for mining clinical data warehouses. PMID:16779156

  15. Ion Channel ElectroPhysiology Ontology (ICEPO) - a case study of text mining assisted ontology development.

    PubMed

    Elayavilli, Ravikumar Komandur; Liu, Hongfang

    2016-01-01

    Computational modeling of biological cascades is of great interest to quantitative biologists. Biomedical text has been a rich source for quantitative information. Gathering quantitative parameters and values from biomedical text is one significant challenge in the early steps of computational modeling as it involves huge manual effort. While automatically extracting such quantitative information from bio-medical text may offer some relief, lack of ontological representation for a subdomain serves as impedance in normalizing textual extractions to a standard representation. This may render textual extractions less meaningful to the domain experts. In this work, we propose a rule-based approach to automatically extract relations involving quantitative data from biomedical text describing ion channel electrophysiology. We further translated the quantitative assertions extracted through text mining to a formal representation that may help in constructing ontology for ion channel events using a rule based approach. We have developed Ion Channel ElectroPhysiology Ontology (ICEPO) by integrating the information represented in closely related ontologies such as, Cell Physiology Ontology (CPO), and Cardiac Electro Physiology Ontology (CPEO) and the knowledge provided by domain experts. The rule-based system achieved an overall F-measure of 68.93% in extracting the quantitative data assertions system on an independently annotated blind data set. We further made an initial attempt in formalizing the quantitative data assertions extracted from the biomedical text into a formal representation that offers potential to facilitate the integration of text mining into ontological workflow, a novel aspect of this study. This work is a case study where we created a platform that provides formal interaction between ontology development and text mining. We have achieved partial success in extracting quantitative assertions from the biomedical text and formalizing them in ontological framework. The ICEPO ontology is available for download at http://openbionlp.org/mutd/supplementarydata/ICEPO/ICEPO.owl.

  16. Prediction of pork quality parameters by applying fractals and data mining on MRI.

    PubMed

    Caballero, Daniel; Pérez-Palacios, Trinidad; Caro, Andrés; Amigo, José Manuel; Dahl, Anders B; ErsbØll, Bjarne K; Antequera, Teresa

    2017-09-01

    This work firstly investigates the use of MRI, fractal algorithms and data mining techniques to determine pork quality parameters non-destructively. The main objective was to evaluate the capability of fractal algorithms (Classical Fractal algorithm, CFA; Fractal Texture Algorithm, FTA and One Point Fractal Texture Algorithm, OPFTA) to analyse MRI in order to predict quality parameters of loin. In addition, the effect of the sequence acquisition of MRI (Gradient echo, GE; Spin echo, SE and Turbo 3D, T3D) and the predictive technique of data mining (Isotonic regression, IR and Multiple linear regression, MLR) were analysed. Both fractal algorithm, FTA and OPFTA are appropriate to analyse MRI of loins. The sequence acquisition, the fractal algorithm and the data mining technique seems to influence on the prediction results. For most physico-chemical parameters, prediction equations with moderate to excellent correlation coefficients were achieved by using the following combinations of acquisition sequences of MRI, fractal algorithms and data mining techniques: SE-FTA-MLR, SE-OPFTA-IR, GE-OPFTA-MLR, SE-OPFTA-MLR, with the last one offering the best prediction results. Thus, SE-OPFTA-MLR could be proposed as an alternative technique to determine physico-chemical traits of fresh and dry-cured loins in a non-destructive way with high accuracy. Copyright © 2017. Published by Elsevier Ltd.

  17. Fact Sheet - Final Air Toxics Rule for Gold Mine Ore Processing and Production

    EPA Pesticide Factsheets

    Fact sheet summarizing main points of National Emissions Standards for Hazardous Air Pollutants for gold ore processing and production facilities, the seventh largest source of mercury air emission in the United States.

  18. Chromite Ore from the Transvaal Region of South Africa

    EPA Pesticide Factsheets

    In 2001, EPA finalized a rule to to delete both chromite ore mined in the Transvaal Region of South Africa and the unreacted ore component of the chromite ore processing residue (COPR) from TRI reporting requirements.

  19. 78 FR 77024 - Telemarketing Sales Rule; Notice of Termination of Caller ID Rulemaking

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-12-20

    ..., data mining and anomaly detection, and call-blocking technology). \\19\\ AT&T Servs., Inc., No. 00040, at... technically feasible, by looking at the signaling data . . . to distinguish between a CPN [calling party...

  20. An intelligent knowledge mining model for kidney cancer using rough set theory.

    PubMed

    Durai, M A Saleem; Acharjya, D P; Kannan, A; Iyengar, N Ch Sriman Narayana

    2012-01-01

    Medical diagnosis processes vary in the degree to which they attempt to deal with different complicating aspects of diagnosis such as relative importance of symptoms, varied symptom pattern and the relation between diseases themselves. Rough set approach has two major advantages over the other methods. First, it can handle different types of data such as categorical, numerical etc. Secondly, it does not make any assumption like probability distribution function in stochastic modeling or membership grade function in fuzzy set theory. It involves pattern recognition through logical computational rules rather than approximating them through smooth mathematical functional forms. In this paper we use rough set theory as a data mining tool to derive useful patterns and rules for kidney cancer faulty diagnosis. In particular, the historical data of twenty five research hospitals and medical college is used for validation and the results show the practical viability of the proposed approach.

  1. The expert explorer: a tool for hospital data visualization and adverse drug event rules validation.

    PubMed

    Băceanu, Adrian; Atasiei, Ionuţ; Chazard, Emmanuel; Leroy, Nicolas

    2009-01-01

    An important part of adverse drug events (ADEs) detection is the validation of the clinical cases and the assessment of the decision rules to detect ADEs. For that purpose, a software called "Expert Explorer" has been designed by Ideea Advertising. Anonymized datasets have been extracted from hospitals into a common repository. The tool has 3 main features. (1) It can display hospital stays in a visual and comprehensive way (diagnoses, drugs, lab results, etc.) using tables and pretty charts. (2) It allows designing and executing dashboards in order to generate knowledge about ADEs. (3) It finally allows uploading decision rules obtained from data mining. Experts can then review the rules, the hospital stays that match the rules, and finally give their advice thanks to specialized forms. Then the rules can be validated, invalidated, or improved (knowledge elicitation phase).

  2. Data Mining and Knowledge Management in Higher Education -Potential Applications.

    ERIC Educational Resources Information Center

    Luan, Jing

    This paper introduces a new decision support tool, data mining, in the context of knowledge management. The most striking features of data mining techniques are clustering and prediction. The clustering aspect of data mining offers comprehensive characteristics analysis of students, while the predicting function estimates the likelihood for a…

  3. Statistical methods of estimating mining costs

    USGS Publications Warehouse

    Long, K.R.

    2011-01-01

    Until it was defunded in 1995, the U.S. Bureau of Mines maintained a Cost Estimating System (CES) for prefeasibility-type economic evaluations of mineral deposits and estimating costs at producing and non-producing mines. This system had a significant role in mineral resource assessments to estimate costs of developing and operating known mineral deposits and predicted undiscovered deposits. For legal reasons, the U.S. Geological Survey cannot update and maintain CES. Instead, statistical tools are under development to estimate mining costs from basic properties of mineral deposits such as tonnage, grade, mineralogy, depth, strip ratio, distance from infrastructure, rock strength, and work index. The first step was to reestimate "Taylor's Rule" which relates operating rate to available ore tonnage. The second step was to estimate statistical models of capital and operating costs for open pit porphyry copper mines with flotation concentrators. For a sample of 27 proposed porphyry copper projects, capital costs can be estimated from three variables: mineral processing rate, strip ratio, and distance from nearest railroad before mine construction began. Of all the variables tested, operating costs were found to be significantly correlated only with strip ratio.

  4. Ensuring the Environmental and Industrial Safety in Solid Mineral Deposit Surface Mining

    NASA Astrophysics Data System (ADS)

    Trubetskoy, Kliment; Rylnikova, Marina; Esina, Ekaterina

    2017-11-01

    The growing environmental pressure of mineral deposit surface mining and severization of industrial safety requirements dictate the necessity of refining the regulatory framework governing safe and efficient development of underground resources. The applicable regulatory documentation governing the procedure of ore open-pit wall and bench stability design for the stage of pit reaching its final boundary was issued several decades ago. Over recent decades, mining and geomechanical conditions have changed significantly in surface mining operations, numerous new software packages and computer developments have appeared, opportunities of experimental methods of source data collection and processing, grounding of the permissible parameters of open pit walls have changed dramatically, and, thus, methods of risk assessment have been perfected [10-13]. IPKON RAS, with the support of the Federal Service for Environmental Supervision, assumed the role of the initiator of the project for the development of Federal norms and regulations of industrial safety "Rules for ensuring the stability of walls and benches of open pits, open-cast mines and spoil banks", which contribute to the improvement of economic efficiency and safety of mineral deposit surface mining and enhancement of the competitiveness of Russian mines at the international level that is very important in the current situation.

  5. A fuzzy decision tree for fault classification.

    PubMed

    Zio, Enrico; Baraldi, Piero; Popescu, Irina C

    2008-02-01

    In plant accident management, the control room operators are required to identify the causes of the accident, based on the different patterns of evolution of the monitored process variables thereby developing. This task is often quite challenging, given the large number of process parameters monitored and the intense emotional states under which it is performed. To aid the operators, various techniques of fault classification have been engineered. An important requirement for their practical application is the physical interpretability of the relationships among the process variables underpinning the fault classification. In this view, the present work propounds a fuzzy approach to fault classification, which relies on fuzzy if-then rules inferred from the clustering of available preclassified signal data, which are then organized in a logical and transparent decision tree structure. The advantages offered by the proposed approach are precisely that a transparent fault classification model is mined out of the signal data and that the underlying physical relationships among the process variables are easily interpretable as linguistic if-then rules that can be explicitly visualized in the decision tree structure. The approach is applied to a case study regarding the classification of simulated faults in the feedwater system of a boiling water reactor.

  6. [Analysis of on medication rules for Qi-deficiency and blood-stasis syndrome of chronic heart failure based on data mining technology].

    PubMed

    Wang, Qian; Yao, Geng-Zhen; Pan, Guang-Ming; Huang, Jing-Yi; An, Yi-Pei; Zou, Xu

    2017-01-01

    To analyze the medication features and the regularity of prescriptions of traditional Chinese medicine in treating patients with Qi-deficiency and blood-stasis syndrome of chronic heart failure based on modern literature. In this article, CNKI Chinese academic journal database, Wanfang Chinese academic journal database and VIP Chinese periodical database were all searched from January 2000 to December 2015 for the relevant literature on traditional Chinese medicine treatment for Qi-deficiency and blood-stasis syndrome of chronic heart failure. Then a normalized database was established for further data mining and analysis. Subsequently, the medication features and the regularity of prescriptions were mined by using traditional Chinese medicine inheritance support system(V2.5), association rules, improved mutual information algorithm, complex system entropy clustering and other mining methods. Finally, a total of 171 articles were included, involving 171 prescriptions, 140 kinds of herbs, with a total frequency of 1 772 for the herbs. As a result, 19 core prescriptions and 7 new prescriptions were mined. The most frequently used herbs included Huangqi(Astragali Radix), Danshen(Salviae Miltiorrhizae Radix et Rhizoma), Fuling(Poria), Renshen(Ginseng Radix et Rhizoma), Tinglizi(Semen Lepidii), Baizhu(Atractylodis Macrocephalae Rhizoma), and Guizhi(Cinnamomum Ramulus). The core prescriptions were composed of Huangqi(Astragali Radix), Danshen(Salviae Miltiorrhizae Radix et Rhizoma) and Fuling(Poria), etc. The high frequent herbs and core prescriptions not only highlight the medication features of Qi-invigorating and blood-circulating therapy, but also reflect the regularity of prescriptions of blood-circulating, Yang-warming, and urination-promoting therapy based on syndrome differentiation. Moreover, the mining of the new prescriptions provide new reference and inspiration for clinical treatment of various accompanying symptoms of chronic heart failure. In conclusion, this article provides new reference for traditional Chinese medicine in the treatment of chronic heart failure. Copyright© by the Chinese Pharmaceutical Association.

  7. Knowledge discovery with classification rules in a cardiovascular dataset.

    PubMed

    Podgorelec, Vili; Kokol, Peter; Stiglic, Milojka Molan; Hericko, Marjan; Rozman, Ivan

    2005-12-01

    In this paper we study an evolutionary machine learning approach to data mining and knowledge discovery based on the induction of classification rules. A method for automatic rules induction called AREX using evolutionary induction of decision trees and automatic programming is introduced. The proposed algorithm is applied to a cardiovascular dataset consisting of different groups of attributes which should possibly reveal the presence of some specific cardiovascular problems in young patients. A case study is presented that shows the use of AREX for the classification of patients and for discovering possible new medical knowledge from the dataset. The defined knowledge discovery loop comprises a medical expert's assessment of induced rules to drive the evolution of rule sets towards more appropriate solutions. The final result is the discovery of a possible new medical knowledge in the field of pediatric cardiology.

  8. 20 CFR 410.687 - Rules governing the representation and advising of claimants and parties.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... ADMINISTRATION FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, TITLE IV-BLACK LUNG BENEFITS (1969... attorney or other representative shall: (a) With intent to defraud, in any matter willfully and knowingly...

  9. PISA — Pooling Information from Several Agents: Multiplayer Argumentation from Experience

    NASA Astrophysics Data System (ADS)

    Wardeh, Maya; Bench-Capon, Trevor; Coenen, Frans

    In this paper a framework, PISA (Pooling Information from Several Agents), to facilitate multiplayer (three or more protagonists), "argumentation from experience" is described. Multiplayer argumentation is a form of dialogue game involving three or more players. The PISA framework is founded on a two player argumentation framework, PADUA (Protocol for Argumentation Dialogue Using Association Rules), also developed by the authors. One of the main advantages of both PISA and PADUA is that they avoid the resource intensive need to predefine a knowledge base, instead data mining techniques are used to facilitate the provision of "just in time" information. Many of the issues associated with multiplayer dialogue games do not present a significant challenge in the two player game. The main original contributions of this paper are the mechanisms whereby the PISA framework addresses these challenges.

  10. Text Mining in Organizational Research

    PubMed Central

    Kobayashi, Vladimer B.; Berkers, Hannah A.; Kismihók, Gábor; Den Hartog, Deanne N.

    2017-01-01

    Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies. PMID:29881248

  11. Text Mining in Organizational Research.

    PubMed

    Kobayashi, Vladimer B; Mol, Stefan T; Berkers, Hannah A; Kismihók, Gábor; Den Hartog, Deanne N

    2018-07-01

    Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.

  12. A Proposed Data Fusion Architecture for Micro-Zone Analysis and Data Mining

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kevin McCarthy; Milos Manic

    Data Fusion requires the ability to combine or “fuse” date from multiple data sources. Time Series Analysis is a data mining technique used to predict future values from a data set based upon past values. Unlike other data mining techniques, however, Time Series places special emphasis on periodicity and how seasonal and other time-based factors tend to affect trends over time. One of the difficulties encountered in developing generic time series techniques is the wide variability of the data sets available for analysis. This presents challenges all the way from the data gathering stage to results presentation. This paper presentsmore » an architecture designed and used to facilitate the collection of disparate data sets well suited to Time Series analysis as well as other predictive data mining techniques. Results show this architecture provides a flexible, dynamic framework for the capture and storage of a myriad of dissimilar data sets and can serve as a foundation from which to build a complete data fusion architecture.« less

  13. Application of Three Existing Stope Boundary Optimisation Methods in an Operating Underground Mine

    NASA Astrophysics Data System (ADS)

    Erdogan, Gamze; Yavuz, Mahmut

    2017-12-01

    The underground mine planning and design optimisation process have received little attention because of complexity and variability of problems in underground mines. Although a number of optimisation studies and software tools are available and some of them, in special, have been implemented effectively to determine the ultimate-pit limits in an open pit mine, there is still a lack of studies for optimisation of ultimate stope boundaries in underground mines. The proposed approaches for this purpose aim at maximizing the economic profit by selecting the best possible layout under operational, technical and physical constraints. In this paper, the existing three heuristic techniques including Floating Stope Algorithm, Maximum Value Algorithm and Mineable Shape Optimiser (MSO) are examined for optimisation of stope layout in a case study. Each technique is assessed in terms of applicability, algorithm capabilities and limitations considering the underground mine planning challenges. Finally, the results are evaluated and compared.

  14. Evaluation of the environmental contamination at an abandoned mining site using multivariate statistical techniques--the Rodalquilar (Southern Spain) mining district.

    PubMed

    Bagur, M G; Morales, S; López-Chicano, M

    2009-11-15

    Unsupervised and supervised pattern recognition techniques such as hierarchical cluster analysis, principal component analysis, factor analysis and linear discriminant analysis have been applied to water samples recollected in Rodalquilar mining district (Southern Spain) in order to identify different sources of environmental pollution caused by the abandoned mining industry. The effect of the mining activity on waters was monitored determining the concentration of eleven elements (Mn, Ba, Co, Cu, Zn, As, Cd, Sb, Hg, Au and Pb) by inductively coupled plasma mass spectrometry (ICP-MS). The Box-Cox transformation has been used to transform the data set in normal form in order to minimize the non-normal distribution of the geochemical data. The environmental impact is affected mainly by the mining activity developed in the zone, the acid drainage and finally by the chemical treatment used for the benefit of gold.

  15. Introduction to the mining of clinical data.

    PubMed

    Harrison, James H

    2008-03-01

    The increasing volume of medical data online, including laboratory data, represents a substantial resource that can provide a foundation for improved understanding of disease presentation, response to therapy, and health care delivery processes. Data mining supports these goals by providing a set of techniques designed to discover similarities and relationships between data elements in large data sets. Currently, medical data have several characteristics that increase the difficulty of applying these techniques, although there have been notable medical data mining successes. Future developments in integrated medical data repositories, standardized data representation, and guidelines for the appropriate research use of medical data will decrease the barriers to mining projects.

  16. Deriving preference order of post-mining land-uses through MLSA framework: application of an outranking technique

    NASA Astrophysics Data System (ADS)

    Soltanmohammadi, Hossein; Osanloo, Morteza; Aghajani Bazzazi, Abbas

    2009-08-01

    This study intends to take advantage of a previously developed framework for mined land suitability analysis (MLSA) consisted of economical, social, technical and mine site factors to achieve a partial and also a complete pre-order of feasible post-mining land-uses. Analysis by an outranking multi-attribute decision-making (MADM) technique, called PROMETHEE (preference ranking organization method for enrichment evaluation), was taken into consideration because of its clear advantages on the field of MLSA as compared with MADM ranking techniques. Application of the proposed approach on a mined land can be completed through some successive steps. First, performance of the MLSA attributes is scored locally by each individual decision maker (DM). Then the assigned performance scores are normalized and the deviation amplitudes of non-dominated alternatives are calculated. Weights of the attributes are calculated by another MADM technique namely, analytical hierarchy process (AHP) in a separate procedure. Using the Gaussian preference function beside the weights, the preference indexes of the land-use alternatives are obtained. Calculation of the outgoing and entering flows of the alternatives and one by one comparison of these values will lead to partial pre-order of them and calculation of the net flows, will lead to a ranked preference for each land-use. At the final step, utilizing the PROMETHEE group decision support system which incorporates judgments of all the DMs, a consensual ranking can be derived. In this paper, preference order of post-mining land-uses for a hypothetical mined land has been derived according to judgments of one DM to reveal applicability of the proposed approach.

  17. Application of remote-sensing techniques to hydrologic studies in selected coal-mine areas of southeastern Kansas

    USGS Publications Warehouse

    Kenny, J.F.; McCauley, J.R.

    1983-01-01

    Disturbances resulting from intensive coal mining in the Cherry Creek basin of southeastern Kansas were investigated using color and color-infrared aerial photography in conjunction with water-quality data from simultaneously acquired samples. Imagery was used to identify the type and extent of vegetative cover on strip-mined lands and the extent and success of reclamation practices. Drainage patterns, point sources of acid mine drainage, and recharge areas for underground mines were located for onsite inspection. Comparison of these interpretations with water-quality data illustrated differences between the eastern and western parts of the Cherry Creek basin. Contamination in the eastern part is due largely to circulation of water from unreclaimed strip mines and collapse features through the network of underground mines and subsequent discharge of acidic drainage through seeps. Contamination in the western part is primarily caused by runoff and seepage from strip-mined lands in which surfaces have frequently been graded and limed but are generally devoid of mature stands of soil-anchoring vegetation. The successful use of aerial photography in the study of Cherry Creek basin indicates the potential of using remote-sensing techniques in studies of other coal-mined regions. (USGS)

  18. Exploring the social determinants of mental health service use using intersectionality theory and CART analysis.

    PubMed

    Cairney, John; Veldhuizen, Scott; Vigod, Simone; Streiner, David L; Wade, Terrance J; Kurdyak, Paul

    2014-02-01

    Fewer than half of individuals with a mental disorder seek formal care in a given year. Much research has been conducted on the factors that influence service use in this population, but the methods generally used cannot easily identify the complex interactions that are thought to exist. In this paper, we examine predictors of subsequent service use among respondents to a population health survey who met criteria for a past-year mood, anxiety or substance-related disorder. To determine service use, we use an administrative database including all physician consultations in the period of interest. To identify predictors, we use classification tree (CART) analysis, a data mining technique with the ability to identify unsuspected interactions. We compare results to those from logistic regression models. We identify 1213 individuals with past-year disorder. In the year after the survey, 24% (n=312) of these had a mental health-related physician consultation. Logistic regression revealed that age, sex and marital status predicted service use. CART analysis yielded a set of rules based on age, sex, marital status and income adequacy, with marital status playing a role among men and by income adequacy important among women. CART analysis proved moderately effective overall, with agreement of 60%, sensitivity of 82% and specificity of 53%. Results highlight the potential of data-mining techniques to uncover complex interactions, and offer support to the view that the intersection of multiple statuses influence health and behaviour in ways that are difficult to identify with conventional statistics. The disadvantages of these methods are also discussed.

  19. 30 CFR 282.28 - Environmental protection measures.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... recent research or improved monitoring techniques. (5) When prototype test mining is proposed, the lessee...) The sampling techniques and procedures to be used to acquire the needed data and information; (ii) The... evaluation of the approved Delineation, Testing, or Mining Plan. The Director's review of the air quality...

  20. Analyzing Teaching Performance of Instructors Using Data Mining Techniques

    ERIC Educational Resources Information Center

    Mardikyan, Sona; Badur, Bertain

    2011-01-01

    Student evaluations to measure the teaching effectiveness of instructor's are very frequently applied in higher education for many years. This study investigates the factors associated with the assessment of instructors teaching performance using two different data mining techniques; stepwise regression and decision trees. The data collected…

  1. Monitoring and inversion on land subsidence over mining area with InSAR technique

    USGS Publications Warehouse

    Wang, Y.; Zhang, Q.; Zhao, C.; Lu, Z.; Ding, X.

    2011-01-01

    The Wulanmulun town, located in Inner Mongolia, is one of the main mining areas of Shendong Company such as Shangwan coal mine and Bulianta coal mine, which has been suffering serious mine collapse with the underground mine withdrawal. We use ALOS/PALSAR data to extract land deformation under these regions, in which Small Baseline Subsets (SBAS) method was applied. Then we compared InSAR results with the underground mining activities, and found high correlations between them. Lastly we applied Distributed Dislocation (Okada) model to invert the mine collapse mechanism. ?? 2011 Copyright Society of Photo-Optical Instrumentation Engineers (SPIE).

  2. 15 CFR 970.600 - General.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.600 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...

  3. 15 CFR 970.600 - General.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.600 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...

  4. 15 CFR 970.600 - General.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.600 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...

  5. 15 CFR 971.500 - General.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL RECOVERY PERMITS Resource Development § 971.500 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...

  6. 15 CFR 971.500 - General.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL RECOVERY PERMITS Resource Development § 971.500 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...

  7. 15 CFR 971.500 - General.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL RECOVERY PERMITS Resource Development § 971.500 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...

  8. 15 CFR 971.500 - General.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL RECOVERY PERMITS Resource Development § 971.500 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...

  9. 15 CFR 971.500 - General.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR COMMERCIAL RECOVERY PERMITS Resource Development § 971.500 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...

  10. 15 CFR 970.600 - General.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.600 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...

  11. 15 CFR 970.600 - General.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... AND ATMOSPHERIC ADMINISTRATION, DEPARTMENT OF COMMERCE GENERAL REGULATIONS OF THE ENVIRONMENTAL DATA SERVICE DEEP SEABED MINING REGULATIONS FOR EXPLORATION LICENSES Resource Development Concepts § 970.600 General. Several provisions in the Act relate to appropriate mining techniques or mining efficiency. These...

  12. VRLane: a desktop virtual safety management program for underground coal mine

    NASA Astrophysics Data System (ADS)

    Li, Mei; Chen, Jingzhu; Xiong, Wei; Zhang, Pengpeng; Wu, Daozheng

    2008-10-01

    VR technologies, which generate immersive, interactive, and three-dimensional (3D) environments, are seldom applied to coal mine safety work management. In this paper, a new method that combined the VR technologies with underground mine safety management system was explored. A desktop virtual safety management program for underground coal mine, called VRLane, was developed. The paper mainly concerned about the current research advance in VR, system design, key techniques and system application. Two important techniques were introduced in the paper. Firstly, an algorithm was designed and implemented, with which the 3D laneway models and equipment models can be built on the basis of the latest mine 2D drawings automatically, whereas common VR programs established 3D environment by using 3DS Max or the other 3D modeling software packages with which laneway models were built manually and laboriously. Secondly, VRLane realized system integration with underground industrial automation. VRLane not only described a realistic 3D laneway environment, but also described the status of the coal mining, with functions of displaying the run states and related parameters of equipment, per-alarming the abnormal mining events, and animating mine cars, mine workers, or long-wall shearers. The system, with advantages of cheap, dynamic, easy to maintenance, provided a useful tool for safety production management in coal mine.

  13. Text mining and its potential applications in systems biology.

    PubMed

    Ananiadou, Sophia; Kell, Douglas B; Tsujii, Jun-ichi

    2006-12-01

    With biomedical literature increasing at a rate of several thousand papers per week, it is impossible to keep abreast of all developments; therefore, automated means to manage the information overload are required. Text mining techniques, which involve the processes of information retrieval, information extraction and data mining, provide a means of solving this. By adding meaning to text, these techniques produce a more structured analysis of textual knowledge than simple word searches, and can provide powerful tools for the production and analysis of systems biology models.

  14. Application of LANDSAT data to monitor land reclamation progress in Belmont County, Ohio

    NASA Technical Reports Server (NTRS)

    Bloemer, H. H. L.; Brumfield, J. O.; Campbell, W. J.; Witt, R. G.; Bly, B. G.

    1981-01-01

    Strip and contour mining techniques are reviewed as well as some studies conducted to determine the applicability of LANDSAT and associated digital image processing techniques to the surficial problems associated with mining operations. A nontraditional unsupervised classification approach to multispectral data is considered which renders increased classification separability in land cover analysis of surface mined areas. The approach also reduces the dimensionality of the data and requires only minimal analytical skills in digital data processing.

  15. Applying Web Usage Mining for Personalizing Hyperlinks in Web-Based Adaptive Educational Systems

    ERIC Educational Resources Information Center

    Romero, Cristobal; Ventura, Sebastian; Zafra, Amelia; de Bra, Paul

    2009-01-01

    Nowadays, the application of Web mining techniques in e-learning and Web-based adaptive educational systems is increasing exponentially. In this paper, we propose an advanced architecture for a personalization system to facilitate Web mining. A specific Web mining tool is developed and a recommender engine is integrated into the AHA! system in…

  16. COMPARISON OF DATA FROM SYNTHETIC LEACHATE AND DIRECT SAMPLING OF ACID DRAINAGE FROM MINE WASTES: IMPLICATIONS FOR MERCURY TRANSPORT AND WASTE MANAGEMENT

    EPA Science Inventory

    The Sulphur Bank Mercury Mine (SBMM) in Lake County, California operated from the 1860s through the 1950's. Mining for sulfur started with surface operations and progressed to shaft, then open pit techniques to obtain mercury. Mining has resulted in deposition of approximately ...

  17. Examining Online Learning Patterns with Data Mining Techniques in Peer-Moderated and Teacher-Moderated Courses

    ERIC Educational Resources Information Center

    Hung, Jui-Long; Crooks, Steven M.

    2009-01-01

    The student learning process is important in online learning environments. If instructors can "observe" online learning behaviors, they can provide adaptive feedback, adjust instructional strategies, and assist students in establishing patterns of successful learning activities. This study used data mining techniques to examine and…

  18. Site investigation report mine research project GUE 70-14.10, Guernsey, Ohio.

    DOT National Transportation Integrated Search

    2003-06-01

    Geophysical investigative techniques can be a valuable supplement to standard subsurface investigations for the : evaluation of abandoned underground coal mine workings and their potential impacts at the ground surface. The GUE : 70 - 14.10 Mine Rese...

  19. Optimization of fertirrigation efficiency in strawberry crops by application of fuzzy logic techniques.

    PubMed

    de la Torre, M L; Grande, J A; Aroba, J; Andujar, J M

    2005-11-01

    A high level of price support has favoured intensive agriculture and an increasing use of fertilisers and pesticides. This has resulted in the pollution of water and soils and damage to certain eco-systems. The target relationship that must be established between agriculture and environment can be called "sustainable agriculture". In this work we aim at relating strawberry total yield with nitrate concentration in water at different soil depths. To achieve this objective, we have used the Predictive Fuzzy Rules Generator (PreFuRGe) tool, based on fuzzy logic and data mining, by means of which the dose that allows a balance between yield and environmental damage minimization can be determined. This determination is quite simple and is done directly from the obtained charts. This technique can be used in other types of crops permitting one to determine in a precise way at which depth the appropriate dose of nitrate fertilizer must be correctly applied, on the one hand providing the maximum yield but, on the other hand, with the minimum loss of nitrates that leachate through the saturated zone polluting aquifers.

  20. Semi-automated contour recognition using DICOMautomaton

    NASA Astrophysics Data System (ADS)

    Clark, H.; Wu, J.; Moiseenko, V.; Lee, R.; Gill, B.; Duzenli, C.; Thomas, S.

    2014-03-01

    Purpose: A system has been developed which recognizes and classifies Digital Imaging and Communication in Medicine contour data with minimal human intervention. It allows researchers to overcome obstacles which tax analysis and mining systems, including inconsistent naming conventions and differences in data age or resolution. Methods: Lexicographic and geometric analysis is used for recognition. Well-known lexicographic methods implemented include Levenshtein-Damerau, bag-of-characters, Double Metaphone, Soundex, and (word and character)-N-grams. Geometrical implementations include 3D Fourier Descriptors, probability spheres, boolean overlap, simple feature comparison (e.g. eccentricity, volume) and rule-based techniques. Both analyses implement custom, domain-specific modules (e.g. emphasis differentiating left/right organ variants). Contour labels from 60 head and neck patients are used for cross-validation. Results: Mixed-lexicographical methods show an effective improvement in more than 10% of recognition attempts compared with a pure Levenshtein-Damerau approach when withholding 70% of the lexicon. Domain-specific and geometrical techniques further boost performance. Conclusions: DICOMautomaton allows users to recognize contours semi-automatically. As usage increases and the lexicon is filled with additional structures, performance improves, increasing the overall utility of the system.

  1. Analysis of Hospital Processes with Process Mining Techniques.

    PubMed

    Orellana García, Arturo; Pérez Alfonso, Damián; Larrea Armenteros, Osvaldo Ulises

    2015-01-01

    Process mining allows for discovery, monitoring, and improving processes identified in information systems from their event logs. In hospital environments, process analysis has been a crucial factor for cost reduction, control and proper use of resources, better patient care, and achieving service excellence. This paper presents a new component for event logs generation in the Hospital Information System or HIS, developed at University of Informatics Sciences. The event logs obtained are used for analysis of hospital processes with process mining techniques. The proposed solution intends to achieve the generation of event logs in the system with high quality. The performed analyses allowed for redefining functions in the system and proposed proper flow of information. The study exposed the need to incorporate process mining techniques in hospital systems to analyze the processes execution. Moreover, we illustrate its application for making clinical and administrative decisions for the management of hospital activities.

  2. Comparison of rule induction, decision trees and formal concept analysis approaches for classification

    NASA Astrophysics Data System (ADS)

    Kotelnikov, E. V.; Milov, V. R.

    2018-05-01

    Rule-based learning algorithms have higher transparency and easiness to interpret in comparison with neural networks and deep learning algorithms. These properties make it possible to effectively use such algorithms to solve descriptive tasks of data mining. The choice of an algorithm depends also on its ability to solve predictive tasks. The article compares the quality of the solution of the problems with binary and multiclass classification based on the experiments with six datasets from the UCI Machine Learning Repository. The authors investigate three algorithms: Ripper (rule induction), C4.5 (decision trees), In-Close (formal concept analysis). The results of the experiments show that In-Close demonstrates the best quality of classification in comparison with Ripper and C4.5, however the latter two generate more compact rule sets.

  3. Analysis of Human Mobility Based on Cellular Data

    NASA Astrophysics Data System (ADS)

    Arifiansyah, F.; Saptawati, G. A. P.

    2017-01-01

    Nowadays not only adult but even teenager and children have then own mobile phones. This phenomena indicates that the mobile phone becomes an important part of everyday’s life. Based on these indication, the amount of cellular data also increased rapidly. Cellular data defined as the data that records communication among mobile phone users. Cellular data is easy to obtain because the telecommunications company had made a record of the data for the billing system of the company. Billing data keeps a log of the users cellular data usage each time. We can obtained information from the data about communication between users. Through data visualization process, an interesting pattern can be seen in the raw cellular data, so that users can obtain prior knowledge to perform data analysis. Cellular data processing can be done using data mining to find out human mobility patterns and on the existing data. In this paper, we use frequent pattern mining and finding association rules to observe the relation between attributes in cellular data and then visualize them. We used weka tools for finding the rules in stage of data mining. Generally, the utilization of cellular data can provide supporting information for the decision making process and become a data support to provide solutions and information needed by the decision makers.

  4. Investigating the Relation Between Prevalence of Asthmatic Allergy with the Characteristics of the Environment Using Association Rule Mining

    NASA Astrophysics Data System (ADS)

    Kanani Sadat, Y.; Karimipour, F.; Kanani Sadat, A.

    2014-10-01

    The prevalence of allergic diseases has highly increased in recent decades due to contamination of the environment with the allergy stimuli. A common treat is identifying the allergy stimulus and, then, avoiding the patient to be exposed with it. There are, however, many unknown allergic diseases stimuli that are related to the characteristics of the living environment. In this paper, we focus on the effect of air pollution on asthmatic allergies and investigate the association between prevalence of such allergies with those characteristics of the environment that may affect the air pollution. For this, spatial association rule mining has been deployed to mine the association between spatial distribution of allergy prevalence and the air pollution parameters such as CO, SO2, NO2, PM10, PM2.5, and O3 (compiled by the air pollution monitoring stations) as well as living distance to parks and roads. The results for the case study (i.e., Tehran metropolitan area) indicates that distance to parks and roads as well as CO, NO2, PM10, and PM2.5 is related to the allergy prevalence in December (the most polluted month of the year in Tehran), while SO2 and O3 have no effect on that.

  5. Mining Context-Aware Association Rules Using Grammar-Based Genetic Programming.

    PubMed

    Luna, Jose Maria; Pechenizkiy, Mykola; Del Jesus, Maria Jose; Ventura, Sebastian

    2017-09-25

    Real-world data usually comprise features whose interpretation depends on some contextual information. Such contextual-sensitive features and patterns are of high interest to be discovered and analyzed in order to obtain the right meaning. This paper formulates the problem of mining context-aware association rules, which refers to the search for associations between itemsets such that the strength of their implication depends on a contextual feature. For the discovery of this type of associations, a model that restricts the search space and includes syntax constraints by means of a grammar-based genetic programming methodology is proposed. Grammars can be considered as a useful way of introducing subjective knowledge to the pattern mining process as they are highly related to the background knowledge of the user. The performance and usefulness of the proposed approach is examined by considering synthetically generated datasets. A posteriori analysis on different domains is also carried out to demonstrate the utility of this kind of associations. For example, in educational domains, it is essential to identify and understand contextual and context-sensitive factors that affect overall and individual student behavior and performance. The results of the experiments suggest that the approach is feasible and it automatically identifies interesting context-aware associations from real-world datasets.

  6. Data Mining Techniques for Customer Relationship Management

    NASA Astrophysics Data System (ADS)

    Guo, Feng; Qin, Huilin

    2017-10-01

    Data mining have made customer relationship management (CRM) a new area where firms can gain a competitive advantage, and play a key role in the firms’ management decision. In this paper, we first analyze the value and application fields of data mining techniques for CRM, and further explore how data mining applied to Customer churn analysis. A new business culture is developing today. The conventional production centered and sales purposed market strategy is gradually shifting to customer centered and service purposed. Customers’ value orientation is increasingly affecting the firms’. And customer resource has become one of the most important strategic resources. Therefore, understanding customers’ needs and discriminating the most contributed customers has become the driving force of most modern business.

  7. 30 CFR 282.23 - Testing Plan.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Resources BUREAU OF OCEAN ENERGY MANAGEMENT, REGULATION, AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR... lessee needs more information to develop a detailed Mining Plan than is obtainable under an approved... techniques or technology or mining equipment, or to determine environmental effects by a pilot test mining...

  8. 43 CFR 3481.4 - Temporary interruption in coal severance.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 43 Public Lands: Interior 2 2013-10-01 2013-10-01 false Temporary interruption in coal severance... LAND MANAGEMENT, DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) COAL EXPLORATION AND MINING OPERATIONS RULES General Provisions § 3481.4 Temporary interruption in coal severance. ...

  9. 43 CFR 3481.4 - Temporary interruption in coal severance.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 43 Public Lands: Interior 2 2012-10-01 2012-10-01 false Temporary interruption in coal severance... LAND MANAGEMENT, DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) COAL EXPLORATION AND MINING OPERATIONS RULES General Provisions § 3481.4 Temporary interruption in coal severance. ...

  10. 43 CFR 3481.4 - Temporary interruption in coal severance.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 43 Public Lands: Interior 2 2014-10-01 2014-10-01 false Temporary interruption in coal severance... LAND MANAGEMENT, DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) COAL EXPLORATION AND MINING OPERATIONS RULES General Provisions § 3481.4 Temporary interruption in coal severance. ...

  11. 43 CFR 3481.4 - Temporary interruption in coal severance.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 43 Public Lands: Interior 2 2011-10-01 2011-10-01 false Temporary interruption in coal severance... LAND MANAGEMENT, DEPARTMENT OF THE INTERIOR MINERALS MANAGEMENT (3000) COAL EXPLORATION AND MINING OPERATIONS RULES General Provisions § 3481.4 Temporary interruption in coal severance. ...

  12. 75 FR 61366 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-10-05

    ... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We are announcing receipt of a proposed... that we will follow for the public hearing, if one is requested. DATES: We will accept written comments...

  13. 78 FR 13004 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-26

    ... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We are announcing receipt of a proposed... will follow for the public hearing, if one is requested. DATES: We will accept written comments on this...

  14. 75 FR 81459 - Simplified Proceedings

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-28

    ... FEDERAL MINE SAFETY AND HEALTH REVIEW COMMISSION 29 CFR Part 2700 Simplified Proceedings AGENCY... Commission is publishing a final rule to simplify the procedures for handling certain civil penalty.... Electronic comments should state ``Comments on Simplified Proceedings'' in the subject line and be sent to...

  15. 78 FR 11796 - Kentucky Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-20

    ... personal identifying information from public review, we cannot guarantee that we will be able to do so... our review of the proposed amendment after the close of the public comment period and determine... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and...

  16. 76 FR 73885 - Mandatory Reporting of Greenhouse Gases

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-11-29

    .... 211112 Natural gas liquid extraction facilities. Underground Coal Mines........ 212113 Underground... natural gas liquids in addition to suppliers of petroleum products. 2. Summary of Comments and Responses... Mandatory Reporting of Greenhouse Gases; Final Rule #0;#0;Federal Register / Vol. 76, No. 229 / Tuesday...

  17. Human Systems Integration (HSI) Associated Development Activities in Japan

    DTIC Science & Technology

    2008-06-12

    machine learning and data mining methods. The continuous effort ( KAIZEN ) to improve the analysis phases are illustrated in Figure 14. Although there...model Extraction of a workflow Extraction of a control rule Variation analysis and improvement Plant operation KAIZEN Fig. 14

  18. Utilization of volume correlation filters for underwater mine identification in LIDAR imagery

    NASA Astrophysics Data System (ADS)

    Walls, Bradley

    2008-04-01

    Underwater mine identification persists as a critical technology pursued aggressively by the Navy for fleet protection. As such, new and improved techniques must continue to be developed in order to provide measurable increases in mine identification performance and noticeable reductions in false alarm rates. In this paper we show how recent advances in the Volume Correlation Filter (VCF) developed for ground based LIDAR systems can be adapted to identify targets in underwater LIDAR imagery. Current automated target recognition (ATR) algorithms for underwater mine identification employ spatial based three-dimensional (3D) shape fitting of models to LIDAR data to identify common mine shapes consisting of the box, cylinder, hemisphere, truncated cone, wedge, and annulus. VCFs provide a promising alternative to these spatial techniques by correlating 3D models against the 3D rendered LIDAR data.

  19. Intelligent and integrated techniques for coalbed methane (CBM) recovery and reduction of greenhouse gas emission.

    PubMed

    Qianting, Hu; Yunpei, Liang; Han, Wang; Quanle, Zou; Haitao, Sun

    2017-07-01

    Coalbed methane (CBM) recovery is a crucial approach to realize the exploitation of a clean energy and the reduction of the greenhouse gas emission. In the past 10 years, remarkable achievements on CBM recovery have been obtained in China. However, some key difficulties still exist such as long borehole drilling in complicated geological condition, and poor gas drainage effect due to low permeability. In this study, intelligent and integrated techniques for CBM recovery are introduced. These integrated techniques mainly include underground CBM recovery techniques and ground well CBM recovery techniques. The underground CBM recovery techniques consist of the borehole formation technique, gas concentration improvement technique, and permeability enhancement technique. According to the division of mining-induced disturbance area, the ground well arrangement area and well structure type in mining-induced disturbance developing area and mining-induced disturbance stable area are optimized to significantly improve the ground well CBM recovery. Besides, automatic devices such as drilling pipe installation device are also developed to achieve remote control of data recording, which makes the integrated techniques intelligent. These techniques can provide key solutions to some long-term difficulties in CBM recovery.

  20. Chapter 16: text mining for translational bioinformatics.

    PubMed

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

Top