Sample records for mining approach identifies

  1. A Data Mining Approach to Identify Sexuality Patterns in a Brazilian University Population.

    PubMed

    Waleska Simões, Priscyla; Cesconetto, Samuel; Toniazzo de Abreu, Larissa Letieli; Côrtes de Mattos Garcia, Merisandra; Cassettari Junior, José Márcio; Comunello, Eros; Bisognin Ceretta, Luciane; Aparecida Manenti, Sandra

    2015-01-01

    This paper presents the profile and experience of sexuality generated from a data mining classification task. We used a database about sexuality and gender violence performed on a university population in southern Brazil. The data mining task identified two relationships between the variables, which enabled the distinction of subgroups that better detail the profile and experience of sexuality. The identification of the relationships between the variables define behavioral models and factors of risk that will help define the algorithms being implemented in the data mining classification task.

  2. An Integrative data mining approach to identifying Adverse ...

    EPA Pesticide Factsheets

    The Adverse Outcome Pathway (AOP) framework is a tool for making biological connections and summarizing key information across different levels of biological organization to connect biological perturbations at the molecular level to adverse outcomes for an individual or population. Computational approaches to explore and determine these connections can accelerate the assembly of AOPs. By leveraging the wealth of publicly available data covering chemical effects on biological systems, computationally-predicted AOPs (cpAOPs) were assembled via data mining of high-throughput screening (HTS) in vitro data, in vivo data and other disease phenotype information. Frequent Itemset Mining (FIM) was used to find associations between the gene targets of ToxCast HTS assays and disease data from Comparative Toxicogenomics Database (CTD) by using the chemicals as the common aggregators between datasets. The method was also used to map gene expression data to disease data from CTD. A cpAOP network was defined by considering genes and diseases as nodes and FIM associations as edges. This network contained 18,283 gene to disease associations for the ToxCast data and 110,253 for CTD gene expression. Two case studies show the value of the cpAOP network by extracting subnetworks focused either on fatty liver disease or the Aryl Hydrocarbon Receptor (AHR). The subnetwork surrounding fatty liver disease included many genes known to play a role in this disease. When querying the cpAOP

  3. An integrative data mining approach to identifying adverse outcome pathway signatures.

    PubMed

    Oki, Noffisat O; Edwards, Stephen W

    2016-03-28

    The Adverse Outcome Pathway (AOP) framework is a tool for making biological connections and summarizing key information across different levels of biological organization to connect biological perturbations at the molecular level to adverse outcomes for an individual or population. Computational approaches to explore and determine these connections can accelerate the assembly of AOPs. By leveraging the wealth of publicly available data covering chemical effects on biological systems, computationally-predicted AOPs (cpAOPs) were assembled via data mining of high-throughput screening (HTS) in vitro data, in vivo data and other disease phenotype information. Frequent Itemset Mining (FIM) was used to find associations between the gene targets of ToxCast HTS assays and disease data from Comparative Toxicogenomics Database (CTD) by using the chemicals as the common aggregators between datasets. The method was also used to map gene expression data to disease data from CTD. A cpAOP network was defined by considering genes and diseases as nodes and FIM associations as edges. This network contained 18,283 gene to disease associations for the ToxCast data and 110,253 for CTD gene expression. Two case studies show the value of the cpAOP network by extracting subnetworks focused either on fatty liver disease or the Aryl Hydrocarbon Receptor (AHR). The subnetwork surrounding fatty liver disease included many genes known to play a role in this disease. When querying the cpAOP network with the AHR gene, an interesting subnetwork including glaucoma was identified. While substantial literature exists to support the potential for AHR ligands to elicit glaucoma, it was not explicitly captured in the public annotation information in CTD. The subnetwork from this analysis suggests a cpAOP that includes changes in CYP1B1 expression, which has been previously established in the literature as a primary cause of glaucoma. These case studies highlight the value in integrating multiple data

  4. Identifying Engineering Students' English Sentence Reading Comprehension Errors: Applying a Data Mining Technique

    ERIC Educational Resources Information Center

    Tsai, Yea-Ru; Ouyang, Chen-Sen; Chang, Yukon

    2016-01-01

    The purpose of this study is to propose a diagnostic approach to identify engineering students' English reading comprehension errors. Student data were collected during the process of reading texts of English for science and technology on a web-based cumulative sentence analysis system. For the analysis, the association-rule, data mining technique…

  5. Design approaches in quarrying and pit-mining reclamation

    USGS Publications Warehouse

    Arbogast, Belinda F.

    1999-01-01

    Reclaimed mine sites have been evaluated so that the public, industry, and land planners may recognize there are innovative designs available for consideration and use. People tend to see cropland, range, and road cuts as a necessary part of their everyday life, not as disturbed areas despite their high visibility. Mining also generates a disturbed landscape, unfortunately one that many consider waste until reclaimed by human beings. The development of mining provides an economic base and use of a natural resource to improve the quality of human life. Equally important is a sensitivity to the geologic origin and natural pattern of the land. Wisely shaping out environment requires a design plan and product that responds to a site's physiography, ecology, function, artistic form, and publication perception. An examination of selected sites for their landscape design suggested nine approaches for mining reclamation. The oldest design approach around is nature itself. Humans may sometimes do more damage going to an area in the attempt to repair it. Given enough geologic time, a small-site area, and stable adjacent ecosystems, disturbed areas recover without mankind's input. Visual screens and buffer zones conceal the facility in a camouflage approach. Typically, earth berms, fences, and plantings are used to disguise the mining facility. Restoration targets social or economic benefits by reusing the site for public amenities, most often in urban centers with large populations. A mitigation approach attempts to protect the environment and return mined areas to use with scientific input. The reuse of cement, building rubble, macadam meets only about 10% of the demand from aggregate. Recognizing the limited supply of mineral resources and encouraging recycling efforts are steps are steps in a renewable resource approach. An educative design approach effectively communicates mining information through outreach, land stewardship, and community service. Mine sites used for

  6. Data Mining and Pattern Recognition Models for Identifying Inherited Diseases: Challenges and Implications.

    PubMed

    Iddamalgoda, Lahiru; Das, Partha S; Aponso, Achala; Sundararajan, Vijayaraghava S; Suravajhala, Prashanth; Valadi, Jayaraman K

    2016-01-01

    Data mining and pattern recognition methods reveal interesting findings in genetic studies, especially on how the genetic makeup is associated with inherited diseases. Although researchers have proposed various data mining models for biomedical approaches, there remains a challenge in accurately prioritizing the single nucleotide polymorphisms (SNP) associated with the disease. In this commentary, we review the state-of-art data mining and pattern recognition models for identifying inherited diseases and deliberate the need of binary classification- and scoring-based prioritization methods in determining causal variants. While we discuss the pros and cons associated with these methods known, we argue that the gene prioritization methods and the protein interaction (PPI) methods in conjunction with the K nearest neighbors' could be used in accurately categorizing the genetic factors in disease causation.

  7. Identifying antecedent conditions responsible for the high rate of mining injuries in Zambia.

    PubMed

    Miller, Hugh B; Sinkala, Thomson; Renger, Ralph F; Peacock, Erin M; Tabor, Joseph A; Burgess, Jefferey L

    2006-01-01

    The incident rates of mining-related accidents and injuries in developing countries exceed those of developed nations. Interventions by international organizations routinely fail to produce appreciable long-term improvement. One major reason is the inability to identify and analyze the underlying factors responsible for creating unsafe working conditions. Understanding these antecedent conditions is necessary to formulate effective intervention strategies and prioritize the use of limited resources. This study utilized a logic model approach to determine the root causes and broad categories of potential interventions for mining accidents and injuries in Zambia. Results showed that policy interventions have the greatest potential for substantive change. A process of educating officials from government and mining companies about the economic and social merits of health and safety programs and extensive changes in regulatory structure and enforcement are needed.

  8. Text mining applied to electronic cardiovascular procedure reports to identify patients with trileaflet aortic stenosis and coronary artery disease.

    PubMed

    Small, Aeron M; Kiss, Daniel H; Zlatsin, Yevgeny; Birtwell, David L; Williams, Heather; Guerraty, Marie A; Han, Yuchi; Anwaruddin, Saif; Holmes, John H; Chirinos, Julio A; Wilensky, Robert L; Giri, Jay; Rader, Daniel J

    2017-08-01

    Interrogation of the electronic health record (EHR) using billing codes as a surrogate for diagnoses of interest has been widely used for clinical research. However, the accuracy of this methodology is variable, as it reflects billing codes rather than severity of disease, and depends on the disease and the accuracy of the coding practitioner. Systematic application of text mining to the EHR has had variable success for the detection of cardiovascular phenotypes. We hypothesize that the application of text mining algorithms to cardiovascular procedure reports may be a superior method to identify patients with cardiovascular conditions of interest. We adapted the Oracle product Endeca, which utilizes text mining to identify terms of interest from a NoSQL-like database, for purposes of searching cardiovascular procedure reports and termed the tool "PennSeek". We imported 282,569 echocardiography reports representing 81,164 individuals and 27,205 cardiac catheterization reports representing 14,567 individuals from non-searchable databases into PennSeek. We then applied clinical criteria to these reports in PennSeek to identify patients with trileaflet aortic stenosis (TAS) and coronary artery disease (CAD). Accuracy of patient identification by text mining through PennSeek was compared with ICD-9 billing codes. Text mining identified 7115 patients with TAS and 9247 patients with CAD. ICD-9 codes identified 8272 patients with TAS and 6913 patients with CAD. 4346 patients with AS and 6024 patients with CAD were identified by both approaches. A randomly selected sample of 200-250 patients uniquely identified by text mining was compared with 200-250 patients uniquely identified by billing codes for both diseases. We demonstrate that text mining was superior, with a positive predictive value (PPV) of 0.95 compared to 0.53 by ICD-9 for TAS, and a PPV of 0.97 compared to 0.86 for CAD. These results highlight the superiority of text mining algorithms applied to electronic

  9. A Node Linkage Approach for Sequential Pattern Mining

    PubMed Central

    Navarro, Osvaldo; Cumplido, René; Villaseñor-Pineda, Luis; Feregrino-Uribe, Claudia; Carrasco-Ochoa, Jesús Ariel

    2014-01-01

    Sequential Pattern Mining is a widely addressed problem in data mining, with applications such as analyzing Web usage, examining purchase behavior, and text mining, among others. Nevertheless, with the dramatic increase in data volume, the current approaches prove inefficient when dealing with large input datasets, a large number of different symbols and low minimum supports. In this paper, we propose a new sequential pattern mining algorithm, which follows a pattern-growth scheme to discover sequential patterns. Unlike most pattern growth algorithms, our approach does not build a data structure to represent the input dataset, but instead accesses the required sequences through pseudo-projection databases, achieving better runtime and reducing memory requirements. Our algorithm traverses the search space in a depth-first fashion and only preserves in memory a pattern node linkage and the pseudo-projections required for the branch being explored at the time. Experimental results show that our new approach, the Node Linkage Depth-First Traversal algorithm (NLDFT), has better performance and scalability in comparison with state of the art algorithms. PMID:24933123

  10. Citation Mining: Integrating Text Mining and Bibliometrics for Research User Profiling.

    ERIC Educational Resources Information Center

    Kostoff, Ronald N.; del Rio, J. Antonio; Humenik, James A.; Garcia, Esther Ofilia; Ramirez, Ana Maria

    2001-01-01

    Discusses the importance of identifying the users and impact of research, and describes an approach for identifying the pathways through which research can impact other research, technology development, and applications. Describes a study that used citation mining, an integration of citation bibliometrics and text mining, on articles from the…

  11. A probabilistic approach for mine burial prediction

    NASA Astrophysics Data System (ADS)

    Barbu, Costin; Valent, Philip; Richardson, Michael; Abelev, Andrei; Plant, Nathaniel

    2004-09-01

    Predicting the degree of burial of mines in soft sediments is one of the main concerns of Naval Mine CounterMeasures (MCM) operations. This is a difficult problem to solve due to uncertainties and variability of the sediment parameters (i.e., density and shear strength) and of the mine state at contact with the seafloor (i.e., vertical and horizontal velocity, angular rotation rate, and pitch angle at the mudline). A stochastic approach is proposed in this paper to better incorporate the dynamic nature of free-falling cylindrical mines in the modeling of impact burial. The orientation, trajectory and velocity of cylindrical mines, after about 4 meters free-fall in the water column, are very strongly influenced by boundary layer effects causing quite chaotic behavior. The model's convolution of the uncertainty through its nonlinearity is addressed by employing Monte Carlo simulations. Finally a risk analysis based on the probability of encountering an undetectable mine is performed.

  12. A data mining approach to intelligence operations

    NASA Astrophysics Data System (ADS)

    Memon, Nasrullah; Hicks, David L.; Harkiolakis, Nicholas

    2008-03-01

    In this paper we examine the latest thinking, approaches and methodologies in use for finding the nuggets of information and subliminal (and perhaps intentionally hidden) patterns and associations that are critical to identify criminal activity and suspects to private and government security agencies. An emphasis in the paper is placed on Social Network Analysis and Investigative Data Mining, and the use of these technologies in the counterterrorism domain. Tools and techniques from both areas are described, along with the important tasks for which they can be used to assist with the investigation and analysis of terrorist organizations. The process of collecting data about these organizations is also considered along with the inherent difficulties that are involved.

  13. Application of data mining approaches to drug delivery.

    PubMed

    Ekins, Sean; Shimada, Jun; Chang, Cheng

    2006-11-30

    Computational approaches play a key role in all areas of the pharmaceutical industry from data mining, experimental and clinical data capture to pharmacoeconomics and adverse events monitoring. They will likely continue to be indispensable assets along with a growing library of software applications. This is primarily due to the increasingly massive amount of biology, chemistry and clinical data, which is now entering the public domain mainly as a result of NIH and commercially funded projects. We are therefore in need of new methods for mining this mountain of data in order to enable new hypothesis generation. The computational approaches include, but are not limited to, database compilation, quantitative structure activity relationships (QSAR), pharmacophores, network visualization models, decision trees, machine learning algorithms and multidimensional data visualization software that could be used to improve drug delivery after mining public and/or proprietary data. We will discuss some areas of unmet needs in the area of data mining for drug delivery that can be addressed with new software tools or databases of relevance to future pharmaceutical projects.

  14. Selecting Proper Plant Species for Mine Reclamation Using Fuzzy AHP Approach (Case Study: Chadormaloo Iron Mine of Iran)

    NASA Astrophysics Data System (ADS)

    Ebrahimabadi, Arash

    2016-12-01

    This paper describes an effective approach to select suitable plant species for reclamation of mined lands in Chadormaloo iron mine which is located in central part of Iran, near the city of Bafgh in Yazd province. After mine's total reserves are excavated, the mine requires to be permanently closed and reclaimed. Mine reclamation and post-mining land-use are the main issues in the phase of mine closure. In general, among various scenarios for mine reclamation process, i.e. planting, agriculture, forestry, residency, tourist attraction, etc., planting is the oldest and commonly-used technology for the reclamation of lands damaged by mining activities. Planting and vegetation play a major role in restoring productivity, ecosystem stability and biological diversity to degraded areas, therefore the main goal of this research work is to choose proper and suitable plants compatible with the conditions of Chadormaloo mined area, providing consistent conditions for future use. To ensure the sustainability of the reclaimed landscape, the most suitable plant species adapted to the mine conditions are selected. Plant species selection is a Multi Criteria Decision Making (MCDM) problem. In this paper, a fuzzy MCDM technique, namely Fuzzy Analytic Hierarchy Process (FAHP) is developed to assist chadormaloo iron mine managers and designers in the process of plant type selection for reclamation of the mine under fuzzy environment where the vagueness and uncertainty are taken into account with linguistic variables parameterized by triangular fuzzy numbers. The results achieved from using FAHP approach demonstrate that the most proper plant species are ranked as Artemisia sieberi, Salsola yazdiana, Halophytes types, and Zygophyllum, respectively for reclamation of Chadormaloo iron mine.

  15. Using Helicopter Electromagnetic Surveys to Identify Potential Hazards at Mine Waste Impoundments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hammack, R.W.

    2008-01-01

    In July 2003, helicopter electromagnetic surveys were conducted at 14 coal waste impoundments in southern West Virginia. The purpose of the surveys was to detect conditions that could lead to impoundment failure either by structural failure of the embankment or by the flooding of adjacent or underlying mine works. Specifically, the surveys attempted to: 1) identify saturated zones within the mine waste, 2) delineate filtrate flow paths through the embankment or into adjacent strata and receiving streams, and 3) identify flooded mine workings underlying or adjacent to the waste impoundment. Data from the helicopter surveys were processed to generate conductivity/depthmore » images. Conductivity/depth images were then spatially linked to georeferenced air photos or topographic maps for interpretation. Conductivity/depth images were found to provide a snapshot of the hydrologic conditions that exist within the impoundment. This information can be used to predict potential areas of failure within the embankment because of its ability to image the phreatic zone. Also, the electromagnetic survey can identify areas of unconsolidated slurry in the decant basin and beneath the embankment. Although shallow, flooded mineworks beneath the impoundment were identified by this survey, it cannot be assumed that electromagnetic surveys can detect all underlying mines. A preliminary evaluation of the data implies that helicopter electromagnetic surveys can provide a better understanding of the phreatic zone than the piezometer arrays that are typically used.« less

  16. Screening and prioritisation of chemical risks from metal mining operations, identifying exposure media of concern.

    PubMed

    Pan, Jilang; Oates, Christopher J; Ihlenfeld, Christian; Plant, Jane A; Voulvoulis, Nikolaos

    2010-04-01

    Metals have been central to the development of human civilisation from the Bronze Age to modern times, although in the past, metal mining and smelting have been the cause of serious environmental pollution with the potential to harm human health. Despite problems from artisanal mining in some developing countries, modern mining to Western standards now uses the best available mining technology combined with environmental monitoring, mitigation and remediation measures to limit emissions to the environment. This paper develops risk screening and prioritisation methods previously used for contaminated land on military and civilian sites and engineering systems for the analysis and prioritisation of chemical risks from modern metal mining operations. It uses hierarchical holographic modelling and multi-criteria decision making to analyse and prioritise the risks from potentially hazardous inorganic chemical substances released by mining operations. A case study of an active platinum group metals mine in South Africa is used to demonstrate the potential of the method. This risk-based methodology for identifying, filtering and ranking mining-related environmental and human health risks can be used to identify exposure media of greatest concern to inform risk management. It also provides a practical decision-making tool for mine acquisition and helps to communicate risk to all members of mining operation teams.

  17. The systematic assessment of traditional evidence from the premodern Chinese medical literature: a text-mining approach.

    PubMed

    May, Brian H; Zhang, Anthony; Lu, Yubo; Lu, Chuanjian; Xue, Charlie C L

    2014-12-01

    This project aimed to develop an approach to evaluating information contained in the premodern Traditional Chinese Medicine (TCM) literature that was (1) comprehensive, systematic, and replicable and (2) able to produce quantifiable output that could be used to answer specific research questions in order to identify natural products for clinical and experimental research. The project involved two stages. In stage 1, 14 TCM collections and compendia were evaluated for suitability as sources for searching; 8 of these were compared in detail. The results were published in the Journal of Alternative and Complementary Medicine. Stage 2 developed a text-mining approach for two of these sources. The text-mining approach was developed for Zhong Hua Yi Dian; Encyclopaedia of Traditional Chinese Medicine, 4th edition) and Zhong Yi Fang Ji Da Ci Dian; Great Compendium of Chinese Medical Formulae). This approach developed procedures for search term selection; methods for screening, classifying, and scoring data; procedures for systematic searching and data extraction; data checking procedures; and approaches for analyzing results. Examples are provided for studies of memory impairment and diabetic nephropathy, and issues relating to data interpretation are discussed. This approach to the analysis of large collections of the premodern TCM literature uses widely available sources and provides a text-mining approach that is systematic, replicable, and adaptable to the requirements of the particular project. Researchers can use these methods to explore changes in the names and conceptions of a disease over time, to identify which therapeutic methods have been more or less frequently used in different eras for particular disorders, and to assist in the selection of natural products for research efforts.

  18. Detection of antipersonnel (AP) mines using mechatronics approach

    NASA Astrophysics Data System (ADS)

    Shahri, Ali M.; Naghdy, Fazel

    1998-09-01

    At present there are approximately 110 million land-mines scattered around the world in 64 countries. The clearance of these mines takes place manually. Unfortunately, on average for every 5000 mines cleared one mine clearer is killed. A Mine Detector Arm (MDA) using mechatronics approach is under development in this work. The robot arm imitates manual hand- prodding technique for mine detection. It inserts a bayonet into the soil and models the dynamics of the manipulator and environment parameters, such as stiffness variation in the soil to control the impact caused by contacting a stiff object. An explicit impact control scheme is applied as the main control scheme, while two different intelligent control methods are designed to deal with uncertainties and varying environmental parameters. Firstly, a neuro-fuzzy adaptive gain controller (NFAGC) is designed to adapt the force gain control according to the estimated environment stiffness. Then, an adaptive neuro-fuzzy plus PID controller is employed to switch from a conventional PID controller to neuro-fuzzy impact control (NFIC), when an impact is detected. The developed control schemes are validated through computer simulation and experimental work.

  19. A New Approach in Coal Mine Exploration Using Cosmic Ray Muons

    NASA Astrophysics Data System (ADS)

    Darijani, Reza; Negarestani, Ali; Rezaie, Mohammad Reza; Fatemi, Syed Jalil; Akhond, Ahmad

    2016-08-01

    Muon radiography is a technique that uses cosmic ray muons to image the interior of large scale geological structures. The muon absorption in matter is the most important parameter in cosmic ray muon radiography. Cosmic ray muon radiography is similar to X-ray radiography. The main aim in this survey is the simulation of the muon radiography for exploration of mines. So, the production source, tracking, and detection of cosmic ray muons were simulated by MCNPX code. For this purpose, the input data of the source card in MCNPX code were extracted from the muon energy spectrum at sea level. In addition, the other input data such as average density and thickness of layers that were used in this code are the measured data from Pabdana (Kerman, Iran) coal mines. The average thickness and density of these layers in the coal mines are from 2 to 4 m and 1.3 gr/c3, respectively. To increase the spatial resolution, a detector was placed inside the mountain. The results indicated that using this approach, the layers with minimum thickness about 2.5 m can be identified.

  20. An Approach to Realizing Process Control for Underground Mining Operations of Mobile Machines

    PubMed Central

    Song, Zhen; Schunnesson, Håkan; Rinne, Mikael; Sturgul, John

    2015-01-01

    The excavation and production in underground mines are complicated processes which consist of many different operations. The process of underground mining is considerably constrained by the geometry and geology of the mine. The various mining operations are normally performed in series at each working face. The delay of a single operation will lead to a domino effect, thus delay the starting time for the next process and the completion time of the entire process. This paper presents a new approach to the process control for underground mining operations, e.g. drilling, bolting, mucking. This approach can estimate the working time and its probability for each operation more efficiently and objectively by improving the existing PERT (Program Evaluation and Review Technique) and CPM (Critical Path Method). If the delay of the critical operation (which is on a critical path) inevitably affects the productivity of mined ore, the approach can rapidly assign mucking machines new jobs to increase this amount at a maximum level by using a new mucking algorithm under external constraints. PMID:26062092

  1. An Approach to Realizing Process Control for Underground Mining Operations of Mobile Machines.

    PubMed

    Song, Zhen; Schunnesson, Håkan; Rinne, Mikael; Sturgul, John

    2015-01-01

    The excavation and production in underground mines are complicated processes which consist of many different operations. The process of underground mining is considerably constrained by the geometry and geology of the mine. The various mining operations are normally performed in series at each working face. The delay of a single operation will lead to a domino effect, thus delay the starting time for the next process and the completion time of the entire process. This paper presents a new approach to the process control for underground mining operations, e.g. drilling, bolting, mucking. This approach can estimate the working time and its probability for each operation more efficiently and objectively by improving the existing PERT (Program Evaluation and Review Technique) and CPM (Critical Path Method). If the delay of the critical operation (which is on a critical path) inevitably affects the productivity of mined ore, the approach can rapidly assign mucking machines new jobs to increase this amount at a maximum level by using a new mucking algorithm under external constraints.

  2. A vector space model approach to identify genetically related diseases.

    PubMed

    Sarkar, Indra Neil

    2012-01-01

    The relationship between diseases and their causative genes can be complex, especially in the case of polygenic diseases. Further exacerbating the challenges in their study is that many genes may be causally related to multiple diseases. This study explored the relationship between diseases through the adaptation of an approach pioneered in the context of information retrieval: vector space models. A vector space model approach was developed that bridges gene disease knowledge inferred across three knowledge bases: Online Mendelian Inheritance in Man, GenBank, and Medline. The approach was then used to identify potentially related diseases for two target diseases: Alzheimer disease and Prader-Willi Syndrome. In the case of both Alzheimer Disease and Prader-Willi Syndrome, a set of plausible diseases were identified that may warrant further exploration. This study furthers seminal work by Swanson, et al. that demonstrated the potential for mining literature for putative correlations. Using a vector space modeling approach, information from both biomedical literature and genomic resources (like GenBank) can be combined towards identification of putative correlations of interest. To this end, the relevance of the predicted diseases of interest in this study using the vector space modeling approach were validated based on supporting literature. The results of this study suggest that a vector space model approach may be a useful means to identify potential relationships between complex diseases, and thereby enable the coordination of gene-based findings across multiple complex diseases.

  3. Using text-mining techniques in electronic patient records to identify ADRs from medicine use

    PubMed Central

    Warrer, Pernille; Hansen, Ebba Holme; Juhl-Jensen, Lars; Aagaard, Lise

    2012-01-01

    This literature review included studies that use text-mining techniques in narrative documents stored in electronic patient records (EPRs) to investigate ADRs. We searched PubMed, Embase, Web of Science and International Pharmaceutical Abstracts without restrictions from origin until July 2011. We included empirically based studies on text mining of electronic patient records (EPRs) that focused on detecting ADRs, excluding those that investigated adverse events not related to medicine use. We extracted information on study populations, EPR data sources, frequencies and types of the identified ADRs, medicines associated with ADRs, text-mining algorithms used and their performance. Seven studies, all from the United States, were eligible for inclusion in the review. Studies were published from 2001, the majority between 2009 and 2010. Text-mining techniques varied over time from simple free text searching of outpatient visit notes and inpatient discharge summaries to more advanced techniques involving natural language processing (NLP) of inpatient discharge summaries. Performance appeared to increase with the use of NLP, although many ADRs were still missed. Due to differences in study design and populations, various types of ADRs were identified and thus we could not make comparisons across studies. The review underscores the feasibility and potential of text mining to investigate narrative documents in EPRs for ADRs. However, more empirical studies are needed to evaluate whether text mining of EPRs can be used systematically to collect new information about ADRs. PMID:22122057

  4. Mining Clinicians' Electronic Documentation to Identify Heart Failure Patients with Ineffective Self-Management: A Pilot Text-Mining Study.

    PubMed

    Topaz, Maxim; Radhakrishnan, Kavita; Lei, Victor; Zhou, Li

    2016-01-01

    Effective self-management can decrease up to 50% of heart failure hospitalizations. Unfortunately, self-management by patients with heart failure remains poor. This pilot study aimed to explore the use of text-mining to identify heart failure patients with ineffective self-management. We first built a comprehensive self-management vocabulary based on the literature and clinical notes review. We then randomly selected 545 heart failure patients treated within Partners Healthcare hospitals (Boston, MA, USA) and conducted a regular expression search with the compiled vocabulary within 43,107 interdisciplinary clinical notes of these patients. We found that 38.2% (n = 208) patients had documentation of ineffective heart failure self-management in the domains of poor diet adherence (28.4%), missed medical encounters (26.4%) poor medication adherence (20.2%) and non-specified self-management issues (e.g., "compliance issues", 34.6%). We showed the feasibility of using text-mining to identify patients with ineffective self-management. More natural language processing algorithms are needed to help busy clinicians identify these patients.

  5. Pilot study on the use of data mining to identify cochlear implant candidates.

    PubMed

    Grisel, Jedidiah J; Schafer, Erin; Lam, Anne; Griffin, Terry

    2018-05-01

    The goal of this pilot study was to determine the clinical utility of data-mining software that screens for cochlear implant (CI) candidacy. The Auditory Implant Initiative developed a software module that screens for CI candidates via integration with a software system (Noah 4) that serves as a depository for hearing test data. To identify candidates, patient audiograms from one practice were exported into the screening module. Candidates were tracked to determine if any eventually underwent implantation. After loading 4836 audiograms from the Noah 4 system, the screening module identified 558 potential CI candidates. After reviewing the data for the potential candidates, 117 were targeted and invited to an educational event. Following the event, a total of six candidates were evaluated, and two were implanted. This objective approach to identifying candidates has the potential to address the gross underutilization of CIs by removing any bias or lack of knowledge regarding the management of severe to profound sensorineural hearing loss with CIs. The screening module was an effective tool for identifying potential CI candidates at one ENT practice. On a larger scale, the screening module has the potential to impact thousands of CI candidates worldwide.

  6. Using text-mining techniques in electronic patient records to identify ADRs from medicine use.

    PubMed

    Warrer, Pernille; Hansen, Ebba Holme; Juhl-Jensen, Lars; Aagaard, Lise

    2012-05-01

    This literature review included studies that use text-mining techniques in narrative documents stored in electronic patient records (EPRs) to investigate ADRs. We searched PubMed, Embase, Web of Science and International Pharmaceutical Abstracts without restrictions from origin until July 2011. We included empirically based studies on text mining of electronic patient records (EPRs) that focused on detecting ADRs, excluding those that investigated adverse events not related to medicine use. We extracted information on study populations, EPR data sources, frequencies and types of the identified ADRs, medicines associated with ADRs, text-mining algorithms used and their performance. Seven studies, all from the United States, were eligible for inclusion in the review. Studies were published from 2001, the majority between 2009 and 2010. Text-mining techniques varied over time from simple free text searching of outpatient visit notes and inpatient discharge summaries to more advanced techniques involving natural language processing (NLP) of inpatient discharge summaries. Performance appeared to increase with the use of NLP, although many ADRs were still missed. Due to differences in study design and populations, various types of ADRs were identified and thus we could not make comparisons across studies. The review underscores the feasibility and potential of text mining to investigate narrative documents in EPRs for ADRs. However, more empirical studies are needed to evaluate whether text mining of EPRs can be used systematically to collect new information about ADRs. © 2011 The Authors. British Journal of Clinical Pharmacology © 2011 The British Pharmacological Society.

  7. Identifying Understudied Nuclear Reactions by Text-mining the EXFOR Experimental Nuclear Reaction Library

    NASA Astrophysics Data System (ADS)

    Hirdt, J. A.; Brown, D. A.

    2016-01-01

    The EXFOR library contains the largest collection of experimental nuclear reaction data available as well as the data's bibliographic information and experimental details. We text-mined the REACTION and MONITOR fields of the ENTRYs in the EXFOR library in order to identify understudied reactions and quantities. Using the results of the text-mining, we created an undirected graph from the EXFOR datasets with each graph node representing a single reaction and quantity and graph links representing the various types of connections between these reactions and quantities. This graph is an abstract representation of the connections in EXFOR, similar to graphs of social networks, authorship networks, etc. We use various graph theoretical tools to identify important yet understudied reactions and quantities in EXFOR. Although we identified a few cross sections relevant for shielding applications and isotope production, mostly we identified charged particle fluence monitor cross sections. As a side effect of this work, we learn that our abstract graph is typical of other real-world graphs.

  8. An extended data mining method for identifying differentially expressed assay-specific signatures in functional genomic studies.

    PubMed

    Rollins, Derrick K; Teh, Ailing

    2010-12-17

    Microarray data sets provide relative expression levels for thousands of genes for a small number, in comparison, of different experimental conditions called assays. Data mining techniques are used to extract specific information of genes as they relate to the assays. The multivariate statistical technique of principal component analysis (PCA) has proven useful in providing effective data mining methods. This article extends the PCA approach of Rollins et al. to the development of ranking genes of microarray data sets that express most differently between two biologically different grouping of assays. This method is evaluated on real and simulated data and compared to a current approach on the basis of false discovery rate (FDR) and statistical power (SP) which is the ability to correctly identify important genes. This work developed and evaluated two new test statistics based on PCA and compared them to a popular method that is not PCA based. Both test statistics were found to be effective as evaluated in three case studies: (i) exposing E. coli cells to two different ethanol levels; (ii) application of myostatin to two groups of mice; and (iii) a simulated data study derived from the properties of (ii). The proposed method (PM) effectively identified critical genes in these studies based on comparison with the current method (CM). The simulation study supports higher identification accuracy for PM over CM for both proposed test statistics when the gene variance is constant and for one of the test statistics when the gene variance is non-constant. PM compares quite favorably to CM in terms of lower FDR and much higher SP. Thus, PM can be quite effective in producing accurate signatures from large microarray data sets for differential expression between assays groups identified in a preliminary step of the PCA procedure and is, therefore, recommended for use in these applications.

  9. Development and application of the Safe Performance Index as a risk-based methodology for identifying major hazard-related safety issues in underground coal mines

    NASA Astrophysics Data System (ADS)

    Kinilakodi, Harisha

    The underground coal mining industry has been under constant watch due to the high risk involved in its activities, and scrutiny increased because of the disasters that occurred in 2006-07. In the aftermath of the incidents, the U.S. Congress passed the Mine Improvement and New Emergency Response Act of 2006 (MINER Act), which strengthened the existing regulations and mandated new laws to address the various issues related to a safe working environment in the mines. Risk analysis in any form should be done on a regular basis to tackle the possibility of unwanted major hazard-related events such as explosions, outbursts, airbursts, inundations, spontaneous combustion, and roof fall instabilities. One of the responses by the Mine Safety and Health Administration (MSHA) in 2007 involved a new pattern of violations (POV) process to target mines with a poor safety performance, specifically to improve their safety. However, the 2010 disaster (worst in 40 years) gave an impression that the collective effort of the industry, federal/state agencies, and researchers to achieve the goal of zero fatalities and serious injuries has gone awry. The Safe Performance Index (SPI) methodology developed in this research is a straight-forward, effective, transparent, and reproducible approach that can help in identifying and addressing some of the existing issues while targeting (poor safety performance) mines which need help. It combines three injury and three citation measures that are scaled to have an equal mean (5.0) in a balanced way with proportionate weighting factors (0.05, 0.15, 0.30) and overall normalizing factor (15) into a mine safety performance evaluation tool. It can be used to assess the relative safety-related risk of mines, including by mine-size category. Using 2008 and 2009 data, comparisons were made of SPI-associated, normalized safety performance measures across mine-size categories, with emphasis on small-mine safety performance as compared to large- and

  10. Differentially Private Frequent Subgraph Mining

    PubMed Central

    Xu, Shengzhi; Xiong, Li; Cheng, Xiang; Xiao, Ke

    2016-01-01

    Mining frequent subgraphs from a collection of input graphs is an important topic in data mining research. However, if the input graphs contain sensitive information, releasing frequent subgraphs may pose considerable threats to individual's privacy. In this paper, we study the problem of frequent subgraph mining (FGM) under the rigorous differential privacy model. We introduce a novel differentially private FGM algorithm, which is referred to as DFG. In this algorithm, we first privately identify frequent subgraphs from input graphs, and then compute the noisy support of each identified frequent subgraph. In particular, to privately identify frequent subgraphs, we present a frequent subgraph identification approach which can improve the utility of frequent subgraph identifications through candidates pruning. Moreover, to compute the noisy support of each identified frequent subgraph, we devise a lattice-based noisy support derivation approach, where a series of methods has been proposed to improve the accuracy of the noisy supports. Through formal privacy analysis, we prove that our DFG algorithm satisfies ε-differential privacy. Extensive experimental results on real datasets show that the DFG algorithm can privately find frequent subgraphs with high data utility. PMID:27616876

  11. Identifying Understudied Nuclear Reactions by Text-mining the EXFOR Experimental Nuclear Reaction Library

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hirdt, J.A.; Brown, D.A., E-mail: dbrown@bnl.gov

    The EXFOR library contains the largest collection of experimental nuclear reaction data available as well as the data's bibliographic information and experimental details. We text-mined the REACTION and MONITOR fields of the ENTRYs in the EXFOR library in order to identify understudied reactions and quantities. Using the results of the text-mining, we created an undirected graph from the EXFOR datasets with each graph node representing a single reaction and quantity and graph links representing the various types of connections between these reactions and quantities. This graph is an abstract representation of the connections in EXFOR, similar to graphs of socialmore » networks, authorship networks, etc. We use various graph theoretical tools to identify important yet understudied reactions and quantities in EXFOR. Although we identified a few cross sections relevant for shielding applications and isotope production, mostly we identified charged particle fluence monitor cross sections. As a side effect of this work, we learn that our abstract graph is typical of other real-world graphs.« less

  12. National Conference on Mining-Influenced Waters: Approaches for Characterization, Source Control and Treatment

    EPA Science Inventory

    The conference goal was to provide a forum for the exchange of scientific information on current and emerging approaches to assessing characterization, monitoring, source control, treatment and/or remediation on mining-influenced waters. The conference was aimed at mining remedi...

  13. Data mining approach to model the diagnostic service management.

    PubMed

    Lee, Sun-Mi; Lee, Ae-Kyung; Park, Il-Su

    2006-01-01

    Korea has National Health Insurance Program operated by the government-owned National Health Insurance Corporation, and diagnostic services are provided every two year for the insured and their family members. Developing a customer relationship management (CRM) system using data mining technology would be useful to improve the performance of diagnostic service programs. Under these circumstances, this study developed a model for diagnostic service management taking into account the characteristics of subjects using a data mining approach. This study could be further used to develop an automated CRM system contributing to the increase in the rate of receiving diagnostic services.

  14. Synoptic sampling and principal components analysis to identify sources of water and metals to an acid mine drainage stream.

    PubMed

    Byrne, Patrick; Runkel, Robert L; Walton-Day, Katherine

    2017-07-01

    Combining the synoptic mass balance approach with principal components analysis (PCA) can be an effective method for discretising the chemistry of inflows and source areas in watersheds where contamination is diffuse in nature and/or complicated by groundwater interactions. This paper presents a field-scale study in which synoptic sampling and PCA are employed in a mineralized watershed (Lion Creek, Colorado, USA) under low flow conditions to (i) quantify the impacts of mining activity on stream water quality; (ii) quantify the spatial pattern of constituent loading; and (iii) identify inflow sources most responsible for observed changes in stream chemistry and constituent loading. Several of the constituents investigated (Al, Cd, Cu, Fe, Mn, Zn) fail to meet chronic aquatic life standards along most of the study reach. The spatial pattern of constituent loading suggests four primary sources of contamination under low flow conditions. Three of these sources are associated with acidic (pH <3.1) seeps that enter along the left bank of Lion Creek. Investigation of inflow water (trace metal and major ion) chemistry using PCA suggests a hydraulic connection between many of the left bank inflows and mine water in the Minnesota Mine shaft located to the north-east of the river channel. In addition, water chemistry data during a rainfall-runoff event suggests the spatial pattern of constituent loading may be modified during rainfall due to dissolution of efflorescent salts or erosion of streamside tailings. These data point to the complexity of contaminant mobilisation processes and constituent loading in mining-affected watersheds but the combined synoptic sampling and PCA approach enables a conceptual model of contaminant dynamics to be developed to inform remediation.

  15. Synoptic sampling and principal components analysis to identify sources of water and metals to an acid mine drainage stream

    USGS Publications Warehouse

    Byrne, Patrick; Runkel, Robert L.; Walton-Day, Katie

    2017-01-01

    Combining the synoptic mass balance approach with principal components analysis (PCA) can be an effective method for discretising the chemistry of inflows and source areas in watersheds where contamination is diffuse in nature and/or complicated by groundwater interactions. This paper presents a field-scale study in which synoptic sampling and PCA are employed in a mineralized watershed (Lion Creek, Colorado, USA) under low flow conditions to (i) quantify the impacts of mining activity on stream water quality; (ii) quantify the spatial pattern of constituent loading; and (iii) identify inflow sources most responsible for observed changes in stream chemistry and constituent loading. Several of the constituents investigated (Al, Cd, Cu, Fe, Mn, Zn) fail to meet chronic aquatic life standards along most of the study reach. The spatial pattern of constituent loading suggests four primary sources of contamination under low flow conditions. Three of these sources are associated with acidic (pH <3.1) seeps that enter along the left bank of Lion Creek. Investigation of inflow water (trace metal and major ion) chemistry using PCA suggests a hydraulic connection between many of the left bank inflows and mine water in the Minnesota Mine shaft located to the north-east of the river channel. In addition, water chemistry data during a rainfall-runoff event suggests the spatial pattern of constituent loading may be modified during rainfall due to dissolution of efflorescent salts or erosion of streamside tailings. These data point to the complexity of contaminant mobilisation processes and constituent loading in mining-affected watersheds but the combined synoptic sampling and PCA approach enables a conceptual model of contaminant dynamics to be developed to inform remediation.

  16. An efficient, versatile and scalable pattern growth approach to mine frequent patterns in unaligned protein sequences.

    PubMed

    Ye, Kai; Kosters, Walter A; Ijzerman, Adriaan P

    2007-03-15

    Pattern discovery in protein sequences is often based on multiple sequence alignments (MSA). The procedure can be computationally intensive and often requires manual adjustment, which may be particularly difficult for a set of deviating sequences. In contrast, two algorithms, PRATT2 (http//www.ebi.ac.uk/pratt/) and TEIRESIAS (http://cbcsrv.watson.ibm.com/) are used to directly identify frequent patterns from unaligned biological sequences without an attempt to align them. Here we propose a new algorithm with more efficiency and more functionality than both PRATT2 and TEIRESIAS, and discuss some of its applications to G protein-coupled receptors, a protein family of important drug targets. In this study, we designed and implemented six algorithms to mine three different pattern types from either one or two datasets using a pattern growth approach. We compared our approach to PRATT2 and TEIRESIAS in efficiency, completeness and the diversity of pattern types. Compared to PRATT2, our approach is faster, capable of processing large datasets and able to identify the so-called type III patterns. Our approach is comparable to TEIRESIAS in the discovery of the so-called type I patterns but has additional functionality such as mining the so-called type II and type III patterns and finding discriminating patterns between two datasets. The source code for pattern growth algorithms and their pseudo-code are available at http://www.liacs.nl/home/kosters/pg/.

  17. Implementation of an original approach on the Mines-Douai Comparative Reactivity Method (MD-CRM) instrument to identify part of the missing OH reactivity at an urban site

    NASA Astrophysics Data System (ADS)

    Dusanter, S.; Michoud, V.; Leonardis, T.; Riffault, V.; Zhang, S.; Locoge, N.

    2015-12-01

    Due to the large number of Volatile Organic Compounds (VOCs) expected in the atmosphere (104-105) (Goldstein and Galbally, ES&T, 2007), exhaustive measurements of VOCs appear to be currently unfeasible using common analytical techniques. In this context, measurements of the total sink of OH, referred as total OH reactivity, can provide a critical test to assess the completeness of trace gas measurements during field campaigns. This can be done by comparing the measured total OH reactivity to values calculated from trace gas measurements. Indeed, large discrepancies are usually found between measured and calculated OH reactivity values revealing the presence of important unmeasured reactive species, which have yet to be identified. A Comparative Reactivity Method (CRM) instrument has been setup at Mines Douai to allow sequential measurements of VOCs and OH reactivity using the same Proton Transfer Reaction-Time of Flight Mass Spectrometer. This approach aims at identifying unmeasured reactive VOCs based on a method proposed by Kato et al. (Atmos. Environ., 2011), taking advantage of VOC oxidations occurring in the CRM sampling reactor. MD-CRM has been deployed at an urban site in Dunkirk (France) during July 2014 to test this new approach. During this campaign, a large fraction of the OH reactivity was not explained by collocated measurements of trace gases (67% on average). In this presentation, we will first describe the approach that was implemented in the CRM instrument to identify part of the observed missing OH reactivity and we will then discuss the OH reactivity budget regarding the origin of air masses reaching the measurement site.

  18. Approaches to Post-Mining Land Reclamation in Polish Open-Cast Lignite Mining

    NASA Astrophysics Data System (ADS)

    Kasztelewicz, Zbigniew

    2014-06-01

    The paper presents the situation regarding the reclamation of post-mining land in the case of particular lignite mines in Poland until 2012 against the background of the whole opencast mining. It discusses the process of land purchase for mining operations and its sales after reclamation. It presents the achievements of mines in the reclamation and regeneration of post-mining land as a result of which-after development processes carried out according to European standards-it now serves the inhabitants as a recreational area that increases the attractiveness of the regions.

  19. A proactive approach to sustainable management of mine tailings

    NASA Astrophysics Data System (ADS)

    Edraki, Mansour; Baumgartl, Thomas

    2015-04-01

    The reactive strategies to manage mine tailings i.e. containment of slurries of tailings in tailings storage facilities (TSF's) and remediation of tailings solids or tailings seepage water after the decommissioning of those facilities, can be technically inefficient to eliminate environmental risks (e.g. prevent dispersion of contaminants and catastrophic dam wall failures), pose a long term economic burden for companies, governments and society after mine closure, and often fail to meet community expectations. Most preventive environmental management practices promote proactive integrated approaches to waste management whereby the source of environmental issues are identified to help make a more informed decisions. They often use life cycle assessment to find the "hot spots" of environmental burdens. This kind of approach is often based on generic data and has rarely been used for tailings. Besides, life cycle assessments are less useful for designing operations or simulating changes in the process and consequent environmental outcomes. It is evident that an integrated approach for tailings research linked to better processing options is needed. A literature review revealed that there are only few examples of integrated approaches. The aim of this project is to develop new tailings management models by streamlining orebody characterization, process optimization and rehabilitation. The approach is based on continuous fingerprinting of geochemical processes from orebody to tailings storage facility, and benchmark the success of such proactive initiatives by evidence of no impacts and no future projected impacts on receiving environments. We present an approach for developing such a framework and preliminary results from a case study where combined grinding and flotation models developed using geometallurgical data from the orebody were constructed to predict the properties of tailings produced under various processing scenarios. The modelling scenarios based on the

  20. Building a glaucoma interaction network using a text mining approach.

    PubMed

    Soliman, Maha; Nasraoui, Olfa; Cooper, Nigel G F

    2016-01-01

    The volume of biomedical literature and its underlying knowledge base is rapidly expanding, making it beyond the ability of a single human being to read through all the literature. Several automated methods have been developed to help make sense of this dilemma. The present study reports on the results of a text mining approach to extract gene interactions from the data warehouse of published experimental results which are then used to benchmark an interaction network associated with glaucoma. To the best of our knowledge, there is, as yet, no glaucoma interaction network derived solely from text mining approaches. The presence of such a network could provide a useful summative knowledge base to complement other forms of clinical information related to this disease. A glaucoma corpus was constructed from PubMed Central and a text mining approach was applied to extract genes and their relations from this corpus. The extracted relations between genes were checked using reference interaction databases and classified generally as known or new relations. The extracted genes and relations were then used to construct a glaucoma interaction network. Analysis of the resulting network indicated that it bears the characteristics of a small world interaction network. Our analysis showed the presence of seven glaucoma linked genes that defined the network modularity. A web-based system for browsing and visualizing the extracted glaucoma related interaction networks is made available at http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx. This study has reported the first version of a glaucoma interaction network using a text mining approach. The power of such an approach is in its ability to cover a wide range of glaucoma related studies published over many years. Hence, a bigger picture of the disease can be established. To the best of our knowledge, this is the first glaucoma interaction network to summarize the known literature. The major findings were a set of

  1. Trend Motif: A Graph Mining Approach for Analysis of Dynamic Complex Networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jin, R; McCallen, S; Almaas, E

    2007-05-28

    Complex networks have been used successfully in scientific disciplines ranging from sociology to microbiology to describe systems of interacting units. Until recently, studies of complex networks have mainly focused on their network topology. However, in many real world applications, the edges and vertices have associated attributes that are frequently represented as vertex or edge weights. Furthermore, these weights are often not static, instead changing with time and forming a time series. Hence, to fully understand the dynamics of the complex network, we have to consider both network topology and related time series data. In this work, we propose a motifmore » mining approach to identify trend motifs for such purposes. Simply stated, a trend motif describes a recurring subgraph where each of its vertices or edges displays similar dynamics over a userdefined period. Given this, each trend motif occurrence can help reveal significant events in a complex system; frequent trend motifs may aid in uncovering dynamic rules of change for the system, and the distribution of trend motifs may characterize the global dynamics of the system. Here, we have developed efficient mining algorithms to extract trend motifs. Our experimental validation using three disparate empirical datasets, ranging from the stock market, world trade, to a protein interaction network, has demonstrated the efficiency and effectiveness of our approach.« less

  2. Phylogeny-guided (meta)genome mining approach for the targeted discovery of new microbial natural products.

    PubMed

    Kang, Hahk-Soo

    2017-02-01

    Genomics-based methods are now commonplace in natural products research. A phylogeny-guided mining approach provides a means to quickly screen a large number of microbial genomes or metagenomes in search of new biosynthetic gene clusters of interest. In this approach, biosynthetic genes serve as molecular markers, and phylogenetic trees built with known and unknown marker gene sequences are used to quickly prioritize biosynthetic gene clusters for their metabolites characterization. An increase in the use of this approach has been observed for the last couple of years along with the emergence of low cost sequencing technologies. The aim of this review is to discuss the basic concept of a phylogeny-guided mining approach, and also to provide examples in which this approach was successfully applied to discover new natural products from microbial genomes and metagenomes. I believe that the phylogeny-guided mining approach will continue to play an important role in genomics-based natural products research.

  3. A novel approach to generating CER hypotheses based on mining clinical data.

    PubMed

    Zhang, Shuo; Li, Lin; Yu, Yiqin; Sun, Xingzhi; Xu, Linhao; Zhao, Wei; Teng, Xiaofei; Pan, Yue

    2013-01-01

    Comparative effectiveness research (CER) is a scientific method of investigating the effectiveness of alternative intervention methods. In a CER study, clinical researchers typically start with a CER hypothesis, and aim to evaluate it by applying a series of medical statistical methods. Traditionally, the CER hypotheses are defined manually by clinical researchers. This makes the task of hypothesis generation very time-consuming and the quality of hypothesis heavily dependent on the researchers' skills. Recently, with more electronic medical data being collected, it is highly promising to apply the computerized method for discovering CER hypotheses from clinical data sets. In this poster, we proposes a novel approach to automatically generating CER hypotheses based on mining clinical data, and presents a case study showing that the approach can facilitate clinical researchers to identify potentially valuable hypotheses and eventually define high quality CER studies.

  4. A data mining based approach to predict spatiotemporal changes in satellite images

    NASA Astrophysics Data System (ADS)

    Boulila, W.; Farah, I. R.; Ettabaa, K. Saheb; Solaiman, B.; Ghézala, H. Ben

    2011-06-01

    The interpretation of remotely sensed images in a spatiotemporal context is becoming a valuable research topic. However, the constant growth of data volume in remote sensing imaging makes reaching conclusions based on collected data a challenging task. Recently, data mining appears to be a promising research field leading to several interesting discoveries in various areas such as marketing, surveillance, fraud detection and scientific discovery. By integrating data mining and image interpretation techniques, accurate and relevant information (i.e. functional relation between observed parcels and a set of informational contents) can be automatically elicited. This study presents a new approach to predict spatiotemporal changes in satellite image databases. The proposed method exploits fuzzy sets and data mining concepts to build predictions and decisions for several remote sensing fields. It takes into account imperfections related to the spatiotemporal mining process in order to provide more accurate and reliable information about land cover changes in satellite images. The proposed approach is validated using SPOT images representing the Saint-Denis region, capital of Reunion Island. Results show good performances of the proposed framework in predicting change for the urban zone.

  5. Geologic considerations in underground coal mining system design

    NASA Technical Reports Server (NTRS)

    Camilli, F. A.; Maynard, D. P.; Mangolds, A.; Harris, J.

    1981-01-01

    Geologic characteristics of coal resources which may impact new extraction technologies are identified and described to aid system designers and planners in their task of designing advanced coal extraction systems for the central Appalachian region. These geologic conditions are then organized into a matrix identified as the baseline mine concept. A sample region, eastern Kentucy is analyzed using both the developed baseline mine concept and the traditional geologic investigative approach.

  6. Efflorescent sulfates from Baia Sprie mining area (Romania)--Acid mine drainage and climatological approach.

    PubMed

    Buzatu, Andrei; Dill, Harald G; Buzgar, Nicolae; Damian, Gheorghe; Maftei, Andreea Elena; Apopei, Andrei Ionuț

    2016-01-15

    The Baia Sprie epithermal system, a well-known deposit for its impressive mineralogical associations, shows the proper conditions for acid mine drainage and can be considered a general example for affected mining areas around the globe. Efflorescent samples from the abandoned open pit Minei Hill have been analyzed by X-ray diffraction (XRD), scanning electron microscopy (SEM), Raman and near-infrared (NIR) spectrometry. The identified phases represent mostly iron sulfates with different hydration degrees (szomolnokite, rozenite, melanterite, coquimbite, ferricopiapite), Zn and Al sulfates (gunningite, alunogen, halotrichite). The samples were heated at different temperatures in order to establish the phase transformations among the studied sulfates. The dehydration temperatures and intermediate phases upon decomposition were successfully identified for each of mineral phases. Gunningite was the single sulfate that showed no transformations during the heating experiment. All the other sulfates started to dehydrate within the 30-90 °C temperature range. The acid mine drainage is the main cause for sulfates formation, triggered by pyrite oxidation as the major source for the abundant iron sulfates. Based on the dehydration temperatures, the climatological interpretation indicated that melanterite formation and long-term presence is related to continental and temperate climates. Coquimbite and rozenite are attributed also to the dry arid/semi-arid areas, in addition to the above mentioned ones. The more stable sulfates, alunogen, halotrichite, szomolnokite, ferricopiapite and gunningite, can form and persists in all climate regimes, from dry continental to even tropical humid. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. Identifying Liver Cancer and Its Relations with Diseases, Drugs, and Genes: A Literature-Based Approach

    PubMed Central

    Song, Min

    2016-01-01

    In biomedicine, scientific literature is a valuable source for knowledge discovery. Mining knowledge from textual data has become an ever important task as the volume of scientific literature is growing unprecedentedly. In this paper, we propose a framework for examining a certain disease based on existing information provided by scientific literature. Disease-related entities that include diseases, drugs, and genes are systematically extracted and analyzed using a three-level network-based approach. A paper-entity network and an entity co-occurrence network (macro-level) are explored and used to construct six entity specific networks (meso-level). Important diseases, drugs, and genes as well as salient entity relations (micro-level) are identified from these networks. Results obtained from the literature-based literature mining can serve to assist clinical applications. PMID:27195695

  8. An Integrated Assessment Approach to Address Artisanal and Small-Scale Gold Mining in Ghana.

    PubMed

    Basu, Niladri; Renne, Elisha P; Long, Rachel N

    2015-09-17

    Artisanal and small-scale gold mining (ASGM) is growing in many regions of the world including Ghana. The problems in these communities are complex and multi-faceted. To help increase understanding of such problems, and to enable consensus-building and effective translation of scientific findings to stakeholders, help inform policies, and ultimately improve decision making, we utilized an Integrated Assessment approach to study artisanal and small-scale gold mining activities in Ghana. Though Integrated Assessments have been used in the fields of environmental science and sustainable development, their use in addressing specific matter in public health, and in particular, environmental and occupational health is quite limited despite their many benefits. The aim of the current paper was to describe specific activities undertaken and how they were organized, and the outputs and outcomes of our activity. In brief, three disciplinary workgroups (Natural Sciences, Human Health, Social Sciences and Economics) were formed, with 26 researchers from a range of Ghanaian institutions plus international experts. The workgroups conducted activities in order to address the following question: What are the causes, consequences and correctives of small-scale gold mining in Ghana? More specifically: What alternatives are available in resource-limited settings in Ghana that allow for gold-mining to occur in a manner that maintains ecological health and human health without hindering near- and long-term economic prosperity? Several response options were identified and evaluated, and are currently being disseminated to various stakeholders within Ghana and internationally.

  9. An Integrated Assessment Approach to Address Artisanal and Small-Scale Gold Mining in Ghana

    PubMed Central

    Basu, Niladri; Renne, Elisha P.; Long, Rachel N.

    2015-01-01

    Artisanal and small-scale gold mining (ASGM) is growing in many regions of the world including Ghana. The problems in these communities are complex and multi-faceted. To help increase understanding of such problems, and to enable consensus-building and effective translation of scientific findings to stakeholders, help inform policies, and ultimately improve decision making, we utilized an Integrated Assessment approach to study artisanal and small-scale gold mining activities in Ghana. Though Integrated Assessments have been used in the fields of environmental science and sustainable development, their use in addressing specific matter in public health, and in particular, environmental and occupational health is quite limited despite their many benefits. The aim of the current paper was to describe specific activities undertaken and how they were organized, and the outputs and outcomes of our activity. In brief, three disciplinary workgroups (Natural Sciences, Human Health, Social Sciences and Economics) were formed, with 26 researchers from a range of Ghanaian institutions plus international experts. The workgroups conducted activities in order to address the following question: What are the causes, consequences and correctives of small-scale gold mining in Ghana? More specifically: What alternatives are available in resource-limited settings in Ghana that allow for gold-mining to occur in a manner that maintains ecological health and human health without hindering near- and long-term economic prosperity? Several response options were identified and evaluated, and are currently being disseminated to various stakeholders within Ghana and internationally. PMID:26393627

  10. Mining integrated semantic networks for drug repositioning opportunities

    PubMed Central

    Mullen, Joseph; Tipney, Hannah

    2016-01-01

    Current research and development approaches to drug discovery have become less fruitful and more costly. One alternative paradigm is that of drug repositioning. Many marketed examples of repositioned drugs have been identified through serendipitous or rational observations, highlighting the need for more systematic methodologies to tackle the problem. Systems level approaches have the potential to enable the development of novel methods to understand the action of therapeutic compounds, but requires an integrative approach to biological data. Integrated networks can facilitate systems level analyses by combining multiple sources of evidence to provide a rich description of drugs, their targets and their interactions. Classically, such networks can be mined manually where a skilled person is able to identify portions of the graph (semantic subgraphs) that are indicative of relationships between drugs and highlight possible repositioning opportunities. However, this approach is not scalable. Automated approaches are required to systematically mine integrated networks for these subgraphs and bring them to the attention of the user. We introduce a formal framework for the definition of integrated networks and their associated semantic subgraphs for drug interaction analysis and describe DReSMin, an algorithm for mining semantically-rich networks for occurrences of a given semantic subgraph. This algorithm allows instances of complex semantic subgraphs that contain data about putative drug repositioning opportunities to be identified in a computationally tractable fashion, scaling close to linearly with network data. We demonstrate the utility of our approach by mining an integrated drug interaction network built from 11 sources. This work identified and ranked 9,643,061 putative drug-target interactions, showing a strong correlation between highly scored associations and those supported by literature. We discuss the 20 top ranked associations in more detail, of which

  11. An Efficient Pattern Mining Approach for Event Detection in Multivariate Temporal Data

    PubMed Central

    Batal, Iyad; Cooper, Gregory; Fradkin, Dmitriy; Harrison, James; Moerchen, Fabian; Hauskrecht, Milos

    2015-01-01

    This work proposes a pattern mining approach to learn event detection models from complex multivariate temporal data, such as electronic health records. We present Recent Temporal Pattern mining, a novel approach for efficiently finding predictive patterns for event detection problems. This approach first converts the time series data into time-interval sequences of temporal abstractions. It then constructs more complex time-interval patterns backward in time using temporal operators. We also present the Minimal Predictive Recent Temporal Patterns framework for selecting a small set of predictive and non-spurious patterns. We apply our methods for predicting adverse medical events in real-world clinical data. The results demonstrate the benefits of our methods in learning accurate event detection models, which is a key step for developing intelligent patient monitoring and decision support systems. PMID:26752800

  12. The application of Airborne Laser Scaning for identifying old lignite workings - case study: the mine "Borussia" near Ośno Lubuskie (Western Poland)

    NASA Astrophysics Data System (ADS)

    Gontaszewska-Piekarz, Agnieszka; Mrówczyńska, Maria

    2018-04-01

    The paper presents the possibilities of using data obtained by airborne laser scanning for identifying areas where lignite used to be mined. The technology of airborne laser scanning presented in the paper as and its results have a vast potential in terms of identifying local terrain deformations. The paper also presents the history of lignite mining in the region of Ośno Lubuskie (the north-west of Ziemia Lubuska - western Poland). It describes underground mining in complicated geological conditions (glaciotectonic deformations). The paper is supplemented with historical maps showing the locations of the mines

  13. Mines and human casualties: a robotics approach toward mine clearing

    NASA Astrophysics Data System (ADS)

    Ghaffari, Masoud; Manthena, Dinesh; Ghaffari, Alireza; Hall, Ernest L.

    2004-10-01

    An estimated 100 million landmines which have been planted in more than 60 countries kill or maim thousands of civilians every year. Millions of people live in the vast dangerous areas and are not able to access to basic human services because of landmines" threats. This problem has affected many third world countries and poor nations which are not able to afford high cost solutions. This paper tries to present some experiences with the land mine victims and solutions for the mine clearing. It studies current situation of this crisis as well as state of the art robotics technology for the mine clearing. It also introduces a survey robot which is suitable for the mine clearing applications. The results show that in addition to technical aspects, this problem has many socio-economic issues. The significance of this study is to persuade robotics researchers toward this topic and to peruse the technical and humanitarian facets of this issue.

  14. Evolutionary Data Mining Approach to Creating Digital Logic

    DTIC Science & Technology

    2010-01-01

    To deal with this problem a genetic program (GP) based data mining ( DM ) procedure has been invented (Smith 2005). A genetic program is an algorithm...that can operate on the variables. When a GP was used as a DM function in the past to automatically create fuzzy decision trees, the Report...rules represents an approach to the determining the effect of linguistic imprecision, i.e., the inability of experts to provide crisp rules. The

  15. Environmental considerations related to mining of nonfuel minerals

    USGS Publications Warehouse

    Seal, Robert R.; Piatak, Nadine M.; Kimball, Bryn E.; Hammarstrom, Jane M.; Schulz, Klaus J.; DeYoung,, John H.; Seal, Robert R.; Bradley, Dwight C.

    2017-12-19

    Throughout most of human history, environmental stewardship during mining has not been a priority partly because of the lack of applicable laws and regulations and partly because of ignorance about the effects that mining can have on the environment. In the United States, the National Environmental Policy Act of 1969, in conjunction with related laws, codified a more modern approach to mining, including the responsibility for environmental stewardship, and provided a framework for incorporating environmental protection into mine planning. Today, similar frameworks are in place in the other developed countries of the world, and international mining companies generally follow similar procedures wherever they work in the world. The regulatory guidance has fostered an international effort among all stakeholders to identify best practices for environmental stewardship.The modern approach to mining using best practices involves the following: (a) establishment of a pre-mining baseline from which to monitor environmental effects during mining and help establish geologically reasonable closure goals; (b) identification of environmental risks related to mining through standardized approaches; and (c) formulation of an environmental closure plan before the start of mining. A key aspect of identifying the environmental risks and mitigating those risks is understanding how the risks vary from one deposit type to another—a concept that forms the basis for geoenvironmental mineral-deposit models.Accompanying the quest for best practices is the goal of making mining sustainable into the future. Sustainable mine development is generally considered to be development that meets the needs of the present generation without compromising the ability of future generations to meet their own needs. The concept extends beyond the availability of nonrenewable mineral commodities and includes the environmental and social effects of mine development.Global population growth, meanwhile, has

  16. APPLYING DATA MINING APPROACHES TO FURTHER ...

    EPA Pesticide Factsheets

    This dataset will be used to illustrate various data mining techniques to biologically profile the chemical space. This dataset will be used to illustrate various data mining techniques to biologically profile the chemical space.

  17. Integrated approach of environmental impact and risk assessment of Rosia Montana Mining Area, Romania.

    PubMed

    Stefănescu, Lucrina; Robu, Brînduşa Mihaela; Ozunu, Alexandru

    2013-11-01

    The environmental impact assessment of mining sites represents nowadays a large interest topic in Romania. Historical pollution in the Rosia Montana mining area of Romania caused extensive damage to environmental media. This paper has two goals: to investigate the environmental pollution induced by mining activities in the Rosia Montana area and to quantify the environmental impacts and associated risks by means of an integrated approach. Thus, a new method was developed and applied for quantifying the impact of mining activities, taking account of the quality of environmental media in the mining area, and used as case study in the present paper. The associated risks are a function of the environmental impacts and the probability of their occurrence. The results show that the environmental impacts and quantified risks, based on quality indicators to characterize the environmental quality, are of a higher order, and thus measures for pollution remediation and control need to be considered in the investigated area. The conclusion drawn is that an integrated approach for the assessment of environmental impact and associated risks is a valuable and more objective method, and is an important tool that can be applied in the decision-making process for national authorities in the prioritization of emergency action.

  18. A Hybrid Data Mining Approach for Credit Card Usage Behavior Analysis

    NASA Astrophysics Data System (ADS)

    Tsai, Chieh-Yuan

    Credit card is one of the most popular e-payment approaches in current online e-commerce. To consolidate valuable customers, card issuers invest a lot of money to maintain good relationship with their customers. Although several efforts have been done in studying card usage motivation, few researches emphasize on credit card usage behavior analysis when time periods change from t to t+1. To address this issue, an integrated data mining approach is proposed in this paper. First, the customer profile and their transaction data at time period t are retrieved from databases. Second, a LabelSOM neural network groups customers into segments and identify critical characteristics for each group. Third, a fuzzy decision tree algorithm is used to construct usage behavior rules of interesting customer groups. Finally, these rules are used to analysis the behavior changes between time periods t and t+1. An implementation case using a practical credit card database provided by a commercial bank in Taiwan is illustrated to show the benefits of the proposed framework.

  19. A Hybrid Approach for Efficient Modeling of Medium-Frequency Propagation in Coal Mines

    PubMed Central

    Brocker, Donovan E.; Sieber, Peter E.; Waynert, Joseph A.; Li, Jingcheng; Werner, Pingjuan L.; Werner, Douglas H.

    2015-01-01

    An efficient procedure for modeling medium frequency (MF) communications in coal mines is introduced. In particular, a hybrid approach is formulated and demonstrated utilizing ideal transmission line equations to model MF propagation in combination with full-wave sections used for accurate simulation of local antenna-line coupling and other near-field effects. This work confirms that the hybrid method accurately models signal propagation from a source to a load for various system geometries and material compositions, while significantly reducing computation time. With such dramatic improvement to solution times, it becomes feasible to perform large-scale optimizations with the primary motivation of improving communications in coal mines both for daily operations and emergency response. Furthermore, it is demonstrated that the hybrid approach is suitable for modeling and optimizing large communication networks in coal mines that may otherwise be intractable to simulate using traditional full-wave techniques such as moment methods or finite-element analysis. PMID:26478686

  20. Determining the familial risk distribution of colorectal cancer: a data mining approach.

    PubMed

    Chau, Rowena; Jenkins, Mark A; Buchanan, Daniel D; Ait Ouakrim, Driss; Giles, Graham G; Casey, Graham; Gallinger, Steven; Haile, Robert W; Le Marchand, Loic; Newcomb, Polly A; Lindor, Noralane M; Hopper, John L; Win, Aung Ko

    2016-04-01

    This study was aimed to characterize the distribution of colorectal cancer risk using family history of cancers by data mining. Family histories for 10,066 colorectal cancer cases recruited to population cancer registries of the Colon Cancer Family Registry were analyzed using a data mining framework. A novel index was developed to quantify familial cancer aggregation. Artificial neural network was used to identify distinct categories of familial risk. Standardized incidence ratios (SIRs) and corresponding 95% confidence intervals (CIs) of colorectal cancer were calculated for each category. We identified five major, and 66 minor categories of familial risk for developing colorectal cancer. The distribution the major risk categories were: (1) 7% of families (SIR = 7.11; 95% CI 6.65-7.59) had a strong family history of colorectal cancer; (2) 13% of families (SIR = 2.94; 95% CI 2.78-3.10) had a moderate family history of colorectal cancer; (3) 11% of families (SIR = 1.23; 95% CI 1.12-1.36) had a strong family history of breast cancer and a weak family history of colorectal cancer; (4) 9 % of families (SIR = 1.06; 95 % CI 0.96-1.18) had strong family history of prostate cancer and weak family history of colorectal cancer; and (5) 60% of families (SIR = 0.61; 95% CI 0.57-0.65) had a weak family history of all cancers. There is a wide variation of colorectal cancer risk that can be categorized by family history of cancer, with a strong gradient of colorectal cancer risk between the highest and lowest risk categories. The risk of colorectal cancer for people with the highest risk category of family history (7% of the population) was 12-times that for people in the lowest risk category (60%) of the population. Data mining was proven an effective approach for gaining insight into the underlying cancer aggregation patterns and for categorizing familial risk of colorectal cancer.

  1. Determining the familial risk distribution of colorectal cancer: A data mining approach

    PubMed Central

    Chau, Rowena; Jenkins, Mark A.; Buchanan, Daniel D.; Ouakrim, Driss Ait; Giles, Graham G.; Casey, Graham; Gallinger, Steven; Haile, Robert W.; Le Marchand, Loic; Newcomb, Polly A.; Lindor, Noralane M.; Hopper, John L.; Win, Aung Ko

    2016-01-01

    This study was aimed to characterize the distribution of colorectal cancer risk using family history of cancers by data mining. Family histories for 10,066 colorectal cancer cases recruited to population cancer registries of the Colon Cancer Family Registry were analyzed using a data mining framework. A novel index was developed to quantify familial cancer aggregation. Artificial neural network was used to identify distinct categories of familial risk. Standardized incidence ratios (SIRs) and corresponding 95% confidence intervals (CIs) of colorectal cancer were calculated for each category. We identified five major, and sixty-six minor categories of familial risk for developing colorectal cancer. The distribution the major risk categories were: (i) 7% of families (SIR=7.11; 95%CI=6.65–7.59) had a strong family history of colorectal cancer; (ii) 13% of families (SIR=2.94; 95%CI=2.78–3.10) had a moderate family history of colorectal cancer; (iii) 11% of families (SIR=1.23; 95%CI=1.12–1.36) had a strong family history of breast cancer and weak family history of colorectal cancer; (iv) 9% of families (SIR=1.06; 95% CI=0.96–1.18) had a strong family history of prostate cancer and a weak family history of colorectal cancer; and (v) 60% of families (SIR=0.61; 95%CI=0.57–0.65) had weak family history of all cancers. There is a wide variation of colorectal cancer risk that can be categorized by family history of cancer, with a strong gradient of colorectal cancer risk between the highest and lowest risk categories. The risk of colorectal cancer for people with the highest risk category of family history (7% of the population) was 12-times that for people in the lowest risk category (60%) of the population. Data mining was proven an effective approach for gaining insight into the underlying cancer aggregation patterns and for categorizing familial risk of colorectal cancer. PMID:26681340

  2. A Temporal Pattern Mining Approach for Classifying Electronic Health Record Data

    PubMed Central

    Batal, Iyad; Valizadegan, Hamed; Cooper, Gregory F.; Hauskrecht, Milos

    2013-01-01

    We study the problem of learning classification models from complex multivariate temporal data encountered in electronic health record systems. The challenge is to define a good set of features that are able to represent well the temporal aspect of the data. Our method relies on temporal abstractions and temporal pattern mining to extract the classification features. Temporal pattern mining usually returns a large number of temporal patterns, most of which may be irrelevant to the classification task. To address this problem, we present the Minimal Predictive Temporal Patterns framework to generate a small set of predictive and non-spurious patterns. We apply our approach to the real-world clinical task of predicting patients who are at risk of developing heparin induced thrombocytopenia. The results demonstrate the benefit of our approach in efficiently learning accurate classifiers, which is a key step for developing intelligent clinical monitoring systems. PMID:25309815

  3. Integrating text mining, data mining, and network analysis for identifying genetic breast cancer trends.

    PubMed

    Jurca, Gabriela; Addam, Omar; Aksac, Alper; Gao, Shang; Özyer, Tansel; Demetrick, Douglas; Alhajj, Reda

    2016-04-26

    Breast cancer is a serious disease which affects many women and may lead to death. It has received considerable attention from the research community. Thus, biomedical researchers aim to find genetic biomarkers indicative of the disease. Novel biomarkers can be elucidated from the existing literature. However, the vast amount of scientific publications on breast cancer make this a daunting task. This paper presents a framework which investigates existing literature data for informative discoveries. It integrates text mining and social network analysis in order to identify new potential biomarkers for breast cancer. We utilized PubMed for the testing. We investigated gene-gene interactions, as well as novel interactions such as gene-year, gene-country, and abstract-country to find out how the discoveries varied over time and how overlapping/diverse are the discoveries and the interest of various research groups in different countries. Interesting trends have been identified and discussed, e.g., different genes are highlighted in relationship to different countries though the various genes were found to share functionality. Some text analysis based results have been validated against results from other tools that predict gene-gene relations and gene functions.

  4. Identifying Catchment-Scale Predictors of Coal Mining Impacts on New Zealand Stream Communities.

    PubMed

    Clapcott, Joanne E; Goodwin, Eric O; Harding, Jon S

    2016-03-01

    Coal mining activities can have severe and long-term impacts on freshwater ecosystems. At the individual stream scale, these impacts have been well studied; however, few attempts have been made to determine the predictors of mine impacts at a regional scale. We investigated whether catchment-scale measures of mining impacts could be used to predict biological responses. We collated data from multiple studies and analyzed algae, benthic invertebrate, and fish community data from 186 stream sites, including un-mined streams, and those associated with 620 mines on the West Coast of the South Island, New Zealand. Algal, invertebrate, and fish richness responded to mine impacts and were significantly higher in un-mined compared to mine-impacted streams. Changes in community composition toward more acid- and metal-tolerant species were evident for algae and invertebrates, whereas changes in fish communities were significant and driven by a loss of nonmigratory native species. Consistent catchment-scale predictors of mining activities affecting biota included the time post mining (years), mining density (the number of mines upstream per catchment area), and mining intensity (tons of coal production per catchment area). Mining was associated with a decline in stream biodiversity irrespective of catchment size, and recovery was not evident until at least 30 years after mining activities have ceased. These catchment-scale predictors can provide managers and regulators with practical metrics to focus on management and remediation decisions.

  5. Identifying Catchment-Scale Predictors of Coal Mining Impacts on New Zealand Stream Communities

    NASA Astrophysics Data System (ADS)

    Clapcott, Joanne E.; Goodwin, Eric O.; Harding, Jon S.

    2016-03-01

    Coal mining activities can have severe and long-term impacts on freshwater ecosystems. At the individual stream scale, these impacts have been well studied; however, few attempts have been made to determine the predictors of mine impacts at a regional scale. We investigated whether catchment-scale measures of mining impacts could be used to predict biological responses. We collated data from multiple studies and analyzed algae, benthic invertebrate, and fish community data from 186 stream sites, including un-mined streams, and those associated with 620 mines on the West Coast of the South Island, New Zealand. Algal, invertebrate, and fish richness responded to mine impacts and were significantly higher in un-mined compared to mine-impacted streams. Changes in community composition toward more acid- and metal-tolerant species were evident for algae and invertebrates, whereas changes in fish communities were significant and driven by a loss of nonmigratory native species. Consistent catchment-scale predictors of mining activities affecting biota included the time post mining (years), mining density (the number of mines upstream per catchment area), and mining intensity (tons of coal production per catchment area). Mining was associated with a decline in stream biodiversity irrespective of catchment size, and recovery was not evident until at least 30 years after mining activities have ceased. These catchment-scale predictors can provide managers and regulators with practical metrics to focus on management and remediation decisions.

  6. Identification of ex-sand mining area using optical and SAR imagery

    NASA Astrophysics Data System (ADS)

    Indriasari, Novie; Kusratmoko, Eko; Indra, Tito Latif; Julzarika, Atriyon

    2018-05-01

    Open mining activities in Sumedang Regency has been operated since 1984 impacted to degradation of environment due to large area of ex-mining. Therefore, identification of ex-mining area which generally been used for sand mining is crucial and important to detect and monitor recent environmental degradation impacted from the ex-mining activities. In this research, identification ex-sand mining area using optical and SAR data in Sumedang Regency will be discussed. We use Landsat 5 TM acquisition date August 01, 2009 and Landsat 8 OLI acquired on June 24, 2016 to identify location of sand mining area, processed using Tasselled Cap Trasformation (TCT), while the landform deformation approached using ALOS PALSAR in 2009 and ALOS PALSAR 2 in 2016 processed using SAR interferometry (InSAR) method. The results show that TCT and InSAR method can can be used to identify the areas of ex-sand mining clearly. In 2016 the total area of ex-mining were 352.92 Ha. The land deformation show that during 7 years period since 2009 has impacted to the deformation at 7 meters.

  7. Citation-related reliability analysis for a pilot sample of underground coal mines.

    PubMed

    Kinilakodi, Harisha; Grayson, R Larry

    2011-05-01

    The scrutiny of underground coal mine safety was heightened because of the disasters that occurred in 2006-2007, and more recently in 2010. In the aftermath of the 2006 incidents, the U.S. Congress passed the Mine Improvement and New Emergency Response Act of 2006 (MINER Act), which strengthened the existing regulations and mandated new laws to address various issues related to emergency preparedness and response, escape from an emergency situation, and protection of miners. The National Mining Association-sponsored Mine Safety Technology and Training Commission study highlighted the role of risk management in identifying and controlling major hazards, which are elements that could come together and cause a mine disaster. In 2007 MSHA revised its approach to the "Pattern of Violations" (POV) process in order to target unsafe mines and then force them to remediate conditions in their mines. The POV approach has certain limitations that make it difficult for it to be enforced. One very understandable way to focus on removing threats from major-hazard conditions is to use citation-related reliability analysis. The citation reliability approach, which focuses on the probability of not getting a citation on a given inspector day, is considered an analogue to the maintenance reliability approach, which many mine operators understand and use. In this study, the citation reliability approach was applied to a stratified random sample of 31 underground coal mines to examine its potential for broader application. The results clearly show the best-performing and worst-performing mines for compliance with mine safety standards, and they highlight differences among different mine sizes. Copyright © 2010 Elsevier Ltd. All rights reserved.

  8. Meta-control of combustion performance with a data mining approach

    NASA Astrophysics Data System (ADS)

    Song, Zhe

    Large scale combustion process is complex and proposes challenges of optimizing its performance. Traditional approaches based on thermal dynamics have limitations on finding optimal operational regions due to time-shift nature of the process. Recent advances in information technology enable people collect large volumes of process data easily and continuously. The collected process data contains rich information about the process and, to some extent, represents a digital copy of the process over time. Although large volumes of data exist in industrial combustion processes, they are not fully utilized to the level where the process can be optimized. Data mining is an emerging science which finds patterns or models from large data sets. It has found many successful applications in business marketing, medical and manufacturing domains The focus of this dissertation is on applying data mining to industrial combustion processes, and ultimately optimizing the combustion performance. However the philosophy, methods and frameworks discussed in this research can also be applied to other industrial processes. Optimizing an industrial combustion process has two major challenges. One is the underlying process model changes over time and obtaining an accurate process model is nontrivial. The other is that a process model with high fidelity is usually highly nonlinear, solving the optimization problem needs efficient heuristics. This dissertation is set to solve these two major challenges. The major contribution of this 4-year research is the data-driven solution to optimize the combustion process, where process model or knowledge is identified based on the process data, then optimization is executed by evolutionary algorithms to search for optimal operating regions.

  9. Ensemble-based classification approach for micro-RNA mining applied on diverse metagenomic sequences.

    PubMed

    ElGokhy, Sherin M; ElHefnawi, Mahmoud; Shoukry, Amin

    2014-05-06

    MicroRNAs (miRNAs) are endogenous ∼22 nt RNAs that are identified in many species as powerful regulators of gene expressions. Experimental identification of miRNAs is still slow since miRNAs are difficult to isolate by cloning due to their low expression, low stability, tissue specificity and the high cost of the cloning procedure. Thus, computational identification of miRNAs from genomic sequences provide a valuable complement to cloning. Different approaches for identification of miRNAs have been proposed based on homology, thermodynamic parameters, and cross-species comparisons. The present paper focuses on the integration of miRNA classifiers in a meta-classifier and the identification of miRNAs from metagenomic sequences collected from different environments. An ensemble of classifiers is proposed for miRNA hairpin prediction based on four well-known classifiers (Triplet SVM, Mipred, Virgo and EumiR), with non-identical features, and which have been trained on different data. Their decisions are combined using a single hidden layer neural network to increase the accuracy of the predictions. Our ensemble classifier achieved 89.3% accuracy, 82.2% f-measure, 74% sensitivity, 97% specificity, 92.5% precision and 88.2% negative predictive value when tested on real miRNA and pseudo sequence data. The area under the receiver operating characteristic curve of our classifier is 0.9 which represents a high performance index.The proposed classifier yields a significant performance improvement relative to Triplet-SVM, Virgo and EumiR and a minor refinement over MiPred.The developed ensemble classifier is used for miRNA prediction in mine drainage, groundwater and marine metagenomic sequences downloaded from the NCBI sequence reed archive. By consulting the miRBase repository, 179 miRNAs have been identified as highly probable miRNAs. Our new approach could thus be used for mining metagenomic sequences and finding new and homologous miRNAs. The paper investigates a

  10. Data mining approach identifies research priorities and data requirements for resolving the red algal tree of life.

    PubMed

    Verbruggen, Heroen; Maggs, Christine A; Saunders, Gary W; Le Gall, Line; Yoon, Hwan Su; De Clerck, Olivier

    2010-01-20

    The assembly of the tree of life has seen significant progress in recent years but algae and protists have been largely overlooked in this effort. Many groups of algae and protists have ancient roots and it is unclear how much data will be required to resolve their phylogenetic relationships for incorporation in the tree of life. The red algae, a group of primary photosynthetic eukaryotes of more than a billion years old, provide the earliest fossil evidence for eukaryotic multicellularity and sexual reproduction. Despite this evolutionary significance, their phylogenetic relationships are understudied. This study aims to infer a comprehensive red algal tree of life at the family level from a supermatrix containing data mined from GenBank. We aim to locate remaining regions of low support in the topology, evaluate their causes and estimate the amount of data required to resolve them. Phylogenetic analysis of a supermatrix of 14 loci and 98 red algal families yielded the most complete red algal tree of life to date. Visualization of statistical support showed the presence of five poorly supported regions. Causes for low support were identified with statistics about the age of the region, data availability and node density, showing that poor support has different origins in different parts of the tree. Parametric simulation experiments yielded optimistic estimates of how much data will be needed to resolve the poorly supported regions (ca. 103 to ca. 104 nucleotides for the different regions). Nonparametric simulations gave a markedly more pessimistic image, some regions requiring more than 2.8 105 nucleotides or not achieving the desired level of support at all. The discrepancies between parametric and nonparametric simulations are discussed in light of our dataset and known attributes of both approaches. Our study takes the red algae one step closer to meaningful inclusion in the tree of life. In addition to the recovery of stable relationships, the recognition of

  11. Theoretical approaches to creation of robotic coal mines based on the synthesis of simulation technologies

    NASA Astrophysics Data System (ADS)

    Fryanov, V. N.; Pavlova, L. D.; Temlyantsev, M. V.

    2017-09-01

    Methodological approaches to theoretical substantiation of the structure and parameters of robotic coal mines are outlined. The results of mathematical and numerical modeling revealed the features of manifestation of geomechanical and gas dynamic processes in the conditions of robotic mines. Technological solutions for the design and manufacture of technical means for robotic mine are adopted using the method of economic and mathematical modeling and in accordance with the current regulatory documents. For a comparative performance evaluation of technological schemes of traditional and robotic mines, methods of cognitive modeling and matrix search for subsystem elements in the synthesis of a complex geotechnological system are applied. It is substantiated that the process of technical re-equipment of a traditional mine with a phased transition to a robotic mine will reduce unit costs by almost 1.5 times with a significant social effect due to a reduction in the number of personnel engaged in hazardous work.

  12. THE FUTURE OF COMPUTER-BASED TOXICITY PREDICTION: MECHANISM-BASED MODELS VS. INFORMATION MINING APPROACHES

    EPA Science Inventory


    The Future of Computer-Based Toxicity Prediction:
    Mechanism-Based Models vs. Information Mining Approaches

    When we speak of computer-based toxicity prediction, we are generally referring to a broad array of approaches which rely primarily upon chemical structure ...

  13. Systematic analysis of molecular mechanisms for HCC metastasis via text mining approach.

    PubMed

    Zhen, Cheng; Zhu, Caizhong; Chen, Haoyang; Xiong, Yiru; Tan, Junyuan; Chen, Dong; Li, Jin

    2017-02-21

    To systematically explore the molecular mechanism for hepatocellular carcinoma (HCC) metastasis and identify regulatory genes with text mining methods. Genes with highest frequencies and significant pathways related to HCC metastasis were listed. A handful of proteins such as EGFR, MDM2, TP53 and APP, were identified as hub nodes in PPI (protein-protein interaction) network. Compared with unique genes for HBV-HCCs, genes particular to HCV-HCCs were less, but may participate in more extensive signaling processes. VEGFA, PI3KCA, MAPK1, MMP9 and other genes may play important roles in multiple phenotypes of metastasis. Genes in abstracts of HCC-metastasis literatures were identified. Word frequency analysis, KEGG pathway and PPI network analysis were performed. Then co-occurrence analysis between genes and metastasis-related phenotypes were carried out. Text mining is effective for revealing potential regulators or pathways, but the purpose of it should be specific, and the combination of various methods will be more useful.

  14. Using stable isotopes (δD, δ18O, δ34S and 87Sr/86Sr) to identify sources of water in abandoned mines in the Fengfeng coal mining district, northern China

    NASA Astrophysics Data System (ADS)

    Qu, Shen; Wang, Guangcai; Shi, Zheming; Xu, Qingyu; Guo, Yuying; Ma, Luan; Sheng, Yizhi

    2018-05-01

    With depleted coal resources or deteriorating mining geological conditions, some coal mines have been abandoned in the Fengfeng mining district, China. Water that accumulates in an abandoned underground mine (goaf water) may be a hazard to neighboring mines and impact the groundwater environment. Groundwater samples at three abandoned mines (Yi, Er and Quantou mines) in the Fengfeng mining district and the underlying Ordovician limestone aquifer were collected to characterize their chemical and isotopic compositions and identify the sources of the mine water. The water was HCO3·SO4-Ca·Mg type in Er mine and the auxiliary shaft of Yi mine, and HCO3·SO4-Na type in the main shaft of Quantou mine. The isotopic compositions (δD and δ18O) of water in the three abandoned mines were close to that of Ordovician limestone groundwater. Faults in the abandoned mines were developmental, possibly facilitating inflows of groundwater from the underlying Ordovician limestone aquifers into the coal mines. Although the Sr2+ concentrations differed considerably, the ratios of Sr2+/Ca2+ and 87Sr/86Sr and the 34S content of SO4 2- were similar for all three mine waters and Ordovician limestone groundwater, indicating that a close hydraulic connection may exist. Geochemical and isotopic indicators suggest that (1) the mine waters may originate mainly from the Ordovician limestone groundwater inflows, and (2) the upward hydraulic gradient in the limestone aquifer may prevent its contamination by the overlying abandoned mine water. The results of this study could be useful for water resources management in this area and other similar mining areas.

  15. North American Bats and Mines Project: A cooperative approach for integrating bat conservation and mine-land reclamation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ducummon, S.L.

    Inactive underground mines now provide essential habitat for more than half of North America`s 44 bat species, including some of the largest remaining populations. Thousands of abandoned mines have already been closed or are slated for safety closures, and many are destroyed during renewed mining in historic districts. The available evidence suggests that millions of bats have already been lost due to these closures. Bats are primary predators of night-flying insects that cost American farmers and foresters billions of dollars annually, therefore, threats to bat survival are cause for serious concern. Fortunately, mine closure methods exist that protect both batsmore » and humans. Bat Conservation International (BCI) and the USDI-Bureau of Land Management founded the North American Bats and Mines Project to provide national leadership and coordination to minimize the loss of mine-roosting bats. This partnership has involved federal and state mine-land and wildlife managers and the mining industry. BCI has trained hundreds of mine-land and wildlife managers nationwide in mine assessment techniques for bats and bat-compatible closure methods, published technical information on bats and mine-land management, presented papers on bats and mines at national mining and wildlife conferences, and collaborated with numerous federal, state, and private partners to protect some of the most important mine-roosting bat populations. Our new mining industry initiative, Mining for Habitat, is designed to develop bat habitat conservation and enhancement plans for active mining operations. It includes the creation of cost-effective artificial underground bat roosts using surplus mining materials such as old mine-truck tires and culverts buried beneath waste rock.« less

  16. Combined mining: discovering informative knowledge in complex data.

    PubMed

    Cao, Longbing; Zhang, Huaifeng; Zhao, Yanchang; Luo, Dan; Zhang, Chengqi

    2011-06-01

    Enterprise data mining applications often involve complex data such as multiple large heterogeneous data sources, user preferences, and business impact. In such situations, a single method or one-step mining is often limited in discovering informative knowledge. It would also be very time and space consuming, if not impossible, to join relevant large data sources for mining patterns consisting of multiple aspects of information. It is crucial to develop effective approaches for mining patterns combining necessary information from multiple relevant business lines, catering for real business settings and decision-making actions rather than just providing a single line of patterns. The recent years have seen increasing efforts on mining more informative patterns, e.g., integrating frequent pattern mining with classifications to generate frequent pattern-based classifiers. Rather than presenting a specific algorithm, this paper builds on our existing works and proposes combined mining as a general approach to mining for informative patterns combining components from either multiple data sets or multiple features or by multiple methods on demand. We summarize general frameworks, paradigms, and basic processes for multifeature combined mining, multisource combined mining, and multimethod combined mining. Novel types of combined patterns, such as incremental cluster patterns, can result from such frameworks, which cannot be directly produced by the existing methods. A set of real-world case studies has been conducted to test the frameworks, with some of them briefed in this paper. They identify combined patterns for informing government debt prevention and improving government service objectives, which show the flexibility and instantiation capability of combined mining in discovering informative knowledge in complex data.

  17. WHAT INNOVATIVE APPROACHES CAN BE DEVELOPED FOR MINING SITES?

    EPA Science Inventory

    Mining is essential to maintain our way of life. However, based upon industry's reporting in the most recent Toxic Release Inventory (TRI), the primary sources of heavy metal releases to the environment are mining and mining related activities. The hard rock mining industry rel...

  18. EVALUATION OF A TWO-STAGE PASSIVE TREATMENT APPROACH FOR MINING INFLUENCE WATERS

    EPA Science Inventory

    A two-stage passive treatment approach was assessed at bench-scale using two Colorado Mining Influenced Waters (MIWs). The first-stage was a limestone drain with the purpose of removing iron and aluminum and mitigating the potential effects of mineral acidity. The second stage w...

  19. Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach.

    PubMed

    Schneider, Nadine; Fechner, Nikolas; Landrum, Gregory A; Stiefl, Nikolaus

    2017-08-28

    Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: more and more data are being generated, for instance, by technologies such as DNA encoded libraries, peptide libraries, text mining of large literature corpora, and new in silico enumeration methods. Handling those huge sets of molecules effectively is quite challenging and requires compromises that often come at the expense of the interpretability of the results. In order to find an intuitive and meaningful approach to organizing large molecular data sets, we adopted a probabilistic framework called "topic modeling" from the text-mining field. Here we present the first chemistry-related implementation of this method, which allows large molecule sets to be assigned to "chemical topics" and investigating the relationships between those. In this first study, we thoroughly evaluate this novel method in different experiments and discuss both its disadvantages and advantages. We show very promising results in reproducing human-assigned concepts using the approach to identify and retrieve chemical series from sets of molecules. We have also created an intuitive visualization of the chemical topics output by the algorithm. This is a huge benefit compared to other unsupervised machine-learning methods, like clustering, which are commonly used to group sets of molecules. Finally, we applied the new method to the 1.6 million molecules of the ChEMBL22 data set to test its robustness and efficiency. In about 1 h we built a 100-topic model of this large data set in which we could identify interesting topics like "proteins", "DNA", or "steroids". Along with this publication we provide our data sets and an open-source implementation of the new method (CheTo) which

  20. Using text mining for study identification in systematic reviews: a systematic review of current approaches.

    PubMed

    O'Mara-Eves, Alison; Thomas, James; McNaught, John; Miwa, Makoto; Ananiadou, Sophia

    2015-01-14

    The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying relevant studies in an unbiased way for inclusion in systematic reviews both complex and time consuming. Text mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved. The evidence base around the use of text mining for screening has not yet been pulled together systematically; this systematic review fills that research gap. Focusing mainly on non-technical issues, the review aims to increase awareness of the potential of these technologies and promote further collaborative research between the computer science and systematic review communities. Five research questions led our review: what is the state of the evidence base; how has workload reduction been evaluated; what are the purposes of semi-automation and how effective are they; how have key contextual problems of applying text mining to the systematic review field been addressed; and what challenges to implementation have emerged? We answered these questions using standard systematic review methods: systematic and exhaustive searching, quality-assured data extraction and a narrative synthesis to synthesise findings. The evidence base is active and diverse; there is almost no replication between studies or collaboration between research teams and, whilst it is difficult to establish any overall conclusions about best approaches, it is clear that efficiencies and reductions in workload are potentially achievable. On the whole, most suggested that a saving in workload of between 30% and 70% might be possible, though sometimes the saving in workload is accompanied by the loss of 5% of relevant studies (i.e. a 95% recall). Using text mining to prioritise the order in which items are screened should be considered safe and ready for use in 'live' reviews. The use of text mining as a 'second screener' may also be used cautiously

  1. Text Mining.

    ERIC Educational Resources Information Center

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  2. A new approach to preserve privacy data mining based on fuzzy theory in numerical database

    NASA Astrophysics Data System (ADS)

    Cui, Run; Kim, Hyoung Joong

    2014-01-01

    With the rapid development of information techniques, data mining approaches have become one of the most important tools to discover the in-deep associations of tuples in large-scale database. Hence how to protect the private information is quite a huge challenge, especially during the data mining procedure. In this paper, a new method is proposed for privacy protection which is based on fuzzy theory. The traditional fuzzy approach in this area will apply fuzzification to the data without considering its readability. A new style of obscured data expression is introduced to provide more details of the subsets without reducing the readability. Also we adopt a balance approach between the privacy level and utility when to achieve the suitable subgroups. An experiment is provided to show that this approach is suitable for the classification without a lower accuracy. In the future, this approach can be adapted to the data stream as the low computation complexity of the fuzzy function with a suitable modification.

  3. New approach to generating insights for aging research based on literature mining and knowledge integration

    PubMed Central

    Kwon, Yeondae; Natori, Yukikazu

    2017-01-01

    The proportion of the elderly population in most countries worldwide is increasing dramatically. Therefore, social interest in the fields of health, longevity, and anti-aging has been increasing as well. However, the basic research results obtained from a reductionist approach in biology and a bioinformatic approach in genome science have limited usefulness for generating insights on future health, longevity, and anti-aging-related research on a case by case basis. We propose a new approach that uses our literature mining technique and bioinformatics, which lead to a better perspective on research trends by providing an expanded knowledge base to work from. We demonstrate that our approach provides useful information that deepens insights on future trends which differs from data obtained conventionally, and this methodology is already paving the way for a new field in aging-related research based on literature mining. One compelling example of this is how our new approach can be a useful tool in drug repositioning. PMID:28817730

  4. A novel association rule mining approach using TID intermediate itemset.

    PubMed

    Aqra, Iyad; Herawan, Tutut; Abdul Ghani, Norjihan; Akhunzada, Adnan; Ali, Akhtar; Bin Razali, Ramdan; Ilahi, Manzoor; Raymond Choo, Kim-Kwang

    2018-01-01

    Designing an efficient association rule mining (ARM) algorithm for multilevel knowledge-based transactional databases that is appropriate for real-world deployments is of paramount concern. However, dynamic decision making that needs to modify the threshold either to minimize or maximize the output knowledge certainly necessitates the extant state-of-the-art algorithms to rescan the entire database. Subsequently, the process incurs heavy computation cost and is not feasible for real-time applications. The paper addresses efficiently the problem of threshold dynamic updation for a given purpose. The paper contributes by presenting a novel ARM approach that creates an intermediate itemset and applies a threshold to extract categorical frequent itemsets with diverse threshold values. Thus, improving the overall efficiency as we no longer needs to scan the whole database. After the entire itemset is built, we are able to obtain real support without the need of rebuilding the itemset (e.g. Itemset list is intersected to obtain the actual support). Moreover, the algorithm supports to extract many frequent itemsets according to a pre-determined minimum support with an independent purpose. Additionally, the experimental results of our proposed approach demonstrate the capability to be deployed in any mining system in a fully parallel mode; consequently, increasing the efficiency of the real-time association rules discovery process. The proposed approach outperforms the extant state-of-the-art and shows promising results that reduce computation cost, increase accuracy, and produce all possible itemsets.

  5. A novel association rule mining approach using TID intermediate itemset

    PubMed Central

    Ali, Akhtar; Bin Razali, Ramdan; Ilahi, Manzoor; Raymond Choo, Kim-Kwang

    2018-01-01

    Designing an efficient association rule mining (ARM) algorithm for multilevel knowledge-based transactional databases that is appropriate for real-world deployments is of paramount concern. However, dynamic decision making that needs to modify the threshold either to minimize or maximize the output knowledge certainly necessitates the extant state-of-the-art algorithms to rescan the entire database. Subsequently, the process incurs heavy computation cost and is not feasible for real-time applications. The paper addresses efficiently the problem of threshold dynamic updation for a given purpose. The paper contributes by presenting a novel ARM approach that creates an intermediate itemset and applies a threshold to extract categorical frequent itemsets with diverse threshold values. Thus, improving the overall efficiency as we no longer needs to scan the whole database. After the entire itemset is built, we are able to obtain real support without the need of rebuilding the itemset (e.g. Itemset list is intersected to obtain the actual support). Moreover, the algorithm supports to extract many frequent itemsets according to a pre-determined minimum support with an independent purpose. Additionally, the experimental results of our proposed approach demonstrate the capability to be deployed in any mining system in a fully parallel mode; consequently, increasing the efficiency of the real-time association rules discovery process. The proposed approach outperforms the extant state-of-the-art and shows promising results that reduce computation cost, increase accuracy, and produce all possible itemsets. PMID:29351287

  6. Design risk assessment for burst-prone mines: Application in a Canadian mine

    NASA Astrophysics Data System (ADS)

    Cheung, David J.

    reduce both exposure risk (personnel and equipment), and economical risk (revenue and costs). Fatal and catastrophic consequences can be averted through robust planning and design. Two customized approaches were developed to conduct risk assessment of case studies at Craig Mine. Firstly, the Brownfield Approach utilizes the seismic database to determine the seismic hazard from a rating system that evaluates frequency-magnitude, event size, and event-blast relation. Secondly, the Greenfield Approach utilizes the seismic database, focusing on larger magnitude events, rocktype, and geological structure. The customized Greenfield Approach can also be applied in the evaluation of design risk in deep mines with the same setting and condition as Craig Mine. Other mines with different settings and conditions can apply the principles in the methodology to evaluate design alternatives and risk reduction strategies for burst-prone mines.

  7. A Feature Mining Based Approach for the Classification of Text Documents into Disjoint Classes.

    ERIC Educational Resources Information Center

    Nieto Sanchez, Salvador; Triantaphyllou, Evangelos; Kraft, Donald

    2002-01-01

    Proposes a new approach for classifying text documents into two disjoint classes. Highlights include a brief overview of document clustering; a data mining approach called the One Clause at a Time (OCAT) algorithm which is based on mathematical logic; vector space model (VSM); and comparing the OCAT to the VSM. (Author/LRW)

  8. Identification of hydrologic indicators related to fish diversity and abundance: A data mining approach for fish community analysis

    NASA Astrophysics Data System (ADS)

    Yang, Yi-Chen E.; Cai, Ximing; Herricks, Edwin E.

    2008-04-01

    This paper develops a new approach to identify hydrologic indicators related to fish community and generate a quantitative function between an ecological target index and the identified hydrologic indicators. The approach is based on genetic programming (GP), a data mining method. Using the Shannon Index (a fish community diversity index) or the number of individuals (total abundance) of a fish community, as an ecological target, the GP identified the most ecologically relevant hydrologic indicators (ERHIs) from 32 indicators of hydrologic alteration, for the case study site, the upper Illinois River. Robustness analysis showed that different GP runs found a similar set of ERHIs; each of the identified ERHI from different GP runs had a consistent relationship with the target index. By comparing the GP results with those from principal component analysis and autecology matrix, the three approaches identified a small number (six) of common ERHIs. Particularly, the timing of low flow (Dmin) seems to be more relevant to the diversity of the fish community, while the magnitude of the low flow (Qb) is more relevant to the total fish abundance; large rising rates result in a significant improvement of fish diversity, which is counterintuitive and against previous findings. The quantitative function developed by GP was further used to construct an indicator impact matrix (IIM), which was demonstrated as a potentially useful tool for streamflow restoration design.

  9. Monitoring the growth or decline of vegetation on mine dumps

    NASA Technical Reports Server (NTRS)

    Gilbertson, B. P. (Principal Investigator)

    1975-01-01

    The author has identified the following signficant results. It was established that particular mine dumps throughout the entire test area can be detected and identified. It was also established that patterns of vegetative growth on the mine dumps can be recognized from a simple visual analysis of photographic images. Because vegetation tends to occur in patches on many mine dumps, it is unsatisfactory to classify complete dumps into categories of percentage vegetative cover. A more desirable approach is to classify the patches of vegetation themselves. The coarse resolution of conventional densitometers restricts the accuracy of this procedure, and consequently a direct analysis of ERTS CCT's is preferred. A set of computer programs was written to perform the data reading and manipulating functions required for basic CCT analysis.

  10. Managing equipment innovations in mining: A review.

    PubMed

    Trudel, Bryan; Nadeau, Sylvie; Zaras, Kazimierz; Deschamps, Isabelle

    2015-01-01

    Technological innovations in mining equipment have led to increased productivity and occupational health and safety (OHS) performance, but their introduction also brings new risks for workers. The aim of this study is to provide support for mining industry managers who are required to reconcile equipment choices with OHS and productivity. Examination of the literature through interdisciplinary digital databases. Databases were searched using specific combinations of keywords and limited to studies dating back no farther than 1992. The ``snowball'' technique was also used to examining the references listed in research articles initially identified with the databases. A total of 19 contextual factors were identified as having the potential to influence the OHS and productivity leverage of equipment innovations. The most often cited among these factors are the level of training provided to the equipment operators, operator experience and age, supervisor leadership abilities, and maintaining good relations within work crews. Interactions between these factors are not discussed in mining innovation literature. It would be helpful to use a systems thinking approach which incorporates interaction between relevant actors and factors to define properly the most sensitive aspects of innovation management as it applies to mining equipment.

  11. An ecosystem approach to evaluate restoration measures in the lignite mining district of Lusatia/Germany

    NASA Astrophysics Data System (ADS)

    Schaaf, Wolfgang

    2015-04-01

    Lignite mining in Lusatia has a history of over 100 years. Open-cast mining directly affected an area of 1000 km2. Since 20 years we established an ecosystem oriented approach to evaluate the development and site characteristics of post-mining areas mainly restored for agricultural and silvicultural land use. Water and element budgets of afforested sites were studied under different geochemical settings in a chronosequence approach (Schaaf 2001), as well as the effect of soil amendments like sewage sludge or compost in restoration (Schaaf & Hüttl 2006). Since 10 years we also study the development of natural site regeneration in the constructed catchment Chicken Creek at the watershed scale (Schaaf et al. 2011, 2013). One of the striking characteristics of post-mining sites is a very large small-scale soil heterogeneity that has to be taken into account with respect to soil forming processes and element cycling. Results from these studies in combination with smaller-scale process studies enable to evaluate the long-term effect of restoration measures and adapted land use options. In addition, it is crucial to compare these results with data from undisturbed, i.e. non-mined sites. Schaaf, W., 2001: What can element budgets of false-time series tell us about ecosystem development on post-lignite mining sites? Ecological Engineering 17, 241-252. Schaaf, W. and Hüttl, R. F., 2006: Direct and indirect effects of soil pollution by lignite mining. Water, Air and Soil Pollution - Focus 6, 253-264. Schaaf, W., Bens, O., Fischer, A., Gerke, H.H., Gerwin, W., Grünewald, U., Holländer, H.M., Kögel-Knabner, I., Mutz, M., Schloter, M., Schulin, R., Veste, M., Winter, S. & Hüttl, R.F., 2011: Patterns and processes of initial terrestrial-ecosystem development. Journal of Plant Nutrition and Soil Science, 174, 229-239. Schaaf, W., Elmer, M., Fischer, A., Gerwin, W., Nenov, R., Pretsch, H. and Zaplate, M.K., 2013: Feedbacks between vegetation, surface structures and hydrology

  12. Gene prioritization and clustering by multi-view text mining

    PubMed Central

    2010-01-01

    Background Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model. Results We present a multi-view approach to retrieve biomedical knowledge using different controlled vocabularies. These controlled vocabularies are selected on the basis of nine well-known bio-ontologies and are applied to index the vast amounts of gene-based free-text information available in the MEDLINE repository. The text mining result specified by a vocabulary is considered as a view and the obtained multiple views are integrated by multi-source learning algorithms. We investigate the effect of integration in two fundamental computational disease gene identification tasks: gene prioritization and gene clustering. The performance of the proposed approach is systematically evaluated and compared on real benchmark data sets. In both tasks, the multi-view approach demonstrates significantly better performance than other comparing methods. Conclusions In practical research, the relevance of specific vocabulary pertaining to the task is usually unknown. In such case, multi-view text mining is a superior and promising strategy for text-based disease gene identification. PMID:20074336

  13. Information mining in remote sensing imagery

    NASA Astrophysics Data System (ADS)

    Li, Jiang

    The volume of remotely sensed imagery continues to grow at an enormous rate due to the advances in sensor technology, and our capability for collecting and storing images has greatly outpaced our ability to analyze and retrieve information from the images. This motivates us to develop image information mining techniques, which is very much an interdisciplinary endeavor drawing upon expertise in image processing, databases, information retrieval, machine learning, and software design. This dissertation proposes and implements an extensive remote sensing image information mining (ReSIM) system prototype for mining useful information implicitly stored in remote sensing imagery. The system consists of three modules: image processing subsystem, database subsystem, and visualization and graphical user interface (GUI) subsystem. Land cover and land use (LCLU) information corresponding to spectral characteristics is identified by supervised classification based on support vector machines (SVM) with automatic model selection, while textural features that characterize spatial information are extracted using Gabor wavelet coefficients. Within LCLU categories, textural features are clustered using an optimized k-means clustering approach to acquire search efficient space. The clusters are stored in an object-oriented database (OODB) with associated images indexed in an image database (IDB). A k-nearest neighbor search is performed using a query-by-example (QBE) approach. Furthermore, an automatic parametric contour tracing algorithm and an O(n) time piecewise linear polygonal approximation (PLPA) algorithm are developed for shape information mining of interesting objects within the image. A fuzzy object-oriented database based on the fuzzy object-oriented data (FOOD) model is developed to handle the fuzziness and uncertainty. Three specific applications are presented: integrated land cover and texture pattern mining, shape information mining for change detection of lakes, and

  14. Using data mining techniques to characterize participation in observational studies.

    PubMed

    Linden, Ariel; Yarnold, Paul R

    2016-12-01

    Data mining techniques are gaining in popularity among health researchers for an array of purposes, such as improving diagnostic accuracy, identifying high-risk patients and extracting concepts from unstructured data. In this paper, we describe how these techniques can be applied to another area in the health research domain: identifying characteristics of individuals who do and do not choose to participate in observational studies. In contrast to randomized studies where individuals have no control over their treatment assignment, participants in observational studies self-select into the treatment arm and therefore have the potential to differ in their characteristics from those who elect not to participate. These differences may explain part, or all, of the difference in the observed outcome, making it crucial to assess whether there is differential participation based on observed characteristics. As compared to traditional approaches to this assessment, data mining offers a more precise understanding of these differences. To describe and illustrate the application of data mining in this domain, we use data from a primary care-based medical home pilot programme and compare the performance of commonly used classification approaches - logistic regression, support vector machines, random forests and classification tree analysis (CTA) - in correctly classifying participants and non-participants. We find that CTA is substantially more accurate than the other models. Moreover, unlike the other models, CTA offers transparency in its computational approach, ease of interpretation via the decision rules produced and provides statistical results familiar to health researchers. Beyond their application to research, data mining techniques could help administrators to identify new candidates for participation who may most benefit from the intervention. © 2016 John Wiley & Sons, Ltd.

  15. A structural informatics approach to mine kinase knowledge bases.

    PubMed

    Brooijmans, Natasja; Mobilio, Dominick; Walker, Gary; Nilakantan, Ramaswamy; Denny, Rajiah A; Feyfant, Eric; Diller, David; Bikker, Jack; Humblet, Christine

    2010-03-01

    In this paper, we describe a combination of structural informatics approaches developed to mine data extracted from existing structure knowledge bases (Protein Data Bank and the GVK database) with a focus on kinase ATP-binding site data. In contrast to existing systems that retrieve and analyze protein structures, our techniques are centered on a database of ligand-bound geometries in relation to residues lining the binding site and transparent access to ligand-based SAR data. We illustrate the systems in the context of the Abelson kinase and related inhibitor structures. 2009 Elsevier Ltd. All rights reserved.

  16. Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach.

    PubMed

    Rinaldi, Fabio; Schneider, Gerold; Kaljurand, Kaarel; Hess, Michael; Andronis, Christos; Konstandi, Ourania; Persidis, Andreas

    2007-02-01

    The amount of new discoveries (as published in the scientific literature) in the biomedical area is growing at an exponential rate. This growth makes it very difficult to filter the most relevant results, and thus the extraction of the core information becomes very expensive. Therefore, there is a growing interest in text processing approaches that can deliver selected information from scientific publications, which can limit the amount of human intervention normally needed to gather those results. This paper presents and evaluates an approach aimed at automating the process of extracting functional relations (e.g. interactions between genes and proteins) from scientific literature in the biomedical domain. The approach, using a novel dependency-based parser, is based on a complete syntactic analysis of the corpus. We have implemented a state-of-the-art text mining system for biomedical literature, based on a deep-linguistic, full-parsing approach. The results are validated on two different corpora: the manually annotated genomics information access (GENIA) corpus and the automatically annotated arabidopsis thaliana circadian rhythms (ATCR) corpus. We show how a deep-linguistic approach (contrary to common belief) can be used in a real world text mining application, offering high-precision relation extraction, while at the same time retaining a sufficient recall.

  17. Process mining in oncology using the MIMIC-III dataset

    NASA Astrophysics Data System (ADS)

    Prima Kurniati, Angelina; Hall, Geoff; Hogg, David; Johnson, Owen

    2018-03-01

    Process mining is a data analytics approach to discover and analyse process models based on the real activities captured in information systems. There is a growing body of literature on process mining in healthcare, including oncology, the study of cancer. In earlier work we found 37 peer-reviewed papers describing process mining research in oncology with a regular complaint being the limited availability and accessibility of datasets with suitable information for process mining. Publicly available datasets are one option and this paper describes the potential to use MIMIC-III, for process mining in oncology. MIMIC-III is a large open access dataset of de-identified patient records. There are 134 publications listed as using the MIMIC dataset, but none of them have used process mining. The MIMIC-III dataset has 16 event tables which are potentially useful for process mining and this paper demonstrates the opportunities to use MIMIC-III for process mining in oncology. Our research applied the L* lifecycle method to provide a worked example showing how process mining can be used to analyse cancer pathways. The results and data quality limitations are discussed along with opportunities for further work and reflection on the value of MIMIC-III for reproducible process mining research.

  18. Mines Systems Safety Improvement Using an Integrated Event Tree and Fault Tree Analysis

    NASA Astrophysics Data System (ADS)

    Kumar, Ranjan; Ghosh, Achyuta Krishna

    2017-04-01

    Mines systems such as ventilation system, strata support system, flame proof safety equipment, are exposed to dynamic operational conditions such as stress, humidity, dust, temperature, etc., and safety improvement of such systems can be done preferably during planning and design stage. However, the existing safety analysis methods do not handle the accident initiation and progression of mine systems explicitly. To bridge this gap, this paper presents an integrated Event Tree (ET) and Fault Tree (FT) approach for safety analysis and improvement of mine systems design. This approach includes ET and FT modeling coupled with redundancy allocation technique. In this method, a concept of top hazard probability is introduced for identifying system failure probability and redundancy is allocated to the system either at component or system level. A case study on mine methane explosion safety with two initiating events is performed. The results demonstrate that the presented method can reveal the accident scenarios and improve the safety of complex mine systems simultaneously.

  19. Prediction model for peninsular Indian summer monsoon rainfall using data mining and statistical approaches

    NASA Astrophysics Data System (ADS)

    Vathsala, H.; Koolagudi, Shashidhar G.

    2017-01-01

    In this paper we discuss a data mining application for predicting peninsular Indian summer monsoon rainfall, and propose an algorithm that combine data mining and statistical techniques. We select likely predictors based on association rules that have the highest confidence levels. We then cluster the selected predictors to reduce their dimensions and use cluster membership values for classification. We derive the predictors from local conditions in southern India, including mean sea level pressure, wind speed, and maximum and minimum temperatures. The global condition variables include southern oscillation and Indian Ocean dipole conditions. The algorithm predicts rainfall in five categories: Flood, Excess, Normal, Deficit and Drought. We use closed itemset mining, cluster membership calculations and a multilayer perceptron function in the algorithm to predict monsoon rainfall in peninsular India. Using Indian Institute of Tropical Meteorology data, we found the prediction accuracy of our proposed approach to be exceptionally good.

  20. A software tool for determination of breast cancer treatment methods using data mining approach.

    PubMed

    Cakır, Abdülkadir; Demirel, Burçin

    2011-12-01

    In this work, breast cancer treatment methods are determined using data mining. For this purpose, software is developed to help to oncology doctor for the suggestion of application of the treatment methods about breast cancer patients. 462 breast cancer patient data, obtained from Ankara Oncology Hospital, are used to determine treatment methods for new patients. This dataset is processed with Weka data mining tool. Classification algorithms are applied one by one for this dataset and results are compared to find proper treatment method. Developed software program called as "Treatment Assistant" uses different algorithms (IB1, Multilayer Perception and Decision Table) to find out which one is giving better result for each attribute to predict and by using Java Net beans interface. Treatment methods are determined for the post surgical operation of breast cancer patients using this developed software tool. At modeling step of data mining process, different Weka algorithms are used for output attributes. For hormonotherapy output IB1, for tamoxifen and radiotherapy outputs Multilayer Perceptron and for the chemotherapy output decision table algorithm shows best accuracy performance compare to each other. In conclusion, this work shows that data mining approach can be a useful tool for medical applications particularly at the treatment decision step. Data mining helps to the doctor to decide in a short time.

  1. Collaborative Data Mining

    NASA Astrophysics Data System (ADS)

    Moyle, Steve

    Collaborative Data Mining is a setting where the Data Mining effort is distributed to multiple collaborating agents - human or software. The objective of the collaborative Data Mining effort is to produce solutions to the tackled Data Mining problem which are considered better by some metric, with respect to those solutions that would have been achieved by individual, non-collaborating agents. The solutions require evaluation, comparison, and approaches for combination. Collaboration requires communication, and implies some form of community. The human form of collaboration is a social task. Organizing communities in an effective manner is non-trivial and often requires well defined roles and processes. Data Mining, too, benefits from a standard process. This chapter explores the standard Data Mining process CRISP-DM utilized in a collaborative setting.

  2. Abandoned Uranium Mine (AUM) Trust Mine Points, Navajo Nation, 2016, US EPA Region 9

    EPA Pesticide Factsheets

    This GIS dataset contains point features that represent mines included in the Navajo Environmental Response Trust. This mine category also includes Priority mines. USEPA and NNEPA prioritized mines based on gamma radiation levels, proximity to homes and potential for water contamination identified in the preliminary assessments. Attributes include mine names, reclaimed status, links to US EPA AUM reports, and the region in which the mine is located. This dataset contains 19 features.

  3. Renewed mining and reclamation: Imapacts on bats and potential mitigation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown, P.E.; Berry, R.D.

    Historic mining created new roosting habitat for many bat species. Now the same industry has the potential to adversely impact bats. Contemporary mining operations usually occur in historic districts; consequently the old workings are destroyed by open pit operations. Occasionally, underground techniques are employed, resulting in the enlargement or destruction of the original workings. Even during exploratory operations, historic mine openings can be covered as drill roads are bulldozed, or drills can penetrate and collapse underground workings. Nearby blasting associated with mine construction and operation can disrupt roosting bats. Bats can also be disturbed by the entry of mine personnelmore » to collect ore samples or by recreational mine explorers, since the creation of roads often results in easier access. In addition to roost disturbance, other aspects of renewed mining can have adverse impacts on bat populations, and affect even those bats that do not live in mines. Open cyanide ponds, or other water in which toxic chemicals accumulate, can poison bats and other wildlife. The creation of the pits, roads and processing areas often destroys critical foraging habitat, or change drainage patterns. Finally, at the completion of mining, any historic mines still open may be sealed as part of closure and reclamation activities. The net result can be a loss of bats and bat habitat. Conversely, in some contemporary underground operations, future roosting habitat for bats can be fabricated. An experimental approach to the creation of new roosting habitat is to bury culverts or old tires beneath waste rock. Mining companies can mitigate for impacts to bats by surveying to identify bat-roosting habitat, removing bats prior to renewed mining or closure, protecting non-impacted roost sites with gates and fences, researching to identify habitat requirements and creating new artificial roosts.« less

  4. Semi-automated literature mining to identify putative biomarkers of disease from multiple biofluids

    PubMed Central

    2014-01-01

    Background Computational methods for mining of biomedical literature can be useful in augmenting manual searches of the literature using keywords for disease-specific biomarker discovery from biofluids. In this work, we develop and apply a semi-automated literature mining method to mine abstracts obtained from PubMed to discover putative biomarkers of breast and lung cancers in specific biofluids. Methodology A positive set of abstracts was defined by the terms ‘breast cancer’ and ‘lung cancer’ in conjunction with 14 separate ‘biofluids’ (bile, blood, breastmilk, cerebrospinal fluid, mucus, plasma, saliva, semen, serum, synovial fluid, stool, sweat, tears, and urine), while a negative set of abstracts was defined by the terms ‘(biofluid) NOT breast cancer’ or ‘(biofluid) NOT lung cancer.’ More than 5.3 million total abstracts were obtained from PubMed and examined for biomarker-disease-biofluid associations (34,296 positive and 2,653,396 negative for breast cancer; 28,355 positive and 2,595,034 negative for lung cancer). Biological entities such as genes and proteins were tagged using ABNER, and processed using Python scripts to produce a list of putative biomarkers. Z-scores were calculated, ranked, and used to determine significance of putative biomarkers found. Manual verification of relevant abstracts was performed to assess our method’s performance. Results Biofluid-specific markers were identified from the literature, assigned relevance scores based on frequency of occurrence, and validated using known biomarker lists and/or databases for lung and breast cancer [NCBI’s On-line Mendelian Inheritance in Man (OMIM), Cancer Gene annotation server for cancer genomics (CAGE), NCBI’s Genes & Disease, NCI’s Early Detection Research Network (EDRN), and others]. The specificity of each marker for a given biofluid was calculated, and the performance of our semi-automated literature mining method assessed for breast and lung cancer

  5. Graph-based biomedical text summarization: An itemset mining and sentence clustering approach.

    PubMed

    Nasr Azadani, Mozhgan; Ghadiri, Nasser; Davoodijam, Ensieh

    2018-06-12

    Automatic text summarization offers an efficient solution to access the ever-growing amounts of both scientific and clinical literature in the biomedical domain by summarizing the source documents while maintaining their most informative contents. In this paper, we propose a novel graph-based summarization method that takes advantage of the domain-specific knowledge and a well-established data mining technique called frequent itemset mining. Our summarizer exploits the Unified Medical Language System (UMLS) to construct a concept-based model of the source document and mapping the document to the concepts. Then, it discovers frequent itemsets to take the correlations among multiple concepts into account. The method uses these correlations to propose a similarity function based on which a represented graph is constructed. The summarizer then employs a minimum spanning tree based clustering algorithm to discover various subthemes of the document. Eventually, it generates the final summary by selecting the most informative and relative sentences from all subthemes within the text. We perform an automatic evaluation over a large number of summaries using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics. The results demonstrate that the proposed summarization system outperforms various baselines and benchmark approaches. The carried out research suggests that the incorporation of domain-specific knowledge and frequent itemset mining equips the summarization system in a better way to address the informativeness measurement of the sentences. Moreover, clustering the graph nodes (sentences) can enable the summarizer to target different main subthemes of a source document efficiently. The evaluation results show that the proposed approach can significantly improve the performance of the summarization systems in the biomedical domain. Copyright © 2018. Published by Elsevier Inc.

  6. Abandoned Mine Lands: Site Information

    EPA Pesticide Factsheets

    A catalogue of mining sites proposed for and listed on the NPL as well as mining sites being cleaned up using the Superfund Alternative Approach. Also mine sites not on the NPL but that have had removal or emergency response cleanup actions.

  7. MinePath: Mining for Phenotype Differential Sub-paths in Molecular Pathways

    PubMed Central

    Koumakis, Lefteris; Kartsaki, Evgenia; Chatzimina, Maria; Zervakis, Michalis; Vassou, Despoina; Marias, Kostas; Moustakis, Vassilis; Potamias, George

    2016-01-01

    Pathway analysis methodologies couple traditional gene expression analysis with knowledge encoded in established molecular pathway networks, offering a promising approach towards the biological interpretation of phenotype differentiating genes. Early pathway analysis methodologies, named as gene set analysis (GSA), view pathways just as plain lists of genes without taking into account either the underlying pathway network topology or the involved gene regulatory relations. These approaches, even if they achieve computational efficiency and simplicity, consider pathways that involve the same genes as equivalent in terms of their gene enrichment characteristics. Most recent pathway analysis approaches take into account the underlying gene regulatory relations by examining their consistency with gene expression profiles and computing a score for each profile. Even with this approach, assessing and scoring single-relations limits the ability to reveal key gene regulation mechanisms hidden in longer pathway sub-paths. We introduce MinePath, a pathway analysis methodology that addresses and overcomes the aforementioned problems. MinePath facilitates the decomposition of pathways into their constituent sub-paths. Decomposition leads to the transformation of single-relations to complex regulation sub-paths. Regulation sub-paths are then matched with gene expression sample profiles in order to evaluate their functional status and to assess phenotype differential power. Assessment of differential power supports the identification of the most discriminant profiles. In addition, MinePath assess the significance of the pathways as a whole, ranking them by their p-values. Comparison results with state-of-the-art pathway analysis systems are indicative for the soundness and reliability of the MinePath approach. In contrast with many pathway analysis tools, MinePath is a web-based system (www.minepath.org) offering dynamic and rich pathway visualization functionality, with the

  8. MinePath: Mining for Phenotype Differential Sub-paths in Molecular Pathways.

    PubMed

    Koumakis, Lefteris; Kanterakis, Alexandros; Kartsaki, Evgenia; Chatzimina, Maria; Zervakis, Michalis; Tsiknakis, Manolis; Vassou, Despoina; Kafetzopoulos, Dimitris; Marias, Kostas; Moustakis, Vassilis; Potamias, George

    2016-11-01

    Pathway analysis methodologies couple traditional gene expression analysis with knowledge encoded in established molecular pathway networks, offering a promising approach towards the biological interpretation of phenotype differentiating genes. Early pathway analysis methodologies, named as gene set analysis (GSA), view pathways just as plain lists of genes without taking into account either the underlying pathway network topology or the involved gene regulatory relations. These approaches, even if they achieve computational efficiency and simplicity, consider pathways that involve the same genes as equivalent in terms of their gene enrichment characteristics. Most recent pathway analysis approaches take into account the underlying gene regulatory relations by examining their consistency with gene expression profiles and computing a score for each profile. Even with this approach, assessing and scoring single-relations limits the ability to reveal key gene regulation mechanisms hidden in longer pathway sub-paths. We introduce MinePath, a pathway analysis methodology that addresses and overcomes the aforementioned problems. MinePath facilitates the decomposition of pathways into their constituent sub-paths. Decomposition leads to the transformation of single-relations to complex regulation sub-paths. Regulation sub-paths are then matched with gene expression sample profiles in order to evaluate their functional status and to assess phenotype differential power. Assessment of differential power supports the identification of the most discriminant profiles. In addition, MinePath assess the significance of the pathways as a whole, ranking them by their p-values. Comparison results with state-of-the-art pathway analysis systems are indicative for the soundness and reliability of the MinePath approach. In contrast with many pathway analysis tools, MinePath is a web-based system (www.minepath.org) offering dynamic and rich pathway visualization functionality, with the

  9. Integrated investigations of environmental effects of historical mining in the Basin and Boulder Mining Districts, Boulder River watershed, Jefferson County, Montana

    USGS Publications Warehouse

    Nimick, David A.; Church, Stan E.; Finger, Susan E.

    2004-01-01

    The Boulder River watershed is one of many watersheds in the western United States where historical mining has left a legacy of acid mine drainage and elevated concentrations of potentially toxic trace elements. Abandoned mine lands commonly are located on or affect Federal land. Cleaning up these Federal lands will require substantial investment of resources. As part of a cooperative effort with Federal land-management agencies, the U.S. Geological Survey implemented an Abandoned Mine Lands Initiative in 1997. The goal of the initiative was to use the watershed approach to develop a strategy for gathering and communicating the scientific information needed to formulate effective and cost-efficient remediation of affected lands in a watershed. The watershed approach is based on the premise that contaminated sites that have the most profound effect on water and ecosystem quality within an entire watershed should be identified, characterized, and ranked for remediation.The watershed approach provides an effective means to evaluate the overall status of affected resources and helps to focus remediation at sites where the most benefit will be gained in the watershed. Such a large-scale approach can result in the collection of extensive information on the geology and geochemistry of rocks and sediment, the hydrology and water chemistry of streams and ground water, and the diversity and health of aquatic and terrestrial organisms. During the assessment of the Boulder River watershed, we inventoried historical mines, defined geological conditions, assessed fish habitat, collected and chemically analyzed hundreds of water and sediment samples, conducted toxicity tests, analyzed fish tissue and indicators of physiological malfunction, examined invertebrates and biofilm, and defined hydrological regimes. Land- and resource-management agencies are faced with evaluating risks associated with thousands of potentially harmful mine sites, and this level of effort is not always

  10. Improving Fraud and Abuse Detection in General Physician Claims: A Data Mining Study

    PubMed Central

    Joudaki, Hossein; Rashidian, Arash; Minaei-Bidgoli, Behrouz; Mahmoodi, Mahmood; Geraili, Bijan; Nasiri, Mahdi; Arab, Mohammad

    2016-01-01

    Background: We aimed to identify the indicators of healthcare fraud and abuse in general physicians’ drug prescription claims, and to identify a subset of general physicians that were more likely to have committed fraud and abuse. Methods: We applied data mining approach to a major health insurance organization dataset of private sector general physicians’ prescription claims. It involved 5 steps: clarifying the nature of the problem and objectives, data preparation, indicator identification and selection, cluster analysis to identify suspect physicians, and discriminant analysis to assess the validity of the clustering approach. Results: Thirteen indicators were developed in total. Over half of the general physicians (54%) were ‘suspects’ of conducting abusive behavior. The results also identified 2% of physicians as suspects of fraud. Discriminant analysis suggested that the indicators demonstrated adequate performance in the detection of physicians who were suspect of perpetrating fraud (98%) and abuse (85%) in a new sample of data. Conclusion: Our data mining approach will help health insurance organizations in low-and middle-income countries (LMICs) in streamlining auditing approaches towards the suspect groups rather than routine auditing of all physicians. PMID:26927587

  11. Improving Fraud and Abuse Detection in General Physician Claims: A Data Mining Study.

    PubMed

    Joudaki, Hossein; Rashidian, Arash; Minaei-Bidgoli, Behrouz; Mahmoodi, Mahmood; Geraili, Bijan; Nasiri, Mahdi; Arab, Mohammad

    2015-11-10

    We aimed to identify the indicators of healthcare fraud and abuse in general physicians' drug prescription claims, and to identify a subset of general physicians that were more likely to have committed fraud and abuse. We applied data mining approach to a major health insurance organization dataset of private sector general physicians' prescription claims. It involved 5 steps: clarifying the nature of the problem and objectives, data preparation, indicator identification and selection, cluster analysis to identify suspect physicians, and discriminant analysis to assess the validity of the clustering approach. Thirteen indicators were developed in total. Over half of the general physicians (54%) were 'suspects' of conducting abusive behavior. The results also identified 2% of physicians as suspects of fraud. Discriminant analysis suggested that the indicators demonstrated adequate performance in the detection of physicians who were suspect of perpetrating fraud (98%) and abuse (85%) in a new sample of data. Our data mining approach will help health insurance organizations in low-and middle-income countries (LMICs) in streamlining auditing approaches towards the suspect groups rather than routine auditing of all physicians. © 2016 by Kerman University of Medical Sciences.

  12. Online discourse on fibromyalgia: text-mining to identify clinical distinction and patient concerns.

    PubMed

    Park, Jungsik; Ryu, Young Uk

    2014-10-07

    The purpose of this study was to evaluate the possibility of using text-mining to identify clinical distinctions and patient concerns in online memoires posted by patients with fibromyalgia (FM). A total of 399 memoirs were collected from an FM group website. The unstructured data of memoirs associated with FM were collected through a crawling process and converted into structured data with a concordance, parts of speech tagging, and word frequency. We also conducted a lexical analysis and phrase pattern identification. After examining the data, a set of FM-related keywords were obtained and phrase net relationships were set through a web-based visualization tool. The clinical distinction of FM was verified. Pain is the biggest issue to the FM patients. The pains were affecting body parts including 'muscles,' 'leg,' 'neck,' 'back,' 'joints,' and 'shoulders' with accompanying symptoms such as 'spasms,' 'stiffness,' and 'aching,' and were described as 'sever,' 'chronic,' and 'constant.' This study also demonstrated that it was possible to understand the interests and concerns of FM patients through text-mining. FM patients wanted to escape from the pain and symptoms, so they were interested in medical treatment and help. Also, they seemed to have interest in their work and occupation, and hope to continue to live life through the relationships with the people around them. This research shows the potential for extracting keywords to confirm the clinical distinction of a certain disease, and text-mining can help objectively understand the concerns of patients by generalizing their large number of subjective illness experiences. However, it is believed that there are limitations to the processes and methods for organizing and classifying large amounts of text, so these limits have to be considered when analyzing the results. The development of research methodology to overcome these limitations is greatly needed.

  13. A Swarm Optimization approach for clinical knowledge mining.

    PubMed

    Christopher, J Jabez; Nehemiah, H Khanna; Kannan, A

    2015-10-01

    Rule-based classification is a typical data mining task that is being used in several medical diagnosis and decision support systems. The rules stored in the rule base have an impact on classification efficiency. Rule sets that are extracted with data mining tools and techniques are optimized using heuristic or meta-heuristic approaches in order to improve the quality of the rule base. In this work, a meta-heuristic approach called Wind-driven Swarm Optimization (WSO) is used. The uniqueness of this work lies in the biological inspiration that underlies the algorithm. WSO uses Jval, a new metric, to evaluate the efficiency of a rule-based classifier. Rules are extracted from decision trees. WSO is used to obtain different permutations and combinations of rules whereby the optimal ruleset that satisfies the requirement of the developer is used for predicting the test data. The performance of various extensions of decision trees, namely, RIPPER, PART, FURIA and Decision Tables are analyzed. The efficiency of WSO is also compared with the traditional Particle Swarm Optimization. Experiments were carried out with six benchmark medical datasets. The traditional C4.5 algorithm yields 62.89% accuracy with 43 rules for liver disorders dataset where as WSO yields 64.60% with 19 rules. For Heart disease dataset, C4.5 is 68.64% accurate with 98 rules where as WSO is 77.8% accurate with 34 rules. The normalized standard deviation for accuracy of PSO and WSO are 0.5921 and 0.5846 respectively. WSO provides accurate and concise rulesets. PSO yields results similar to that of WSO but the novelty of WSO lies in its biological motivation and it is customization for rule base optimization. The trade-off between the prediction accuracy and the size of the rule base is optimized during the design and development of rule-based clinical decision support system. The efficiency of a decision support system relies on the content of the rule base and classification accuracy. Copyright

  14. A Control Chart Approach for Representing and Mining Data Streams with Shape Based Similarity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Omitaomu, Olufemi A

    The mining of data streams for online condition monitoring is a challenging task in several domains including (electric) power grid system, intelligent manufacturing, and consumer science. Considering a power grid application in which thousands of sensors, called the phasor measurement units, are deployed on the power grid network to continuously collect streams of digital data for real-time situational awareness and system management. Depending on design, each sensor could stream between ten and sixty data samples per second. The myriad of sensory data captured could convey deeper insights about sequence of events in real-time and before major damages are done. However,more » the timely processing and analysis of these high-velocity and high-volume data streams is a challenge. Hence, a new data processing and transformation approach, based on the concept of control charts, for representing sequence of data streams from sensors is proposed. In addition, an application of the proposed approach for enhancing data mining tasks such as clustering using real-world power grid data streams is presented. The results indicate that the proposed approach is very efficient for data streams storage and manipulation.« less

  15. Metal dispersion resulting from mining activities in coastal environments: A pathways approach

    USGS Publications Warehouse

    Koski, Randolph A.

    2012-01-01

    Acid rock drainage (ARD) and disposal of tailings that result from mining activities impact coastal areas in many countries. The dispersion of metals from mine sites that are both proximal and distal to the shoreline can be examined using a pathways approach in which physical and chemical processes guide metal transport in the continuum from sources (sulfide minerals) to bioreceptors (marine biota). Large amounts of metals can be physically transported to the coastal environment by intentional or accidental release of sulfide-bearing mine tailings. Oxidation of sulfide minerals results in elevated dissolved metal concentrations in surface waters on land (producing ARD) and in pore waters of submarine tailings. Changes in pH, adsorption by insoluble secondary minerals (e.g., Fe oxyhydroxides), and precipitation of soluble salts (e.g., sulfates) affect dissolved metal fluxes. Evidence for bioaccumulation includes anomalous metal concentrations in bivalves and reef corals, and overlapping Pb isotope ratios for sulfides, shellfish, and seaweed in contaminated environments. Although bioavailability and potential toxicity are, to a large extent, functions of metal speciation, specific uptake pathways, such as adsorption from solution and ingestion of particles, also play important roles. Recent emphasis on broader ecological impacts has led to complementary methodologies involving laboratory toxicity tests and field studies of species richness and diversity.

  16. Metal dispersion resulting from mining activities in coastal environments: a pathways approach

    USGS Publications Warehouse

    Koski, Randolph A.

    2012-01-01

    Acid rock drainage (ARD) and disposal of tailings that result from mining activities impact coastal areas in many countries. The dispersion of metals from mine sites that are both proximal and distal to the shoreline can be examined using a pathways approach in which physical and chemical processes guide metal transport in the continuum from sources (sulfide minerals) to bioreceptors (marine biota). Large amounts of metals can be physically transported to the coastal environment by intentional or accidental release of sulfide-bearing mine tailings. Oxidation of sulfide minerals results in elevated dissolved metal concentrations in surface waters on land (producing ARD) and in pore waters of submarine tailings. Changes in pH, adsorption by insoluble secondary minerals (e.g., Fe oxyhydroxides), and precipitation of soluble salts (e.g., sulfates) affect dissolved metal fluxes. Evidence for bioaccumulation includes anomalous metal concentrations in bivalves and reef corals, and overlapping Pb isotope ratios for sulfides, shellfish, and seaweed in contaminated environments. Although bioavailability and potential toxicity are, to a large extent, functions of metal speciation, specific uptake pathways, such as adsorption from solution and ingestion of particles, also play important roles. Recent emphasis on broader ecological impacts has led to complementary methodologies involving laboratory toxicity tests and field studies of species richness and diversity.

  17. A review of approaches to identifying patient phenotype cohorts using electronic health records

    PubMed Central

    Shivade, Chaitanya; Raghavan, Preethi; Fosler-Lussier, Eric; Embi, Peter J; Elhadad, Noemie; Johnson, Stephen B; Lai, Albert M

    2014-01-01

    Objective To summarize literature describing approaches aimed at automatically identifying patients with a common phenotype. Materials and methods We performed a review of studies describing systems or reporting techniques developed for identifying cohorts of patients with specific phenotypes. Every full text article published in (1) Journal of American Medical Informatics Association, (2) Journal of Biomedical Informatics, (3) Proceedings of the Annual American Medical Informatics Association Symposium, and (4) Proceedings of Clinical Research Informatics Conference within the past 3 years was assessed for inclusion in the review. Only articles using automated techniques were included. Results Ninety-seven articles met our inclusion criteria. Forty-six used natural language processing (NLP)-based techniques, 24 described rule-based systems, 41 used statistical analyses, data mining, or machine learning techniques, while 22 described hybrid systems. Nine articles described the architecture of large-scale systems developed for determining cohort eligibility of patients. Discussion We observe that there is a rise in the number of studies associated with cohort identification using electronic medical records. Statistical analyses or machine learning, followed by NLP techniques, are gaining popularity over the years in comparison with rule-based systems. Conclusions There are a variety of approaches for classifying patients into a particular phenotype. Different techniques and data sources are used, and good performance is reported on datasets at respective institutions. However, no system makes comprehensive use of electronic medical records addressing all of their known weaknesses. PMID:24201027

  18. Monitoring genotoxic exposure in uranium mines.

    PubMed Central

    Srám, R J; Dobiás, L; Rössner, P; Veselá, D; Veselý, D; Rakusová, R; Rericha, V

    1993-01-01

    Recent data from deep uranium mines in Czechoslovakia indicated that mines are exposed to other mutagenic factors in addition to radon daughter products. Mycotoxins were identified as a possible source of mutagens in these mines. Mycotoxins were examined in 38 samples from mines and in throat swabs taken from 116 miners and 78 controls. The following mycotoxins were identified from mines samples: aflatoxins B1 and G1, citrinin, citreoviridin, mycophenolic acid, and sterigmatocystin. Some mold strains isolated from mines and throat swabs were investigated for mutagenic activity by the SOS chromotest and Salmonella assay with strains TA100 and TA98. Mutagenicity was observed, especially with metabolic activation in vitro. These data suggest that mycotoxins produced by molds in uranium mines are a new genotoxic factor for uranium miners. PMID:8143610

  19. Restoring Forests and Associated Ecosystem Services on Appalachian Coal Surface Mines

    NASA Astrophysics Data System (ADS)

    Zipper, Carl E.; Burger, James A.; Skousen, Jeffrey G.; Angel, Patrick N.; Barton, Christopher D.; Davis, Victor; Franklin, Jennifer A.

    2011-05-01

    Surface coal mining in Appalachia has caused extensive replacement of forest with non-forested land cover, much of which is unmanaged and unproductive. Although forested ecosystems are valued by society for both marketable products and ecosystem services, forests have not been restored on most Appalachian mined lands because traditional reclamation practices, encouraged by regulatory policies, created conditions poorly suited for reforestation. Reclamation scientists have studied productive forests growing on older mine sites, established forest vegetation experimentally on recent mines, and identified mine reclamation practices that encourage forest vegetation re-establishment. Based on these findings, they developed a Forestry Reclamation Approach (FRA) that can be employed by coal mining firms to restore forest vegetation. Scientists and mine regulators, working collaboratively, have communicated the FRA to the coal industry and to regulatory enforcement personnel. Today, the FRA is used routinely by many coal mining firms, and thousands of mined hectares have been reclaimed to restore productive mine soils and planted with native forest trees. Reclamation of coal mines using the FRA is expected to restore these lands' capabilities to provide forest-based ecosystem services, such as wood production, atmospheric carbon sequestration, wildlife habitat, watershed protection, and water quality protection to a greater extent than conventional reclamation practices.

  20. MINE: Module Identification in Networks

    PubMed Central

    2011-01-01

    Background Graphical models of network associations are useful for both visualizing and integrating multiple types of association data. Identifying modules, or groups of functionally related gene products, is an important challenge in analyzing biological networks. However, existing tools to identify modules are insufficient when applied to dense networks of experimentally derived interaction data. To address this problem, we have developed an agglomerative clustering method that is able to identify highly modular sets of gene products within highly interconnected molecular interaction networks. Results MINE outperforms MCODE, CFinder, NEMO, SPICi, and MCL in identifying non-exclusive, high modularity clusters when applied to the C. elegans protein-protein interaction network. The algorithm generally achieves superior geometric accuracy and modularity for annotated functional categories. In comparison with the most closely related algorithm, MCODE, the top clusters identified by MINE are consistently of higher density and MINE is less likely to designate overlapping modules as a single unit. MINE offers a high level of granularity with a small number of adjustable parameters, enabling users to fine-tune cluster results for input networks with differing topological properties. Conclusions MINE was created in response to the challenge of discovering high quality modules of gene products within highly interconnected biological networks. The algorithm allows a high degree of flexibility and user-customisation of results with few adjustable parameters. MINE outperforms several popular clustering algorithms in identifying modules with high modularity and obtains good overall recall and precision of functional annotations in protein-protein interaction networks from both S. cerevisiae and C. elegans. PMID:21605434

  1. TOXICITY APPROACHES TO ASSESSING MINING IMPACTS AND MINE WASTE TREATMENT EFFECTIVENESS

    EPA Science Inventory

    The USEPA Office of Research and Development's National Exposure Research Laboratory and National Risk Management Research Laboratory have been evaluating the impact of mining sites on receiving streams and the effectiveness of waste treatment technologies in removing toxicity fo...

  2. Realising the knowledge spiral in healthcare: the role of data mining and knowledge management.

    PubMed

    Wickramasinghe, Nilmini; Bali, Rajeev K; Gibbons, M Chris; Schaffer, Jonathan

    2008-01-01

    Knowledge Management (KM) is an emerging business approach aimed at solving current problems such as competitiveness and the need to innovate which are faced by businesses today. The premise for the need for KM is based on a paradigm shift in the business environment where knowledge is central to organizational performance . Organizations trying to embrace KM have many tools, techniques and strategies at their disposal. A vital technique in KM is data mining which enables critical knowledge to be gained from the analysis of large amounts of data and information. The healthcare industry is a very information rich industry. The collecting of data and information permeate most, if not all areas of this industry; however, the healthcare industry has yet to fully embrace KM, let alone the new evolving techniques of data mining. In this paper, we demonstrate the ubiquitous benefits of data mining and KM to healthcare by highlighting their potential to enable and facilitate superior clinical practice and administrative management to ensue. Specifically, we show how data mining can realize the knowledge spiral by effecting the four key transformations identified by Nonaka of turning: (1) existing explicit knowledge to new explicit knowledge, (2) existing explicit knowledge to new tacit knowledge, (3) existing tacit knowledge to new explicit knowledge and (4) existing tacit knowledge to new tacit knowledge. This is done through the establishment of theoretical models that respectively identify the function of the knowledge spiral and the powers of data mining, both exploratory and predictive, in the knowledge discovery process. Our models are then applied to a healthcare data set to demonstrate the potential of this approach as well as the implications of such an approach to the clinical and administrative aspects of healthcare. Further, we demonstrate how these techniques can facilitate hospitals to address the six healthcare quality dimensions identified by the Committee

  3. Assessing Weather-Yield Relationships in Rice at Local Scale Using Data Mining Approaches

    PubMed Central

    Delerce, Sylvain; Dorado, Hugo; Grillon, Alexandre; Rebolledo, Maria Camila; Prager, Steven D.; Patiño, Victor Hugo; Garcés Varón, Gabriel; Jiménez, Daniel

    2016-01-01

    Seasonal and inter-annual climate variability have become important issues for farmers, and climate change has been shown to increase them. Simultaneously farmers and agricultural organizations are increasingly collecting observational data about in situ crop performance. Agriculture thus needs new tools to cope with changing environmental conditions and to take advantage of these data. Data mining techniques make it possible to extract embedded knowledge associated with farmer experiences from these large observational datasets in order to identify best practices for adapting to climate variability. We introduce new approaches through a case study on irrigated and rainfed rice in Colombia. Preexisting observational datasets of commercial harvest records were combined with in situ daily weather series. Using Conditional Inference Forest and clustering techniques, we assessed the relationships between climatic factors and crop yield variability at the local scale for specific cultivars and growth stages. The analysis showed clear relationships in the various location-cultivar combinations, with climatic factors explaining 6 to 46% of spatiotemporal variability in yield, and with crop responses to weather being non-linear and cultivar-specific. Climatic factors affected cultivars differently during each stage of development. For instance, one cultivar was affected by high nighttime temperatures in the reproductive stage but responded positively to accumulated solar radiation during the ripening stage. Another was affected by high nighttime temperatures during both the vegetative and reproductive stages. Clustering of the weather patterns corresponding to individual cropping events revealed different groups of weather patterns for irrigated and rainfed systems with contrasting yield levels. Best-suited cultivars were identified for some weather patterns, making weather-site-specific recommendations possible. This study illustrates the potential of data mining for

  4. Assessing Weather-Yield Relationships in Rice at Local Scale Using Data Mining Approaches.

    PubMed

    Delerce, Sylvain; Dorado, Hugo; Grillon, Alexandre; Rebolledo, Maria Camila; Prager, Steven D; Patiño, Victor Hugo; Garcés Varón, Gabriel; Jiménez, Daniel

    2016-01-01

    Seasonal and inter-annual climate variability have become important issues for farmers, and climate change has been shown to increase them. Simultaneously farmers and agricultural organizations are increasingly collecting observational data about in situ crop performance. Agriculture thus needs new tools to cope with changing environmental conditions and to take advantage of these data. Data mining techniques make it possible to extract embedded knowledge associated with farmer experiences from these large observational datasets in order to identify best practices for adapting to climate variability. We introduce new approaches through a case study on irrigated and rainfed rice in Colombia. Preexisting observational datasets of commercial harvest records were combined with in situ daily weather series. Using Conditional Inference Forest and clustering techniques, we assessed the relationships between climatic factors and crop yield variability at the local scale for specific cultivars and growth stages. The analysis showed clear relationships in the various location-cultivar combinations, with climatic factors explaining 6 to 46% of spatiotemporal variability in yield, and with crop responses to weather being non-linear and cultivar-specific. Climatic factors affected cultivars differently during each stage of development. For instance, one cultivar was affected by high nighttime temperatures in the reproductive stage but responded positively to accumulated solar radiation during the ripening stage. Another was affected by high nighttime temperatures during both the vegetative and reproductive stages. Clustering of the weather patterns corresponding to individual cropping events revealed different groups of weather patterns for irrigated and rainfed systems with contrasting yield levels. Best-suited cultivars were identified for some weather patterns, making weather-site-specific recommendations possible. This study illustrates the potential of data mining for

  5. Application of EREP imagery to fracture-related mine safety hazards and environmental problems in mining

    NASA Technical Reports Server (NTRS)

    Wier, C. E.; Wobber, F. J.; Amato, R. V.; Russell, O. R. (Principal Investigator)

    1973-01-01

    The author has identified the following significant results. Numerous fracture traces were detected on both the color transparencies and black and white spectral bands. Fracture traces of value to mining hazards analysis were noted on the EREP imagery which could not be detected on either the ERTS-1 or high altitude aircraft color infrared photography. Several areas of mine subsidence occurring in the Busseron Creek area near Sullivan, Indiana were successfully identified using color photography. Skylab photography affords an increase over comparable scale ERTS-1 imagery in level of information obtained in mined lands inventory and reclamation analysis. A review of EREP color photography permitted the identification of a substantial number of non-fuel mines within the Southern Indiana test area. A new mine was detected on the EREP photography without prior data. EREP has definite value for estimating areal changes in active mines and for detecting new non-fuel mines. Gob piles and slurry ponds of several acres could be detected on the S-190B color photography when observed in association with large scale mining operations. Apparent degradation of water quality resulting from acid mine drainage and/or siltation was noted in several ponds or small lakes and appear to be related to intensive mining activity near Sullivan, Indiana.

  6. Quantitative Analysis of Critical Factors for the Climate Impact of Landfill Mining.

    PubMed

    Laner, David; Cencic, Oliver; Svensson, Niclas; Krook, Joakim

    2016-07-05

    Landfill mining has been proposed as an innovative strategy to mitigate environmental risks associated with landfills, to recover secondary raw materials and energy from the deposited waste, and to enable high-valued land uses at the site. The present study quantitatively assesses the importance of specific factors and conditions for the net contribution of landfill mining to global warming using a novel, set-based modeling approach and provides policy recommendations for facilitating the development of projects contributing to global warming mitigation. Building on life-cycle assessment, scenario modeling and sensitivity analysis methods are used to identify critical factors for the climate impact of landfill mining. The net contributions to global warming of the scenarios range from -1550 (saving) to 640 (burden) kg CO2e per Mg of excavated waste. Nearly 90% of the results' total variation can be explained by changes in four factors, namely the landfill gas management in the reference case (i.e., alternative to mining the landfill), the background energy system, the composition of the excavated waste, and the applied waste-to-energy technology. Based on the analyses, circumstances under which landfill mining should be prioritized or not are identified and sensitive parameters for the climate impact assessment of landfill mining are highlighted.

  7. Sustainable rehabilitation of mining waste and acid mine drainage using geochemistry, mine type, mineralogy, texture, ore extraction and climate knowledge.

    PubMed

    Anawar, Hossain Md

    2015-08-01

    The oxidative dissolution of sulfidic minerals releases the extremely acidic leachate, sulfate and potentially toxic elements e.g., As, Ag, Cd, Cr, Cu, Hg, Ni, Pb, Sb, Th, U, Zn, etc. from different mine tailings and waste dumps. For the sustainable rehabilitation and disposal of mining waste, the sources and mechanisms of contaminant generation, fate and transport of contaminants should be clearly understood. Therefore, this study has provided a critical review on (1) recent insights in mechanisms of oxidation of sulfidic minerals, (2) environmental contamination by mining waste, and (3) remediation and rehabilitation techniques, and (4) then developed the GEMTEC conceptual model/guide [(bio)-geochemistry-mine type-mineralogy- geological texture-ore extraction process-climatic knowledge)] to provide the new scientific approach and knowledge for remediation of mining wastes and acid mine drainage. This study has suggested the pre-mining geological, geochemical, mineralogical and microtextural characterization of different mineral deposits, and post-mining studies of ore extraction processes, physical, geochemical, mineralogical and microbial reactions, natural attenuation and effect of climate change for sustainable rehabilitation of mining waste. All components of this model should be considered for effective and integrated management of mining waste and acid mine drainage. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. A study of acid and ferruginous mine water in coal mining operations

    NASA Astrophysics Data System (ADS)

    Atkins, A. S.; Singh, R. N.

    1982-06-01

    The paper describes a bio-chemical investigation in the laboratory to identify various factors which promote the formation of acidic and ferruginous mine water. Biochemical reactions responsible for bacterial oxidation of Iron pyrites are described. The acidic and ferruginous mine water are not only responsible for the corrosion of mine plant and equipment and formation of scales in the delivery pipe range, but also pollution of the mine surface environment, thus affecting the surface ecology. Control measures to mitigate the adverse effects of acid mine discharge include the protection of mining equipment and prevention of formation of acid and ferruginous water. Various control measures discussed in the paper are blending with alkaline or spring water, use of neutralising agents and bactericides, and various types of seals for preventing water and air coming into contact with pyrites in caved mine workings.

  9. Image Information Mining Utilizing Hierarchical Segmentation

    NASA Technical Reports Server (NTRS)

    Tilton, James C.; Marchisio, Giovanni; Koperski, Krzysztof; Datcu, Mihai

    2002-01-01

    The Hierarchical Segmentation (HSEG) algorithm is an approach for producing high quality, hierarchically related image segmentations. The VisiMine image information mining system utilizes clustering and segmentation algorithms for reducing visual information in multispectral images to a manageable size. The project discussed herein seeks to enhance the VisiMine system through incorporating hierarchical segmentations from HSEG into the VisiMine system.

  10. Data mining mining data: MSHA enforcement efforts, underground coal mine safety, and new health policy implications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kniesner, T.J.; Leeth, J.D.

    2004-09-15

    Using recently assembled data from the Mine Safety and Health Administration (MSHA) we shed new light on the regulatory approach to workplace safety. Because all underground coal mines are inspected quarterly, MSHA regulations will not be ineffective because of infrequent inspections. From over 200 different specifications of dynamic mine safety regressions we select the specification producing the largest MSHA impact. Even using results most favorable to the agency, MSHA is not currently cost effective. Almost 700,000 life years could be gained for typical miners if a quarter of MSHA's enforcement budget were reallocated to other programs (more heart disease screeningmore » or defibrillators at worksites).« less

  11. Text mining approach to predict hospital admissions using early medical records from the emergency department.

    PubMed

    Lucini, Filipe R; S Fogliatto, Flavio; C da Silveira, Giovani J; L Neyeloff, Jeruza; Anzanello, Michel J; de S Kuchenbecker, Ricardo; D Schaan, Beatriz

    2017-04-01

    Emergency department (ED) overcrowding is a serious issue for hospitals. Early information on short-term inward bed demand from patients receiving care at the ED may reduce the overcrowding problem, and optimize the use of hospital resources. In this study, we use text mining methods to process data from early ED patient records using the SOAP framework, and predict future hospitalizations and discharges. We try different approaches for pre-processing of text records and to predict hospitalization. Sets-of-words are obtained via binary representation, term frequency, and term frequency-inverse document frequency. Unigrams, bigrams and trigrams are tested for feature formation. Feature selection is based on χ 2 and F-score metrics. In the prediction module, eight text mining methods are tested: Decision Tree, Random Forest, Extremely Randomized Tree, AdaBoost, Logistic Regression, Multinomial Naïve Bayes, Support Vector Machine (Kernel linear) and Nu-Support Vector Machine (Kernel linear). Prediction performance is evaluated by F1-scores. Precision and Recall values are also informed for all text mining methods tested. Nu-Support Vector Machine was the text mining method with the best overall performance. Its average F1-score in predicting hospitalization was 77.70%, with a standard deviation (SD) of 0.66%. The method could be used to manage daily routines in EDs such as capacity planning and resource allocation. Text mining could provide valuable information and facilitate decision-making by inward bed management teams. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  12. Soil quality assessment using GIS-based chemometric approach and pollution indices: Nakhlak mining district, Central Iran.

    PubMed

    Moore, Farid; Sheykhi, Vahideh; Salari, Mohammad; Bagheri, Adel

    2016-04-01

    This paper is a comprehensive assessment of the quality of soil in the Nakhlak mining district in Central Iran with special reference to potentially toxic metals. In this regard, an integrated approach involving geostatistical, correlation matrix, pollution indices, and chemical fractionation measurement is used to evaluate selected potentially toxic metals in soil samples. The fractionation of metals indicated a relatively high variability. Some metals (Mo, Ag, and Pb) showed important enrichment in the bioavailable fractions (i.e., exchangeable and carbonate), whereas the residual fraction mostly comprised Sb and Cr. The Cd, Zn, Co, Ni, Mo, Cu, and As were retained in Fe-Mn oxide and oxidizable fractions, suggesting that they may be released to the environment by changes in physicochemical conditions. The spatial variability patterns of 11 soil heavy metals (Ag, As, Cd, Co, Cr, Cu, Mo, Ni, Pb, Sb, and Zn) were identified and mapped. The results demonstrated that Ag, As, Cd, Mo, Cu, Pb, Sb, and Zn pollution are associated with mineralized veins and mining operations in this area. Further environmental monitoring and remedial actions are required for management of soil heavy metals in the study area. The present study not only enhanced our knowledge regarding soil pollution in the study area but also introduced a better technique to analyze pollution indices by multivariate geostatistical methods.

  13. Mining and biodiversity offsets: a transparent and science-based approach to measure "no-net-loss".

    PubMed

    Virah-Sawmy, Malika; Ebeling, Johannes; Taplin, Roslyn

    2014-10-01

    Mining and associated infrastructure developments can present themselves as economic opportunities that are difficult to forego for developing and industrialised countries alike. Almost inevitably, however, they lead to biodiversity loss. This trade-off can be greatest in economically poor but highly biodiverse regions. Biodiversity offsets have, therefore, increasingly been promoted as a mechanism to help achieve both the aims of development and biodiversity conservation. Accordingly, this mechanism is emerging as a key tool for multinational mining companies to demonstrate good environmental stewardship. Relying on offsets to achieve "no-net-loss" of biodiversity, however, requires certainty in their ecological integrity where they are used to sanction habitat destruction. Here, we discuss real-world practices in biodiversity offsetting by assessing how well some leading initiatives internationally integrate critical aspects of biodiversity attributes, net loss accounting and project management. With the aim of improving, rather than merely critiquing the approach, we analyse different aspects of biodiversity offsetting. Further, we analyse the potential pitfalls of developing counterfactual scenarios of biodiversity loss or gains in a project's absence. In this, we draw on insights from experience with carbon offsetting. This informs our discussion of realistic projections of project effectiveness and permanence of benefits to ensure no net losses, and the risk of displacing, rather than avoiding biodiversity losses ("leakage"). We show that the most prominent existing biodiversity offset initiatives employ broad and somewhat arbitrary parameters to measure habitat value and do not sufficiently consider real-world challenges in compensating losses in an effective and lasting manner. We propose a more transparent and science-based approach, supported with a new formula, to help design biodiversity offsets to realise their potential in enabling more responsible

  14. Using MSD prevention for cultural change in mining: Queensland Government/Anglo Coal Industry partnership.

    PubMed

    Tilbury, Trudy; Sanderson, Liz

    2012-01-01

    Queensland Mining has a strong focus on safety performance, but risk management of health, including Musculoskeletal Disorders (MSDs) continues to have a lower priority. The reliance on individual screening of workers and lower level approaches such as manual handling training is part of the coal mining 'culture'. Initiatives such as the New South Wales and Queensland Mining joint project to develop good practice guidance for mining has allowed for a more consistent message on participatory ergonomics and prevention of MSD. An evidence based practice approach, including the introduction of participatory ergonomics and safe design principles, was proposed to Anglo American Coal operations in Queensland. The project consisted of a skills analysis of current health personnel, design of a facilitated participatory ergonomics training program, site visits to identify good practice and champions, and a graduated mentoring program for health personnel. Early results demonstrate a number of sites are benefiting from site taskforces with a focus on positive performance outcomes.

  15. Monitoring genotoxic exposure in uranium mines

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sram, R.J.; Vesela, D.; Vesely, D.

    1993-10-01

    Recent data from deep uranium mines in Czechoslovakia indicated that miners are exposed to other mutagenic factors in addition to radon daughter products. Mycotoxins were identified as a possible source of mutagens in these mines. Mycotoxins were examined in 38 samples from mines and in throat swabs taken from 116 miners and 78 controls. The following mycotoxins were identified from mines samples: aflatoxins B{sub 1} and G1, citrinin, citreoviridin, mycophenolic acid, and sterigmatocystin. Some mold strains isolated from mines and throat swabs were investigated for mutagenic activity by the SOS chromotest and Salmonella assay with strains TA100 and TA98. Mutagenicitymore » was observed, especially with metabolic activation in citro. These data suggest that mycotoxins produced by molds in uranium mines are a new genotoxic factor im uranium miners. 17 refs., 4 tabs.« less

  16. Using association rule mining to identify risk factors for early childhood caries.

    PubMed

    Ivančević, Vladimir; Tušek, Ivan; Tušek, Jasmina; Knežević, Marko; Elheshk, Salaheddin; Luković, Ivan

    2015-11-01

    Early childhood caries (ECC) is a potentially severe disease affecting children all over the world. The available findings are mostly based on a logistic regression model, but data mining, in particular association rule mining, could be used to extract more information from the same data set. ECC data was collected in a cross-sectional analytical study of the 10% sample of preschool children in the South Bačka area (Vojvodina, Serbia). Association rules were extracted from the data by association rule mining. Risk factors were extracted from the highly ranked association rules. Discovered dominant risk factors include male gender, frequent breastfeeding (with other risk factors), high birth order, language, and low body weight at birth. Low health awareness of parents was significantly associated to ECC only in male children. The discovered risk factors are mostly confirmed by the literature, which corroborates the value of the methods. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  17. Image Mining in Remote Sensing for Coastal Wetlands Mapping: from Pixel Based to Object Based Approach

    NASA Astrophysics Data System (ADS)

    Farda, N. M.; Danoedoro, P.; Hartono; Harjoko, A.

    2016-11-01

    The availably of remote sensing image data is numerous now, and with a large amount of data it makes “knowledge gap” in extraction of selected information, especially coastal wetlands. Coastal wetlands provide ecosystem services essential to people and the environment. The aim of this research is to extract coastal wetlands information from satellite data using pixel based and object based image mining approach. Landsat MSS, Landsat 5 TM, Landsat 7 ETM+, and Landsat 8 OLI images located in Segara Anakan lagoon are selected to represent data at various multi temporal images. The input for image mining are visible and near infrared bands, PCA band, invers PCA bands, mean shift segmentation bands, bare soil index, vegetation index, wetness index, elevation from SRTM and ASTER GDEM, and GLCM (Harralick) or variability texture. There is three methods were applied to extract coastal wetlands using image mining: pixel based - Decision Tree C4.5, pixel based - Back Propagation Neural Network, and object based - Mean Shift segmentation and Decision Tree C4.5. The results show that remote sensing image mining can be used to map coastal wetlands ecosystem. Decision Tree C4.5 can be mapped with highest accuracy (0.75 overall kappa). The availability of remote sensing image mining for mapping coastal wetlands is very important to provide better understanding about their spatiotemporal coastal wetlands dynamics distribution.

  18. Model of environmental life cycle assessment for coal mining operations.

    PubMed

    Burchart-Korol, Dorota; Fugiel, Agata; Czaplicka-Kolarz, Krystyna; Turek, Marian

    2016-08-15

    This paper presents a novel approach to environmental assessment of coal mining operations, which enables assessment of the factors that are both directly and indirectly affecting the environment and are associated with the production of raw materials and energy used in processes. The primary novelty of the paper is the development of a computational environmental life cycle assessment (LCA) model for coal mining operations and the application of the model for coal mining operations in Poland. The LCA model enables the assessment of environmental indicators for all identified unit processes in hard coal mines with the life cycle approach. The proposed model enables the assessment of greenhouse gas emissions (GHGs) based on the IPCC method and the assessment of damage categories, such as human health, ecosystems and resources based on the ReCiPe method. The model enables the assessment of GHGs for hard coal mining operations in three time frames: 20, 100 and 500years. The model was used to evaluate the coal mines in Poland. It was demonstrated that the largest environmental impacts in damage categories were associated with the use of fossil fuels, methane emissions and the use of electricity, processing of wastes, heat, and steel supports. It was concluded that an environmental assessment of coal mining operations, apart from direct influence from processing waste, methane emissions and drainage water, should include the use of electricity, heat and steel, particularly for steel supports. Because the model allows the comparison of environmental impact assessment for various unit processes, it can be used for all hard coal mines, not only in Poland but also in the world. This development is an important step forward in the study of the impacts of fossil fuels on the environment with the potential to mitigate the impact of the coal industry on the environment. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Landscape Character of Pongkor Mining Ecotourism Area

    NASA Astrophysics Data System (ADS)

    Kusumoarto, A.; Gunawan, A.; Machfud; Hikmat, A.

    2017-10-01

    Pongkor Mining Ecotourism Area has a diverse landscape character as a potential landscape resources for the development of ecotourism destination. This area is part of the Mount of Botol Resort, Halimun Salak National Park (HSNP). This area also has a fairly high biodiversity. This study aims to identify and analysis the category of landscape character in the Pongkor Mining Ecotourism Area for the development of ecotourism destination. This study used a descriptive approach through field surveys and interviews, was carried out through two steps : 1) identify the landscape character, and 2) analysis of the landscape character. The results showed that in areas set aside for ecotourism destination in Pongkor Mining, landscape character category scattered forests, tailing ponds, river, plain, and the built environment. The Category of landscape character most dominant scattered in the area is forest, here is the river, plain, tailing ponds, the built environment, and plain. The landscape character in a natural environment most preferred for ecotourism activities. The landscape character that spread in the natural environment and the built environment is a potential that must be protected and modified such as elimination of incongruous element, accentuation of natural form, alteration of the natural form, intensification and enhanced visual quality intensively to be developed as a ecotourism destination area.

  20. A Tools-Based Approach to Teaching Data Mining Methods

    ERIC Educational Resources Information Center

    Jafar, Musa J.

    2010-01-01

    Data mining is an emerging field of study in Information Systems programs. Although the course content has been streamlined, the underlying technology is still in a state of flux. The purpose of this paper is to describe how we utilized Microsoft Excel's data mining add-ins as a front-end to Microsoft's Cloud Computing and SQL Server 2008 Business…

  1. Can abstract screening workload be reduced using text mining? User experiences of the tool Rayyan.

    PubMed

    Olofsson, Hanna; Brolund, Agneta; Hellberg, Christel; Silverstein, Rebecca; Stenström, Karin; Österberg, Marie; Dagerhamn, Jessica

    2017-09-01

    One time-consuming aspect of conducting systematic reviews is the task of sifting through abstracts to identify relevant studies. One promising approach for reducing this burden uses text mining technology to identify those abstracts that are potentially most relevant for a project, allowing those abstracts to be screened first. To examine the effectiveness of the text mining functionality of the abstract screening tool Rayyan. User experiences were collected. Rayyan was used to screen abstracts for 6 reviews in 2015. After screening 25%, 50%, and 75% of the abstracts, the screeners logged the relevant references identified. A survey was sent to users. After screening half of the search result with Rayyan, 86% to 99% of the references deemed relevant to the study were identified. Of those studies included in the final reports, 96% to 100% were already identified in the first half of the screening process. Users rated Rayyan 4.5 out of 5. The text mining function in Rayyan successfully helped reviewers identify relevant studies early in the screening process. Copyright © 2017 John Wiley & Sons, Ltd.

  2. Hazards identified and the need for health risk assessment in the South African mining industry.

    PubMed

    Utembe, W; Faustman, E M; Matatiele, P; Gulumian, M

    2015-12-01

    Although mining plays a prominent role in the economy of South Africa, it is associated with many chemical hazards. Exposure to dust from mining can lead to many pathological effects depending on mineralogical composition, size, shape and levels and duration of exposure. Mining and processing of minerals also result in occupational exposure to toxic substances such as platinum, chromium, vanadium, manganese, mercury, cyanide and diesel particulate. South Africa has set occupational exposure limits (OELs) for some hazards, but mine workers are still at a risk. Since the hazard posed by a mineral depends on its physiochemical properties, it is recommended that South Africa should not simply adopt OELs from other countries but rather set her own standards based on local toxicity studies. The limits should take into account the issue of mixtures to which workers could be exposed as well as the health status of the workers. The mining industry is also a source of contamination of the environment, due inter alia to the large areas of tailings dams and dumps left behind. Therefore, there is need to develop guidelines for safe land-uses of contaminated lands after mine closure. © The Author(s) 2015.

  3. Application of ERTS-1 imagery to fracture related mine safety hazards in the coal mining industry. [Indiana

    NASA Technical Reports Server (NTRS)

    Wier, C. E.; Wobber, F. J. (Principal Investigator); Russell, O. R.; Amato, R. V.; Leshendok, T. V.

    1974-01-01

    The author has identified the following significant results. New fracture detail of Indiana has been observed and mapped from ERTS-1 imagery. Studies so far indicate a close relationship between the directions of fracture traces mapped from the imagery, fractures measured on bedrock outcrops, and fractures measured in the underground mines. First hand observations and discussions with underground mine operators indicate good correlation of mine hazard maps prepared from ERTS-1/aircraft imagery and actual roof falls. The inventory of refuse piles/slurry ponds of the coal field of Indiana has identified over 225 such sites from past mining operations. These data will serve the State Legislature in making tax decisions on coal mining which take on increased importance because of the energy crisis.

  4. Incorporating ecosystem services into environmental management of deep-seabed mining

    NASA Astrophysics Data System (ADS)

    Le, Jennifer T.; Levin, Lisa A.; Carson, Richard T.

    2017-03-01

    Accelerated exploration of minerals in the deep sea over the past decade has raised the likelihood that commercial mining of the deep seabed will commence in the near future. Environmental concerns create a growing urgency for development of environmental regulations under commercial exploitation. Here, we consider an ecosystem services approach to the environmental policy and management of deep-sea mineral resources. Ecosystem services link the environment and human well-being, and can help improve sustainability and stewardship of the deep sea by providing a quantitative basis for decision-making. This paper briefly reviews ecosystem services provided by habitats targeted for deep-seabed mining (hydrothermal vents, seamounts, nodule provinces, and phosphate-rich margins), and presents practical steps to incorporate ecosystem services into deep-seabed mining regulation. The linkages and translation between ecosystem structure, ecological function (including supporting services), and ecosystem services are highlighted as generating human benefits. We consider criteria for identifying which ecosystem services are vulnerable to potential mining impacts, the role of ecological functions in providing ecosystem services, development of ecosystem service indicators, valuation of ecosystem services, and implementation of ecosystem services concepts. The first three steps put ecosystem services into a deep-seabed mining context; the last two steps help to incorporate ecosystem services into a management and decision-making framework. Phases of environmental planning discussed in the context of ecosystem services include conducting strategic environmental assessments, collecting baseline data, monitoring, establishing marine protected areas, assessing cumulative impacts, identifying thresholds and triggers, and creating an environmental damage compensation regime. We also identify knowledge gaps that need to be addressed in order to operationalize ecosystem services

  5. Miners' Misconceptions of Flow Distribution Within Circuits as a Factor Influencing Underground Mining Accidents.

    NASA Astrophysics Data System (ADS)

    Passaro, Perry David

    Misconceptions can be thought of as naive approaches to problem solving that are perceptually appealing but incorrect and inconsistent with scientific evidence (Piaget, 1929). One type of misconception involves flow distributions within circuits. This concept is important because miners' conceptual errors about flow distribution changes within complex circuits may be in part responsible for fatal mine disasters. Based on the theory that misconceptions of flow distribution changes within circuits were responsible for underground mine disasters involving mine ventilation circuits, a series of studies was undertaken with mining engineering students, professional mining engineers, as well as mine foremen, mine supervisors, mine rescue members, mine maintenance personnel, mining researchers and working miners to identify these conceptual errors and errors in mine ventilation procedures. Results indicate that misconceptions of flow distribution changes within circuits exist in over 70 percent of the subjects sampled. It is assumed that these misconceptions of flow distribution changes within circuits result in errors of judgment when miners are faced with inferring and changing ventilation arrangements when two or more mine sections are connected. Furthermore, it is assumed that these misconceptions are pervasive in the mining industry and may be responsible for at least two mine ventilation disasters. The findings of this study are consistent with Piaget's (1929) model of figurative and operative knowledge. This model states that misconceptions are in part due to a lack of knowledge of dynamic transformations and how to apply content information. Recommendations for future research include the development of an interactive expert system for training miners with ventilation arrangements. Such a system would meet the educational recommendations made by Piaget (1973b) by involving a hands-on approach that allows discovery, interaction, the opportunity to make mistakes and

  6. Management of the water balance and quality in mining areas

    NASA Astrophysics Data System (ADS)

    Pasanen, Antti; Krogerus, Kirsti; Mroueh, Ulla-Maija; Turunen, Kaisa; Backnäs, Soile; Vento, Tiia; Veijalainen, Noora; Hentinen, Kimmo; Korkealaakso, Juhani

    2015-04-01

    Although mining companies have long been conscious of water related risks they still face environmental management problems. These problems mainly emerge because mine sites' water balances have not been adequately assessed in the stage of the planning of mines. More consistent approach is required to help mining companies identify risks and opportunities related to the management of water resources in all stages of mining. This approach requires that the water cycle of a mine site is interconnected with the general hydrologic water cycle. In addition to knowledge on hydrological conditions, the control of the water balance in the mining processes require knowledge of mining processes, the ability to adjust process parameters to variable hydrological conditions, adaptation of suitable water management tools and systems, systematic monitoring of amounts and quality of water, adequate capacity in water management infrastructure to handle the variable water flows, best practices to assess the dispersion, mixing and dilution of mine water and pollutant loading to receiving water bodies, and dewatering and separation of water from tailing and precipitates. WaterSmart project aims to improve the awareness of actual quantities of water, and water balances in mine areas to improve the forecasting and the management of the water volumes. The study is executed through hydrogeological and hydrological surveys and online monitoring procedures. One of the aims is to exploit on-line water quantity and quality monitoring for the better management of the water balances. The target is to develop a practical and end-user-specific on-line input and output procedures. The second objective is to develop mathematical models to calculate combined water balances including the surface, ground and process waters. WSFS, the Hydrological Modeling and Forecasting System of SYKE is being modified for mining areas. New modelling tools are developed on spreadsheet and system dynamics platforms to

  7. A science-based, watershed strategy to support effective remediation of abandoned mine lands

    USGS Publications Warehouse

    Buxton, Herbert T.; Nimick, David A.; Von Guerard, Paul; Church, Stan E.; Frazier, Ann G.; Gray, John R.; Lipin, Bruce R.; Marsh, Sherman P.; Woodward, Daniel F.; Kimball, Briant A.; Finger, Susan E.; Ischinger, Lee S.; Fordham, John C.; Power, Martha S.; Bunch, Christine M.; Jones, John W.

    1997-01-01

    A U.S. Geological Survey Abandoned Mine Lands Initiative will develop a strategy for gathering and communicating the scientific information needed to formulate effective and cost-efficient remediation of abandoned mine lands. A watershed approach will identify, characterize, and remediate contaminated sites that have the most profound effect on water and ecosystem quality within a watershed. The Initiative will be conducted during 1997 through 2001 in two pilot watersheds, the Upper Animas River watershed in Colorado and the Boulder River watershed in Montana. Initiative efforts are being coordinated with the U.S. Forest Service, Bureau of Land Management, National Park Service, and other stakeholders which are using the resulting scientific information to design and implement remediation activities. The Initiative has the following eight objective-oriented components: estimate background (pre-mining) conditions; define baseline (current) conditions; identify target sites (major contaminant sources); characterize target sites and processes affecting contaminant dispersal; characterize ecosystem health and controlling processes at target sites; develop remediation goals and monitoring network; provide an integrated, quality-assured and accessible data network; and document lessons learned for future applications of the watershed approach.

  8. Distributed design approach in persistent identifiers systems

    NASA Astrophysics Data System (ADS)

    Golodoniuc, Pavel; Car, Nicholas; Klump, Jens

    2017-04-01

    The need to identify both digital and physical objects is ubiquitous in our society. Past and present persistent identifier (PID) systems, of which there is a great variety in terms of technical and social implementations, have evolved with the advent of the Internet, which has allowed for globally unique and globally resolvable identifiers. PID systems have catered for identifier uniqueness, integrity, persistence, and trustworthiness, regardless of the identifier's application domain, the scope of which has expanded significantly in the past two decades. Since many PID systems have been largely conceived and developed by small communities, or even a single organisation, they have faced challenges in gaining widespread adoption and, most importantly, the ability to survive change of technology. This has left a legacy of identifiers that still exist and are being used but which have lost their resolution service. We believe that one of the causes of once successful PID systems fading is their reliance on a centralised technical infrastructure or a governing authority. Golodoniuc et al. (2016) proposed an approach to the development of PID systems that combines the use of (a) the Handle system, as a distributed system for the registration and first-degree resolution of persistent identifiers, and (b) the PID Service (Golodoniuc et al., 2015), to enable fine-grained resolution to different information object representations. The proposed approach solved the problem of guaranteed first-degree resolution of identifiers, but left fine-grained resolution and information delivery under the control of a single authoritative source, posing risk to the long-term availability of information resources. Herein, we develop these approaches further and explore the potential of large-scale decentralisation at all levels: (i) persistent identifiers and information resources registration; (ii) identifier resolution; and (iii) data delivery. To achieve large-scale decentralisation

  9. Using Data Mining to Detect Health Care Fraud and Abuse: A Review of Literature

    PubMed Central

    Joudaki, Hossein; Rashidian, Arash; Minaei-Bidgoli, Behrouz; Mahmoodi, Mahmood; Geraili, Bijan; Nasiri, Mahdi; Arab, Mohammad

    2015-01-01

    Inappropriate payments by insurance organizations or third party payers occur because of errors, abuse and fraud. The scale of this problem is large enough to make it a priority issue for health systems. Traditional methods of detecting health care fraud and abuse are time-consuming and inefficient. Combining automated methods and statistical knowledge lead to the emergence of a new interdisciplinary branch of science that is named Knowledge Discovery from Databases (KDD). Data mining is a core of the KDD process. Data mining can help third-party payers such as health insurance organizations to extract useful information from thousands of claims and identify a smaller subset of the claims or claimants for further assessment. We reviewed studies that performed data mining techniques for detecting health care fraud and abuse, using supervised and unsupervised data mining approaches. Most available studies have focused on algorithmic data mining without an emphasis on or application to fraud detection efforts in the context of health service provision or health insurance policy. More studies are needed to connect sound and evidence-based diagnosis and treatment approaches toward fraudulent or abusive behaviors. Ultimately, based on available studies, we recommend seven general steps to data mining of health care claims. PMID:25560347

  10. Application of EREP imagery to fracture-related mine safety hazards and environmental problems in mining. [Indiana

    NASA Technical Reports Server (NTRS)

    Wier, C. E.; Wobber, F. J.; Amato, R. V.; Russell, O. R. (Principal Investigator)

    1974-01-01

    The author has identified the following significant results. All Skylab 2 imagery received to date has been analyzed manually and data related to fracture analysis and mined land inventories has been summarized on map-overlays. A comparison of the relative utility of the Skylab image products for fracture detection, soil tone/vegetation contrast mapping, and mined land mapping has been completed. Numerous fracture traces were detected on both color and black and white transparencies. Unique fracture trace data which will contribute to the investigator's mining hazards analysis were noted on the EREP imagery; these data could not be detected on ERTS-1 imagery or high altitude aircraft color infrared photography. Stream segments controlled by fractures or joint systems could be identified in more detail than with ERTS-1 imagery of comparable scale. ERTS-1 mine hazards products will be modified to demonstrate the value of this additional data. Skylab images were used successfully to update a mined land map of Indiana made in 1972. Changes in mined area as small as two acres can be identified. As the Energy Crisis increases the demand for coal, such demonstrations of the application of Skylab data to coal resources will take on new importance.

  11. The Forestry Reclamation Approach: guide to successful reforestation of mined lands

    Treesearch

    Mary Beth Adams

    2017-01-01

    Appalachian forests are among the most productive and diverse in the world. The land underlying them is also rich in coal, and surface mines operated on more than 2.4 million acres in the region from 1977, when the federal Surface Mining Control and Reclamation Act was passed, through 2015. Many efforts to reclaim mined lands most often resulted in the establishment of...

  12. Annotating images by mining image search results.

    PubMed

    Wang, Xin-Jing; Zhang, Lei; Li, Xirong; Ma, Wei-Ying

    2008-11-01

    Although it has been studied for years by the computer vision and machine learning communities, image annotation is still far from practical. In this paper, we propose a novel attempt at model-free image annotation, which is a data-driven approach that annotates images by mining their search results. Some 2.4 million images with their surrounding text are collected from a few photo forums to support this approach. The entire process is formulated in a divide-and-conquer framework where a query keyword is provided along with the uncaptioned image to improve both the effectiveness and efficiency. This is helpful when the collected data set is not dense everywhere. In this sense, our approach contains three steps: 1) the search process to discover visually and semantically similar search results, 2) the mining process to identify salient terms from textual descriptions of the search results, and 3) the annotation rejection process to filter out noisy terms yielded by Step 2. To ensure real-time annotation, two key techniques are leveraged-one is to map the high-dimensional image visual features into hash codes, the other is to implement it as a distributed system, of which the search and mining processes are provided as Web services. As a typical result, the entire process finishes in less than 1 second. Since no training data set is required, our approach enables annotating with unlimited vocabulary and is highly scalable and robust to outliers. Experimental results on both real Web images and a benchmark image data set show the effectiveness and efficiency of the proposed algorithm. It is also worth noting that, although the entire approach is illustrated within the divide-and conquer framework, a query keyword is not crucial to our current implementation. We provide experimental results to prove this.

  13. A Novel Biclustering Approach to Association Rule Mining for Predicting HIV-1–Human Protein Interactions

    PubMed Central

    Mukhopadhyay, Anirban; Maulik, Ujjwal; Bandyopadhyay, Sanghamitra

    2012-01-01

    Identification of potential viral-host protein interactions is a vital and useful approach towards development of new drugs targeting those interactions. In recent days, computational tools are being utilized for predicting viral-host interactions. Recently a database containing records of experimentally validated interactions between a set of HIV-1 proteins and a set of human proteins has been published. The problem of predicting new interactions based on this database is usually posed as a classification problem. However, posing the problem as a classification one suffers from the lack of biologically validated negative interactions. Therefore it will be beneficial to use the existing database for predicting new viral-host interactions without the need of negative samples. Motivated by this, in this article, the HIV-1–human protein interaction database has been analyzed using association rule mining. The main objective is to identify a set of association rules both among the HIV-1 proteins and among the human proteins, and use these rules for predicting new interactions. In this regard, a novel association rule mining technique based on biclustering has been proposed for discovering frequent closed itemsets followed by the association rules from the adjacency matrix of the HIV-1–human interaction network. Novel HIV-1–human interactions have been predicted based on the discovered association rules and tested for biological significance. For validation of the predicted new interactions, gene ontology-based and pathway-based studies have been performed. These studies show that the human proteins which are predicted to interact with a particular viral protein share many common biological activities. Moreover, literature survey has been used for validation purpose to identify some predicted interactions that are already validated experimentally but not present in the database. Comparison with other prediction methods is also discussed. PMID:22539940

  14. From data towards knowledge: revealing the architecture of signaling systems by unifying knowledge mining and data mining of systematic perturbation data.

    PubMed

    Lu, Songjian; Jin, Bo; Cowart, L Ashley; Lu, Xinghua

    2013-01-01

    Genetic and pharmacological perturbation experiments, such as deleting a gene and monitoring gene expression responses, are powerful tools for studying cellular signal transduction pathways. However, it remains a challenge to automatically derive knowledge of a cellular signaling system at a conceptual level from systematic perturbation-response data. In this study, we explored a framework that unifies knowledge mining and data mining towards the goal. The framework consists of the following automated processes: 1) applying an ontology-driven knowledge mining approach to identify functional modules among the genes responding to a perturbation in order to reveal potential signals affected by the perturbation; 2) applying a graph-based data mining approach to search for perturbations that affect a common signal; and 3) revealing the architecture of a signaling system by organizing signaling units into a hierarchy based on their relationships. Applying this framework to a compendium of yeast perturbation-response data, we have successfully recovered many well-known signal transduction pathways; in addition, our analysis has led to many new hypotheses regarding the yeast signal transduction system; finally, our analysis automatically organized perturbed genes as a graph reflecting the architecture of the yeast signaling system. Importantly, this framework transformed molecular findings from a gene level to a conceptual level, which can be readily translated into computable knowledge in the form of rules regarding the yeast signaling system, such as "if genes involved in the MAPK signaling are perturbed, genes involved in pheromone responses will be differentially expressed."

  15. The adaptive approach for storage assignment by mining data of warehouse management system for distribution centres

    NASA Astrophysics Data System (ADS)

    Ming-Huang Chiang, David; Lin, Chia-Ping; Chen, Mu-Chen

    2011-05-01

    Among distribution centre operations, order picking has been reported to be the most labour-intensive activity. Sophisticated storage assignment policies adopted to reduce the travel distance of order picking have been explored in the literature. Unfortunately, previous research has been devoted to locating entire products from scratch. Instead, this study intends to propose an adaptive approach, a Data Mining-based Storage Assignment approach (DMSA), to find the optimal storage assignment for newly delivered products that need to be put away when there is vacant shelf space in a distribution centre. In the DMSA, a new association index (AIX) is developed to evaluate the fitness between the put away products and the unassigned storage locations by applying association rule mining. With AIX, the storage location assignment problem (SLAP) can be formulated and solved as a binary integer programming. To evaluate the performance of DMSA, a real-world order database of a distribution centre is obtained and used to compare the results from DMSA with a random assignment approach. It turns out that DMSA outperforms random assignment as the number of put away products and the proportion of put away products with high turnover rates increase.

  16. Identifying Learning Behaviors by Contextualizing Differential Sequence Mining with Action Features and Performance Evolution

    ERIC Educational Resources Information Center

    Kinnebrew, John S.; Biswas, Gautam

    2012-01-01

    Our learning-by-teaching environment, Betty's Brain, captures a wealth of data on students' learning interactions as they teach a virtual agent. This paper extends an exploratory data mining methodology for assessing and comparing students' learning behaviors from these interaction traces. The core algorithm employs sequence mining techniques to…

  17. Proceedings: Fourth Workshop on Mining Scientific Datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kamath, C

    Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratorymore » data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of

  18. MELODI: Mining Enriched Literature Objects to Derive Intermediates

    PubMed Central

    Elsworth, Benjamin; Dawe, Karen; Vincent, Emma E; Langdon, Ryan; Lynch, Brigid M; Martin, Richard M; Relton, Caroline; Higgins, Julian P T; Gaunt, Tom R

    2018-01-01

    Abstract Background The scientific literature contains a wealth of information from different fields on potential disease mechanisms. However, identifying and prioritizing mechanisms for further analytical evaluation presents enormous challenges in terms of the quantity and diversity of published research. The application of data mining approaches to the literature offers the potential to identify and prioritize mechanisms for more focused and detailed analysis. Methods Here we present MELODI, a literature mining platform that can identify mechanistic pathways between any two biomedical concepts. Results Two case studies demonstrate the potential uses of MELODI and how it can generate hypotheses for further investigation. First, an analysis of ETS-related gene ERG and prostate cancer derives the intermediate transcription factor SP1, recently confirmed to be physically interacting with ERG. Second, examining the relationship between a new potential risk factor for pancreatic cancer identifies possible mechanistic insights which can be studied in vitro. Conclusions We have demonstrated the possible applications of MELODI, including two case studies. MELODI has been implemented as a Python/Django web application, and is freely available to use at [www.melodi.biocompute.org.uk]. PMID:29342271

  19. MELODI: Mining Enriched Literature Objects to Derive Intermediates.

    PubMed

    Elsworth, Benjamin; Dawe, Karen; Vincent, Emma E; Langdon, Ryan; Lynch, Brigid M; Martin, Richard M; Relton, Caroline; Higgins, Julian P T; Gaunt, Tom R

    2018-01-12

    The scientific literature contains a wealth of information from different fields on potential disease mechanisms. However, identifying and prioritizing mechanisms for further analytical evaluation presents enormous challenges in terms of the quantity and diversity of published research. The application of data mining approaches to the literature offers the potential to identify and prioritize mechanisms for more focused and detailed analysis. Here we present MELODI, a literature mining platform that can identify mechanistic pathways between any two biomedical concepts. Two case studies demonstrate the potential uses of MELODI and how it can generate hypotheses for further investigation. First, an analysis of ETS-related gene ERG and prostate cancer derives the intermediate transcription factor SP1, recently confirmed to be physically interacting with ERG. Second, examining the relationship between a new potential risk factor for pancreatic cancer identifies possible mechanistic insights which can be studied in vitro. We have demonstrated the possible applications of MELODI, including two case studies. MELODI has been implemented as a Python/Django web application, and is freely available to use at [www.melodi.biocompute.org.uk]. © The Author(s) 2018. Published by Oxford University Press on behalf of the International Epidemiological Association

  20. PubMedMiner: Mining and Visualizing MeSH-based Associations in PubMed.

    PubMed

    Zhang, Yucan; Sarkar, Indra Neil; Chen, Elizabeth S

    2014-01-01

    The exponential growth of biomedical literature provides the opportunity to develop approaches for facilitating the identification of possible relationships between biomedical concepts. Indexing by Medical Subject Headings (MeSH) represent high-quality summaries of much of this literature that can be used to support hypothesis generation and knowledge discovery tasks using techniques such as association rule mining. Based on a survey of literature mining tools, a tool implemented using Ruby and R - PubMedMiner - was developed in this study for mining and visualizing MeSH-based associations for a set of MEDLINE articles. To demonstrate PubMedMiner's functionality, a case study was conducted that focused on identifying and comparing comorbidities for asthma in children and adults. Relative to the tools surveyed, the initial results suggest that PubMedMiner provides complementary functionality for summarizing and comparing topics as well as identifying potentially new knowledge.

  1. Information mining over heterogeneous and high-dimensional time-series data in clinical trials databases.

    PubMed

    Altiparmak, Fatih; Ferhatosmanoglu, Hakan; Erdal, Selnur; Trost, Donald C

    2006-04-01

    An effective analysis of clinical trials data involves analyzing different types of data such as heterogeneous and high dimensional time series data. The current time series analysis methods generally assume that the series at hand have sufficient length to apply statistical techniques to them. Other ideal case assumptions are that data are collected in equal length intervals, and while comparing time series, the lengths are usually expected to be equal to each other. However, these assumptions are not valid for many real data sets, especially for the clinical trials data sets. An addition, the data sources are different from each other, the data are heterogeneous, and the sensitivity of the experiments varies by the source. Approaches for mining time series data need to be revisited, keeping the wide range of requirements in mind. In this paper, we propose a novel approach for information mining that involves two major steps: applying a data mining algorithm over homogeneous subsets of data, and identifying common or distinct patterns over the information gathered in the first step. Our approach is implemented specifically for heterogeneous and high dimensional time series clinical trials data. Using this framework, we propose a new way of utilizing frequent itemset mining, as well as clustering and declustering techniques with novel distance metrics for measuring similarity between time series data. By clustering the data, we find groups of analytes (substances in blood) that are most strongly correlated. Most of these relationships already known are verified by the clinical panels, and, in addition, we identify novel groups that need further biomedical analysis. A slight modification to our algorithm results an effective declustering of high dimensional time series data, which is then used for "feature selection." Using industry-sponsored clinical trials data sets, we are able to identify a small set of analytes that effectively models the state of normal health.

  2. Magnetic signature of overbank sediment in industry impacted floodplains identified by data mining methods

    NASA Astrophysics Data System (ADS)

    Chudaničová, Monika; Hutchinson, Simon M.

    2016-11-01

    Our study attempts to identify a characteristic magnetic signature of overbank sediments exhibiting anthropogenically induced magnetic enhancement and thereby to distinguish them from unenhanced sediments with weak magnetic background values, using a novel approach based on data mining methods, thus providing a mean of rapid pollution determination. Data were obtained from 539 bulk samples from vertical profiles through overbank sediment, collected on seven rivers in the eastern Czech Republic and three rivers in northwest England. k-Means clustering and hierarchical clustering methods, paired group (UPGMA) and Ward's method, were used to divide the samples to natural groups according to their attributes. Interparametric ratios: SIRM/χ; SIRM/ARM; and S-0.1T were chosen as attributes for analyses making the resultant model more widely applicable as magnetic concentration values can differ by two orders. Division into three clusters appeared to be optimal and corresponded to inherent clusters in the data scatter. Clustering managed to separate samples with relatively weak anthropogenically induced enhancement, relatively strong anthropogenically induced enhancement and samples lacking enhancement. To describe the clusters explicitly and thus obtain a discrete magnetic signature, classification rules (JRip method) and decision trees (J4.8 and Simple Cart methods) were used. Samples lacking anthropogenic enhancement typically exhibited an S-0.1T < c. 0.5, SIRM/ARM < c. 150 and SIRM/χ < c. 6000 A m-1. Samples with magnetic enhancement all exhibited an S-0.1T > 0.5. Samples with relatively stronger anthropogenic enhancement were unequivocally distinguished from the samples with weaker enhancement by an SIRM/ARM > c. 150. Samples with SIRM/ARM in a range c. 126-150 were classified as relatively strongly enhanced when their SIRM/χ > 18 000 A m-1 and relatively less enhanced when their SIRM/χ < 18 000 A m-1. An additional rule was arbitrary added to exclude samples with

  3. Selection of remedial alternatives for mine sites: a multicriteria decision analysis approach.

    PubMed

    Betrie, Getnet D; Sadiq, Rehan; Morin, Kevin A; Tesfamariam, Solomon

    2013-04-15

    The selection of remedial alternatives for mine sites is a complex task because it involves multiple criteria and often with conflicting objectives. However, an existing framework used to select remedial alternatives lacks multicriteria decision analysis (MCDA) aids and does not consider uncertainty in the selection of alternatives. The objective of this paper is to improve the existing framework by introducing deterministic and probabilistic MCDA methods. The Preference Ranking Organization Method for Enrichment Evaluation (PROMETHEE) methods have been implemented in this study. The MCDA analysis involves processing inputs to the PROMETHEE methods that are identifying the alternatives, defining the criteria, defining the criteria weights using analytical hierarchical process (AHP), defining the probability distribution of criteria weights, and conducting Monte Carlo Simulation (MCS); running the PROMETHEE methods using these inputs; and conducting a sensitivity analysis. A case study was presented to demonstrate the improved framework at a mine site. The results showed that the improved framework provides a reliable way of selecting remedial alternatives as well as quantifying the impact of different criteria on selecting alternatives. Copyright © 2013 Elsevier Ltd. All rights reserved.

  4. Development and testing of a text-mining approach to analyse patients' comments on their experiences of colorectal cancer care.

    PubMed

    Wagland, Richard; Recio-Saucedo, Alejandra; Simon, Michael; Bracher, Michael; Hunt, Katherine; Foster, Claire; Downing, Amy; Glaser, Adam; Corner, Jessica

    2016-08-01

    Quality of cancer care may greatly impact on patients' health-related quality of life (HRQoL). Free-text responses to patient-reported outcome measures (PROMs) provide rich data but analysis is time and resource-intensive. This study developed and tested a learning-based text-mining approach to facilitate analysis of patients' experiences of care and develop an explanatory model illustrating impact on HRQoL. Respondents to a population-based survey of colorectal cancer survivors provided free-text comments regarding their experience of living with and beyond cancer. An existing coding framework was tested and adapted, which informed learning-based text mining of the data. Machine-learning algorithms were trained to identify comments relating to patients' specific experiences of service quality, which were verified by manual qualitative analysis. Comparisons between coded retrieved comments and a HRQoL measure (EQ5D) were explored. The survey response rate was 63.3% (21 802/34 467), of which 25.8% (n=5634) participants provided free-text comments. Of retrieved comments on experiences of care (n=1688), over half (n=1045, 62%) described positive care experiences. Most negative experiences concerned a lack of post-treatment care (n=191, 11% of retrieved comments) and insufficient information concerning self-management strategies (n=135, 8%) or treatment side effects (n=160, 9%). Associations existed between HRQoL scores and coded algorithm-retrieved comments. Analysis indicated that the mechanism by which service quality impacted on HRQoL was the extent to which services prevented or alleviated challenges associated with disease and treatment burdens. Learning-based text mining techniques were found useful and practical tools to identify specific free-text comments within a large dataset, facilitating resource-efficient qualitative analysis. This method should be considered for future PROM analysis to inform policy and practice. Study findings indicated that

  5. Stopping Antidepressants and Anxiolytics as Major Concerns Reported in Online Health Communities: A Text Mining Approach.

    PubMed

    Abbe, Adeline; Falissard, Bruno

    2017-10-23

    Internet is a particularly dynamic way to quickly capture the perceptions of a population in real time. Complementary to traditional face-to-face communication, online social networks help patients to improve self-esteem and self-help. The aim of this study was to use text mining on material from an online forum exploring patients' concerns about treatment (antidepressants and anxiolytics). Concerns about treatment were collected from discussion titles in patients' online community related to antidepressants and anxiolytics. To examine the content of these titles automatically, we used text mining methods, such as word frequency in a document-term matrix and co-occurrence of words using a network analysis. It was thus possible to identify topics discussed on the forum. The forum included 2415 discussions on antidepressants and anxiolytics over a period of 3 years. After a preprocessing step, the text mining algorithm identified the 99 most frequently occurring words in titles, among which were escitalopram, withdrawal, antidepressant, venlafaxine, paroxetine, and effect. Patients' concerns were related to antidepressant withdrawal, the need to share experience about symptoms, effects, and questions on weight gain with some drugs. Patients' expression on the Internet is a potential additional resource in addressing patients' concerns about treatment. Patient profiles are close to that of patients treated in psychiatry. ©Adeline Abbe, Bruno Falissard. Originally published in JMIR Mental Health (http://mental.jmir.org), 23.10.2017.

  6. Text Mining for Neuroscience

    NASA Astrophysics Data System (ADS)

    Tirupattur, Naveen; Lapish, Christopher C.; Mukhopadhyay, Snehasis

    2011-06-01

    Text mining, sometimes alternately referred to as text analytics, refers to the process of extracting high-quality knowledge from the analysis of textual data. Text mining has wide variety of applications in areas such as biomedical science, news analysis, and homeland security. In this paper, we describe an approach and some relatively small-scale experiments which apply text mining to neuroscience research literature to find novel associations among a diverse set of entities. Neuroscience is a discipline which encompasses an exceptionally wide range of experimental approaches and rapidly growing interest. This combination results in an overwhelmingly large and often diffuse literature which makes a comprehensive synthesis difficult. Understanding the relations or associations among the entities appearing in the literature not only improves the researchers current understanding of recent advances in their field, but also provides an important computational tool to formulate novel hypotheses and thereby assist in scientific discoveries. We describe a methodology to automatically mine the literature and form novel associations through direct analysis of published texts. The method first retrieves a set of documents from databases such as PubMed using a set of relevant domain terms. In the current study these terms yielded a set of documents ranging from 160,909 to 367,214 documents. Each document is then represented in a numerical vector form from which an Association Graph is computed which represents relationships between all pairs of domain terms, based on co-occurrence. Association graphs can then be subjected to various graph theoretic algorithms such as transitive closure and cycle (circuit) detection to derive additional information, and can also be visually presented to a human researcher for understanding. In this paper, we present three relatively small-scale problem-specific case studies to demonstrate that such an approach is very successful in

  7. A data mining approach to predict in situ chlorinated ethene detoxification potential

    NASA Astrophysics Data System (ADS)

    Lee, J.; Im, J.; Kim, U.; Loeffler, F. E.

    2015-12-01

    Despite major advances in physicochemical remediation technologies, in situ biostimulation and bioaugmentation treatment aimed at stimulating Dehalococcoides mccartyi (Dhc) reductive dechlorination activity remains a cornerstone approach to remedy sites impacted with chlorinated ethenes. In practice, selecting the best remedial strategy is challenging due to uncertainties associated with the microbiology (e.g., presence and activity of Dhc) and geochemical factors influencing Dhc activity. Extensive groundwater datasets collected over decades of monitoring exist, but have not been systematically analyzed. In the present study, geochemical and microbial data sets collected from 35 wells at 5 contaminated sites were used to develop a predictive empirical model using a machine learning algorithm (i) to rank the relative importance of parameters that affect in situ reductive dechlorination potential, and (ii) to provide recommendations for selecting the optimal remediation strategy at a specific site. Classification and regression tree (CART) analysis was applied, and a representative classification tree model was developed that allowed short-term prediction of dechlorination potential. Indirect indicators for low dissolved oxygen (e.g., low NO3-and NO2-, high Fe2+ and CH4) were the most influential factors for predicting dechlorination potential, followed by total organic carbon content (TOC) and Dhc cell abundance. These findings indicate that machine learning-based data mining techniques applied to groundwater monitoring data can lead to the development of predictive groundwater remediation models. A major need for improving the predictive capabilities of the data mining approach is a curated, up-to-date and comprehensive collection of groundwater monitoring data.

  8. Multiagent data warehousing and multiagent data mining for cerebrum/cerebellum modeling

    NASA Astrophysics Data System (ADS)

    Zhang, Wen-Ran

    2002-03-01

    An algorithm named Neighbor-Miner is outlined for multiagent data warehousing and multiagent data mining. The algorithm is defined in an evolving dynamic environment with autonomous or semiautonomous agents. Instead of mining frequent itemsets from customer transactions, the new algorithm discovers new agents and mining agent associations in first-order logic from agent attributes and actions. While the Apriori algorithm uses frequency as a priory threshold, the new algorithm uses agent similarity as priory knowledge. The concept of agent similarity leads to the notions of agent cuboid, orthogonal multiagent data warehousing (MADWH), and multiagent data mining (MADM). Based on agent similarities and action similarities, Neighbor-Miner is proposed and illustrated in a MADWH/MADM approach to cerebrum/cerebellum modeling. It is shown that (1) semiautonomous neurofuzzy agents can be identified for uniped locomotion and gymnastic training based on attribute relevance analysis; (2) new agents can be discovered and agent cuboids can be dynamically constructed in an orthogonal MADWH, which resembles an evolving cerebrum/cerebellum system; and (3) dynamic motion laws can be discovered as association rules in first order logic. Although examples in legged robot gymnastics are used to illustrate the basic ideas, the new approach is generally suitable for a broad category of data mining tasks where knowledge can be discovered collectively by a set of agents from a geographically or geometrically distributed but relevant environment, especially in scientific and engineering data environments.

  9. Automated Assessment of Patients' Self-Narratives for Posttraumatic Stress Disorder Screening Using Natural Language Processing and Text Mining.

    PubMed

    He, Qiwei; Veldkamp, Bernard P; Glas, Cees A W; de Vries, Theo

    2017-03-01

    Patients' narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four machine-learning algorithms-including decision tree, naive Bayes, support vector machine, and an alternative classification approach called the product score model-were used in combination with n-gram representation models to identify patterns between verbal features in self-narratives and psychiatric diagnoses. With our sample, the product score model with unigrams attained the highest prediction accuracy when compared with practitioners' diagnoses. The addition of multigrams contributed most to balancing the metrics of sensitivity and specificity. This article also demonstrates that text mining is a promising approach for analyzing patients' self-expression behavior, thus helping clinicians identify potential patients from an early stage.

  10. What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    2004-01-01

    To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

  11. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  12. Mining Consumer Health Vocabulary from Community-Generated Text

    PubMed Central

    Vydiswaran, V.G. Vinod; Mei, Qiaozhu; Hanauer, David A.; Zheng, Kai

    2014-01-01

    Community-generated text corpora can be a valuable resource to extract consumer health vocabulary (CHV) and link them to professional terminologies and alternative variants. In this research, we propose a pattern-based text-mining approach to identify pairs of CHV and professional terms from Wikipedia, a large text corpus created and maintained by the community. A novel measure, leveraging the ratio of frequency of occurrence, was used to differentiate consumer terms from professional terms. We empirically evaluated the applicability of this approach using a large data sample consisting of MedLine abstracts and all posts from an online health forum, MedHelp. The results show that the proposed approach is able to identify synonymous pairs and label the terms as either consumer or professional term with high accuracy. We conclude that the proposed approach provides great potential to produce a high quality CHV to improve the performance of computational applications in processing consumer-generated health text. PMID:25954426

  13. IDENTIFYING RECENT SURFACE MINING ACTIVITIES USING A NORMALIZED DIFFERENCE VEGETATION INDEX (NDVI) CHANGE DETECTION METHOD

    EPA Science Inventory



    Coal mining is a major resource extraction activity on the Appalachian Mountains. The increased size and frequency of a specific type of surface mining, known as mountain top removal-valley fill, has in recent years raised various environmental concerns. During mountainto...

  14. Adaptive semantic tag mining from heterogeneous clinical research texts.

    PubMed

    Hao, T; Weng, C

    2015-01-01

    To develop an adaptive approach to mine frequent semantic tags (FSTs) from heterogeneous clinical research texts. We develop a "plug-n-play" framework that integrates replaceable unsupervised kernel algorithms with formatting, functional, and utility wrappers for FST mining. Temporal information identification and semantic equivalence detection were two example functional wrappers. We first compared this approach's recall and efficiency for mining FSTs from ClinicalTrials.gov to that of a recently published tag-mining algorithm. Then we assessed this approach's adaptability to two other types of clinical research texts: clinical data requests and clinical trial protocols, by comparing the prevalence trends of FSTs across three texts. Our approach increased the average recall and speed by 12.8% and 47.02% respectively upon the baseline when mining FSTs from ClinicalTrials.gov, and maintained an overlap in relevant FSTs with the base- line ranging between 76.9% and 100% for varying FST frequency thresholds. The FSTs saturated when the data size reached 200 documents. Consistent trends in the prevalence of FST were observed across the three texts as the data size or frequency threshold changed. This paper contributes an adaptive tag-mining framework that is scalable and adaptable without sacrificing its recall. This component-based architectural design can be potentially generalizable to improve the adaptability of other clinical text mining methods.

  15. SMM-system: A mining tool to identify specific markers in Salmonella enterica.

    PubMed

    Yu, Shuijing; Liu, Weibing; Shi, Chunlei; Wang, Dapeng; Dan, Xianlong; Li, Xiao; Shi, Xianming

    2011-03-01

    This report presents SMM-system, a software package that implements various personalized pre- and post-BLASTN tasks for mining specific markers of microbial pathogens. The main functionalities of SMM-system are summarized as follows: (i) converting multi-FASTA file, (ii) cutting interesting genomic sequence, (iii) automatic high-throughput BLASTN searches, and (iv) screening target sequences. The utility of SMM-system was demonstrated by using it to identify 214 Salmonella enterica-specific protein-coding sequences (CDSs). Eighteen primer pairs were designed based on eighteen S. enterica-specific CDSs, respectively. Seven of these primer pairs were validated with PCR assay, which showed 100% inclusivity for the 101 S. enterica genomes and 100% exclusivity of 30 non-S. enterica genomes. Three specific primer pairs were chosen to develop a multiplex PCR assay, which generated specific amplicons with a size of 180bp (SC1286), 238bp (SC1598) and 405bp (SC4361), respectively. This study demonstrates that SMM-system is a high-throughput specific marker generation tool that can be used to identify genus-, species-, serogroup- and even serovar-specific DNA sequences of microbial pathogens, which has a potential to be applied in food industries, diagnostics and taxonomic studies. SMM-system is freely available and can be downloaded from http://foodsafety.sjtu.edu.cn/SMM-system.html. Copyright © 2011 Elsevier B.V. All rights reserved.

  16. Investigation into the effect of infrastructure on fly-in fly-out mining workers.

    PubMed

    Perring, Adam; Pham, Kieu; Snow, Steve; Buys, Laurie

    2014-12-01

    To explore fly-in fly-out (FIFO) mining workers' attitudes towards the leisure time they spend in mining camps, the recreational and social aspects of mining camp culture, the camps' communal and recreational infrastructure and activities, and implications for health. In-depth semistructured interviews. Individual interviews at locations convenient for each participant. A total of seven participants, one female and six males. The age group varied within 20-59 years. Marital status varied across participants. A qualitative approach was used to interview participants, with responses thematically analysed. Findings highlight how the recreational infrastructure and activities at mining camps impact participants' enjoyment of the camps and their feelings of community and social inclusion. Three main areas of need were identified in the interviews, as follows: (i) on-site facilities and activities; (ii) the role of infrastructure in facilitating a sense of community; and (iii) barriers to social interaction. Recreational infrastructure and activities enhance the experience of FIFO workers at mining camps. The availability of quality recreational facilities helps promote social interaction, provides for greater social inclusion and improves the experience of mining camps for their temporary FIFO residents. The infrastructure also needs to allow for privacy and individual recreational activities, which participants identified as important emotional needs. Developing appropriate recreational infrastructure at mining camps would enhance social interactions among FIFO workers, improve their well-being and foster a sense of community. Introducing infrastructure to promote social and recreational activities could also reduce alcohol-related social exclusion. © 2014 National Rural Health Alliance Inc.

  17. Order batching in warehouses by minimizing total tardiness: a hybrid approach of weighted association rule mining and genetic algorithms.

    PubMed

    Azadnia, Amir Hossein; Taheri, Shahrooz; Ghadimi, Pezhman; Saman, Muhamad Zameri Mat; Wong, Kuan Yew

    2013-01-01

    One of the cost-intensive issues in managing warehouses is the order picking problem which deals with the retrieval of items from their storage locations in order to meet customer requests. Many solution approaches have been proposed in order to minimize traveling distance in the process of order picking. However, in practice, customer orders have to be completed by certain due dates in order to avoid tardiness which is neglected in most of the related scientific papers. Consequently, we proposed a novel solution approach in order to minimize tardiness which consists of four phases. First of all, weighted association rule mining has been used to calculate associations between orders with respect to their due date. Next, a batching model based on binary integer programming has been formulated to maximize the associations between orders within each batch. Subsequently, the order picking phase will come up which used a Genetic Algorithm integrated with the Traveling Salesman Problem in order to identify the most suitable travel path. Finally, the Genetic Algorithm has been applied for sequencing the constructed batches in order to minimize tardiness. Illustrative examples and comparisons are presented to demonstrate the proficiency and solution quality of the proposed approach.

  18. Defining hazard from the mine worker's perspective

    PubMed Central

    Eiter, B.M.; Kosmoski, C.L.; Connor, B.P.

    2016-01-01

    In the recent past, the mining industry has witnessed a substantial increase in the numbers of fatalities occurring at metal and nonmetal mine sites, but it is unclear why this is occurring. One possible explanation is that workers struggle with identifying worksite hazards and accurately assessing the associated risk. The purpose of this research was to explore this possibility within the mining industry and to more fully understand stone, sand and gravel (SSG) mine workers' thoughts, understandings and perceptions of worksite hazards and risks. Eight mine workers were interviewed and asked to identify common hazards they come across when doing their jobs and to then discuss their perceptions of the risks associated with those identified hazards. The results of this exploratory study indicate the importance of workers' job-related experience as it applies to hazard identification and risk perception, particularly their knowledge of or familiarity with a task, whether or not they had personal control over that task, and the frequency with which they perform that task. PMID:28042176

  19. Automated data mining: an innovative and efficient web-based approach to maintaining resident case logs.

    PubMed

    Bhattacharya, Pratik; Van Stavern, Renee; Madhavan, Ramesh

    2010-12-01

    Use of resident case logs has been considered by the Residency Review Committee for Neurology of the Accreditation Council for Graduate Medical Education (ACGME). This study explores the effectiveness of a data-mining program for creating resident logs and compares the results to a manual data-entry system. Other potential applications of data mining to enhancing resident education are also explored. Patient notes dictated by residents were extracted from the Hospital Information System and analyzed using an unstructured mining program. History, examination and ICD codes were obtained and compared to the existing manual log. The automated data History, examination, and ICD codes were gathered for a 30-day period and compared to manual case logs. The automated method extracted all resident dictations with the dates of encounter and transcription. The automated data-miner processed information from all 19 residents, while only 4 residents logged manually. The manual method identified only broad categories of diseases; the major categories were stroke or vascular disorder 53 (27.6%), epilepsy 28 (14.7%), and pain syndromes 26 (13.5%). In the automated method, epilepsy 114 (21.1%), cerebral atherosclerosis 114 (21.1%), and headache 105 (19.4%) were the most frequent primary diagnoses, and headache 89 (16.5%), seizures 94 (17.4%), and low back pain 47 (9%) were the most common chief complaints. More detailed patient information such as tobacco use 227 (42%), alcohol use 205 (38%), and drug use 38 (7%) were extracted by the data-mining method. Manual case logs are time-consuming, provide limited information, and may be unpopular with residents. Data mining is a time-effective tool that may aid in the assessment of resident experience or the ACGME core competencies or in resident clinical research. More study of this method in larger numbers of residency programs is needed.

  20. GROUNDWATER IMPACTED BY ACID MINE DRAINAGE

    EPA Science Inventory

    The generation and release of acidic, metal-rich water from mine wastes continues to be an intractable environmental problem. Although the effects of acid mine drainage (AMD) are most evident in surface waters, there is an obvious need for developing cost-effective approaches fo...

  1. Association rule mining in the US Vaccine Adverse Event Reporting System (VAERS).

    PubMed

    Wei, Lai; Scott, John

    2015-09-01

    Spontaneous adverse event reporting systems are critical tools for monitoring the safety of licensed medical products. Commonly used signal detection algorithms identify disproportionate product-adverse event pairs and may not be sensitive to more complex potential signals. We sought to develop a computationally tractable multivariate data-mining approach to identify product-multiple adverse event associations. We describe an application of stepwise association rule mining (Step-ARM) to detect potential vaccine-symptom group associations in the US Vaccine Adverse Event Reporting System. Step-ARM identifies strong associations between one vaccine and one or more adverse events. To reduce the number of redundant association rules found by Step-ARM, we also propose a clustering method for the post-processing of association rules. In sample applications to a trivalent intradermal inactivated influenza virus vaccine and to measles, mumps, rubella, and varicella (MMRV) vaccine and in simulation studies, we find that Step-ARM can detect a variety of medically coherent potential vaccine-symptom group signals efficiently. In the MMRV example, Step-ARM appears to outperform univariate methods in detecting a known safety signal. Our approach is sensitive to potentially complex signals, which may be particularly important when monitoring novel medical countermeasure products such as pandemic influenza vaccines. The post-processing clustering algorithm improves the applicability of the approach as a screening method to identify patterns that may merit further investigation. Copyright © 2015 John Wiley & Sons, Ltd.

  2. Redundancy and Novelty Mining in the Business Blogosphere

    ERIC Educational Resources Information Center

    Tsai, Flora S.; Chan, Kap Luk

    2010-01-01

    Purpose: The paper aims to explore the performance of redundancy and novelty mining in the business blogosphere, which has not been studied before. Design/methodology/approach: Novelty mining techniques are implemented to single out novel information out of a massive set of text documents. This paper adopted the mixed metric approach which…

  3. A Need for Systems Architecture Approach for Next Generation Mine Warfare Capability

    DTIC Science & Technology

    2006-09-01

    MRUUV Mission Reconfigurable Unmanned Undersea Vehicle MSC Mine Countermeasures Ship Coastal MSO Mine Countermeasures Ship Open-ocean P3I Preplanned...Helicopter, the Remote Mine Hunting System (RMS), the Mission Reconfigurable Unmanned Undersea Vehicle (MRUUV) and finally the Littoral Combat Ship (LCS...guarding against the sophisticated Soviet blue-water, air, and undersea threats. Yet since World War II, U.S. Naval Forces have suffered significantly

  4. Occupational respiratory diseases in the South African mining industry

    PubMed Central

    Nelson, Gill

    2013-01-01

    Background Crystalline silica and asbestos are common minerals that occur throughout South Africa, exposure to either causes respiratory disease. Most studies on silicosis in South Africa have been cross-sectional and long-term trends have not been reported. Although much research has been conducted on the health effects of silica dust and asbestos fibre in the gold-mining and asbestos-mining sectors, little is known about their health effects in other mining sectors. Objective The aims of this thesis were to describe silicosis trends in gold miners over three decades, and to explore the potential for diamond mine workers to develop asbestos-related diseases and platinum mine workers to develop silicosis. Methods Mine workers for the three sub-studies were identified from a mine worker autopsy database at the National Institute for Occupational Health. Results From 1975 to 2007, the proportions of white and black gold mine workers with silicosis increased from 18 to 22% and from 3 to 32% respectively. Cases of diamond and platinum mine workers with asbestos-related diseases and silicosis, respectively, were also identified. Conclusion The trends in silicosis in gold miners at autopsy clearly demonstrate the failure of the gold mines to adequately control dust and prevent occupational respiratory disease. The two case series of diamond and platinum mine workers contribute to the evidence for the risk of asbestos-related diseases in diamond mine workers and silicosis in platinum mine workers, respectively. The absence of reliable environmental dust measurements and incomplete work history records impedes occupational health research in South Africa because it is difficult to identify and/or validate sources of dust exposure that may be associated with occupational respiratory disease. PMID:23364097

  5. Occupational respiratory diseases in the South African mining industry.

    PubMed

    Nelson, Gill

    2013-01-24

    Crystalline silica and asbestos are common minerals that occur throughout South Africa, exposure to either causes respiratory disease. Most studies on silicosis in South Africa have been cross-sectional and long-term trends have not been reported. Although much research has been conducted on the health effects of silica dust and asbestos fibre in the gold-mining and asbestos-mining sectors, little is known about their health effects in other mining sectors. The aims of this thesis were to describe silicosis trends in gold miners over three decades, and to explore the potential for diamond mine workers to develop asbestos-related diseases and platinum mine workers to develop silicosis. Mine workers for the three sub-studies were identified from a mine worker autopsy database at the National Institute for Occupational Health. From 1975 to 2007, the proportions of white and black gold mine workers with silicosis increased from 18 to 22% and from 3 to 32% respectively. Cases of diamond and platinum mine workers with asbestos-related diseases and silicosis, respectively, were also identified. The trends in silicosis in gold miners at autopsy clearly demonstrate the failure of the gold mines to adequately control dust and prevent occupational respiratory disease. The two case series of diamond and platinum mine workers contribute to the evidence for the risk of asbestos-related diseases in diamond mine workers and silicosis in platinum mine workers, respectively. The absence of reliable environmental dust measurements and incomplete work history records impedes occupational health research in South Africa because it is difficult to identify and/or validate sources of dust exposure that may be associated with occupational respiratory disease.

  6. Factors influencing mine rescue team behaviors.

    PubMed

    Jansky, Jacqueline H; Kowalski-Trakofler, K M; Brnich, M J; Vaught, C

    2016-01-01

    A focus group study of the first moments in an underground mine emergency response was conducted by the National Institute for Occupational Safety and Health (NIOSH), Office for Mine Safety and Health Research. Participants in the study included mine rescue team members, team trainers, mine officials, state mining personnel, and individual mine managers. A subset of the data consists of responses from participants with mine rescue backgrounds. These responses were noticeably different from those given by on-site emergency personnel who were at the mine and involved with decisions made during the first moments of an event. As a result, mine rescue team behavior data were separated in the analysis and are reported in this article. By considering the responses from mine rescue team members and trainers, it was possible to sort the data and identify seven key areas of importance to them. On the basis of the responses from the focus group participants with a mine rescue background, the authors concluded that accurate and complete information and a unity of purpose among all command center personnel are two of the key conditions needed for an effective mine rescue operation.

  7. Overview of mine drainage geochemistry at historical mines, Humboldt River basin and adjacent mining areas, Nevada. Chapter E.

    USGS Publications Warehouse

    Nash, J. Thomas; Stillings, Lisa L.

    2004-01-01

    Reconnaissance hydrogeochemical studies of the Humboldt River basin and adjacent areas of northern Nevada have identified local sources of acidic waters generated by historical mine workings and mine waste. The mine-related acidic waters are rare and generally flow less than a kilometer before being neutralized by natural processes. Where waters have a pH of less than about 3, particularly in the presence of sulfide minerals, the waters take on high to extremely high concentrations of many potentially toxic metals. The processes that create these acidic, metal-rich waters in Nevada are the same as for other parts of the world, but the scale of transport and the fate of metals are much more localized because of the ubiquitous presence of caliche soils. Acid mine drainage is rare in historical mining districts of northern Nevada, and the volume of drainage rarely exceeds about 20 gpm. My findings are in close agreement with those of Price and others (1995) who estimated that less than 0.05 percent of inactive and abandoned mines in Nevada are likely to be a concern for acid mine drainage. Most historical mining districts have no draining mines. Only in two districts (Hilltop and National) does water affected by mining flow into streams of significant size and length (more than 8 km). Water quality in even the worst cases is naturally attenuated to meet water-quality standards within about 1 km of the source. Only a few historical mines release acidic water with elevated metal concentrations to small streams that reach the Humboldt River, and these contaminants and are not detectable in the Humboldt. These reconnaissance studies offer encouraging evidence that abandoned mines in Nevada create only minimal and local water-quality problems. Natural attenuation processes are sufficient to compensate for these relatively small sources of contamination. These results may provide useful analogs for future mining in the Humboldt River basin, but attention must be given to

  8. A multitrophic approach to monitoring the effects of metal mining in otherwise pristine and ecologically sensitive rivers in northern Canada.

    PubMed

    Spencer, Paula; Bowman, Michelle F; Dubé, Monique G

    2008-07-01

    It is not known if current chemical and biological monitoring methods are appropriate for assessing the impacts of growing industrial development on ecologically sensitive northern waters. We used a multitrophic level approach to evaluate current monitoring methods and to determine whether metal-mining activities had affected 2 otherwise pristine rivers that flow into the South Nahanni River, Northwest Territories, a World Heritage Site. We compared upstream reference conditions in the rivers to sites downstream and further downstream of mines. The endpoints we evaluated included concentrations of metals in river water, sediments, and liver and flesh of slimy sculpin (Cottus cognatus); benthic algal and macroinvertebrate abundance, richness, diversity, and community composition; and various slimy sculpin measures, our sentinel forage fish species. Elevated concentrations of copper and iron in liver tissue of sculpin from the Flat River were associated with high concentrations of mine-derived iron in river water and copper in sediments that were above national guidelines. In addition, sites downstream of the mine on the Flat River had increased algal abundances and altered benthic macroinvertebrate communities, whereas the sites downstream of the mine on Prairie Creek had increased benthic macroinvertebrate taxa richness and improved sculpin condition. Biological differences in both rivers were consistent with mild enrichment of the rivers downstream of current and historical mining activity. We recommend that monitoring in these northern rivers focus on indicators in epilithon and benthic macroinvertebrate communities due to their responsiveness and as alternatives to lethal fish sampling in habitats with low fish abundance. We also recommend monitoring of metal burdens in periphyton and benthic invertebrates for assessment of exposure to mine effluent and causal association. Although the effects of mining activities on riverine biota currently are limited, our

  9. Alzheimer's disease biomarker discovery using in silico literature mining and clinical validation

    PubMed Central

    2012-01-01

    Background Alzheimer’s Disease (AD) is the most widespread form of dementia in the elderly but despite progress made in recent years towards a mechanistic understanding, there is still an urgent need for disease modification therapy and for early diagnostic tests. Substantial international efforts are being made to discover and validate biomarkers for AD using candidate analytes and various data-driven 'omics' approaches. Cerebrospinal fluid is in many ways the tissue of choice for biomarkers of brain disease but is limited by patient and clinician acceptability, and increasing attention is being paid to the search for blood-based biomarkers. The aim of this study was to use a novel in silico approach to discover a set of candidate biomarkers for AD. Methods We used an in silico literature mining approach to identify potential biomarkers by creating a summarized set of assertional metadata derived from relevant legacy information. We then assessed the validity of this approach using direct assays of the identified biomarkers in plasma by immunodetection methods. Results Using this in silico approach, we identified 25 biomarker candidates, at least three of which have subsequently been reported to be altered in blood or CSF from AD patients. Two further candidate biomarkers, indicated from the in silico approach, were choline acetyltransferase and urokinase-type plasminogen activator receptor. Using immunodetection, we showed that, in a large sample set, these markers are either altered in disease or correlate with MRI markers of atrophy. Conclusions These data support as a proof of concept the use of data mining and in silico analyses to derive valid biomarker candidates for AD and, by extension, for other disorders. PMID:23113945

  10. Multifunctional greenway approach for landscape planning and reclamation of a post-mining district: Cartagena-La Unión, SE Spain

    NASA Astrophysics Data System (ADS)

    Acosta, Jose A.; Faz, Ángel; Zornoza, Raúl; Martínez-Martínez, Silvia; Kabas, Sebla; Bech, Jaume

    2015-04-01

    Fragmented structures create metaphorical wounds in the landscape altering the ecological and cultural processes associated with it, as it can be seen in many mine areas. Therefore it is advisable to organize the reclamation plan in the beginning of mine operating to provide spatial and functional integration of the landscape based on scientific arguments and with all possible legal and administrative means, which is generally the case of the Strategic Environmental Assessment. However, there are many abandon mine areas where no reclamation plan has been carried out, such as the case of Mining District of Sierra Minera Cartagena-La Unión, SE Spain. In these cases it is vital to respond in a sustainable manner for healing the landscape wounds of post-mining activities. Reclamation activities of a post-mining district includes not only the mine soils also all land uses around them, for this reason on necessary create practical solutions for returning the functions of ecologic and cultural processes of the area. Greenway approach shows the main veins which are crucial for keeping alive and sustaining the mentioned processes of the area. Therefore the main objectives of this study are to 1) develop an integrated local greenway network to be able to preserve significant resources and values of the district, and to 2) develop this greenway network as a part of reclamation process for degraded areas. Landscape assessments revealed the most valuable and potential connectivity resources of the area. These clustering and linear patterns of resource concentrations include mountain range and valleys, natural drainage network, legally protected areas and cultural-historical resources. Conservation areas, cultural-educational resources of post-mining activities and the riverbeds have been the main building stones for the greenway corridor. The multifunctional greenway approach serves as landscape reclamation and planning tool in a degraded area by showing the priority zones for

  11. Association between borderline dysnatremia and mortality insight into a new data mining approach.

    PubMed

    Girardeau, Yannick; Jannot, Anne-Sophie; Chatellier, Gilles; Saint-Jean, Olivier

    2017-11-22

    Even small variations of serum sodium concentration may be associated with mortality. Our objective was to confirm the impact of borderline dysnatremia for patients admitted to hospital on in-hospital mortality using real life care data from our electronic health record (EHR) and a phenome-wide association analysis (PheWAS). Retrospective observational study based on patient data admitted to Hôpital Européen George Pompidou, between 01/01/2008 and 31/06/2014; including 45,834 patients with serum sodium determinations on admission. We analyzed the association between dysnatremia and in-hospital mortality, using a multivariate logistic regression model to adjust for classical potential confounders. We performed a PheWAS to identify new potential confounders. Hyponatremia and hypernatremia were recorded for 12.0% and 1.0% of hospital stays, respectively. Adjusted odds ratios (ORa) for severe, moderate and borderline hyponatremia were 3.44 (95% CI, 2.41-4.86), 2.48 (95% CI, 1.96-3.13) and 1.98 (95% CI, 1.73-2.28), respectively. ORa for severe, moderate and borderline hypernatremia were 4.07 (95% CI, 2.92-5.62), 4.42 (95% CI, 2.04-9.20) and 3.72 (95% CI, 1.53-8.45), respectively. Borderline hyponatremia (ORa = 1.57 95% CI, 1.35-1.81) and borderline hypernatremia (ORa = 3.47 95% CI, 2.43-4.90) were still associated with in-hospital mortality after adjustment for classical and new confounding factors identified through the PheWAS analysis. Borderline dysnatremia on admission are independently associated with a higher risk of in-hospital mortality. By using medical data automatically collected in EHR and a new data mining approach, we identified new potential confounding factors that were highly associated with both mortality and dysnatremia.

  12. Assessing the effectiveness of sustainable land management policies for combating desertification: A data mining approach.

    PubMed

    Salvati, L; Kosmas, C; Kairis, O; Karavitis, C; Acikalin, S; Belgacem, A; Solé-Benet, A; Chaker, M; Fassouli, V; Gokceoglu, C; Gungor, H; Hessel, R; Khatteli, H; Kounalaki, A; Laouina, A; Ocakoglu, F; Ouessar, M; Ritsema, C; Sghaier, M; Sonmez, H; Taamallah, H; Tezcan, L; de Vente, J; Kelly, C; Colantoni, A; Carlucci, M

    2016-12-01

    This study investigates the relationship between fine resolution, local-scale biophysical and socioeconomic contexts within which land degradation occurs, and the human responses to it. The research draws on experimental data collected under different territorial and socioeconomic conditions at 586 field sites in five Mediterranean countries (Spain, Greece, Turkey, Tunisia and Morocco). We assess the level of desertification risk under various land management practices (terracing, grazing control, prevention of wildland fires, soil erosion control measures, soil water conservation measures, sustainable farming practices, land protection measures and financial subsidies) taken as possible responses to land degradation. A data mining approach, incorporating principal component analysis, non-parametric correlations, multiple regression and canonical analysis, was developed to identify the spatial relationship between land management conditions, the socioeconomic and environmental context (described using 40 biophysical and socioeconomic indicators) and desertification risk. Our analysis identified a number of distinct relationships between the level of desertification experienced and the underlying socioeconomic context, suggesting that the effectiveness of responses to land degradation is strictly dependent on the local biophysical and socioeconomic context. Assessing the latent relationship between land management practices and the biophysical/socioeconomic attributes characterizing areas exposed to different levels of desertification risk proved to be an indirect measure of the effectiveness of field actions contrasting land degradation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Reduction of Conflicts in Mining Development Using "Good Neighbor Agreements"

    NASA Astrophysics Data System (ADS)

    Masaitis, A.

    2013-05-01

    New environmental and social challenges for the mining industry in both developed and developing countries show the obvious need to implement "responsible" mining practices that include improved community involvement. Good Neighbor Agreements (GNA's) are a relatively new mechanism for improving communication and trust between a mining company and the community. The focus of a GNA will be to provide a written and enforceable agreement, negotiated between the concerned public and the respective mining company to respond to concerns from the public, and also provide a mechanism for conflict resolution, when there is mutual benefit to maintain a working relationship. Development of GNA's, a recently evolving process that promotes environmentally sound relationships between mines and the surrounding communities. Modify and apply the resulting GNA formulas to the developing countries and countries with transitional economies. This is particularly important for countries that have poorly functioning regulatory systems that cannot guarantee a healthy and safe environment for the communities. The fundamental questions addressed by this research. 1. This is a three-year research project started in August 2012 at the University of Nevada, Reno (UNR) to develop a Good Neighbor Agreements standards as well as to investigate the details of mine development. 2. Identify spheres of possible cooperation between mining companies, government organizations, and the Non-Governmental Organizations (NGO's). Use this cooperation to develop international standards for the GNA, to promote exchange of environmental information, and exchange of successful environmental, health, and safety practices between mining operations from different countries. Discussion: The Good Neighbor Agreement currently evolving will address the following: 1. Provide an economically viable mechanism for developing a partnership between mining operations and the local communities that will increase mining industry

  14. Study of application of ERTS-A imagery to fracture related mine safety hazards in the coal mining industry

    NASA Technical Reports Server (NTRS)

    Wier, C. E.; Wobber, F. J. (Principal Investigator); Russell, O. R.; Amato, R. V.

    1973-01-01

    The author has identified the following significant results. The utility of ERTS-1/high altitude aircraft imagery to detect underground mine hazards is strongly suggested. A 1:250,000 scale mined lands map of the Vincennes Quadrangle, Indiana has been prepared. This map is a prototype for a national mined lands inventory and will be distributed to State and Federal offices.

  15. Comparative Characterization of Crofelemer Samples Using Data Mining and Machine Learning Approaches With Analytical Stability Data Sets.

    PubMed

    Nariya, Maulik K; Kim, Jae Hyun; Xiong, Jian; Kleindl, Peter A; Hewarathna, Asha; Fisher, Adam C; Joshi, Sangeeta B; Schöneich, Christian; Forrest, M Laird; Middaugh, C Russell; Volkin, David B; Deeds, Eric J

    2017-11-01

    There is growing interest in generating physicochemical and biological analytical data sets to compare complex mixture drugs, for example, products from different manufacturers. In this work, we compare various crofelemer samples prepared from a single lot by filtration with varying molecular weight cutoffs combined with incubation for different times at different temperatures. The 2 preceding articles describe experimental data sets generated from analytical characterization of fractionated and degraded crofelemer samples. In this work, we use data mining techniques such as principal component analysis and mutual information scores to help visualize the data and determine discriminatory regions within these large data sets. The mutual information score identifies chemical signatures that differentiate crofelemer samples. These signatures, in many cases, would likely be missed by traditional data analysis tools. We also found that supervised learning classifiers robustly discriminate samples with around 99% classification accuracy, indicating that mathematical models of these physicochemical data sets are capable of identifying even subtle differences in crofelemer samples. Data mining and machine learning techniques can thus identify fingerprint-type attributes of complex mixture drugs that may be used for comparative characterization of products. Copyright © 2017 American Pharmacists Association®. All rights reserved.

  16. 30 CFR 47.21 - Identifying hazardous chemicals.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Identifying hazardous chemicals. 47.21 Section... TRAINING HAZARD COMMUNICATION (HazCom) Hazard Determination § 47.21 Identifying hazardous chemicals. The operator must evaluate each chemical brought on mine property and each chemical produced on mine property...

  17. 30 CFR 47.21 - Identifying hazardous chemicals.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Identifying hazardous chemicals. 47.21 Section... TRAINING HAZARD COMMUNICATION (HazCom) Hazard Determination § 47.21 Identifying hazardous chemicals. The operator must evaluate each chemical brought on mine property and each chemical produced on mine property...

  18. 30 CFR 47.21 - Identifying hazardous chemicals.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Identifying hazardous chemicals. 47.21 Section... TRAINING HAZARD COMMUNICATION (HazCom) Hazard Determination § 47.21 Identifying hazardous chemicals. The operator must evaluate each chemical brought on mine property and each chemical produced on mine property...

  19. 30 CFR 47.21 - Identifying hazardous chemicals.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Identifying hazardous chemicals. 47.21 Section... TRAINING HAZARD COMMUNICATION (HazCom) Hazard Determination § 47.21 Identifying hazardous chemicals. The operator must evaluate each chemical brought on mine property and each chemical produced on mine property...

  20. 30 CFR 47.21 - Identifying hazardous chemicals.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Identifying hazardous chemicals. 47.21 Section... TRAINING HAZARD COMMUNICATION (HazCom) Hazard Determination § 47.21 Identifying hazardous chemicals. The operator must evaluate each chemical brought on mine property and each chemical produced on mine property...

  1. Spectral methods to detect surface mines

    NASA Astrophysics Data System (ADS)

    Winter, Edwin M.; Schatten Silvious, Miranda

    2008-04-01

    Over the past five years, advances have been made in the spectral detection of surface mines under minefield detection programs at the U. S. Army RDECOM CERDEC Night Vision and Electronic Sensors Directorate (NVESD). The problem of detecting surface land mines ranges from the relatively simple, the detection of large anti-vehicle mines on bare soil, to the very difficult, the detection of anti-personnel mines in thick vegetation. While spatial and spectral approaches can be applied to the detection of surface mines, spatial-only detection requires many pixels-on-target such that the mine is actually imaged and shape-based features can be exploited. This method is unreliable in vegetated areas because only part of the mine may be exposed, while spectral detection is possible without the mine being resolved. At NVESD, hyperspectral and multi-spectral sensors throughout the reflection and thermal spectral regimes have been applied to the mine detection problem. Data has been collected on mines in forest and desert regions and algorithms have been developed both to detect the mines as anomalies and to detect the mines based on their spectral signature. In addition to the detection of individual mines, algorithms have been developed to exploit the similarities of mines in a minefield to improve their detection probability. In this paper, the types of spectral data collected over the past five years will be summarized along with the advances in algorithm development.

  2. An application of data mining in district heating substations for improving energy performance

    NASA Astrophysics Data System (ADS)

    Xue, Puning; Zhou, Zhigang; Chen, Xin; Liu, Jing

    2017-11-01

    Automatic meter reading system is capable of collecting and storing a huge number of district heating (DH) data. However, the data obtained are rarely fully utilized. Data mining is a promising technology to discover potential interesting knowledge from vast data. This paper applies data mining methods to analyse the massive data for improving energy performance of DH substation. The technical approach contains three steps: data selection, cluster analysis and association rule mining (ARM). Two-heating-season data of a substation are used for case study. Cluster analysis identifies six distinct heating patterns based on the primary heat of the substation. ARM reveals that secondary pressure difference and secondary flow rate have a strong correlation. Using the discovered rules, a fault occurring in remote flow meter installed at secondary network is detected accurately. The application demonstrates that data mining techniques can effectively extrapolate potential useful knowledge to better understand substation operation strategies and improve substation energy performance.

  3. Geotechnical approach for occupational safety risk analysis of critical slope in open pit mining as implication for earthquake hazard

    NASA Astrophysics Data System (ADS)

    Munirwansyah; Irsyam, Masyhur; Munirwan, Reza P.; Yunita, Halida; Zulfan Usrina, M.

    2018-05-01

    Occupational safety and health (OSH) is a planned effort to prevent accidents and diseases caused by work. In conducting mining activities often occur work accidents caused by unsafe field conditions. In open mine area, there is often a slump due to unstable slopes, which can disrupt the activities and productivity of mining companies. Based on research on stability of open pit slopes conducted by Febrianti [8], the Meureubo coal mine located in Aceh Barat district, on the slope of mine was indicated unsafe slope conditions, it will be continued research on OSH for landslide which is to understand the stability of the excavation slope and the shape of the slope collapse. Plaxis software was used for this research. After analyzing the slope stability and the effect of landslide on OSH with Job Safety Analysis (JSA) method, to identify the hazard to work safety, risk management analysis will be conducted to classified hazard level and its handling technique. This research aim is to know the level of risk of work accident at the company and its prevention effort. The result of risk analysis research is very high-risk value that is > 350 then the activity must be stopped until the risk can be reduced to reach the risk value limit < 20 which is allowed or accepted.

  4. Data-Mining Technologies for Diabetes: A Systematic Review

    PubMed Central

    Marinov, Miroslav; Mosa, Abu Saleh Mohammad; Yoo, Illhoi; Boren, Suzanne Austin

    2011-01-01

    Background The objective of this study is to conduct a systematic review of applications of data-mining techniques in the field of diabetes research. Method We searched the MEDLINE database through PubMed. We initially identified 31 articles by the search, and selected 17 articles representing various data-mining methods used for diabetes research. Our main interest was to identify research goals, diabetes types, data sets, data-mining methods, data-mining software and technologies, and outcomes. Results The applications of data-mining techniques in the selected articles were useful for extracting valuable knowledge and generating new hypothesis for further scientific research/experimentation and improving health care for diabetes patients. The results could be used for both scientific research and real-life practice to improve the quality of health care diabetes patients. Conclusions Data mining has played an important role in diabetes research. Data mining would be a valuable asset for diabetes researchers because it can unearth hidden knowledge from a huge amount of diabetes-related data. We believe that data mining can significantly help diabetes research and ultimately improve the quality of health care for diabetes patients. PMID:22226277

  5. American elm in mine land reforestation

    Treesearch

    M.B. Adams; P. Angel; C. Barton; J. Slavicek

    2015-01-01

    Reforestation of mined land in the Appalachians realizes many important benefits and provides important ecosystem services. Because much of the reclaimed mine lands in Appalachia were previously in forest, reclaiming these drastically disturbed areas to forests is desirable, feasible and cost-effective. The Forestry Reclamation Approach (FRA) provides a five-step...

  6. Hyperspectral analysis for qualitative and quantitative features related to acid mine drainage at a remediated open-pit mine

    NASA Astrophysics Data System (ADS)

    Davies, G.; Calvin, W. M.

    2015-12-01

    The exposure of pyrite to oxygen and water in mine waste environments is known to generate acidity and the accumulation of secondary iron minerals. Sulfates and secondary iron minerals associated with acid mine drainage (AMD) exhibit diverse spectral properties in the ultraviolet, visible and near-infrared regions of the electromagnetic spectrum. The use of hyperspectral imagery for identification of AMD mineralogy and contamination has been well studied. Fewer studies have examined the impacts of hydrologic variations on mapping AMD or the unique spectral signatures of mine waters. Open-pit mine lakes are an additional environmental hazard which have not been widely studied using imaging spectroscopy. A better understanding of AMD variation related to climate fluctuations and the spectral signatures of contaminated surface waters will aid future assessments of environmental contamination. This study examined the ability of multi-season airborne hyperspectral data to identify the geochemical evolution of substances and contaminant patterns at the Leviathan Mine Superfund site. The mine is located 24 miles southeast of Lake Tahoe and contains remnant tailings piles and several AMD collection ponds. The objectives were to 1) distinguish temporal changes in mineralogy at a the remediated open-pit sulfur mine, 2) identify the absorption features of mine affected waters, and 3) quantitatively link water spectra to known dissolved iron concentrations. Images from NASA's AVIRIS instrument were collected in the spring, summer, and fall seasons for two consecutive years at Leviathan (HyspIRI campaign). Images had a spatial resolution of 15 meters at nadir. Ground-based surveys using the ASD FieldSpecPro spectrometer and laboratory spectral and chemical analysis complemented the remote sensing data. Temporal changes in surface mineralogy were difficult to distinguish. However, seasonal changes in pond water quality were identified. Dissolved ferric iron and chlorophyll

  7. A data mining paradigm for identifying key factors in biological processes using gene expression data.

    PubMed

    Li, Jin; Zheng, Le; Uchiyama, Akihiko; Bin, Lianghua; Mauro, Theodora M; Elias, Peter M; Pawelczyk, Tadeusz; Sakowicz-Burkiewicz, Monika; Trzeciak, Magdalena; Leung, Donald Y M; Morasso, Maria I; Yu, Peng

    2018-06-13

    A large volume of biological data is being generated for studying mechanisms of various biological processes. These precious data enable large-scale computational analyses to gain biological insights. However, it remains a challenge to mine the data efficiently for knowledge discovery. The heterogeneity of these data makes it difficult to consistently integrate them, slowing down the process of biological discovery. We introduce a data processing paradigm to identify key factors in biological processes via systematic collection of gene expression datasets, primary analysis of data, and evaluation of consistent signals. To demonstrate its effectiveness, our paradigm was applied to epidermal development and identified many genes that play a potential role in this process. Besides the known epidermal development genes, a substantial proportion of the identified genes are still not supported by gain- or loss-of-function studies, yielding many novel genes for future studies. Among them, we selected a top gene for loss-of-function experimental validation and confirmed its function in epidermal differentiation, proving the ability of this paradigm to identify new factors in biological processes. In addition, this paradigm revealed many key genes in cold-induced thermogenesis using data from cold-challenged tissues, demonstrating its generalizability. This paradigm can lead to fruitful results for studying molecular mechanisms in an era of explosive accumulation of publicly available biological data.

  8. New approach for reduction of diesel consumption by comparing different mining haulage configurations.

    PubMed

    Rodovalho, Edmo da Cunha; Lima, Hernani Mota; de Tomi, Giorgio

    2016-05-01

    The mining operations of loading and haulage have an energy source that is highly dependent on fossil fuels. In mining companies that select trucks for haulage, this input is the main component of mining costs. How can the impact of the operational aspects on the diesel consumption of haulage operations in surface mines be assessed? There are many studies relating the consumption of fuel trucks to several variables, but a methodology that prioritizes higher-impact variables under each specific condition is not available. Generic models may not apply to all operational settings presented in the mining industry. This study aims to create a method of analysis, identification, and prioritization of variables related to fuel consumption of haul trucks in open pit mines. For this purpose, statistical analysis techniques and mathematical modelling tools using multiple linear regressions will be applied. The model is shown to be suitable because the results generate a good description of the fuel consumption behaviour. In the practical application of the method, the reduction of diesel consumption reached 10%. The implementation requires no large-scale investments or very long deadlines and can be applied to mining haulage operations in other settings. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Mining with Rare Cases

    NASA Astrophysics Data System (ADS)

    Weiss, Gary M.

    Rare cases are often the most interesting cases. For example, in medical diagnosis one is typically interested in identifying relatively rare diseases, such as cancer, rather than more frequently occurring ones, such as the common cold. In this chapter we discuss the role of rare cases in Data Mining. Specific problems associated with mining rare cases are discussed, followed by a description of methods for addressing these problems.

  10. Applications of multi-season hyperspectral remote sensing for acid mine water characterization and mapping of secondary iron minerals associated with acid mine drainage

    NASA Astrophysics Data System (ADS)

    Davies, Gwendolyn E.

    Acid mine drainage (AMD) resulting from the oxidation of sulfides in mine waste is a major environmental issue facing the mining industry today. Open pit mines, tailings ponds, ore stockpiles, and waste rock dumps can all be significant sources of pollution, primarily heavy metals. These large mining-induced footprints are often located across vast geographic expanses and are difficult to access. With the continuing advancement of imaging satellites, remote sensing may provide a useful monitoring tool for pit lake water quality and the rapid assessment of abandoned mine sites. This study explored the applications of laboratory spectroscopy and multi-season hyperspectral remote sensing for environmental monitoring of mine waste environments. Laboratory spectral experiments were first performed on acid mine waters and synthetic ferric iron solutions to identify and isolate the unique spectral properties of mine waters. These spectral characterizations were then applied to airborne hyperspectral imagery for identification of poor water quality in AMD ponds at the Leviathan Mine Superfund site, CA. Finally, imagery varying in temporal and spatial resolutions were used to identify changes in mineralogy over weathering overburden piles and on dry AMD pond liner surfaces at the Leviathan Mine. Results show the utility of hyperspectral remote sensing for monitoring a diverse range of surfaces associated with AMD.

  11. Text Mining in Organizational Research

    PubMed Central

    Kobayashi, Vladimer B.; Berkers, Hannah A.; Kismihók, Gábor; Den Hartog, Deanne N.

    2017-01-01

    Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies. PMID:29881248

  12. Text Mining in Organizational Research.

    PubMed

    Kobayashi, Vladimer B; Mol, Stefan T; Berkers, Hannah A; Kismihók, Gábor; Den Hartog, Deanne N

    2018-07-01

    Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.

  13. Socially Responsible Mining: the Relationship between Mining and Poverty, Human Health and the Environment

    PubMed Central

    Maier, Raina M.; Díaz-Barriga, Fernando; Field, James A.; Hopkins, James; Klein, Bern; Poulton, Mary M.

    2016-01-01

    Increasing global demand for metals is straining the ability of the mining industry to physically keep up with demand (physical scarcity). On the other hand, social issues including the environmental and human health consequences of mining as well as the disparity in income distribution from mining revenues are disproportionately felt at the local community level. This has created social rifts, particularly in the developing world, between affected communities and both industry and governments. Such rifts can result in a disruption of the steady supply of metals (situational scarcity). Here we discuss the importance of mining in relationship to poverty, identify steps that have been taken to create a framework for socially responsible mining, and then discuss the need for academia to work in partnership with communities, government, and industry to develop trans-disciplinary research-based step change solutions to the intertwined problems of physical and situational scarcity. PMID:24552962

  14. Activity recognition from minimal distinguishing subsequence mining

    NASA Astrophysics Data System (ADS)

    Iqbal, Mohammad; Pao, Hsing-Kuo

    2017-08-01

    Human activity recognition is one of the most important research topics in the era of Internet of Things. To separate different activities given sensory data, we utilize a Minimal Distinguishing Subsequence (MDS) mining approach to efficiently find distinguishing patterns among different activities. We first transform the sensory data into a series of sensor triggering events and operate the MDS mining procedure afterwards. The gap constraints are also considered in the MDS mining. Given the multi-class nature of most activity recognition tasks, we modify the MDS mining approach from a binary case to a multi-class one to fit the need for multiple activity recognition. We also study how to select the best parameter set including the minimal and the maximal support thresholds in finding the MDSs for effective activity recognition. Overall, the prediction accuracy is 86.59% on the van Kasteren dataset which consists of four different activities for recognition.

  15. Mercury methylation in mine wastes collected from abandoned mercury mines in the USA

    USGS Publications Warehouse

    Gray, J.E.; Hines, M.E.; Biester, H.; Lasorsa, B.K.; ,

    2003-01-01

    Speciation and transformation of Hg was studied in mine wastes collected from abandoned Hg mines at McDermitt, Nevada, and Terlingua, Texas, to evaluate formation of methyl-Hg, which is highly toxic. In these mine wastes, we measured total Hg and methyl-Hg contents, identified various Hg compounds using a pyrolysis technique, and determined rates of Hg methylation and methyl-Hg demethylation using isotopic-tracer methods. Mine wastes contain total Hg contents as high as 14000 ??g/g and methyl-Hg concentrations as high as 88 ng/g. Mine wastes were found to contain variable amounts of cinnabar, metacinnabar, Hg salts, Hg0, and Hg0 and Hg2+ sorbed onto matrix particulates. Samples with Hg0 and matrix-sorbed Hg generally contained significant methyl-Hg contents. Similarly, samples containing Hg0 compounds generally produced significant Hg methylation rates, as much as 26%/day. Samples containing mostly cinnabar showed little or no Hg methylation. Mine wastes with high methyl-Hg contents generally showed low methyl-Hg demethylation, suggesting that Hg methylation was dominant. Methyl-Hg demethylation was by both oxidative and microbial pathways. The correspondence of mine wastes containing Hg0 compounds and measured Hg methylation suggests that Hg0 oxidizes to Hg2+, which is subsequently bioavailable for microbial Hg methylation.

  16. Risk evaluation of uranium mining: A geochemical inverse modelling approach

    NASA Astrophysics Data System (ADS)

    Rillard, J.; Zuddas, P.; Scislewski, A.

    2011-12-01

    It is well known that uranium extraction operations can increase risks linked to radiation exposure. The toxicity of uranium and associated heavy metals is the main environmental concern regarding exploitation and processing of U-ore. In areas where U mining is planned, a careful assessment of toxic and radioactive element concentrations is recommended before the start of mining activities. A background evaluation of harmful elements is important in order to prevent and/or quantify future water contamination resulting from possible migration of toxic metals coming from ore and waste water interaction. Controlled leaching experiments were carried out to investigate processes of ore and waste (leached ore) degradation, using samples from the uranium exploitation site located in Caetité-Bahia, Brazil. In experiments in which the reaction of waste with water was tested, we found that the water had low pH and high levels of sulphates and aluminium. On the other hand, in experiments in which ore was tested, the water had a chemical composition comparable to natural water found in the region of Caetité. On the basis of our experiments, we suggest that waste resulting from sulphuric acid treatment can induce acidification and salinization of surface and ground water. For this reason proper storage of waste is imperative. As a tool to evaluate the risks, a geochemical inverse modelling approach was developed to estimate the water-mineral interaction involving the presence of toxic elements. We used a method earlier described by Scislewski and Zuddas 2010 (Geochim. Cosmochim. Acta 74, 6996-7007) in which the reactive surface area of mineral dissolution can be estimated. We found that the reactive surface area of rock parent minerals is not constant during time but varies according to several orders of magnitude in only two months of interaction. We propose that parent mineral heterogeneity and particularly, neogenic phase formation may explain the observed variation of the

  17. A Text-Mining Framework for Supporting Systematic Reviews.

    PubMed

    Li, Dingcheng; Wang, Zhen; Wang, Liwei; Sohn, Sunghwan; Shen, Feichen; Murad, Mohammad Hassan; Liu, Hongfang

    2016-11-01

    Systematic reviews (SRs) involve the identification, appraisal, and synthesis of all relevant studies for focused questions in a structured reproducible manner. High-quality SRs follow strict procedures and require significant resources and time. We investigated advanced text-mining approaches to reduce the burden associated with abstract screening in SRs and provide high-level information summary. A text-mining SR supporting framework consisting of three self-defined semantics-based ranking metrics was proposed, including keyword relevance, indexed-term relevance and topic relevance. Keyword relevance is based on the user-defined keyword list used in the search strategy. Indexed-term relevance is derived from indexed vocabulary developed by domain experts used for indexing journal articles and books. Topic relevance is defined as the semantic similarity among retrieved abstracts in terms of topics generated by latent Dirichlet allocation, a Bayesian-based model for discovering topics. We tested the proposed framework using three published SRs addressing a variety of topics (Mass Media Interventions, Rectal Cancer and Influenza Vaccine). The results showed that when 91.8%, 85.7%, and 49.3% of the abstract screening labor was saved, the recalls were as high as 100% for the three cases; respectively. Relevant studies identified manually showed strong topic similarity through topic analysis, which supported the inclusion of topic analysis as relevance metric. It was demonstrated that advanced text mining approaches can significantly reduce the abstract screening labor of SRs and provide an informative summary of relevant studies.

  18. Field-scale study of the influence of differing remediation strategies on trace metal geochemistry in metal mine tailings from the Irish Midlands.

    PubMed

    Perkins, William T; Bird, Graham; Jacobs, Suzanne R; Devoy, Cora

    2016-03-01

    Mine tailings represent a globally significant source of potentially harmful elements (PHEs) to the environment. The management of large volumes of mine tailings represents a major challenge to the mining industry and environmental managers. This field-scale study evaluates the impact of two highly contrasting remediation approaches to the management and stabilisation of mine tailings. The geochemistry of the tailings, overlying amendment layers and vegetation are examined in the light of the different management approaches. Pseudo-total As, Cd and Pb concentrations and solid-state partitioning (speciation), determined via sequential extraction, were established for two Tailings Management Facilities (TMFs) in Ireland subjected to the following: (1) a 'walk-away' approach (Silvermines) and (2) application of an amendment layer (Galmoy). PHE concentrations in roots and herbage of grasses growing on the TMFs were also determined. Results identify very different PHE concentration profiles with depth through the TMFs and the impact of remediation approach on concentrations and their potential bioavailability in the rooting zone of grass species. Data also highlight the importance of choice of grass species in remediation approaches and the benefits of relatively shallow-rooting Agrostis capillaris and Festuca rubra varieties. In addition, data from the Galmoy TMF indicate the importance of regional soil geochemistry for interpreting the influence of the PHE geochemistry of capping and amendment layers applied to mine tailings.

  19. Integrating Communication into Engineering Curricula: An Interdisciplinary Approach to Facilitating Transfer at New Mexico Institute of Mining and Technology

    ERIC Educational Resources Information Center

    Ford, Julie Dyke

    2012-01-01

    This program profile describes a new approach towards integrating communication within Mechanical Engineering curricula. The author, who holds a joint appointment between Technical Communication and Mechanical Engineering at New Mexico Institute of Mining and Technology, has been collaborating with Mechanical Engineering colleagues to establish a…

  20. Chapter 16: text mining for translational bioinformatics.

    PubMed

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  1. pubmed.mineR: an R package with text-mining algorithms to analyse PubMed abstracts.

    PubMed

    Rani, Jyoti; Shah, A B Rauf; Ramachandran, Srinivasan

    2015-10-01

    The PubMed literature database is a valuable source of information for scientific research. It is rich in biomedical literature with more than 24 million citations. Data-mining of voluminous literature is a challenging task. Although several text-mining algorithms have been developed in recent years with focus on data visualization, they have limitations such as speed, are rigid and are not available in the open source. We have developed an R package, pubmed.mineR, wherein we have combined the advantages of existing algorithms, overcome their limitations, and offer user flexibility and link with other packages in Bioconductor and the Comprehensive R Network (CRAN) in order to expand the user capabilities for executing multifaceted approaches. Three case studies are presented, namely, 'Evolving role of diabetes educators', 'Cancer risk assessment' and 'Dynamic concepts on disease and comorbidity' to illustrate the use of pubmed.mineR. The package generally runs fast with small elapsed times in regular workstations even on large corpus sizes and with compute intensive functions. The pubmed.mineR is available at http://cran.rproject. org/web/packages/pubmed.mineR.

  2. Mining the Temporal Dimension of the Information Propagation

    NASA Astrophysics Data System (ADS)

    Berlingerio, Michele; Coscia, Michele; Giannotti, Fosca

    In the last decade, Social Network Analysis has been a field in which the effort devoted from several researchers in the Data Mining area has increased very fast. Among the possible related topics, the study of the information propagation in a network attracted the interest of many researchers, also from the industrial world. However, only a few answers to the questions “How does the information propagates over a network, why and how fast?” have been discovered so far. On the other hand, these answers are of large interest, since they help in the tasks of finding experts in a network, assessing viral marketing strategies, identifying fast or slow paths of the information inside a collaborative network. In this paper we study the problem of finding frequent patterns in a network with the help of two different techniques: TAS (Temporally Annotated Sequences) mining, aimed at extracting sequential patterns where each transition between two events is annotated with a typical transition time that emerges from input data, and Graph Mining, which is helpful for locally analyzing the nodes of the networks with their properties. Finally we show preliminary results done in the direction of mining the information propagation over a network, performed on two well known email datasets, that show the power of the combination of these two approaches.

  3. Order Batching in Warehouses by Minimizing Total Tardiness: A Hybrid Approach of Weighted Association Rule Mining and Genetic Algorithms

    PubMed Central

    Taheri, Shahrooz; Mat Saman, Muhamad Zameri; Wong, Kuan Yew

    2013-01-01

    One of the cost-intensive issues in managing warehouses is the order picking problem which deals with the retrieval of items from their storage locations in order to meet customer requests. Many solution approaches have been proposed in order to minimize traveling distance in the process of order picking. However, in practice, customer orders have to be completed by certain due dates in order to avoid tardiness which is neglected in most of the related scientific papers. Consequently, we proposed a novel solution approach in order to minimize tardiness which consists of four phases. First of all, weighted association rule mining has been used to calculate associations between orders with respect to their due date. Next, a batching model based on binary integer programming has been formulated to maximize the associations between orders within each batch. Subsequently, the order picking phase will come up which used a Genetic Algorithm integrated with the Traveling Salesman Problem in order to identify the most suitable travel path. Finally, the Genetic Algorithm has been applied for sequencing the constructed batches in order to minimize tardiness. Illustrative examples and comparisons are presented to demonstrate the proficiency and solution quality of the proposed approach. PMID:23864823

  4. String Mining in Bioinformatics

    NASA Astrophysics Data System (ADS)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word “data-mining” is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  5. Mining Context-Aware Association Rules Using Grammar-Based Genetic Programming.

    PubMed

    Luna, Jose Maria; Pechenizkiy, Mykola; Del Jesus, Maria Jose; Ventura, Sebastian

    2017-09-25

    Real-world data usually comprise features whose interpretation depends on some contextual information. Such contextual-sensitive features and patterns are of high interest to be discovered and analyzed in order to obtain the right meaning. This paper formulates the problem of mining context-aware association rules, which refers to the search for associations between itemsets such that the strength of their implication depends on a contextual feature. For the discovery of this type of associations, a model that restricts the search space and includes syntax constraints by means of a grammar-based genetic programming methodology is proposed. Grammars can be considered as a useful way of introducing subjective knowledge to the pattern mining process as they are highly related to the background knowledge of the user. The performance and usefulness of the proposed approach is examined by considering synthetically generated datasets. A posteriori analysis on different domains is also carried out to demonstrate the utility of this kind of associations. For example, in educational domains, it is essential to identify and understand contextual and context-sensitive factors that affect overall and individual student behavior and performance. The results of the experiments suggest that the approach is feasible and it automatically identifies interesting context-aware associations from real-world datasets.

  6. Application of EREP imagery to fracture-related mine safety hazards in coal mining and mining-environmental problems in Indiana. [Indiana and Illinois

    NASA Technical Reports Server (NTRS)

    Wier, C. E. (Principal Investigator); Powell, R. L.; Amato, R. V.; Russell, O. R.; Martin, K. R.

    1975-01-01

    The author has identified the following significant results. This investigation evaluated the applicability of a variety of sensor types, formats, and resolution capabilities to the study of both fuel and nonfuel mined lands. The image reinforcement provided by stereo viewing of the EREP images proved useful for identifying lineaments and for mined lands mapping. Skylab S190B color and color infrared transparencies were the most useful EREP imagery. New information on lineament and fracture patterns in the bedrock of Indiana and Illinois extracted from analysis of the Skylab imagery has contributed to furthering the geological understanding of this portion of the Illinois basin.

  7. An Empirical Model for Mine-Blast Loading

    DTIC Science & Technology

    2014-10-17

    fledged experimental program. The numerical approach however suffers from several drawbacks in the mine blast simulations. First, it is a very...Suffield consisted in a pendulum type device to measure global impulse of buried mine [15]. One of the main purposes of the ONAGER pendulum was to study...TP-1 Terminal effects, KTA 1-34 report, 2004. [15] Bues, R., Hlady, S.L. and Bergeron, D.M., Pendulum Measurement of Land Mine Blast Output, Volume

  8. A systematic review of lost-time injuries in the global mining industry.

    PubMed

    Nowrouzi-Kia, Behdin; Gohar, Basem; Casole, Jennifer; Chidu, Carla; Dumond, Jennifer; McDougall, Alicia; Nowrouzi-Kia, Behnam

    2018-05-01

    Mining is a hazardous occupation with elevated rates of lost-time injury and disability. The purpose of this study is twofold: 1) To identify the type of lost-time injuries in the mining workforce, regardless of the kind of mining and 2) To examine the antecedent factors to the occupational injury (lost-time injuries). We identified and extracted primary papers related to lost-time injuries in the mining sector by conducting a systematic search of the electronic literature in the eight health and related databases. We critically reviewed nine articles in the mining sector that examined lost-time injuries. Musculoskeletal injuries (hand, back, limbs, fractures, lacerations and muscle contusions), slips and falls were identified as types of lost-time injuries. The review identified the following antecedent factors related to lost-time injuries: the mining work environment (underground mining), being male, age, working with mining equipment, organizational size, falling objects, disease status, job training and lack of occupational safety management teams, recovery time, social supports, access to health services, pre-injury health status and susceptibility to injury. The mining sector is a hazardous environment that increases workers' susceptibility to occupational injuries. There is a need to create and implement monitoring systems of lost-time injuries to implement prevention programs.

  9. A new genome-mining tool redefines the lasso peptide biosynthetic landscape

    PubMed Central

    Tietz, Jonathan I.; Schwalen, Christopher J.; Patel, Parth S.; Maxson, Tucker; Blair, Patricia M.; Tai, Hua-Chia; Zakai, Uzma I.; Mitchell, Douglas A.

    2016-01-01

    Ribosomally synthesized and post-translationally modified peptide (RiPP) natural products are attractive for genome-driven discovery and re-engineering, but limitations in bioinformatic methods and exponentially increasing genomic data make large-scale mining difficult. We report RODEO (Rapid ORF Description and Evaluation Online), which combines hidden Markov model-based analysis, heuristic scoring, and machine learning to identify biosynthetic gene clusters and predict RiPP precursor peptides. We initially focused on lasso peptides, which display intriguing physiochemical properties and bioactivities, but their hypervariability renders them challenging prospects for automated mining. Our approach yielded the most comprehensive mapping of lasso peptide space, revealing >1,300 compounds. We characterized the structures and bioactivities of six lasso peptides, prioritized based on predicted structural novelty, including an unprecedented handcuff-like topology and another with a citrulline modification exceptionally rare among bacteria. These combined insights significantly expand the knowledge of lasso peptides, and more broadly, provide a framework for future genome-mining efforts. PMID:28244986

  10. Investigating the management performance of disinfection analysis of water distribution networks using data mining approaches.

    PubMed

    Zounemat-Kermani, Mohammad; Ramezani-Charmahineh, Abdollah; Adamowski, Jan; Kisi, Ozgur

    2018-06-13

    Chlorination, the basic treatment utilized for drinking water sources, is widely used for water disinfection and pathogen elimination in water distribution networks. Thereafter, the proper prediction of chlorine consumption is of great importance in water distribution network performance. In this respect, data mining techniques-which have the ability to discover the relationship between dependent variable(s) and independent variables-can be considered as alternative approaches in comparison to conventional methods (e.g., numerical methods). This study examines the applicability of three key methods, based on the data mining approach, for predicting chlorine levels in four water distribution networks. ANNs (artificial neural networks, including the multi-layer perceptron neural network, MLPNN, and radial basis function neural network, RBFNN), SVM (support vector machine), and CART (classification and regression tree) methods were used to estimate the concentration of residual chlorine in distribution networks for three villages in Kerman Province, Iran. Produced water (flow), chlorine consumption, and residual chlorine were collected daily for 3 years. An assessment of the studied models using several statistical criteria (NSC, RMSE, R 2 , and SEP) indicated that, in general, MLPNN has the greatest capability for predicting chlorine levels followed by CART, SVM, and RBF-ANN. Weaker performance of the data-driven methods in the water distribution networks, in some cases, could be attributed to improper chlorination management rather than the methods' capability.

  11. Open-source tools for data mining.

    PubMed

    Zupan, Blaz; Demsar, Janez

    2008-03-01

    With a growing volume of biomedical databases and repositories, the need to develop a set of tools to address their analysis and support knowledge discovery is becoming acute. The data mining community has developed a substantial set of techniques for computational treatment of these data. In this article, we discuss the evolution of open-source toolboxes that data mining researchers and enthusiasts have developed over the span of a few decades and review several currently available open-source data mining suites. The approaches we review are diverse in data mining methods and user interfaces and also demonstrate that the field and its tools are ready to be fully exploited in biomedical research.

  12. Aquatic Ecosystem Enhancement at Mountaintop Mining Sites Symposium

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Black, D. Courtney; Lawson, Peter; Morgan, John

    2000-01-12

    Welcome to this symposium which is part of the ongoing effort to prepare an Environmental Impact Statement (EIS) regarding mountaintop mining and valley fills. The EIS is being prepared by the U.S. Environmental Protection Agency, U.S. Army Corps of Engineers, U.S. Office of Surface Mining, and U.S. Fish and Wildlife Service, in cooperation with the State of West Virginia. Aquatic Ecosystem Enhancement (AEE) at mountaintop mining sites is one of fourteen technical areas identified for study by the EIS Interagency Steering Committee. Three goals were identified in the AEE Work Plan: 1. Assess mining and reclamation practices to show howmore » mining operations might be carried out in a way that minimizes adverse impacts to streams and other environmental resources and to local communities. Clarify economic and technical constraints and benefits. 2. Help citizens clarify choices by showing whether there are affordable ways to enhance existing mining, reclamation, mitigation processes and/or procedures. 3. Ide identify data needed to improve environmental evaluation and design of mining projects to protect the environment. Today’s symposium was proposed in the AEE Team Work Plans but coordinated planning for the event began September 15, 1999 when representatives from coal industry, environmental groups and government regulators met in Morgantown. The meeting participants worked with a facilitator from the Canaan Valley Institute to outline plans for the symposium. Several teams were formed to carry out the plans we outlined in the meeting.« less

  13. Rapid Evaluation of Radioactive Contamination in Rare Earth Mine Mining

    NASA Astrophysics Data System (ADS)

    Wang, N.

    2017-12-01

    In order to estimate the current levels of environmental radioactivity in Bayan Obo rare earth mine and to study the rapid evaluation methods of radioactivity contamination in the rare earth mine, the surveys of the in-situ gamma-ray spectrometry and gamma dose rate measurement were carried out around the mining area and living area. The in-situ gamma-ray spectrometer was composed of a scintillation detector of NaI(Tl) (Φ75mm×75mm) and a multichannel analyzer. Our survey results in Bayan Obo Mine display: (1) Thorium-232 is the radioactive contamination source of this region, and uranium-238 and potassium - 40 is at the background level. (2) The average content of thorium-232 in the slag of the tailings dam in Bayan Obo is as high as 276 mg/kg, which is 37 times as the global average value of thorium content. (3) We found that the thorium-232 content in the soil in the living area near the mining is higher than that in the local soil in Guyang County. The average thorium-232 concentrations in the mining areas of the Bayan Obo Mine and the living areas of the Bayan Obo Town were 18.7±7.5 and 26.2±9.1 mg/kg, respectively. (4) It was observed that thorium-232 was abnormal distributed in the contaminated area near the tailings dam. Our preliminary research results show that the in-situ gamma-ray spectrometry is an effective approach of fast evaluating rare earths radioactive pollution, not only can the scene to determine the types of radioactive contamination source, but also to measure the radioactivity concentration of thorium and uranium in soil. The environmental radioactive evaluation of rare earth ore and tailings dam in open-pit mining is also needed. The research was supported by National Natural Science Foundation of China (No. 41674111).

  14. VALUING ACID MINE DRAINAGE REMEDIATION OF IMPAIRED WATERWAYS IN WEST VIRGINIA: A HEDONIC MODELING APPROACH

    EPA Science Inventory

    States with active and abandoned mines face large private and public costs to remediate damage to streams and rivers from acid mine drainage (AMD), the metal rich runoff flowing primarily from abandoned mines and surface deposits of mine waste. AMD can lower stream and river pH ...

  15. Mining of Business-Oriented Conversations at a Call Center

    NASA Astrophysics Data System (ADS)

    Takeuchi, Hironori; Nasukawa, Tetsuya; Watanabe, Hideo

    Recently it has become feasible to transcribe textual records from telephone conversations at call centers by using automatic speech recognition. In this research, we extended a text mining system for call summary records and constructed a conversation mining system for the business-oriented conversations at the call center. To acquire useful business insights from the conversational data through the text mining system, it is critical to identify appropriate textual segments and expressions as the viewpoints to focus on. In the analysis of call summary data using a text mining system, some experts defined the viewpoints for the analysis by looking at some sample records and by preparing the dictionaries based on frequent keywords in the sample dataset. However with conversations it is difficult to identify such viewpoints manually and in advance because the target data consists of complete transcripts that are often lengthy and redundant. In this research, we defined a model of the business-oriented conversations and proposed a mining method to identify segments that have impacts on the outcomes of the conversations and can then extract useful expressions in each of these identified segments. In the experiment, we processed the real datasets from a car rental service center and constructed a mining system. With this system, we show the effectiveness of the method based on the defined conversation model.

  16. Spatial variability of sediment erosion processes using GIS analysis within watersheds in a historically mined region, Patagonia Mountains, Arizona

    USGS Publications Warehouse

    Brady, Laura M.; Gray, Floyd; Wissler, Craig A.; Guertin, D. Phillip

    2001-01-01

    In this study, a geographic information system (GIS) is used to integrate and accurately map field studies, information from remotely sensed data, watershed models, and the dispersion of potentially toxic mine waste and tailings. The purpose of this study is to identify erosion rates and net sediment delivery of soil and mine waste/tailings to the drainage channel within several watershed regions to determine source areas of sediment delivery as a method of quantifying geo-environmental analysis of transport mechanisms in abandoned mine lands in arid climate conditions. Users of this study are the researchers interested in exploration of approaches to depicting historical activity in an area which has no baseline data records for environmental analysis of heavily mined terrain.

  17. Automatic target validation based on neuroscientific literature mining for tractography

    PubMed Central

    Vasques, Xavier; Richardet, Renaud; Hill, Sean L.; Slater, David; Chappelier, Jean-Cedric; Pralong, Etienne; Bloch, Jocelyne; Draganski, Bogdan; Cif, Laura

    2015-01-01

    Target identification for tractography studies requires solid anatomical knowledge validated by an extensive literature review across species for each seed structure to be studied. Manual literature review to identify targets for a given seed region is tedious and potentially subjective. Therefore, complementary approaches would be useful. We propose to use text-mining models to automatically suggest potential targets from the neuroscientific literature, full-text articles and abstracts, so that they can be used for anatomical connection studies and more specifically for tractography. We applied text-mining models to three structures: two well-studied structures, since validated deep brain stimulation targets, the internal globus pallidus and the subthalamic nucleus and, the nucleus accumbens, an exploratory target for treating psychiatric disorders. We performed a systematic review of the literature to document the projections of the three selected structures and compared it with the targets proposed by text-mining models, both in rat and primate (including human). We ran probabilistic tractography on the nucleus accumbens and compared the output with the results of the text-mining models and literature review. Overall, text-mining the literature could find three times as many targets as two man-weeks of curation could. The overall efficiency of the text-mining against literature review in our study was 98% recall (at 36% precision), meaning that over all the targets for the three selected seeds, only one target has been missed by text-mining. We demonstrate that connectivity for a structure of interest can be extracted from a very large amount of publications and abstracts. We believe this tool will be useful in helping the neuroscience community to facilitate connectivity studies of particular brain regions. The text mining tools used for the study are part of the HBP Neuroinformatics Platform, publicly available at http://connectivity-brainer.rhcloud.com/. PMID

  18. Detecting and characterizing coal mine related seismicity in the Western U.S. using subspace methods

    NASA Astrophysics Data System (ADS)

    Chambers, Derrick J. A.; Koper, Keith D.; Pankow, Kristine L.; McCarter, Michael K.

    2015-11-01

    We present an approach for subspace detection of small seismic events that includes methods for estimating magnitudes and associating detections from multiple stations into unique events. The process is used to identify mining related seismicity from a surface coal mine and an underground coal mining district, both located in the Western U.S. Using a blasting log and a locally derived seismic catalogue as ground truth, we assess detector performance in terms of verified detections, false positives and failed detections. We are able to correctly identify over 95 per cent of the surface coal mine blasts and about 33 per cent of the events from the underground mining district, while keeping the number of potential false positives relatively low by requiring all detections to occur on two stations. We find that most of the potential false detections for the underground coal district are genuine events missed by the local seismic network, demonstrating the usefulness of regional subspace detectors in augmenting local catalogues. We note a trade-off in detection performance between stations at smaller source-receiver distances, which have increased signal-to-noise ratio, and stations at larger distances, which have greater waveform similarity. We also explore the increased detection capabilities of a single higher dimension subspace detector, compared to multiple lower dimension detectors, in identifying events that can be described as linear combinations of training events. We find, in our data set, that such an advantage can be significant, justifying the use of a subspace detection scheme over conventional correlation methods.

  19. VALUING ACID MINE DRAINAGE REMEDIATION IN WEST VIRGINIA: A HEDONIC MODELING APPROACH

    EPA Science Inventory

    States with active and abandoned mines face large private and public costs to remediate damage to streams and rivers from acid mine drainage (AMD). Appalachian states have an especially large number of contaminated streams and rivers, and the USGS places AMD as the primary source...

  20. Mine Water Treatment in Hongai Coal Mines

    NASA Astrophysics Data System (ADS)

    Dang, Phuong Thao; Dang, Vu Chi

    2018-03-01

    Acid mine drainage (AMD) is recognized as one of the most serious environmental problem associated with mining industry. Acid water, also known as acid mine drainage forms when iron sulfide minerals found in the rock of coal seams are exposed to oxidizing conditions in coal mining. Until 2009, mine drainage in Hongai coal mines was not treated, leading to harmful effects on humans, animals and aquatic ecosystem. This report has examined acid mine drainage problem and techniques for acid mine drainage treatment in Hongai coal mines. In addition, selection and criteria for the design of the treatment systems have been presented.

  1. The influence of the scale of mining activity and mine site remediation on the contamination legacy of historical metal mining activity.

    PubMed

    Bird, Graham

    2016-12-01

    Globally, thousands of kilometres of rivers are degraded due to the presence of elevated concentrations of potentially harmful elements (PHEs) sourced from historical metal mining activity. In many countries, the presence of contaminated water and river sediment creates a legal requirement to address such problems. Remediation of mining-associated point sources has often been focused upon improving river water quality; however, this study evaluates the contaminant legacy present within river sediments and attempts to assess the influence of the scale of mining activity and post-mining remediation upon the magnitude of PHE contamination found within contemporary river sediments. Data collected from four exemplar catchments indicates a strong relationship between the scale of historical mining, as measured by ore output, and maximum PHE enrichment factors, calculated versus environmental quality guidelines. The use of channel slope as a proxy measure for the degree of channel-floodplain coupling indicates that enrichment factors for PHEs in contemporary river sediments may also be the highest where channel-floodplain coupling is the greatest. Calculation of a metric score for mine remediation activity indicates no clear influence of the scale of remediation activity and PHE enrichment factors for river sediments. It is suggested that whilst exemplars of significant successes at improving post-remediation river water quality can be identified; river sediment quality is a much more long-lasting environmental problem. In addition, it is suggested that improvements to river sediment quality do not occur quickly or easily as a result of remediation actions focused a specific mining point sources. Data indicate that PHEs continue to be episodically dispersed through river catchments hundreds of years after the cessation of mining activity, especially during flood flows. The high PHE loads of flood sediments in mining-affected river catchments and the predicted changes to

  2. Development and implementation of the Good Neighbor Agreement (GNA) practice in the USA sustainable mining development.

    NASA Astrophysics Data System (ADS)

    Masaitis, Alexandra

    2014-05-01

    New economic, environmental and social challenges for the mining industry in the USA show the need to implement "responsible" mining practices that include improved community involvement. Conflicts which occur in the US territory and with US mining companies around the world are now common between the mining proponents, NGO's and communities. These conflicts can sometimes be alleviated by early development of modes of communication, and a formal discussion format that allows airing of concerns and potential resolution of problems. One of the methods that can formalize this process is to establish a Good Neighbor Agreement (GNA), which deals specifically with challenges in relationships between mining operations and the local communities. It is a new practice related to mining operations that are oriented toward social needs and concerns of local communities that arise during the normal life of a mine, which can achieve sustainable mining practices. The GNA project being currently developed at the University of Nevada, USA in cooperation with the Newmont Mining Corporation has a goal of creating an open company/community dialog that will help identify and address sociological and environmental concerns associated with mining. Discussion: The Good Neighbor Agreement currently evolving will address the following: 1. Identify spheres of possible cooperation between mining companies, government organizations, and NGO's. 2. Provide an economically viable mechanism for developing a partnership between mining operations and the local communities that will increase mining industry's accountability and provide higher levels of confidence for the community that a mine is operated in a safe and sustainable manner. Implementation of the GNA can help identify and evaluate conflict criteria in mining/community relationships; determine the status of concerns; determine the role and responsibilities of stakeholders; analyze problem resolution feasibility; maintain the community

  3. Identifying Topics in Microblogs Using Wikipedia.

    PubMed

    Yıldırım, Ahmet; Üsküdarlı, Suzan; Özgür, Arzucan

    2016-01-01

    Twitter is an extremely high volume platform for user generated contributions regarding any topic. The wealth of content created at real-time in massive quantities calls for automated approaches to identify the topics of the contributions. Such topics can be utilized in numerous ways, such as public opinion mining, marketing, entertainment, and disaster management. Towards this end, approaches to relate single or partial posts to knowledge base items have been proposed. However, in microblogging systems like Twitter, topics emerge from the culmination of a large number of contributions. Therefore, identifying topics based on collections of posts, where individual posts contribute to some aspect of the greater topic is necessary. Models, such as Latent Dirichlet Allocation (LDA), propose algorithms for relating collections of posts to sets of keywords that represent underlying topics. In these approaches, figuring out what the specific topic(s) the keyword sets represent remains as a separate task. Another issue in topic detection is the scope, which is often limited to specific domain, such as health. This work proposes an approach for identifying domain-independent specific topics related to sets of posts. In this approach, individual posts are processed and then aggregated to identify key tokens, which are then mapped to specific topics. Wikipedia article titles are selected to represent topics, since they are up to date, user-generated, sophisticated articles that span topics of human interest. This paper describes the proposed approach, a prototype implementation, and a case study based on data gathered during the heavily contributed periods corresponding to the four US election debates in 2012. The manually evaluated results (0.96 precision) and other observations from the study are discussed in detail.

  4. Identifying Topics in Microblogs Using Wikipedia

    PubMed Central

    Yıldırım, Ahmet; Üsküdarlı, Suzan; Özgür, Arzucan

    2016-01-01

    Twitter is an extremely high volume platform for user generated contributions regarding any topic. The wealth of content created at real-time in massive quantities calls for automated approaches to identify the topics of the contributions. Such topics can be utilized in numerous ways, such as public opinion mining, marketing, entertainment, and disaster management. Towards this end, approaches to relate single or partial posts to knowledge base items have been proposed. However, in microblogging systems like Twitter, topics emerge from the culmination of a large number of contributions. Therefore, identifying topics based on collections of posts, where individual posts contribute to some aspect of the greater topic is necessary. Models, such as Latent Dirichlet Allocation (LDA), propose algorithms for relating collections of posts to sets of keywords that represent underlying topics. In these approaches, figuring out what the specific topic(s) the keyword sets represent remains as a separate task. Another issue in topic detection is the scope, which is often limited to specific domain, such as health. This work proposes an approach for identifying domain-independent specific topics related to sets of posts. In this approach, individual posts are processed and then aggregated to identify key tokens, which are then mapped to specific topics. Wikipedia article titles are selected to represent topics, since they are up to date, user-generated, sophisticated articles that span topics of human interest. This paper describes the proposed approach, a prototype implementation, and a case study based on data gathered during the heavily contributed periods corresponding to the four US election debates in 2012. The manually evaluated results (0.96 precision) and other observations from the study are discussed in detail. PMID:26991442

  5. Visual cues for data mining

    NASA Astrophysics Data System (ADS)

    Rogowitz, Bernice E.; Rabenhorst, David A.; Gerth, John A.; Kalin, Edward B.

    1996-04-01

    This paper describes a set of visual techniques, based on principles of human perception and cognition, which can help users analyze and develop intuitions about tabular data. Collections of tabular data are widely available, including, for example, multivariate time series data, customer satisfaction data, stock market performance data, multivariate profiles of companies and individuals, and scientific measurements. In our approach, we show how visual cues can help users perform a number of data mining tasks, including identifying correlations and interaction effects, finding clusters and understanding the semantics of cluster membership, identifying anomalies and outliers, and discovering multivariate relationships among variables. These cues are derived from psychological studies on perceptual organization, visual search, perceptual scaling, and color perception. These visual techniques are presented as a complement to the statistical and algorithmic methods more commonly associated with these tasks, and provide an interactive interface for the human analyst.

  6. Preventing spontaneous combustion after mine closing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lewicki, G.

    1987-11-01

    The author explains how the Northern Coal Company and a Houston-based firefighting firm developed an innovative technique to reduce the risk of spontaneous combustion after mine closing in its Rienau number2 Mine. The ''Light Water TM'' ATC series of firefighting foam concentrates were designed for extinguishing flammable liquid fires. By slightly altering the chemicals, the concentrates could be used to seal the coal ribs, floor, and roof, reducing the risk of combustion. Subsequent monitoring of the mine has identified no signs of heating.

  7. Mine-hunting dolphins of the Navy

    NASA Astrophysics Data System (ADS)

    Moore, Patrick W.

    1997-07-01

    Current counter-mine and obstacle avoidance technology is inadequate, and limits the Navy's capability to conduct shallow water (SW) and very shallow water (VSW) MCM in support of beach assaults by Marine Corps forces. Without information as to the location or density of mined beach areas, it must be assumed that if mines are present in one area then they are present in all areas. Marine mammal systems (MMS) are an unusual, effective and unique solution to current problems of mine and obstacle hunting. In the US Navy Mine Warfare Plan for 1994-1995 Marine Mammal Systems are explicitly identified as the Navy's only means of countering buried mines and the best means for dealing with close-tethered mines. The dolphins in these systems possess a biological sonar specifically adapted for their shallow and very shallow water habitat. Research has demonstrated that the dolphin biosonar outperforms any current hardware system available for SW and VSW applications. This presentation will cover current Fleet MCM systems and future technology application to the littoral region.

  8. Acid-base accounting to predict post-mining drainage quality on surface mines.

    PubMed

    Skousen, J; Simmons, J; McDonald, L M; Ziemkiewicz, P

    2002-01-01

    Acid-base accounting (ABA) is an analytical procedure that provides values to help assess the acid-producing and acid-neutralizing potential of overburden rocks prior to coal mining and other large-scale excavations. This procedure was developed by West Virginia University scientists during the 1960s. After the passage of laws requiring an assessment of surface mining on water quality, ABA became a preferred method to predict post-mining water quality, and permitting decisions for surface mines are largely based on the values determined by ABA. To predict the post-mining water quality, the amount of acid-producing rock is compared with the amount of acid-neutralizing rock, and a prediction of the water quality at the site (whether acid or alkaline) is obtained. We gathered geologic and geographic data for 56 mined sites in West Virginia, which allowed us to estimate total overburden amounts, and values were determined for maximum potential acidity (MPA), neutralization potential (NP), net neutralization potential (NNP), and NP to MPA ratios for each site based on ABA. These values were correlated to post-mining water quality from springs or seeps on the mined property. Overburden mass was determined by three methods, with the method used by Pennsylvania researchers showing the most accurate results for overburden mass. A poor relationship existed between MPA and post-mining water quality, NP was intermediate, and NNP and the NP to MPA ratio showed the best prediction accuracy. In this study, NNP and the NP to MPA ratio gave identical water quality prediction results. Therefore, with NP to MPA ratios, values were separated into categories: <1 should produce acid drainage, between 1 and 2 can produce either acid or alkaline water conditions, and >2 should produce alkaline water. On our 56 surface mined sites, NP to MPA ratios varied from 0.1 to 31, and six sites (11%) did not fit the expected pattern using this category approach. Two sites with ratios <1 did not

  9. A life-cycle description of underground coal mining

    NASA Technical Reports Server (NTRS)

    Lavin, M. L.; Borden, C. S.; Duda, J. R.

    1978-01-01

    An initial effort to relate the major technological and economic variables which impact conventional underground coal mining systems, in order to help identify promising areas for advanced mining technology is described. The point of departure is a series of investment analyses published by the United States Bureau of Mines, which provide both the analytical framework and guidance on a choice of variables.

  10. Manpower for the coal mining industry: an assessment of adequacy through the year 2000. Volume II. Technical approach. Final technical report. [USA; forecasting

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mendis, M.S.; Rosenberg, J.I.; Medville, D.M.

    1980-03-01

    This report presents a summary of the analytical approach taken and the conclusions reached in an assessment of the supply and demand for manpower in the coal mining industry through the year 2000. A hybrid system dynamics/econometric model of the coal mining industry was developed which incorporates relationships between technological change, labor productivity, production costs, wages, graduation rates, and other key variables in estimating imbalances between labor supply and demand. Study results indicate that while the supply of production workers is expected to be sufficient under most future demand scenarios, periodic shortages of experienced workers, especially in the Northern Greatmore » Plains can be expected. Other study findings are that the supply of mining engineers will be sufficient under all but the highest coal demand scenario, a shortage of faculty will affect the supply of mining engineers in the near-term and the employment of mining technicians is expected to exhibit the largest increase in any labor category studied. In this volume the nature of the coal mining manpower problem is discussed, a detailed description of that analysis conducted and the sources of data used is provided, and the findings of the study are presented.« less

  11. Compositional mining of multiple object API protocols through state abstraction.

    PubMed

    Dai, Ziying; Mao, Xiaoguang; Lei, Yan; Qi, Yuhua; Wang, Rui; Gu, Bin

    2013-01-01

    API protocols specify correct sequences of method invocations. Despite their usefulness, API protocols are often unavailable in practice because writing them is cumbersome and error prone. Multiple object API protocols are more expressive than single object API protocols. However, the huge number of objects of typical object-oriented programs poses a major challenge to the automatic mining of multiple object API protocols: besides maintaining scalability, it is important to capture various object interactions. Current approaches utilize various heuristics to focus on small sets of methods. In this paper, we present a general, scalable, multiple object API protocols mining approach that can capture all object interactions. Our approach uses abstract field values to label object states during the mining process. We first mine single object typestates as finite state automata whose transitions are annotated with states of interacting objects before and after the execution of the corresponding method and then construct multiple object API protocols by composing these annotated single object typestates. We implement our approach for Java and evaluate it through a series of experiments.

  12. Compositional Mining of Multiple Object API Protocols through State Abstraction

    PubMed Central

    Mao, Xiaoguang; Qi, Yuhua; Wang, Rui; Gu, Bin

    2013-01-01

    API protocols specify correct sequences of method invocations. Despite their usefulness, API protocols are often unavailable in practice because writing them is cumbersome and error prone. Multiple object API protocols are more expressive than single object API protocols. However, the huge number of objects of typical object-oriented programs poses a major challenge to the automatic mining of multiple object API protocols: besides maintaining scalability, it is important to capture various object interactions. Current approaches utilize various heuristics to focus on small sets of methods. In this paper, we present a general, scalable, multiple object API protocols mining approach that can capture all object interactions. Our approach uses abstract field values to label object states during the mining process. We first mine single object typestates as finite state automata whose transitions are annotated with states of interacting objects before and after the execution of the corresponding method and then construct multiple object API protocols by composing these annotated single object typestates. We implement our approach for Java and evaluate it through a series of experiments. PMID:23844378

  13. Determining Plant – Leaf Miner – Parasitoid Interactions: A DNA Barcoding Approach

    PubMed Central

    Derocles, Stéphane A. P.; Evans, Darren M.; Nichols, Paul C.; Evans, S. Aifionn; Lunt, David H.

    2015-01-01

    A major challenge in network ecology is to describe the full-range of species interactions in a community to create highly-resolved food-webs. We developed a molecular approach based on DNA full barcoding and mini-barcoding to describe difficult to observe plant – leaf miner – parasitoid interactions, consisting of animals commonly regarded as agricultural pests and their natural enemies. We tested the ability of universal primers to amplify the remaining DNA inside leaf miner mines after the emergence of the insect. We compared the results of a) morphological identification of adult specimens; b) identification based on the shape of the mines; c) the COI Mini-barcode (130 bp) and d) the COI full barcode (658 bp) fragments to accurately identify the leaf-miner species. We used the molecular approach to build and analyse a tri-partite ecological network of plant – leaf miner – parasitoid interactions. We were able to detect the DNA of leaf-mining insects within their feeding mines on a range of host plants using mini-barcoding primers: 6% for the leaves collected empty and 33% success after we observed the emergence of the leaf miner. We suggest that the low amplification success of leaf mines collected empty was mainly due to the time since the adult emerged and discuss methodological improvements. Nevertheless our approach provided new species-interaction data for the ecological network. We found that the 130 bp fragment is variable enough to identify all the species included in this study. Both COI fragments reveal that some leaf miner species could be composed of cryptic species. The network built using the molecular approach was more accurate in describing tri-partite interactions compared with traditional approaches based on morphological criteria. PMID:25710377

  14. A data mining approach to optimize pellets manufacturing process based on a decision tree algorithm.

    PubMed

    Ronowicz, Joanna; Thommes, Markus; Kleinebudde, Peter; Krysiński, Jerzy

    2015-06-20

    The present study is focused on the thorough analysis of cause-effect relationships between pellet formulation characteristics (pellet composition as well as process parameters) and the selected quality attribute of the final product. The shape using the aspect ratio value expressed the quality of pellets. A data matrix for chemometric analysis consisted of 224 pellet formulations performed by means of eight different active pharmaceutical ingredients and several various excipients, using different extrusion/spheronization process conditions. The data set contained 14 input variables (both formulation and process variables) and one output variable (pellet aspect ratio). A tree regression algorithm consistent with the Quality by Design concept was applied to obtain deeper understanding and knowledge of formulation and process parameters affecting the final pellet sphericity. The clear interpretable set of decision rules were generated. The spehronization speed, spheronization time, number of holes and water content of extrudate have been recognized as the key factors influencing pellet aspect ratio. The most spherical pellets were achieved by using a large number of holes during extrusion, a high spheronizer speed and longer time of spheronization. The described data mining approach enhances knowledge about pelletization process and simultaneously facilitates searching for the optimal process conditions which are necessary to achieve ideal spherical pellets, resulting in good flow characteristics. This data mining approach can be taken into consideration by industrial formulation scientists to support rational decision making in the field of pellets technology. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. A Review of Mine Rescue Ensembles for Underground Coal Mining in the United States.

    PubMed

    Kilinc, F Selcen; Monaghan, William D; Powell, Jeffrey B

    and regulatory agencies have been more restrictive by requiring additional post disaster information regarding atmospheric conditions and other hazards before exposing rescue workers and others in the aftermath of a mine disaster. In light of some of the more recent mine rescuer fatalities such as the Crandall Canyon Mine and Jim Walters Resources in the past years, the direction of reducing exposure is preferred. This review provides a historical perspective on ensembles used during mine rescue operations and summarizes environmental hazards, critical elements of mine rescue ensembles, and key problems with these elements. This study also identifies domains for improved mine rescue ensembles. Furthermore, field observations from several coal mine rescue teams were added to provide the information on the currently used mine rescue ensembles in the U.S.

  16. A Review of Mine Rescue Ensembles for Underground Coal Mining in the United States

    PubMed Central

    Kilinc, F. Selcen; Monaghan, William D.; Powell, Jeffrey B.

    2016-01-01

    and regulatory agencies have been more restrictive by requiring additional post disaster information regarding atmospheric conditions and other hazards before exposing rescue workers and others in the aftermath of a mine disaster. In light of some of the more recent mine rescuer fatalities such as the Crandall Canyon Mine and Jim Walters Resources in the past years, the direction of reducing exposure is preferred. This review provides a historical perspective on ensembles used during mine rescue operations and summarizes environmental hazards, critical elements of mine rescue ensembles, and key problems with these elements. This study also identifies domains for improved mine rescue ensembles. Furthermore, field observations from several coal mine rescue teams were added to provide the information on the currently used mine rescue ensembles in the U.S. PMID:27065231

  17. Text mining to decipher free-response consumer complaints: insights from the NHTSA vehicle owner's complaint database.

    PubMed

    Ghazizadeh, Mahtab; McDonald, Anthony D; Lee, John D

    2014-09-01

    This study applies text mining to extract clusters of vehicle problems and associated trends from free-response data in the National Highway Traffic Safety Administration's vehicle owner's complaint database. As the automotive industry adopts new technologies, it is important to systematically assess the effect of these changes on traffic safety. Driving simulators, naturalistic driving data, and crash databases all contribute to a better understanding of how drivers respond to changing vehicle technology, but other approaches, such as automated analysis of incident reports, are needed. Free-response data from incidents representing two severity levels (fatal incidents and incidents involving injury) were analyzed using a text mining approach: latent semantic analysis (LSA). LSA and hierarchical clustering identified clusters of complaints for each severity level, which were compared and analyzed across time. Cluster analysis identified eight clusters of fatal incidents and six clusters of incidents involving injury. Comparisons showed that although the airbag clusters across the two severity levels have the same most frequent terms, the circumstances around the incidents differ. The time trends show clear increases in complaints surrounding the Ford/Firestone tire recall and the Toyota unintended acceleration recall. Increases in complaints may be partially driven by these recall announcements and the associated media attention. Text mining can reveal useful information from free-response databases that would otherwise be prohibitively time-consuming and difficult to summarize manually. Text mining can extend human analysis capabilities for large free-response databases to support earlier detection of problems and more timely safety interventions.

  18. Exploring Online Students' Self-Regulated Learning with Self-Reported Surveys and Log Files: A Data Mining Approach

    ERIC Educational Resources Information Center

    Cho, Moon-Heum; Yoo, Jin Soung

    2017-01-01

    Many researchers who are interested in studying students' online self-regulated learning (SRL) have heavily relied on self-reported surveys. Data mining is an alternative technique that can be used to discover students' SRL patterns from large data logs saved on a course management system. The purpose of this study was to identify students' online…

  19. Modelling the sensory space of varietal wines: Mining of large, unstructured text data and visualisation of style patterns.

    PubMed

    Valente, Carlo C; Bauer, Florian F; Venter, Fritz; Watson, Bruce; Nieuwoudt, Hélène H

    2018-03-21

    The increasingly large volumes of publicly available sensory descriptions of wine raises the question whether this source of data can be mined to extract meaningful domain-specific information about the sensory properties of wine. We introduce a novel application of formal concept lattices, in combination with traditional statistical tests, to visualise the sensory attributes of a big data set of some 7,000 Chenin blanc and Sauvignon blanc wines. Complexity was identified as an important driver of style in hereto uncharacterised Chenin blanc, and the sensory cues for specific styles were identified. This is the first study to apply these methods for the purpose of identifying styles within varietal wines. More generally, our interactive data visualisation and mining driven approach opens up new investigations towards better understanding of the complex field of sensory science.

  20. Systematic review of community health impacts of mountaintop removal mining.

    PubMed

    Boyles, Abee L; Blain, Robyn B; Rochester, Johanna R; Avanasi, Raghavendhran; Goldhaber, Susan B; McComb, Sofie; Holmgren, Stephanie D; Masten, Scott A; Thayer, Kristina A

    2017-10-01

    The objective of this evaluation is to understand the human health impacts of mountaintop removal (MTR) mining, the major method of coal mining in and around Central Appalachia. MTR mining impacts the air, water, and soil and raises concerns about potential adverse health effects in neighboring communities; exposures associated with MTR mining include particulate matter (PM), polycyclic aromatic hydrocarbons (PAHs), metals, hydrogen sulfide, and other recognized harmful substances. A systematic review was conducted of published studies of MTR mining and community health, occupational studies of MTR mining, and any available animal and in vitro experimental studies investigating the effects of exposures to MTR-mining-related chemical mixtures. Six databases (Embase, PsycINFO, PubMed, Scopus, Toxline, and Web of Science) were searched with customized terms, and no restrictions on publication year or language, through October 27, 2016. The eligibility criteria included all human population studies and animal models of human health, direct and indirect measures of MTR-mining exposure, any health-related effect or change in physiological response, and any study design type. Risk of bias was assessed for observational and experimental studies using an approach developed by the National Toxicology Program (NTP) Office of Health Assessment and Translation (OHAT). To provide context for these health effects, a summary of the exposure literature is included that focuses on describing findings for outdoor air, indoor air, and drinking water. From a literature search capturing 3088 studies, 33 human studies (29 community, four occupational), four experimental studies (two in rat, one in vitro and in mice, one in C. elegans), and 58 MTR mining exposure studies were identified. A number of health findings were reported in observational human studies, including cardiopulmonary effects, mortality, and birth defects. However, concerns for risk of bias were identified, especially

  1. A Novel Continuous Blood Pressure Estimation Approach Based on Data Mining Techniques.

    PubMed

    Miao, Fen; Fu, Nan; Zhang, Yuan-Ting; Ding, Xiao-Rong; Hong, Xi; He, Qingyun; Li, Ye

    2017-11-01

    Continuous blood pressure (BP) estimation using pulse transit time (PTT) is a promising method for unobtrusive BP measurement. However, the accuracy of this approach must be improved for it to be viable for a wide range of applications. This study proposes a novel continuous BP estimation approach that combines data mining techniques with a traditional mechanism-driven model. First, 14 features derived from simultaneous electrocardiogram and photoplethysmogram signals were extracted for beat-to-beat BP estimation. A genetic algorithm-based feature selection method was then used to select BP indicators for each subject. Multivariate linear regression and support vector regression were employed to develop the BP model. The accuracy and robustness of the proposed approach were validated for static, dynamic, and follow-up performance. Experimental results based on 73 subjects showed that the proposed approach exhibited excellent accuracy in static BP estimation, with a correlation coefficient and mean error of 0.852 and -0.001 ± 3.102 mmHg for systolic BP, and 0.790 and -0.004 ± 2.199 mmHg for diastolic BP. Similar performance was observed for dynamic BP estimation. The robustness results indicated that the estimation accuracy was lower by a certain degree one day after model construction but was relatively stable from one day to six months after construction. The proposed approach is superior to the state-of-the-art PTT-based model for an approximately 2-mmHg reduction in the standard derivation at different time intervals, thus providing potentially novel insights for cuffless BP estimation.

  2. Identifying Drug-Drug Interactions by Data Mining: A Pilot Study of Warfarin-Associated Drug Interactions.

    PubMed

    Hansen, Peter Wæde; Clemmensen, Line; Sehested, Thomas S G; Fosbøl, Emil Loldrup; Torp-Pedersen, Christian; Køber, Lars; Gislason, Gunnar H; Andersson, Charlotte

    2016-11-01

    Knowledge about drug-drug interactions commonly arises from preclinical trials, from adverse drug reports, or based on knowledge of mechanisms of action. Our aim was to investigate whether drug-drug interactions were discoverable without prior hypotheses using data mining. We focused on warfarin-drug interactions as the prototype. We analyzed altered prothrombin time (measured as international normalized ratio [INR]) after initiation of a novel prescription in previously INR-stable warfarin-treated patients with nonvalvular atrial fibrillation. Data sets were retrieved from clinical work. Random forest (a machine-learning method) was set up to predict altered INR levels after novel prescriptions. The most important drug groups from the analysis were further investigated using logistic regression in a new data set. Two hundred and twenty drug groups were analyzed in 61 190 novel prescriptions. We rediscovered 2 drug groups having known interactions (β-lactamase-resistant penicillins [dicloxacillin] and carboxamide derivatives) and 3 antithrombotic/anticoagulant agents (platelet aggregation inhibitors excluding heparin, direct thrombin inhibitors [dabigatran etexilate], and heparins) causing decreasing INR. Six drug groups with known interactions were rediscovered causing increasing INR (antiarrhythmics class III [amiodarone], other opioids [tramadol], glucocorticoids, triazole derivatives, and combinations of penicillins, including β-lactamase inhibitors) and two had a known interaction in a closely related drug group (oripavine derivatives [buprenorphine] and natural opium alkaloids). Antipropulsives had an unknown signal of increasing INR. We were able to identify known warfarin-drug interactions without a prior hypothesis using clinical registries. Additionally, we discovered a few potentially novel interactions. This opens up for the use of data mining to discover unknown drug-drug interactions in cardiovascular medicine. © 2016 American Heart Association

  3. Real-time diesel particulate monitor for underground mines.

    PubMed

    Noll, James; Janisko, Samuel; Mischler, Steven E

    The standard method for determining diesel particulate matter (DPM) exposures in underground metal/ nonmetal mines provides the average exposure concentration for an entire working shift, and several weeks might pass before results are obtained. The main problem with this approach is that it only indicates that an overexposure has occurred rather than providing the ability to prevent an overexposure or detect its cause. Conversely, real-time measurement would provide miners with timely information to allow engineering controls to be deployed immediately and to identify the major factors contributing to any overexposures. Toward this purpose, the National Institute for Occupational Safety and Health (NIOSH) developed a laser extinction method to measure real-time elemental carbon (EC) concentrations (EC is a DPM surrogate). To employ this method, NIOSH developed a person-wearable instrument that was commercialized in 2011. This paper evaluates this commercial instrument, including the calibration curve, limit of detection, accuracy, and potential interferences. The instrument was found to meet the NIOSH accuracy criteria and to be capable of measuring DPM concentrations at levels observed in underground mines. In addition, it was found that a submicron size selector was necessary to avoid interference from mine dust and that cigarette smoke can be an interference when sampling in enclosed cabs.

  4. Using a Data Mining Approach to Develop a Student Engagement-Based Institutional Typology. IR Applications, Volume 18, February 8, 2009

    ERIC Educational Resources Information Center

    Luan, Jing; Zhao, Chun-Mei; Hayek, John C.

    2009-01-01

    Data mining provides both systematic and systemic ways to detect patterns of student engagement among students at hundreds of institutions. Using traditional statistical techniques alone, the task would be significantly difficult--if not impossible--considering the size and complexity in both data and analytical approaches necessary for this…

  5. From IHE Audit Trails to XES Event Logs Facilitating Process Mining.

    PubMed

    Paster, Ferdinand; Helm, Emmanuel

    2015-01-01

    Recently Business Intelligence approaches like process mining are applied to the healthcare domain. The goal of process mining is to gain process knowledge, compliance and room for improvement by investigating recorded event data. Previous approaches focused on process discovery by event data from various specific systems. IHE, as a globally recognized basis for healthcare information systems, defines in its ATNA profile how real-world events must be recorded in centralized event logs. The following approach presents how audit trails collected by the means of ATNA can be transformed to enable process mining. Using the standardized audit trails provides the ability to apply these methods to all IHE based information systems.

  6. Geochemical Characterization of Mine Waste, Mine Drainage, and Stream Sediments at the Pike Hill Copper Mine Superfund Site, Orange County, Vermont

    USGS Publications Warehouse

    Piatak, Nadine M.; Seal, Robert R.; Hammarstrom, Jane M.; Kiah, Richard G.; Deacon, Jeffrey R.; Adams, Monique; Anthony, Michael W.; Briggs, Paul H.; Jackson, John C.

    2006-01-01

    The Pike Hill Copper Mine Superfund Site in the Vermont copper belt consists of the abandoned Smith, Eureka, and Union mines, all of which exploited Besshi-type massive sulfide deposits. The site was listed on the U.S. Environmental Protection Agency (USEPA) National Priorities List in 2004 due to aquatic ecosystem impacts. This study was intended to be a precursor to a formal remedial investigation by the USEPA, and it focused on the characterization of mine waste, mine drainage, and stream sediments. A related study investigated the effects of the mine drainage on downstream surface waters. The potential for mine waste and drainage to have an adverse impact on aquatic ecosystems, on drinking- water supplies, and to human health was assessed on the basis of mineralogy, chemical concentrations, acid generation, and potential for metals to be leached from mine waste and soils. The results were compared to those from analyses of other Vermont copper belt Superfund sites, the Elizabeth Mine and Ely Copper Mine, to evaluate if the waste material at the Pike Hill Copper Mine was sufficiently similar to that of the other mine sites that USEPA can streamline the evaluation of remediation technologies. Mine-waste samples consisted of oxidized and unoxidized sulfidic ore and waste rock, and flotation-mill tailings. These samples contained as much as 16 weight percent sulfides that included chalcopyrite, pyrite, pyrrhotite, and sphalerite. During oxidation, sulfides weather and may release potentially toxic trace elements and may produce acid. In addition, soluble efflorescent sulfate salts were identified at the mines; during rain events, the dissolution of these salts contributes acid and metals to receiving waters. Mine waste contained concentrations of cadmium, copper, and iron that exceeded USEPA Preliminary Remediation Goals. The concentrations of selenium in mine waste were higher than the average composition of eastern United States soils. Most mine waste was

  7. A methodological toolkit for field assessments of artisanally mined alluvial diamond deposits

    USGS Publications Warehouse

    Chirico, Peter G.; Malpeli, Katherine C.

    2014-01-01

    This toolkit provides a standardized checklist of critical issues relevant to artisanal mining-related field research. An integrated sociophysical geographic approach to collecting data at artisanal mine sites is outlined. The implementation and results of a multistakeholder approach to data collection, carried out in the assessment of Guinea’s artisanally mined diamond deposits, also are summarized. This toolkit, based on recent and successful field campaigns in West Africa, has been developed as a reference document to assist other government agencies or organizations in collecting the data necessary for artisanal diamond mining or similar natural resource assessments.

  8. Prospective data mining of six products in the US FDA Adverse Event Reporting System: disposition of events identified and impact on product safety profiles.

    PubMed

    Bailey, Steven; Singh, Ajay; Azadian, Robert; Huber, Peter; Blum, Michael

    2010-02-01

    The use of data mining has increased among regulators and pharmaceutical companies. The incremental value of data mining as an adjunct to traditional pharmacovigilance methods has yet to be demonstrated. Specifically, the utility in identifying new safety signals and the resources required to do so have not been elucidated. To analyse the number and types of disproportionately reported product-event combinations (DRPECs), as well as the final disposition of each, in order to understand the potential utility and resource implications of routinely conducting data mining in the US FDA Adverse Event Reporting System (AERS). We generated DRPECs from AERS for six of Wyeth's products, prospectively tracked their dispositions and evaluated the appropriate DRPECs in the company's safety database. We chose EB05 (the lower bound of the 90% confidence interval around the Empirical Bayes Geometric Mean) > or =2 as the appropriate metric, employing stratification based on age, sex and year of report. A total of 861 DRPECs were identified - the average number of DRPECs was 144 per product. The proportion of unique preferred terms (PTs) in AERS for each drug with an EB05 > or =2 was similar across the six products (5.1-8.5%). Overall, 64.0% (551) of the DRPECs were closed after the initial screening (44.8% labelled, 14.3% indication related, 4.9% non-interpretable). An additional 9.9% (85) had been reviewed within the prior year and were not further reviewed. The remaining 26.1% (225) required full case review. After review of all pertinent reports and additional data, it was determined which of the DRPECs necessitated a formal review by the company's ongoing Safety Review Team (SRT) process. In total, 3.6% (31/861) of the DRPECs, yielding 16 medical concepts, were reviewed by the SRT, leading to seven labelling changes. These labelling changes involved 1.9% of all DRPECs generated. Four of the six compounds reviewed as part of this pilot had an identified labelling change. The

  9. Multisource geological data mining and its utilization of uranium resources exploration

    NASA Astrophysics Data System (ADS)

    Zhang, Jie-lin

    2009-10-01

    Nuclear energy as one of clear energy sources takes important role in economic development in CHINA, and according to the national long term development strategy, many more nuclear powers will be built in next few years, so it is a great challenge for uranium resources exploration. Research and practice on mineral exploration demonstrates that utilizing the modern Earth Observe System (EOS) technology and developing new multi-source geological data mining methods are effective approaches to uranium deposits prospecting. Based on data mining and knowledge discovery technology, this paper uses multi-source geological data to character electromagnetic spectral, geophysical and spatial information of uranium mineralization factors, and provides the technical support for uranium prospecting integrating with field remote sensing geological survey. Multi-source geological data used in this paper include satellite hyperspectral image (Hyperion), high spatial resolution remote sensing data, uranium geological information, airborne radiometric data, aeromagnetic and gravity data, and related data mining methods have been developed, such as data fusion of optical data and Radarsat image, information integration of remote sensing and geophysical data, and so on. Based on above approaches, the multi-geoscience information of uranium mineralization factors including complex polystage rock mass, mineralization controlling faults and hydrothermal alterations have been identified, the metallogenic potential of uranium has been evaluated, and some predicting areas have been located.

  10. Texas lignite mining: Groundwater and slope stability control in the nineties and beyond

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lawrence J.

    As lignite mining in Texas approaches and exceeds depths of 200 feet below ground level, rising costs demand that innovative mining approaches be used in order to maintain the economic viability of lignite mining. Groundwater and slope stability problems multiply at these depths, resulting in increasing focus on how to control these costs. Dewatering costs are consistently rising for the lignite industry, as deeper mining encounters more and larger saturated sand bodies. These sands require dewatering in order to improve slope stability. Planning and analysis become more important as the number of wells grows beyond what can be managed withmore » a simple {open_quotes}cookie-cutter{close_quotes} approach. Slope stability plays an increasing role in mining concerns as deeper lignite is recovered. Slope stability causes several problems, including loss of lignite, increased rehandle, and hazards to personnel and equipment. Traditional lignite mine planning involved a fairly {open_quotes}generic{close_quotes} pit design with one design highwall angle, one design spoil angle, and little geotechnical evaluation of the deposit. This {open_quotes}one mine-one design{close_quotes} approach, while cost-effective in the past, is now being replaced by a more critical analysis of the design requirements of each area. Geotechnical evaluation plays an increasing role in the planning and operational aspects of lignite mining. Laboratory core sample test results can be used for slope stability modeling, in order to obtain more accurate design and operational information.« less

  11. Mining the SDSS SkyServer SQL queries log

    NASA Astrophysics Data System (ADS)

    Hirota, Vitor M.; Santos, Rafael; Raddick, Jordan; Thakar, Ani

    2016-05-01

    SkyServer, the Internet portal for the Sloan Digital Sky Survey (SDSS) astronomic catalog, provides a set of tools that allows data access for astronomers and scientific education. One of SkyServer data access interfaces allows users to enter ad-hoc SQL statements to query the catalog. SkyServer also presents some template queries that can be used as basis for more complex queries. This interface has logged over 330 million queries submitted since 2001. It is expected that analysis of this data can be used to investigate usage patterns, identify potential new classes of queries, find similar queries, etc. and to shed some light on how users interact with the Sloan Digital Sky Survey data and how scientists have adopted the new paradigm of e-Science, which could in turn lead to enhancements on the user interfaces and experience in general. In this paper we review some approaches to SQL query mining, apply the traditional techniques used in the literature and present lessons learned, namely, that the general text mining approach for feature extraction and clustering does not seem to be adequate for this type of data, and, most importantly, we find that this type of analysis can result in very different queries being clustered together.

  12. Application-Specific Graph Sampling for Frequent Subgraph Mining and Community Detection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Purohit, Sumit; Choudhury, Sutanay; Holder, Lawrence B.

    Graph mining is an important data analysis methodology, but struggles as the input graph size increases. The scalability and usability challenges posed by such large graphs make it imperative to sample the input graph and reduce its size. The critical challenge in sampling is to identify the appropriate algorithm to insure the resulting analysis does not suffer heavily from the data reduction. Predicting the expected performance degradation for a given graph and sampling algorithm is also useful. In this paper, we present different sampling approaches for graph mining applications such as Frequent Subgrpah Mining (FSM), and Community Detection (CD). Wemore » explore graph metrics such as PageRank, Triangles, and Diversity to sample a graph and conclude that for heterogeneous graphs Triangles and Diversity perform better than degree based metrics. We also present two new sampling variations for targeted graph mining applications. We present empirical results to show that knowledge of the target application, along with input graph properties can be used to select the best sampling algorithm. We also conclude that performance degradation is an abrupt, rather than gradual phenomena, as the sample size decreases. We present the empirical results to show that the performance degradation follows a logistic function.« less

  13. A network-based approach for semi-quantitative knowledge mining and its application to yield variability

    NASA Astrophysics Data System (ADS)

    Schauberger, Bernhard; Rolinski, Susanne; Müller, Christoph

    2016-12-01

    Variability of crop yields is detrimental for food security. Under climate change its amplitude is likely to increase, thus it is essential to understand the underlying causes and mechanisms. Crop models are the primary tool to project future changes in crop yields under climate change. A systematic overview of drivers and mechanisms of crop yield variability (YV) can thus inform crop model development and facilitate improved understanding of climate change impacts on crop yields. Yet there is a vast body of literature on crop physiology and YV, which makes a prioritization of mechanisms for implementation in models challenging. Therefore this paper takes on a novel approach to systematically mine and organize existing knowledge from the literature. The aim is to identify important mechanisms lacking in models, which can help to set priorities in model improvement. We structure knowledge from the literature in a semi-quantitative network. This network consists of complex interactions between growing conditions, plant physiology and crop yield. We utilize the resulting network structure to assign relative importance to causes of YV and related plant physiological processes. As expected, our findings confirm existing knowledge, in particular on the dominant role of temperature and precipitation, but also highlight other important drivers of YV. More importantly, our method allows for identifying the relevant physiological processes that transmit variability in growing conditions to variability in yield. We can identify explicit targets for the improvement of crop models. The network can additionally guide model development by outlining complex interactions between processes and by easily retrieving quantitative information for each of the 350 interactions. We show the validity of our network method as a structured, consistent and scalable dictionary of literature. The method can easily be applied to many other research fields.

  14. What Satisfies Students? Mining Student-Opinion Data with Regression and Decision-Tree Analysis. AIR 2002 Forum Paper.

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    To investigate how students' characteristics and experiences affect satisfaction, this study used regression and decision-tree analysis with the CHAID algorithm to analyze student opinion data from a sample of 1,783 college students. A data-mining approach identifies the specific aspects of students' university experience that most influence three…

  15. Large Mine Permitting - Div. of Mining, Land, and Water

    Science.gov Websites

    Pebble Project Pogo Mine Red Dog Mine Rock Creek Project True North Mine OPMP Canadian Large Projects Pebble Project Pogo Mine Red Dog Mine Rock Creek Project True North Mine Contact: Kyle Moselle Large Mine

  16. Data Integration and Mining for Synthetic Biology Design.

    PubMed

    Mısırlı, Göksel; Hallinan, Jennifer; Pocock, Matthew; Lord, Phillip; McLaughlin, James Alastair; Sauro, Herbert; Wipat, Anil

    2016-10-21

    One aim of synthetic biologists is to create novel and predictable biological systems from simpler modular parts. This approach is currently hampered by a lack of well-defined and characterized parts and devices. However, there is a wealth of existing biological information, which can be used to identify and characterize biological parts, and their design constraints in the literature and numerous biological databases. However, this information is spread among these databases in many different formats. New computational approaches are required to make this information available in an integrated format that is more amenable to data mining. A tried and tested approach to this problem is to map disparate data sources into a single data set, with common syntax and semantics, to produce a data warehouse or knowledge base. Ontologies have been used extensively in the life sciences, providing this common syntax and semantics as a model for a given biological domain, in a fashion that is amenable to computational analysis and reasoning. Here, we present an ontology for applications in synthetic biology design, SyBiOnt, which facilitates the modeling of information about biological parts and their relationships. SyBiOnt was used to create the SyBiOntKB knowledge base, incorporating and building upon existing life sciences ontologies and standards. The reasoning capabilities of ontologies were then applied to automate the mining of biological parts from this knowledge base. We propose that this approach will be useful to speed up synthetic biology design and ultimately help facilitate the automation of the biological engineering life cycle.

  17. Study of application of ERTS-A imagery to fracture-related mine safety hazards in the coal mining industry

    NASA Technical Reports Server (NTRS)

    Wier, C. E.; Wobber, F. J. (Principal Investigator); Russell, O. R.; Amato, R. V.; Leshendok, T.

    1973-01-01

    The author has identified the following significant results. The Kings Station Mine in Gibson County, Indiana has experienced considerable roof fall problems. Detailed fracture mapping of the mine area was done with ERTS-1 and aircraft imagery, and a prediction map of roof problem areas was produced in advance of a visit. The visit to the mine and discussions with the operator indicated that of four zones mapped as potential problem areas, three coincided with areas of excessive roof fall. This positive correlation of 75% lends confidence to the validity of the technique being applied in the investigation. The mine officials expressed an interest in the project and are anxious to see the final product maps which are forthcoming.

  18. Use of Lead Isotopes to Identify Sources of Metal and Metalloid Contaminants in Atmospheric Aerosol from Mining Operations

    PubMed Central

    Félix, Omar I.; Csavina, Janae; Field, Jason; Rine, Kyle P.; Sáez, A. Eduardo; Betterton, Eric A.

    2014-01-01

    Mining operations are a potential source of metal and metalloid contamination by atmospheric particulate generated from smelting activities, as well as from erosion of mine tailings. In this work, we show how lead isotopes can be used for source apportionment of metal and metalloid contaminants from the site of an active copper mine. Analysis of atmospheric aerosol shows two distinct isotopic signatures: one prevalent in fine particles (< 1 μm aerodynamic diameter) while the other corresponds to coarse particles as well as particles in all size ranges from a nearby urban environment. The lead isotopic ratios found in the fine particles are equal to those of the mine that provides the ore to the smelter. Topsoil samples at the mining site show concentrations of Pb and As decreasing with distance from the smelter. Isotopic ratios for the sample closest to the smelter (650 m) and from topsoil at all sample locations, extending to more than 1 km from the smelter, were similar to those found in fine particles in atmospheric dust. The results validate the use of lead isotope signatures for source apportionment of metal and metalloid contaminants transported by atmospheric particulate. PMID:25496740

  19. Automation and robotics technology for intelligent mining systems

    NASA Technical Reports Server (NTRS)

    Welsh, Jeffrey H.

    1989-01-01

    The U.S. Bureau of Mines is approaching the problems of accidents and efficiency in the mining industry through the application of automation and robotics to mining systems. This technology can increase safety by removing workers from hazardous areas of the mines or from performing hazardous tasks. The short-term goal of the Automation and Robotics program is to develop technology that can be implemented in the form of an autonomous mining machine using current continuous mining machine equipment. In the longer term, the goal is to conduct research that will lead to new intelligent mining systems that capitalize on the capabilities of robotics. The Bureau of Mines Automation and Robotics program has been structured to produce the technology required for the short- and long-term goals. The short-term goal of application of automation and robotics to an existing mining machine, resulting in autonomous operation, is expected to be accomplished within five years. Key technology elements required for an autonomous continuous mining machine are well underway and include machine navigation systems, coal-rock interface detectors, machine condition monitoring, and intelligent computer systems. The Bureau of Mines program is described, including status of key technology elements for an autonomous continuous mining machine, the program schedule, and future work. Although the program is directed toward underground mining, much of the technology being developed may have applications for space systems or mining on the Moon or other planets.

  20. Challenges in recovering resources from acid mine drainage

    USGS Publications Warehouse

    Nordstrom, D. Kirk; Bowell, Robert J.; Campbell, Kate M.; Alpers, Charles N.

    2017-01-01

    Metal recovery from mine waters and effluents is not a new approach but one that has occurred largely opportunistically over the last four millennia. Due to the need for low-cost resources and increasingly stringent environmental conditions, mine waters are being considered in a fresh light with a designed, deliberate approach to resource recovery often as part of a larger water treatment evaluation. Mine water chemistry is highly dependent on many factors including geology, ore deposit composition and mineralogy, mining methods, climate, site hydrology, and others. Mine waters are typically Ca-Mg-SO4±Al±Fe with a broad range in pH and metal content. The main issue in recovering components of these waters having potential economic value, such as base metals or rare earth elements, is the separation of these from more reactive metals such as Fe and Al. Broad categories of methods for separating and extracting substances from acidic mine drainage are chemical and biological. Chemical methods include solution, physicochemical, and electrochemical technologies. Advances in membrane techniques such as reverse osmosis have been substantial and the technique is both physical and chemical. Biological methods may be further divided into microbiological and macrobiological, but only the former is considered here as a recovery method, as the latter is typically used as a passive form of water treatment.

  1. Microbial and geochemical assessment of bauxitic un-mined and post-mined chronosequence soils from Mocho Mountains, Jamaica.

    PubMed

    Lewis, Dawn E; Chauhan, Ashvini; White, John R; Overholt, Will; Green, Stefan J; Jasrotia, Puja; Wafula, Denis; Jagoe, Charles

    2012-10-01

    Microorganisms are very sensitive to environmental change and can be used to gauge anthropogenic impacts and even predict restoration success of degraded environments. Here, we report assessment of bauxite mining activities on soil biogeochemistry and microbial community structure using un-mined and three post-mined sites in Jamaica. The post-mined soils represent a chronosequence, undergoing restoration since 1987, 1997, and 2007. Soils were collected during dry and wet seasons and analyzed for pH, organic matter (OM), total carbon (TC), nitrogen (TN), and phosphorus. The microbial community structure was assessed through quantitative PCR and massively parallel bacterial ribosomal RNA (rRNA) gene sequencing. Edaphic factors and microbial community composition were analyzed using multivariate statistical approaches and revealed a significant, negative impact of mining on soil that persisted even after greater than 20 years of restoration. Seasonal fluctuations contributed to variation in measured soil properties and community composition, but they were minor in comparison to long-term effects of mining. In both seasons, post-mined soils were higher in pH but OM, TC, and TN decreased. Bacterial rRNA gene analyses demonstrated a general decrease in diversity in post-mined soils and up to a 3-log decrease in rRNA gene abundance. Community composition analyses demonstrated that bacteria from the Proteobacteria (α, β, γ, δ), Acidobacteria, and Firmicutes were abundant in all soils. The abundance of Firmicutes was elevated in newer post-mined soils relative to the un-mined soil, and this contrasted a decrease, relative to un-mined soils, in proteobacterial and acidobacterial rRNA gene abundances. Our study indicates long-lasting impacts of mining activities to soil biogeochemical and microbial properties with impending loss in soil productivity.

  2. An Improved Approach to Estimate Methane Emissions from Coal Mining in China.

    PubMed

    Zhu, Tao; Bian, Wenjing; Zhang, Shuqing; Di, Pingkuan; Nie, Baisheng

    2017-11-07

    China, the largest coal producer in the world, is responsible for over 50% of the total global methane (CH 4 ) emissions from coal mining. However, the current emission inventory of CH4 from coal mining has large uncertainties because of the lack of localized emission factors (EFs). In this study, province-level CH4 EFs from coal mining in China were developed based on the data analysis of coal production and corresponding discharged CH4 emissions from 787 coal mines distributed in 25 provinces with different geological and operation conditions. Results show that the spatial distribution of CH 4 EFs is highly variable with values as high as 36 m3/t and as low as 0.74 m3/t. Based on newly developed CH 4 EFs and activity data, an inventory of the province-level CH4 emissions was built for 2005-2010. Results reveal that the total CH 4 emissions in China increased from 11.5 Tg in 2005 to 16.0 Tg in 2010. By constructing a gray forecasting model for CH 4 EFs and a regression model for activity, the province-level CH 4 emissions from coal mining in China are forecasted for the years of 2011-2020. The estimates are compared with other published inventories. Our results have a reasonable agreement with USEPA's inventory and are lower by a factor of 1-2 than those estimated using the IPCC default EFs. This study could help guide CH 4 mitigation policies and practices in China.

  3. Optimizing data collection for public health decisions: a data mining approach.

    PubMed

    Partington, Susan N; Papakroni, Vasil; Menzies, Tim

    2014-06-12

    Collecting data can be cumbersome and expensive. Lack of relevant, accurate and timely data for research to inform policy may negatively impact public health. The aim of this study was to test if the careful removal of items from two community nutrition surveys guided by a data mining technique called feature selection, can (a) identify a reduced dataset, while (b) not damaging the signal inside that data. The Nutrition Environment Measures Surveys for stores (NEMS-S) and restaurants (NEMS-R) were completed on 885 retail food outlets in two counties in West Virginia between May and November of 2011. A reduced dataset was identified for each outlet type using feature selection. Coefficients from linear regression modeling were used to weight items in the reduced datasets. Weighted item values were summed with the error term to compute reduced item survey scores. Scores produced by the full survey were compared to the reduced item scores using a Wilcoxon rank-sum test. Feature selection identified 9 store and 16 restaurant survey items as significant predictors of the score produced from the full survey. The linear regression models built from the reduced feature sets had R2 values of 92% and 94% for restaurant and grocery store data, respectively. While there are many potentially important variables in any domain, the most useful set may only be a small subset. The use of feature selection in the initial phase of data collection to identify the most influential variables may be a useful tool to greatly reduce the amount of data needed thereby reducing cost.

  4. Data Mining Techniques Applied to Hydrogen Lactose Breath Test.

    PubMed

    Rubio-Escudero, Cristina; Valverde-Fernández, Justo; Nepomuceno-Chamorro, Isabel; Pontes-Balanza, Beatriz; Hernández-Mendoza, Yoedusvany; Rodríguez-Herrera, Alfonso

    2017-01-01

    Analyze a set of data of hydrogen breath tests by use of data mining tools. Identify new patterns of H2 production. Hydrogen breath tests data sets as well as k-means clustering as the data mining technique to a dataset of 2571 patients. Six different patterns have been extracted upon analysis of the hydrogen breath test data. We have also shown the relevance of each of the samples taken throughout the test. Analysis of the hydrogen breath test data sets using data mining techniques has identified new patterns of hydrogen generation upon lactose absorption. We can see the potential of application of data mining techniques to clinical data sets. These results offer promising data for future research on the relations between gut microbiota produced hydrogen and its link to clinical symptoms.

  5. DrugQuest - a text mining workflow for drug association discovery.

    PubMed

    Papanikolaou, Nikolas; Pavlopoulos, Georgios A; Theodosiou, Theodosios; Vizirianakis, Ioannis S; Iliopoulos, Ioannis

    2016-06-06

    Text mining and data integration methods are gaining ground in the field of health sciences due to the exponential growth of bio-medical literature and information stored in biological databases. While such methods mostly try to extract bioentity associations from PubMed, very few of them are dedicated in mining other types of repositories such as chemical databases. Herein, we apply a text mining approach on the DrugBank database in order to explore drug associations based on the DrugBank "Description", "Indication", "Pharmacodynamics" and "Mechanism of Action" text fields. We apply Name Entity Recognition (NER) techniques on these fields to identify chemicals, proteins, genes, pathways, diseases, and we utilize the TextQuest algorithm to find additional biologically significant words. Using a plethora of similarity and partitional clustering techniques, we group the DrugBank records based on their common terms and investigate possible scenarios why these records are clustered together. Different views such as clustered chemicals based on their textual information, tag clouds consisting of Significant Terms along with the terms that were used for clustering are delivered to the user through a user-friendly web interface. DrugQuest is a text mining tool for knowledge discovery: it is designed to cluster DrugBank records based on text attributes in order to find new associations between drugs. The service is freely available at http://bioinformatics.med.uoc.gr/drugquest .

  6. A Network Biology Approach Identifies Molecular Cross-Talk between Normal Prostate Epithelial and Prostate Carcinoma Cells

    PubMed Central

    Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antzack, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J.; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco

    2016-01-01

    Abstract The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication

  7. A Network Biology Approach Identifies Molecular Cross-Talk between Normal Prostate Epithelial and Prostate Carcinoma Cells.

    PubMed

    Trevino, Victor; Cassese, Alberto; Nagy, Zsuzsanna; Zhuang, Xiaodong; Herbert, John; Antczak, Philipp; Clarke, Kim; Davies, Nicholas; Rahman, Ayesha; Campbell, Moray J; Guindani, Michele; Bicknell, Roy; Vannucci, Marina; Falciani, Francesco

    2016-04-01

    The advent of functional genomics has enabled the genome-wide characterization of the molecular state of cells and tissues, virtually at every level of biological organization. The difficulty in organizing and mining this unprecedented amount of information has stimulated the development of computational methods designed to infer the underlying structure of regulatory networks from observational data. These important developments had a profound impact in biological sciences since they triggered the development of a novel data-driven investigative approach. In cancer research, this strategy has been particularly successful. It has contributed to the identification of novel biomarkers, to a better characterization of disease heterogeneity and to a more in depth understanding of cancer pathophysiology. However, so far these approaches have not explicitly addressed the challenge of identifying networks representing the interaction of different cell types in a complex tissue. Since these interactions represent an essential part of the biology of both diseased and healthy tissues, it is of paramount importance that this challenge is addressed. Here we report the definition of a network reverse engineering strategy designed to infer directional signals linking adjacent cell types within a complex tissue. The application of this inference strategy to prostate cancer genome-wide expression profiling data validated the approach and revealed that normal epithelial cells exert an anti-tumour activity on prostate carcinoma cells. Moreover, by using a Bayesian hierarchical model integrating genetics and gene expression data and combining this with survival analysis, we show that the expression of putative cell communication genes related to focal adhesion and secretion is affected by epistatic gene copy number variation and it is predictive of patient survival. Ultimately, this study represents a generalizable approach to the challenge of deciphering cell communication networks

  8. Collective feature selection to identify crucial epistatic variants.

    PubMed

    Verma, Shefali S; Lucas, Anastasia; Zhang, Xinyuan; Veturi, Yogasudha; Dudek, Scott; Li, Binglan; Li, Ruowang; Urbanowicz, Ryan; Moore, Jason H; Kim, Dokyoon; Ritchie, Marylyn D

    2018-01-01

    Machine learning methods have gained popularity and practicality in identifying linear and non-linear effects of variants associated with complex disease/traits. Detection of epistatic interactions still remains a challenge due to the large number of features and relatively small sample size as input, thus leading to the so-called "short fat data" problem. The efficiency of machine learning methods can be increased by limiting the number of input features. Thus, it is very important to perform variable selection before searching for epistasis. Many methods have been evaluated and proposed to perform feature selection, but no single method works best in all scenarios. We demonstrate this by conducting two separate simulation analyses to evaluate the proposed collective feature selection approach. Through our simulation study we propose a collective feature selection approach to select features that are in the "union" of the best performing methods. We explored various parametric, non-parametric, and data mining approaches to perform feature selection. We choose our top performing methods to select the union of the resulting variables based on a user-defined percentage of variants selected from each method to take to downstream analysis. Our simulation analysis shows that non-parametric data mining approaches, such as MDR, may work best under one simulation criteria for the high effect size (penetrance) datasets, while non-parametric methods designed for feature selection, such as Ranger and Gradient boosting, work best under other simulation criteria. Thus, using a collective approach proves to be more beneficial for selecting variables with epistatic effects also in low effect size datasets and different genetic architectures. Following this, we applied our proposed collective feature selection approach to select the top 1% of variables to identify potential interacting variables associated with Body Mass Index (BMI) in ~ 44,000 samples obtained from Geisinger

  9. Respiratory Emergencies and Management of Mining Accidents

    PubMed Central

    Özmen, İpek; Aksoy, Emine

    2015-01-01

    The rapid detection of the reasons for mining accidents that lead to emergency situations is vital for search and rescue work. The control of fire and gas leakage provides an immediate approach for rescue works for deaths or injuries and the detection of who needs resuscitation outside of the mine. The evacuation and recovery operations should be directed by continuous monitoring of the mine environment due to fire and explosion risks. The main toxic gases in mines are carbon monoxide (CO) and carbon dioxide (CO2); the flammable gases are methane (CH4), CO, and hydrogen (H2); the suffocating gases are CO2, nitrogen (N20), and CH4; and the toxic gases are CO, nitrogen oxides (NOx), and hydrogen sulfide (H2S). PMID:29404110

  10. Estimating natural background groundwater chemistry, Questa molybdenum mine, New Mexico

    USGS Publications Warehouse

    Verplanck, Phillip L.; Nordstrom, D. Kirk; Plumlee, Geoffrey S.; Walker, Bruce M.; Morgan, Lisa A.; Quane, Steven L.

    2010-01-01

    This 2 1/2 day field trip will present an overview of a U.S. Geological Survey (USGS) project whose objective was to estimate pre-mining groundwater chemistry at the Questa molybdenum mine, New Mexico. Because of intense debate among stakeholders regarding pre-mining groundwater chemistry standards, the New Mexico Environment Department and Chevron Mining Inc. (formerly Molycorp) agreed that the USGS should determine pre-mining groundwater quality at the site. In 2001, the USGS began a 5-year, multidisciplinary investigation to estimate pre-mining groundwater chemistry utilizing a detailed assessment of a proximal natural analog site and applied an interdisciplinary approach to infer pre-mining conditions. The trip will include a surface tour of the Questa mine and key locations in the erosion scar areas and along the Red River. The trip will provide participants with a detailed understanding of geochemical processes that influence pre-mining environmental baselines in mineralized areas and estimation techniques for determining pre-mining baseline conditions.

  11. Mining the human gut microbiota for effector strains that shape the immune system

    PubMed Central

    Ahern, Philip P.; Faith, Jeremiah J.; Gordon, Jeffrey I.

    2014-01-01

    Summary The gut microbiota co-develops with the immune system beginning at birth. Mining the microbiota for bacterial strains responsible for shaping the structure and dynamic operations of the innate and adaptive arms of the immune system represents a formidable combinatorial problem but one that needs to be overcome to advance mechanistic understanding of microbial community-immune system co-regulation, and in order to develop new diagnostic and therapeutic approaches that promote health. Here, we discuss a scalable, less biased approach for identifying effector strains in complex microbial communities that impact immune function. The approach begins by identifying uncultured human fecal microbiota samples that transmit immune phenotypes to germ-free mice. Clonally-arrayed sequenced collections of bacterial strains are constructed from representative donor microbiota. If the collection transmits phenotypes, effector strains are identified by testing randomly generated subsets with overlapping membership in individually-housed germ-free animals. Detailed mechanistic studies of effector strain-host interactions can then be performed. PMID:24950201

  12. The enviornmental assessment of a contemporary coal mining system

    NASA Technical Reports Server (NTRS)

    Dutzi, E. J.; Sullivan, P. J.; Hutchinson, C. F.; Stevens, C. M.

    1980-01-01

    A contemporary underground coal mine in eastern Kentucky was assessed in order to determine potential off-site and on-site environmental impacts associated with the mining system in the given environmental setting. A 4 section, continuous room and pillor mine plan was developed for an appropriate site in eastern Kentucky. Potential environmental impacts were identified, and mitigation costs determined. The major potential environmental impacts were determined to be: acid water drainage from the mine and refuse site, uneven subsidence of the surface as a result of mining activity, and alteration of ground water aquifers in the subsidence zone. In the specific case examined, the costs of environmental impact mitigation to levels prescribed by regulations would not exceed $1/ton of coal mined, and post mining land values would not be affected.

  13. Study on perception and control layer of mine CPS with mixed logic dynamic approach

    NASA Astrophysics Data System (ADS)

    Li, Jingzhao; Ren, Ping; Yang, Dayu

    2017-01-01

    Mine inclined roadway transportation system of mine cyber physical system is a hybrid system consisting of a continuous-time system and a discrete-time system, which can be divided into inclined roadway signal subsystem, error-proofing channel subsystems, anti-car subsystems, and frequency control subsystems. First, to ensure stable operation, improve efficiency and production safety, this hybrid system model with n inputs and m outputs is constructed and analyzed in detail, then its steady schedule state to be solved. Second, on the basis of the formal modeling for real-time systems, we use hybrid toolbox for system security verification. Third, the practical application of mine cyber physical system shows that the method for real-time simulation of mine cyber physical system is effective.

  14. Metal(loid) levels in biological matrices from human populations exposed to mining contamination--Panasqueira Mine (Portugal).

    PubMed

    Coelho, Patrícia; Costa, Solange; Silva, Susana; Walter, Alan; Ranville, James; Sousa, Ana C A; Costa, Carla; Coelho, Marta; García-Lestón, Julia; Pastorinho, M Ramiro; Laffon, Blanca; Pásaro, Eduardo; Harrington, Chris; Taylor, Andrew; Teixeira, João Paulo

    2012-01-01

    Mining activities may affect the health of miners and communities living near mining sites, and these health effects may persist even when the mine is abandoned. During mining processes various toxic wastes are produced and released into the surrounding environment, resulting in contamination of air, drinking water, rivers, plants, and soils. In a geochemical sampling campaign undertaken in the Panasqueira Mine area of central Portugal, an anomalous distribution of several metals and arsenic (As) was identified in various environmental media. Several potentially harmful elements, including As, cadmium (Cd), chromium (Cr), manganese (Mn), nickel (Ni), lead (Pb), and selenium (Se), were quantified in blood, urine, hair, and nails (toe and finger) from a group of individuals living near the Panasqueira Mine who were environmentally and occupationally exposed. A group with similar demographic characteristics without known exposure to mining activities was also compared. Genotoxicity was evaluated by means of T-cell receptor (TCR) mutation assay, and percentages of different lymphocyte subsets were selected as immunotoxicity biomarkers. Inductively coupled plasma-mass spectrometry (ICP-MS) and inductively coupled plasma-atomic emission spectrometry (ICP-AES) analysis showed elevated levels of As, Cd, Cr, Mn, and Pb in all biological samples taken from populations living close to the mine compared to controls. Genotoxic and immunotoxic differences were also observed. The results provide evidence of an elevated potential risk to the health of populations, with environmental and occupational exposures resulting from mining activities. Further, the results emphasize the need to implement preventive measures, remediation, and rehabilitation plans for the region.

  15. Open Pit Mine 3d Mapping by Tls and Digital Photogrammetry: 3d Model Update Thanks to a Slam Based Approach

    NASA Astrophysics Data System (ADS)

    Vassena, G.; Clerici, A.

    2018-05-01

    The state of the art of 3D surveying technologies, if correctly applied, allows to obtain 3D coloured models of large open pit mines using different technologies as terrestrial laser scanner (TLS), with images, combined with UAV based digital photogrammetry. GNSS and/or total station are also currently used to geo reference the model. The University of Brescia has been realised a project to map in 3D an open pit mine located in Botticino, a famous location of marble extraction close to Brescia in North Italy. Terrestrial Laser Scanner 3D point clouds combined with RGB images and digital photogrammetry from UAV have been used to map a large part of the cave. By rigorous and well know procedures a 3D point cloud and mesh model have been obtained using an easy and rigorous approach. After the description of the combined mapping process, the paper describes the innovative process proposed for the daily/weekly update of the model itself. To realize this task a SLAM technology approach is described, using an innovative approach based on an innovative instrument capable to run an automatic localization process and real time on the field change detection analysis.

  16. A Data Mining Approach to Reveal Representative Collaboration Indicators in Open Collaboration Frameworks

    ERIC Educational Resources Information Center

    Anaya, Antonio R.; Boticario, Jesus G.

    2009-01-01

    Data mining methods are successful in educational environments to discover new knowledge or learner skills or features. Unfortunately, they have not been used in depth with collaboration. We have developed a scalable data mining method, whose objective is to infer information on the collaboration during the collaboration process in a…

  17. A review of contrast pattern based data mining

    NASA Astrophysics Data System (ADS)

    Zhu, Shiwei; Ju, Meilong; Yu, Junfeng; Cai, Binlei; Wang, Aiping

    2015-07-01

    Contrast pattern based data mining is concerned with the mining of patterns and models that contrast two or more datasets. Contrast patterns can describe similarities or differences between the datasets. They represent strong contrast knowledge and have been shown to be very successful for constructing accurate and robust clusters and classifiers. The increasing use of contrast pattern data mining has initiated a great deal of research and development attempts in the field of data mining. A comprehensive revision on the existing contrast pattern based data mining research is given in this paper. They are generally categorized into background and representation, definitions and mining algorithms, contrast pattern based classification, clustering, and other applications, the research trends in future. The primary of this paper is to server as a glossary for interested researchers to have an overall picture on the current contrast based data mining development and identify their potential research direction to future investigation.

  18. Automation of the longwall mining system

    NASA Technical Reports Server (NTRS)

    Zimmerman, W.; Aster, R. W.; Harris, J.; High, J.

    1982-01-01

    Cost effective, safe, and technologically sound applications of automation technology to underground coal mining were identified. The longwall analysis commenced with a general search for government and industry experience of mining automation technology. A brief industry survey was conducted to identify longwall operational, safety, and design problems. The prime automation candidates resulting from the industry experience and survey were: (1) the shearer operation, (2) shield and conveyor pan line advance, (3) a management information system to allow improved mine logistics support, and (4) component fault isolation and diagnostics to reduce untimely maintenance delays. A system network analysis indicated that a 40% improvement in productivity was feasible if system delays associated with all of the above four areas were removed. A technology assessment and conceptual system design of each of the four automation candidate areas showed that state of the art digital computer, servomechanism, and actuator technologies could be applied to automate the longwall system.

  19. Refining adverse drug reaction signals by incorporating interaction variables identified using emergent pattern mining.

    PubMed

    Reps, Jenna M; Aickelin, Uwe; Hubbard, Richard B

    2016-02-01

    To develop a framework for identifying and incorporating candidate confounding interaction terms into a regularised cox regression analysis to refine adverse drug reaction signals obtained via longitudinal observational data. We considered six drug families that are commonly associated with myocardial infarction in observational healthcare data, but where the causal relationship ground truth is known (adverse drug reaction or not). We applied emergent pattern mining to find itemsets of drugs and medical events that are associated with the development of myocardial infarction. These are the candidate confounding interaction terms. We then implemented a cohort study design using regularised cox regression that incorporated and accounted for the candidate confounding interaction terms. The methodology was able to account for signals generated due to confounding and a cox regression with elastic net regularisation correctly ranking the drug families known to be true adverse drug reactions above those that are not. This was not the case without the inclusion of the candidate confounding interaction terms, where confounding leads to a non-adverse drug reaction being ranked highest. The methodology is efficient, can identify high-order confounding interactions and does not require expert input to specify outcome specific confounders, so it can be applied for any outcome of interest to quickly refine its signals. The proposed method shows excellent potential to overcome some forms of confounding and therefore reduce the false positive rate for signal analysis using longitudinal data. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Geochemistry of Standard Mine Waters, Gunnison County, Colorado, July 2009

    USGS Publications Warehouse

    Verplanck, Philip L.; Manning, Andrew H.; Graves, Jeffrey T.; McCleskey, R. Blaine; Todorov, Todor I.; Lamothe, Paul J.

    2009-01-01

    In many hard-rock-mining districts water flowing from abandoned mine adits is a primary source of metals to receiving streams. Understanding the generation of adit discharge is an important step in developing remediation plans. In 2006, the U.S. Environmental Protection Agency listed the Standard Mine in the Elk Creek drainage basin near Crested Butte, Colorado as a superfund site because drainage from the Standard Mine enters Elk Creek, contributing dissolved and suspended loads of zinc, cadmium, copper, and other metals to the stream. Elk Creek flows into Coal Creek, which is a source of drinking water for the town of Crested Butte. In 2006 and 2007, the U.S. Geological Survey undertook a hydrogeologic investigation of the Standard Mine and vicinity and identified areas of the underground workings for additional work. Mine drainage, underground-water samples, and selected spring water samples were collected in July 2009 for analysis of inorganic solutes as part of a follow-up study. Water analyses are reported for mine-effluent samples from Levels 1 and 5 of the Standard Mine, underground samples from Levels 2 and 3 of the Standard Mine, two spring samples, and an Elk Creek sample. Reported analyses include field measurements (pH, specific conductance, water temperature, dissolved oxygen, and redox potential), major constituents and trace elements, and oxygen and hydrogen isotopic determinations. Overall, water samples collected in 2009 at the same sites as were collected in 2006 have similar chemical compositions. Similar to 2006, water in Level 3 did not flow out the portal but was observed to flow into open workings to lower parts of the mine. Many dissolved constituent concentrations, including calcium, magnesium, sulfate, manganese, zinc, and cadmium, in Level 3 waters substantially are lower than in Level 1 effluent. Concentrations of these dissolved constituents in water samples collected from Level 2 approach or exceed concentrations of Level 1 effluent

  1. Predicting the disease of Alzheimer with SNP biomarkers and clinical data using data mining classification approach: decision tree.

    PubMed

    Erdoğan, Onur; Aydin Son, Yeşim

    2014-01-01

    Single Nucleotide Polymorphisms (SNPs) are the most common genomic variations where only a single nucleotide differs between individuals. Individual SNPs and SNP profiles associated with diseases can be utilized as biological markers. But there is a need to determine the SNP subsets and patients' clinical data which is informative for the diagnosis. Data mining approaches have the highest potential for extracting the knowledge from genomic datasets and selecting the representative SNPs as well as most effective and informative clinical features for the clinical diagnosis of the diseases. In this study, we have applied one of the widely used data mining classification methodology: "decision tree" for associating the SNP biomarkers and significant clinical data with the Alzheimer's disease (AD), which is the most common form of "dementia". Different tree construction parameters have been compared for the optimization, and the most accurate tree for predicting the AD is presented.

  2. Poker Flats Mine - Div. of Mining, Land, and Water

    Science.gov Websites

    Lands Coal Regulatory Program Large Mine Permits Mineral Property and Rights Mining Index Land Fishery Water Resources Factsheets Forms banner image of landscape Poker Flats Mine Home Mining Coal Regulatory Program Poker Flats Mine Mining Coal Regulatory Program Info Chickaloon Chuit Watershed Chuitna

  3. Numerical Modeling Tools for the Prediction of Solution Migration Applicable to Mining Site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martell, M.; Vaughn, P.

    1999-01-06

    Mining has always had an important influence on cultures and traditions of communities around the globe and throughout history. Today, because mining legislation places heavy emphasis on environmental protection, there is great interest in having a comprehensive understanding of ancient mining and mining sites. Multi-disciplinary approaches (i.e., Pb isotopes as tracers) are being used to explore the distribution of metals in natural environments. Another successful approach is to model solution migration numerically. A proven method to simulate solution migration in natural rock salt has been applied to project through time for 10,000 years the system performance and solution concentrations surroundingmore » a proposed nuclear waste repository. This capability is readily adaptable to simulate solution migration around mining.« less

  4. Mining Available Data from the United States Environmental ...

    EPA Pesticide Factsheets

    Demands for quick and accurate life cycle assessments create a need for methods to rapidly generate reliable life cycle inventories (LCI). Data mining is a suitable tool for this purpose, especially given the large amount of available governmental data. These data are typically applied to LCIs on a case-by-case basis. As linked open data becomes more prevalent, it may be possible to automate LCI using data mining by establishing a reproducible approach for identifying, extracting, and processing the data. This work proposes a method for standardizing and eventually automating the discovery and use of publicly available data at the United States Environmental Protection Agency for chemical-manufacturing LCI. The method is developed using a case study of acetic acid. The data quality and gap analyses for the generated inventory found that the selected data sources can provide information with equal or better reliability and representativeness on air, water, hazardous waste, on-site energy usage, and production volumes but with key data gaps including material inputs, water usage, purchased electricity, and transportation requirements. A comparison of the generated LCI with existing data revealed that the data mining inventory is in reasonable agreement with existing data and may provide a more-comprehensive inventory of air emissions and water discharges. The case study highlighted challenges for current data management practices that must be overcome to successfu

  5. Optimizing data collection for public health decisions: a data mining approach

    PubMed Central

    2014-01-01

    Background Collecting data can be cumbersome and expensive. Lack of relevant, accurate and timely data for research to inform policy may negatively impact public health. The aim of this study was to test if the careful removal of items from two community nutrition surveys guided by a data mining technique called feature selection, can (a) identify a reduced dataset, while (b) not damaging the signal inside that data. Methods The Nutrition Environment Measures Surveys for stores (NEMS-S) and restaurants (NEMS-R) were completed on 885 retail food outlets in two counties in West Virginia between May and November of 2011. A reduced dataset was identified for each outlet type using feature selection. Coefficients from linear regression modeling were used to weight items in the reduced datasets. Weighted item values were summed with the error term to compute reduced item survey scores. Scores produced by the full survey were compared to the reduced item scores using a Wilcoxon rank-sum test. Results Feature selection identified 9 store and 16 restaurant survey items as significant predictors of the score produced from the full survey. The linear regression models built from the reduced feature sets had R2 values of 92% and 94% for restaurant and grocery store data, respectively. Conclusions While there are many potentially important variables in any domain, the most useful set may only be a small subset. The use of feature selection in the initial phase of data collection to identify the most influential variables may be a useful tool to greatly reduce the amount of data needed thereby reducing cost. PMID:24919484

  6. Mining method selection by integrated AHP and PROMETHEE method.

    PubMed

    Bogdanovic, Dejan; Nikolic, Djordje; Ilic, Ivana

    2012-03-01

    Selecting the best mining method among many alternatives is a multicriteria decision making problem. The aim of this paper is to demonstrate the implementation of an integrated approach that employs AHP and PROMETHEE together for selecting the most suitable mining method for the "Coka Marin" underground mine in Serbia. The related problem includes five possible mining methods and eleven criteria to evaluate them. Criteria are accurately chosen in order to cover the most important parameters that impact on the mining method selection, such as geological and geotechnical properties, economic parameters and geographical factors. The AHP is used to analyze the structure of the mining method selection problem and to determine weights of the criteria, and PROMETHEE method is used to obtain the final ranking and to make a sensitivity analysis by changing the weights. The results have shown that the proposed integrated method can be successfully used in solving mining engineering problems.

  7. Environmental hazard assessment of a marine mine tailings deposit site and potential implications for deep-sea mining.

    PubMed

    Mestre, Nélia C; Rocha, Thiago L; Canals, Miquel; Cardoso, Cátia; Danovaro, Roberto; Dell'Anno, Antonio; Gambi, Cristina; Regoli, Francesco; Sanchez-Vidal, Anna; Bebianno, Maria João

    2017-09-01

    Portmán Bay is a heavily contaminated area resulting from decades of metal mine tailings disposal, and is considered a suitable shallow-water analogue to investigate the potential ecotoxicological impact of deep-sea mining. Resuspension plumes were artificially created by removing the top layer of the mine tailings deposit by bottom trawling. Mussels were deployed at three sites: i) off the mine tailings deposit area; ii) on the mine tailings deposit beyond the influence from the resuspension plumes; iii) under the influence of the artificially generated resuspension plumes. Surface sediment samples were collected at the same sites for metal analysis and ecotoxicity assessment. Metal concentrations and a battery of biomarkers (oxidative stress, metal exposure, biotransformation and oxidative damage) were measured in different mussel tissues. The environmental hazard posed by the resuspension plumes was investigated by a quantitative weight of evidence (WOE) model that integrated all the data. The resuspension of sediments loaded with metal mine tails demonstrated that chemical contaminants were released by trawling subsequently inducing ecotoxicological impact in mussels' health. Considering as sediment quality guidelines (SQGs) those indicated in Spanish action level B for the disposal of dredged material at sea, the WOE model indicates that the hazard is slight off the mine tailings deposit, moderate on the mine tailings deposit without the influence from the resuspension plumes, and major under the influence of the resuspension plumes. Portmán Bay mine tailings deposit is a by-product of sulphide mining, and despite differences in environmental setting, it can reflect the potential ecotoxic effects to marine fauna from the impact of resuspension of plumes created by deep-sea mining of polymetallic sulphides. A similar approach as in this study could be applied in other areas affected by sediment resuspension and for testing future deep-sea mining sites in

  8. Literature mining, gene-set enrichment and pathway analysis for target identification in Behçet's disease.

    PubMed

    Wilson, Paul; Larminie, Christopher; Smith, Rona

    2016-01-01

    To use literature mining to catalogue Behçet's associated genes, and advanced computational methods to improve the understanding of the pathways and signalling mechanisms that lead to the typical clinical characteristics of Behçet's patients. To extend this technique to identify potential treatment targets for further experimental validation. Text mining methods combined with gene enrichment tools, pathway analysis and causal analysis algorithms. This approach identified 247 human genes associated with Behçet's disease and the resulting disease map, comprising 644 nodes and 19220 edges, captured important details of the relationships between these genes and their associated pathways, as described in diverse data repositories. Pathway analysis has identified how Behçet's associated genes are likely to participate in innate and adaptive immune responses. Causal analysis algorithms have identified a number of potential therapeutic strategies for further investigation. Computational methods have captured pertinent features of the prominent disease characteristics presented in Behçet's disease and have highlighted NOD2, ICOS and IL18 signalling as potential therapeutic strategies.

  9. Geophysical model of the Cu-Mo porphyry ore deposit at Copper Flat Mine, Hillsboro, Sierra County, New Mexico

    NASA Astrophysics Data System (ADS)

    Gutierrez, Adrian Emmanuel Gutierrez

    A 3D gravity model of the Copper Flat Mine was performed as part of the exploration of new resources in at the mine. The project is located in the Las Animas Mining District in Sierra County, New Mexico. The mine has been producing ore since 1877 and is currently owned by the New Mexico Copper Corporation, which plans o bringing the closed copper mine back into production with innovation and a sustainable approach to mining development. The Project is located on the Eastern side of the Arizona-Sonora-New Mexico porphyry copper Belt of Cretaceous age. Copper Flat is predominantly a Cretaceous age stratovolcano composed mostly of quartz monzonite. The quartz monzonite was intruded by a block of andesite alter which a series of latite dikes creating veining along the topography where the majority of the deposit. The Copper Flat deposit is mineralized along a breccia pipe where the breccia is the result of auto-brecciation due to the pore pressure. There have been a number of geophysical studies conducted at the site. The most recent survey was a gravity profile on the area. The purpose of the new study is the reinterpretation of the IP Survey and emphasizes the practical use of the gravity geophysical method in evaluating the validity of the previous survey results. The primary method used to identify the deposit is gravity in which four Talwani models were created in order to created a 3D model of the ore body. The Talwani models have numerical integration approaches that were used to divide every model into polygons. The profiles were sectioned into polygons; each polygon was assigning a specific density depending on the body being drawn. Three different gridding techniques with three different filtering methods were used producing ten maps prior to the modeling, these maps were created to establish the best map to fit the models. The calculation of the polygons used an exact formula instead of the numerical integration of the profile made with a Talwani approach. A

  10. The risk of collapse in abandoned mine sites: the issue of data uncertainty

    NASA Astrophysics Data System (ADS)

    Longoni, Laura; Papini, Monica; Brambilla, Davide; Arosio, Diego; Zanzi, Luigi

    2016-04-01

    Ground collapses over abandoned underground mines constitute a new environmental risk in the world. The high risk associated with subsurface voids, together with lack of knowledge of the geometric and geomechanical features of mining areas, makes abandoned underground mines one of the current challenges for countries with a long mining history. In this study, a stability analysis of Montevecchia marl mine is performed in order to validate a general approach that takes into account the poor local information and the variability of the input data. The collapse risk was evaluated through a numerical approach that, starting with some simplifying assumptions, is able to provide an overview of the collapse probability. The final results is an easy-accessible-transparent summary graph that shows the collapse probability. This approach may be useful for public administrators called upon to manage this environmental risk. The approach tries to simplify this complex problem in order to achieve a roughly risk assessment, but, since it relies on just a small amount of information, any final user should be aware that a comprehensive and detailed risk scenario can be generated only through more exhaustive investigations.

  11. Development and application of biotechnologies in the metal mining industry.

    PubMed

    Johnson, D Barrie

    2013-11-01

    Metal mining faces a number of significant economic and environmental challenges in the twenty-first century for which established and emerging biotechnologies may, at least in part, provide the answers. Bioprocessing of mineral ores and concentrates is already used in variously engineered formats to extract base (e.g., copper, cobalt, and nickel) and precious (gold and silver) metals in mines throughout the world, though it remains a niche technology. However, current projections of an increasing future need to use low-grade primary metal ores, to reprocess mine wastes, and to develop in situ leaching technologies to extract metals from deep-buried ore bodies, all of which are economically more amenable to bioprocessing than conventional approaches (e.g., pyrometallurgy), would suggest that biomining will become more extensively utilized in the future. Recent research has also shown that bioleaching could be used to process a far wider range of metal ores (e.g., oxidized ores) than has previously been the case. Biotechnologies are also being developed to control mine-related pollution, including securing mine wastes (rocks and tailings) by using "ecological engineering" approaches, and also to remediate and recover metals from waste waters, such as acid mine drainage. This article reviews the current status of biotechnologies within the mining sector and considers how these may be developed and applied in future years.

  12. Report of investigation on underground limestone mines in the Ohio region. [Jonathan Mine, Alpha Portland Cement Mine, and Lewisburg Mine

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Byerly, D.W.

    1976-06-01

    The following is a report of investigation on the geologic setting of several underground limestone mines in Ohio other than the PPG mine at Barberton, Ohio. Due to the element of available time, the writer is only able to deliver a brief synopsis of the geology of three sites visited. These three sites and the Barberton, Ohio site are the only underground limestone mines in Ohio to the best of the writer's knowledge. The sites visited include: (1) the Jonathan Mine located near Zanesville, Ohio, and currently operated by the Columbia Cement Corporation; (2) the abandoned Alpha Portland Cement Minemore » located near Ironton, Ohio; and (3) the Lewisburg Mine located at Lewisburg, Ohio, and currently being utilized as an underground storage facility. Other remaining possibilities where limestone is being mined underground are located in middle Ordovician strata near Carntown and Maysville, Kentucky. These are drift mines into a thick sequence of carbonates. The writer predicts, however, that these mines would have some problems with water due to the preponderance of carbonate rocks and the proximity of the mines to the Ohio River. None of the sites visited nor the sites in Kentucky have conditions comparable to the deep mine at Barberton, Ohio.« less

  13. Two modelling approaches to water-quality simulation in a flooded iron-ore mine (Saizerais, Lorraine, France): a semi-distributed chemical reactor model and a physically based distributed reactive transport pipe network model.

    PubMed

    Hamm, V; Collon-Drouaillet, P; Fabriol, R

    2008-02-19

    The flooding of abandoned mines in the Lorraine Iron Basin (LIB) over the past 25 years has degraded the quality of the groundwater tapped for drinking water. High concentrations of dissolved sulphate have made the water unsuitable for human consumption. This problematic issue has led to the development of numerical tools to support water-resource management in mining contexts. Here we examine two modelling approaches using different numerical tools that we tested on the Saizerais flooded iron-ore mine (Lorraine, France). A first approach considers the Saizerais Mine as a network of two chemical reactors (NCR). The second approach is based on a physically distributed pipe network model (PNM) built with EPANET 2 software. This approach considers the mine as a network of pipes defined by their geometric and chemical parameters. Each reactor in the NCR model includes a detailed chemical model built to simulate quality evolution in the flooded mine water. However, in order to obtain a robust PNM, we simplified the detailed chemical model into a specific sulphate dissolution-precipitation model that is included as sulphate source/sink in both a NCR model and a pipe network model. Both the NCR model and the PNM, based on different numerical techniques, give good post-calibration agreement between the simulated and measured sulphate concentrations in the drinking-water well and overflow drift. The NCR model incorporating the detailed chemical model is useful when a detailed chemical behaviour at the overflow is needed. The PNM incorporating the simplified sulphate dissolution-precipitation model provides better information of the physics controlling the effect of flow and low flow zones, and the time of solid sulphate removal whereas the NCR model will underestimate clean-up time due to the complete mixing assumption. In conclusion, the detailed NCR model will give a first assessment of chemical processes at overflow, and in a second time, the PNM model will provide more

  14. A three-step approach to minimise the impact of a mining site on vicuña (Vicugna vicugna) and to restore landscape connectivity.

    PubMed

    Mata, Cristina; Malo, Juan E; Galaz, José Luis; Cadorzo, César; Lagunas, Héctor

    2016-07-01

    Resource extraction projects generate a diversity of negative effects on the environment that are difficult to predict and mitigate. Consequently, adaptive management approaches have been advocated to develop effective responses to impacts that were not predicted. Mammal populations living in or around mine sites are frequently of management concern; yet, there is a dearth of published information on how to minimise the negative effects of different phases of mining operations on them. Here, we present the case study of a copper mine in the Chilean Altiplano, which caused roadkills of the protected vicuña (Vicugna vicugna). This issue led to a three-step solution being implemented: (1) the initial identification of the problem and implementation of an emergency response, (2) the scientific analysis for decision making and (3) the planning and informed implementation of responses for different future scenarios and timescales. The measures taken under each of these steps provide examples of environmental management approaches that make use of scientific information to develop integrated management responses. In brief, our case study showed how (1) the timescale and the necessity/urgency of the case were addressed, (2) the various stakeholders involved were taken into account and (3) changes were included into the physical, human and organisational elements of the company to achieve the stated objectives.

  15. Macromolecule mass spectrometry: citation mining of user documents.

    PubMed

    Kostoff, Ronald N; Bedford, Clifford D; del Río, J Antonio; Cortes, Héctor D; Karypis, George

    2004-03-01

    Identifying research users, applications, and impact is important for research performers, managers, evaluators, and sponsors. Identification of the user audience and the research impact is complex and time consuming due to the many indirect pathways through which fundamental research can impact applications. This paper identified the literature pathways through which two highly-cited papers of 2002 Chemistry Nobel Laureates Fenn and Tanaka impacted research, technology development, and applications. Citation Mining, an integration of citation bibliometrics and text mining, was applied to the >1600 first generation Science Citation Index (SCI) citing papers to Fenn's 1989 Science paper on Electrospray Ionization for Mass Spectrometry, and to the >400 first generation SCI citing papers to Tanaka's 1988 Rapid Communications in Mass Spectrometry paper on Laser Ionization Time-of-Flight Mass Spectrometry. Bibliometrics was performed on the citing papers to profile the user characteristics. Text mining was performed on the citing papers to identify the technical areas impacted by the research, and the relationships among these technical areas.

  16. Geochemical Characteristics of TP3 Mine Wastes at the Elizabeth Copper Mine Superfund Site, Orange County, Vermont

    USGS Publications Warehouse

    Hammarstrom, Jane M.; Piatak, Nadine M.; Seal, Robert R.; Briggs, Paul H.; Meier, Allen L.; Muzik, Timothy L.

    2003-01-01

    Remediation of the Elizabeth mine Superfund site in the Vermont copper belt poses challenges for balancing environmental restoration goals with issues of historic preservation while adopting cost-effective strategies for site cleanup and long-term maintenance. The waste-rock pile known as TP3, at the headwaters of Copperas Brook, is especially noteworthy in this regard because it is the worst source of surface- and ground-water contamination identified to date, while also being the area of greatest historical significance. The U.S. Geological Survey (USGS) conducted a study of the historic mine-waste piles known as TP3 at the Elizabeth mine Superfund site near South Strafford, Orange County, VT. TP3 is a 12.3-acre (49,780 m2) subarea of the Elizabeth mine site. It is a focus area for historic preservation because it encompasses an early 19th century copperas works as well as waste from late 19th- and 20th century copper mining (Kierstead, 2001). Surface runoff and seeps from TP3 form the headwaters of Copperas Brook. The stream flows down a valley onto flotation tailings from 20th century copper mining operations and enters the West Branch of the Ompompanoosuc River approximately 1 kilometer downstream from the mine site. Shallow drinking water wells down gradient from TP3 exceed drinking water standards for copper and cadmium (Hathaway and others, 2001). The Elizabeth mine was listed as a Superfund site in 2001, mainly because of impacts of acid-mine drainage on the Ompompanoosuc River.

  17. Kinetics of bed fracturing around mine workings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Veksler, Yu.A.

    1988-03-01

    A failure of the bed near the walls of the workings of a mine away from the face occurs gradually over time and in this paper the authors take a kinetic approach to evaluating its development. The influence of certain mine engineering factors on the pattern of bed fracturing is discussed. The effect of the depth of mining is shown. Cracking occurs in the portion of the seam at the face near the ground at some distance from it on the interface between soft and hard coal. The density of the fractured rocks and their response affect the bed fracturingmore » near the stope face.« less

  18. Determination of pre-mining geochemical conditions and paleoecology in the Animas River Watershed, Colorado

    USGS Publications Warehouse

    Church, S.E.; Fey, D.L.; Brouwers, E.M.; Holmes, C.W.; Blair, Robert

    1999-01-01

    Determination of the pre-mining geochemical baseline in bed sediments and the paleoecology in a watershed impacted by historical mining activity is of utmost importance in establishing watershed restoration goals. We have approached this problem in the Animas River watershed using geomorphologic mapping methods to identify old pre-mining sediments. A systematic evaluation of possible sites resulted in collection of a large number of samples of pre-mining sediments, overbank sediments, and fluvial tailings deposits from more than 50 sites throughout the watershed. Chemical analysis of individual stratigraphic layers has resulted in a chemical stratigraphy that can be tied to the historical record through geochronological and dendochronological studies at these sites. Preliminary analysis of geochemical data from more than 500 samples from this study, when coupled with both the historical and geochronological record, clearly show that there has been a major impact by historical mining activities on the geochemical record preserved in these fluvial bed sediments. Historical mining activity has resulted in a substantial increase in metals in the very fine sand to clay sized component of the bed sediment of the upper Animas River, and Cement and Mineral Creeks. Enrichment factors for metals in modern bed sediments, relative to the pre-mining sediments, range from a factor of 2 to 6 for arsenic, 4 to more than 10 for cadmium, 2 to more than 10 for lead, 2 to 5 for silver, and 2 to more than 15 for zinc. However, the pre-mining bed sediment geochemical baseline is high relative to crustal abundance levels of many orerelated metals and the watershed would readily be identified as a highly mineralized area suitable for mineral exploration if it had not been disturbed by historical mining activity. We infer from these data that the water chemistry in the streams was less acidic prior to historical mining activity in the watershed. Paleoentologic evidence does not indicate a

  19. Utility of hyperspectral imagers in the mining industry: Italy's gypsum reserves

    NASA Astrophysics Data System (ADS)

    Wilson, Janette H.; Greenberger, Rebecca N.

    2014-05-01

    The mining industry is plagued with socioeconomic and safety roadblocks with not many solutions in the midst of a demanding market. As more and more geologic research using hyperspectral technology has been performed, along with an affordable price point for commercial use of hyperspectral technology, the benefits of hyperspectral imaging to the mining industry has become apparent. This study identifies the key areas of use for hyperspectral imaging in the mining industry through a case study of gypsum mine samples obtained from a mine in central Tuscany.

  20. Baseline and premining geochemical characterization of mined sites

    USGS Publications Warehouse

    Nordstrom, D. Kirk

    2015-01-01

    A rational goal for environmental restoration of new, active, or inactive mine sites would be ‘natural background’ or the environmental conditions that existed before any mining activities or other related anthropogenic activities. In a strictly technical sense, there is no such thing as natural background (or entirely non-anthropogenic) existing today because there is no part of the planet earth that has not had at least some chemical disturbance from anthropogenic activities. Hence, the terms ‘baseline’ and ‘pre-mining’ are preferred to describe these conditions. Baseline conditions are those that existed at the time of the characterization which could be pre-mining, during mining, or post-mining. Protocols for geochemically characterizing pre-mining conditions are not well-documented for sites already mined but there are two approaches that seem most direct and least ambiguous. One is characterization of analog sites along with judicious application of geochemical modeling. The other is reactive-transport modeling (based on careful synoptic sampling with tracer-injection) and subtracting inputs from known mining and mineral processing. Several examples of acidic drainage are described from around the world documenting the range of water compositions produced from pyrite oxidation in the absence of mining. These analog sites provide insight to the processes forming mineralized waters in areas untouched by mining. Natural analog water-chemistry data is compared with the higher metal concentrations, metal fluxes, and weathering rates found in mined areas in the few places where comparisons are possible. The differences are generally 1–3 orders of magnitude higher for acid mine drainage.

  1. A Computer Vision Approach to Identify Einstein Rings and Arcs

    NASA Astrophysics Data System (ADS)

    Lee, Chien-Hsiu

    2017-03-01

    Einstein rings are rare gems of strong lensing phenomena; the ring images can be used to probe the underlying lens gravitational potential at every position angles, tightly constraining the lens mass profile. In addition, the magnified images also enable us to probe high-z galaxies with enhanced resolution and signal-to-noise ratios. However, only a handful of Einstein rings have been reported, either from serendipitous discoveries or or visual inspections of hundred thousands of massive galaxies or galaxy clusters. In the era of large sky surveys, an automated approach to identify ring pattern in the big data to come is in high demand. Here, we present an Einstein ring recognition approach based on computer vision techniques. The workhorse is the circle Hough transform that recognise circular patterns or arcs in the images. We propose a two-tier approach by first pre-selecting massive galaxies associated with multiple blue objects as possible lens, than use Hough transform to identify circular pattern. As a proof-of-concept, we apply our approach to SDSS, with a high completeness, albeit with low purity. We also apply our approach to other lenses in DES, HSC-SSP, and UltraVISTA survey, illustrating the versatility of our approach.

  2. Promoter Sequences Prediction Using Relational Association Rule Mining

    PubMed Central

    Czibula, Gabriela; Bocicor, Maria-Iuliana; Czibula, Istvan Gergely

    2012-01-01

    In this paper we are approaching, from a computational perspective, the problem of promoter sequences prediction, an important problem within the field of bioinformatics. As the conditions for a DNA sequence to function as a promoter are not known, machine learning based classification models are still developed to approach the problem of promoter identification in the DNA. We are proposing a classification model based on relational association rules mining. Relational association rules are a particular type of association rules and describe numerical orderings between attributes that commonly occur over a data set. Our classifier is based on the discovery of relational association rules for predicting if a DNA sequence contains or not a promoter region. An experimental evaluation of the proposed model and comparison with similar existing approaches is provided. The obtained results show that our classifier overperforms the existing techniques for identifying promoter sequences, confirming the potential of our proposal. PMID:22563233

  3. Implementation of Paste Backfill Mining Technology in Chinese Coal Mines

    PubMed Central

    Chang, Qingliang; Zhou, Huaqiang; Bai, Jianbiao

    2014-01-01

    Implementation of clean mining technology at coal mines is crucial to protect the environment and maintain balance among energy resources, consumption, and ecology. After reviewing present coal clean mining technology, we introduce the technology principles and technological process of paste backfill mining in coal mines and discuss the components and features of backfill materials, the constitution of the backfill system, and the backfill process. Specific implementation of this technology and its application are analyzed for paste backfill mining in Daizhuang Coal Mine; a practical implementation shows that paste backfill mining can improve the safety and excavation rate of coal mining, which can effectively resolve surface subsidence problems caused by underground mining activities, by utilizing solid waste such as coal gangues as a resource. Therefore, paste backfill mining is an effective clean coal mining technology, which has widespread application. PMID:25258737

  4. Implementation of paste backfill mining technology in Chinese coal mines.

    PubMed

    Chang, Qingliang; Chen, Jianhang; Zhou, Huaqiang; Bai, Jianbiao

    2014-01-01

    Implementation of clean mining technology at coal mines is crucial to protect the environment and maintain balance among energy resources, consumption, and ecology. After reviewing present coal clean mining technology, we introduce the technology principles and technological process of paste backfill mining in coal mines and discuss the components and features of backfill materials, the constitution of the backfill system, and the backfill process. Specific implementation of this technology and its application are analyzed for paste backfill mining in Daizhuang Coal Mine; a practical implementation shows that paste backfill mining can improve the safety and excavation rate of coal mining, which can effectively resolve surface subsidence problems caused by underground mining activities, by utilizing solid waste such as coal gangues as a resource. Therefore, paste backfill mining is an effective clean coal mining technology, which has widespread application.

  5. Effective integrated frameworks for assessing mining sustainability.

    PubMed

    Virgone, K M; Ramirez-Andreotta, M; Mainhagu, J; Brusseau, M L

    2018-05-28

    The objectives of this research are to review existing methods used for assessing mining sustainability, analyze the limited prior research that has evaluated the methods, and identify key characteristics that would constitute an enhanced sustainability framework that would serve to improve sustainability reporting in the mining industry. Five of the most relevant frameworks were selected for comparison in this analysis, and the results show that there are many commonalities among the five, as well as some disparities. In addition, relevant components are missing from all five. An enhanced evaluation system and framework were created to provide a more holistic, comprehensive method for sustainability assessment and reporting. The proposed framework has five components that build from and encompass the twelve evaluation characteristics used in the analysis. The components include Foundation, Focus, Breadth, Quality Assurance, and Relevance. The enhanced framework promotes a comprehensive, location-specific reporting approach with a concise set of well-defined indicators. Built into the framework is quality assurance, as well as a defined method to use information from sustainability reports to inform decisions. The framework incorporates human health and socioeconomic aspects via initiatives such as community-engaged research, economic valuations, and community-initiated environmental monitoring.

  6. Biogeochemical behaviour and bioremediation of uranium in waters of abandoned mines.

    PubMed

    Mkandawire, Martin

    2013-11-01

    The discharges of uranium and associated radionuclides as well as heavy metals and metalloids from waste and tailing dumps in abandoned uranium mining and processing sites pose contamination risks to surface and groundwater. Although many more are being planned for nuclear energy purposes, most of the abandoned uranium mines are a legacy of uranium production that fuelled arms race during the cold war of the last century. Since the end of cold war, there have been efforts to rehabilitate the mining sites, initially, using classical remediation techniques based on high chemical and civil engineering. Recently, bioremediation technology has been sought as alternatives to the classical approach due to reasons, which include: (a) high demand of sites requiring remediation; (b) the economic implication of running and maintaining the facilities due to high energy and work force demand; and (c) the pattern and characteristics of contaminant discharges in most of the former uranium mining and processing sites prevents the use of classical methods. This review discusses risks of uranium contamination from abandoned uranium mines from the biogeochemical point of view and the potential and limitation of uranium bioremediation technique as alternative to classical approach in abandoned uranium mining and processing sites.

  7. The Spatial Assessment of the Current Seismic Hazard State for Hard Rock Underground Mines

    NASA Astrophysics Data System (ADS)

    Wesseloo, Johan

    2018-06-01

    Mining-induced seismic hazard assessment is an important component in the management of safety and financial risk in mines. As the seismic hazard is a response to the mining activity, it is non-stationary and variable both in space and time. This paper presents an approach for implementing a probabilistic seismic hazard assessment to assess the current hazard state of a mine. Each of the components of the probabilistic seismic hazard assessment is considered within the context of hard rock underground mines. The focus of this paper is the assessment of the in-mine hazard distribution and does not consider the hazard to nearby public or structures. A rating system and methodologies to present hazard maps, for the purpose of communicating to different stakeholders in the mine, i.e. mine managers, technical personnel and the work force, are developed. The approach allows one to update the assessment with relative ease and within short time periods as new data become available, enabling the monitoring of the spatial and temporal change in the seismic hazard.

  8. Distributed communications and control network for robotic mining

    NASA Technical Reports Server (NTRS)

    Schiffbauer, William H.

    1989-01-01

    The application of robotics to coal mining machines is one approach pursued to increase productivity while providing enhanced safety for the coal miner. Toward that end, a network composed of microcontrollers, computers, expert systems, real time operating systems, and a variety of program languages are being integrated that will act as the backbone for intelligent machine operation. Actual mining machines, including a few customized ones, have been given telerobotic semiautonomous capabilities by applying the described network. Control devices, intelligent sensors and computers onboard these machines are showing promise of achieving improved mining productivity and safety benefits. Current research using these machines involves navigation, multiple machine interaction, machine diagnostics, mineral detection, and graphical machine representation. Guidance sensors and systems employed include: sonar, laser rangers, gyroscopes, magnetometers, clinometers, and accelerometers. Information on the network of hardware/software and its implementation on mining machines are presented. Anticipated coal production operations using the network are discussed. A parallelism is also drawn between the direction of present day underground coal mining research to how the lunar soil (regolith) may be mined. A conceptual lunar mining operation that employs a distributed communication and control network is detailed.

  9. Mine waste management legislation. Gold mining areas in Romania

    NASA Astrophysics Data System (ADS)

    Maftei, Raluca-Mihaela; Filipciuc, Constantina; Tudor, Elena

    2014-05-01

    Problems in the post-mining regions of Eastern Europe range from degraded land and landscapes, huge insecure dumps, surface cracks, soil pollution, lowering groundwater table, deforestation, and damaged cultural potentials to socio economic problems like unemployment or population decline. There is no common prescription for tackling the development of post-mining regions after mine closure nor is there a common definition of good practices or policy in this field. Key words : waste management, legislation, EU Directive, post mining Rosia Montana is a common oh 16 villages; one of them is also called Rosia Montana, a traditional mining Community, located in the Apuseni Mountains in the North-Western Romania. Beneath part of the village area lays one of the largest gold and silver deposits in Europe. In the Rosia Montana area mining had begun ever since the height of the Roman Empire. While the modern approach to mining demands careful remediation of environmental impacts, historically disused mines in this region have been abandoned, leaving widespread environmental damage. General legislative framework Strict regulations and procedures govern modern mining activity, including mitigation of all environmental impacts. Precious metals exploitation is put under GO no. 190/2000 re-published in 2004. The institutional framework was established and organized based on specific regulations, being represented by the following bodies: • The Ministry of Economy and Commerce (MEC), a public institution which develops the Government policy in the mining area, also provides the management of the public property in the mineral resources area; • The National Agency for the development and implementation of the mining Regions Reconstruction Programs (NAD), responsible with promotion of social mitigation measures and actions; • The Office for Industry Privatization, within the Education Ministry, responsible with privatization of companies under the CEM; • The National

  10. Air pollutant intrusion into the Wieliczka Salt Mine

    USGS Publications Warehouse

    Salmon, L.G.; Cass, G.R.; Kozlowski, R.; Hejda, A.; Spiker, E. C.; Bates, A.L.

    1996-01-01

    The Wieliczka Salt Mine World Cultural Heritage Site contains many rock salt sculptures that are threatened by water vapor condensation from the mine ventilation air. Gaseous and particulate air pollutant concentrations have been measured both outdoors and within the Wieliczka Salt Mine, along with pollutant deposition fluxes to surfaces within the mine. One purpose of these measurements was to determine whether or not low deliquescence point ionic materials (e.g., NH4NO3) are accumulating on surfaces to an extent that would exacerbate the water vapor condensation problems in the mine. It was found that pollutant gases including SO2 and HNO3 present in outdoor air are removed rapidly and almost completely from the air within the mine by deposition to surfaces. Sulfur isotope analyses confirm the accumulation of air pollutant-derived sulfur in liquid dripping from surfaces within the mine. Particle deposition onto interior surfaces in the mine is apparent, with resulting soiling of some of those sculptures that have been carved from translucent rock salt. Water accumulation by salt sculpture surfaces was studied both experimentally and by approximate thermodynamic calculations. Both approaches suggest that the pollutant deposits on the sculpture surfaces lower the relative humidity (RH) at which a substantial amount of liquid water will accumulate by 1% to several percent. The extraordinarily low SO2 concentrations within the mine may explain the apparent success of a respiratory sanatorium located deep within the mine.

  11. Mining biological databases for candidate disease genes

    NASA Astrophysics Data System (ADS)

    Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.

    2001-07-01

    The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).

  12. Application of ERTS-A imagery to fracture related mine safety hazards in the coal mining industry

    NASA Technical Reports Server (NTRS)

    Wier, C. E.; Wobber, F. J. (Principal Investigator)

    1973-01-01

    The author has identified the following significant results. The most important result to date is the demonstration of the special value of repetitive ERTS-1 multiband coverage for detecting previously unknown fracture lineaments despite the presence of a deep glacial overburden. The Illinois Basin is largely covered with glacial drift and few rock outcrops are present. A contribution to the geological understanding of Illinois and Indiana has been made. Analysis of ERTS-1 imagery has provided useful information to the State of Indiana concerning the surface mined lands. The contrast between healthy vegetation and bare ground as imaged by Band 7 is sharp and substantial detail can be obtained concerning the extent of disturbed lands, associated water bodies, large haul roads, and extent of mined lands revegetation. Preliminary results of analysis suggest a reasonable correlation between image-detected fractures and mine roof fall accidents for a few areas investigated. ERTS-1 applications to surface mining operations appear probable, but further investigations are required. The likelihood of applying ERTS-1 derived fracture data to improve coal mine safety in the entire Illinois Basin is suggested from studies conducted in Indiana.

  13. GTA: a game theoretic approach to identifying cancer subnetwork markers.

    PubMed

    Farahmand, S; Goliaei, S; Ansari-Pour, N; Razaghi-Moghadam, Z

    2016-03-01

    The identification of genetic markers (e.g. genes, pathways and subnetworks) for cancer has been one of the most challenging research areas in recent years. A subset of these studies attempt to analyze genome-wide expression profiles to identify markers with high reliability and reusability across independent whole-transcriptome microarray datasets. Therefore, the functional relationships of genes are integrated with their expression data. However, for a more accurate representation of the functional relationships among genes, utilization of the protein-protein interaction network (PPIN) seems to be necessary. Herein, a novel game theoretic approach (GTA) is proposed for the identification of cancer subnetwork markers by integrating genome-wide expression profiles and PPIN. The GTA method was applied to three distinct whole-transcriptome breast cancer datasets to identify the subnetwork markers associated with metastasis. To evaluate the performance of our approach, the identified subnetwork markers were compared with gene-based, pathway-based and network-based markers. We show that GTA is not only capable of identifying robust metastatic markers, it also provides a higher classification performance. In addition, based on these GTA-based subnetworks, we identified a new bonafide candidate gene for breast cancer susceptibility.

  14. Using data mining to segment healthcare markets from patients' preference perspectives.

    PubMed

    Liu, Sandra S; Chen, Jie

    2009-01-01

    This paper aims to provide an example of how to use data mining techniques to identify patient segments regarding preferences for healthcare attributes and their demographic characteristics. Data were derived from a number of individuals who received in-patient care at a health network in 2006. Data mining and conventional hierarchical clustering with average linkage and Pearson correlation procedures are employed and compared to show how each procedure best determines segmentation variables. Data mining tools identified three differentiable segments by means of cluster analysis. These three clusters have significantly different demographic profiles. The study reveals, when compared with traditional statistical methods, that data mining provides an efficient and effective tool for market segmentation. When there are numerous cluster variables involved, researchers and practitioners need to incorporate factor analysis for reducing variables to clearly and meaningfully understand clusters. Interests and applications in data mining are increasing in many businesses. However, this technology is seldom applied to healthcare customer experience management. The paper shows that efficient and effective application of data mining methods can aid the understanding of patient healthcare preferences.

  15. 30 CFR 819.21 - Auger mining: Protection of underground mining.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 3 2011-07-01 2011-07-01 false Auger mining: Protection of underground mining. 819.21 Section 819.21 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT... STANDARDS-AUGER MINING § 819.21 Auger mining: Protection of underground mining. Auger holes shall not extend...

  16. Study of application of ERTS-A imagery to fracture-related mine safety hazards in the coal mining industry. [Indiana

    NASA Technical Reports Server (NTRS)

    Wier, C. E.; Wobber, F. J. (Principal Investigator); Russell, O. R.; Amato, R. V.; Leshendok, T.

    1973-01-01

    The author has identified the following significant results. The Mined Land Inventory map of Pike, Gibson, and Warrick Counties, Indiana, prepared from ERTS-1 imagery, was included in the 1973 Annual Report of the President's Council on Environmental Quality as an example of ERTS applications to mined lands. Increasing numbers of inquiries have been received from coal producing states and coal companies interested in the Indiana Program.

  17. Large-Scale Overlays and Trends: Visually Mining, Panning and Zooming the Observable Universe.

    PubMed

    Luciani, Timothy Basil; Cherinka, Brian; Oliphant, Daniel; Myers, Sean; Wood-Vasey, W Michael; Labrinidis, Alexandros; Marai, G Elisabeta

    2014-07-01

    We introduce a web-based computing infrastructure to assist the visual integration, mining and interactive navigation of large-scale astronomy observations. Following an analysis of the application domain, we design a client-server architecture to fetch distributed image data and to partition local data into a spatial index structure that allows prefix-matching of spatial objects. In conjunction with hardware-accelerated pixel-based overlays and an online cross-registration pipeline, this approach allows the fetching, displaying, panning and zooming of gigabit panoramas of the sky in real time. To further facilitate the integration and mining of spatial and non-spatial data, we introduce interactive trend images-compact visual representations for identifying outlier objects and for studying trends within large collections of spatial objects of a given class. In a demonstration, images from three sky surveys (SDSS, FIRST and simulated LSST results) are cross-registered and integrated as overlays, allowing cross-spectrum analysis of astronomy observations. Trend images are interactively generated from catalog data and used to visually mine astronomy observations of similar type. The front-end of the infrastructure uses the web technologies WebGL and HTML5 to enable cross-platform, web-based functionality. Our approach attains interactive rendering framerates; its power and flexibility enables it to serve the needs of the astronomy community. Evaluation on three case studies, as well as feedback from domain experts emphasize the benefits of this visual approach to the observational astronomy field; and its potential benefits to large scale geospatial visualization in general.

  18. Identifying candidate drivers of drug response in heterogeneous cancer by mining high throughput genomics data.

    PubMed

    Nabavi, Sheida

    2016-08-15

    With advances in technologies, huge amounts of multiple types of high-throughput genomics data are available. These data have tremendous potential to identify new and clinically valuable biomarkers to guide the diagnosis, assessment of prognosis, and treatment of complex diseases, such as cancer. Integrating, analyzing, and interpreting big and noisy genomics data to obtain biologically meaningful results, however, remains highly challenging. Mining genomics datasets by utilizing advanced computational methods can help to address these issues. To facilitate the identification of a short list of biologically meaningful genes as candidate drivers of anti-cancer drug resistance from an enormous amount of heterogeneous data, we employed statistical machine-learning techniques and integrated genomics datasets. We developed a computational method that integrates gene expression, somatic mutation, and copy number aberration data of sensitive and resistant tumors. In this method, an integrative method based on module network analysis is applied to identify potential driver genes. This is followed by cross-validation and a comparison of the results of sensitive and resistance groups to obtain the final list of candidate biomarkers. We applied this method to the ovarian cancer data from the cancer genome atlas. The final result contains biologically relevant genes, such as COL11A1, which has been reported as a cis-platinum resistant biomarker for epithelial ovarian carcinoma in several recent studies. The described method yields a short list of aberrant genes that also control the expression of their co-regulated genes. The results suggest that the unbiased data driven computational method can identify biologically relevant candidate biomarkers. It can be utilized in a wide range of applications that compare two conditions with highly heterogeneous datasets.

  19. Resilience of benthic deep-sea fauna to mining activities.

    PubMed

    Gollner, Sabine; Kaiser, Stefanie; Menzel, Lena; Jones, Daniel O B; Brown, Alastair; Mestre, Nelia C; van Oevelen, Dick; Menot, Lenaick; Colaço, Ana; Canals, Miquel; Cuvelier, Daphne; Durden, Jennifer M; Gebruk, Andrey; Egho, Great A; Haeckel, Matthias; Marcon, Yann; Mevenkamp, Lisa; Morato, Telmo; Pham, Christopher K; Purser, Autun; Sanchez-Vidal, Anna; Vanreusel, Ann; Vink, Annemiek; Martinez Arbizu, Pedro

    2017-08-01

    With increasing demand for mineral resources, extraction of polymetallic sulphides at hydrothermal vents, cobalt-rich ferromanganese crusts at seamounts, and polymetallic nodules on abyssal plains may be imminent. Here, we shortly introduce ecosystem characteristics of mining areas, report on recent mining developments, and identify potential stress and disturbances created by mining. We analyze species' potential resistance to future mining and perform meta-analyses on population density and diversity recovery after disturbances most similar to mining: volcanic eruptions at vents, fisheries on seamounts, and experiments that mimic nodule mining on abyssal plains. We report wide variation in recovery rates among taxa, size, and mobility of fauna. While densities and diversities of some taxa can recover to or even exceed pre-disturbance levels, community composition remains affected after decades. The loss of hard substrata or alteration of substrata composition may cause substantial community shifts that persist over geological timescales at mined sites. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Study of application of ERTS-1 imagery to fracture-related mine safety hazards in the coal mining industry

    NASA Technical Reports Server (NTRS)

    Wier, C. E. (Principal Investigator); Wobber, F. J. (Principal Investigator); Russell, O. R.; Amato, R. V.

    1972-01-01

    The author has identified the following significant results. Numerous fractures are identifiable on the 1:120,000 color infrared photography. Some of these fractures are in the proximity of operating open pit mines and should provide opportunities for field checking and confirmation.

  1. Wild flora of mine tailings: perspectives for use in phytoremediation of potentially toxic elements in a semi-arid region in Mexico.

    PubMed

    Sánchez-López, Ariadna S; Del Carmen A González-Chávez, Ma; Carrillo-González, Rogelio; Vangronsveld, Jaco; Díaz-Garduño, Margarita

    2015-01-01

    The aim of this research was to identify wild plant species applicable for remediation of mine tailings in arid soils. Plants growing on two mine tailings were identified and evaluated for their potential use in phytoremediation based on the concentration of potentially toxic elements (PTEs) in roots and shoots, bioconcentration (BCF) and translocation factors (TF). Total, water-soluble and DTPA-extractable concentrations of Pb, Cd, Zn, Cu, Co and Ni in rhizospheric and bulk soil were determined. Twelve species can grow on mine tailings, accumulate PTEs concentrations above the commonly accepted phytotoxicity levels, and are suitable for establishing a vegetation cover on barren mine tailings in the Zimapan region. Pteridium sp. is suitable for Zn and Cd phytostabilization. Aster gymnocephalus is a potential phytoextractor for Zn, Cd, Pb and Cu; Gnaphalium sp. for Cu and Crotalaria pumila for Zn. The species play different roles according to the specific conditions where they are growing at one site behaving as a PTEs accumulator and at another as a stabilizer. For this reason and due to the lack of a unified approach for calculation and interpretation of bioaccumulation factors, only considering BCF and TF may be not practical in all cases.

  2. Contribution to understanding the post-mining landscape - Application of airborn LiDAR and historical maps at the example from Silesian Upland (Poland)

    NASA Astrophysics Data System (ADS)

    Gawior, D.; Rutkiewicz, P.; Malik, I.; Wistuba, M.

    2017-11-01

    LiDAR data provide new insights into the historical development of mining industry recorded in the topography and landscape. In the study on the lead ore mining in the 13th-17th century we identified remnants of mining activity in relief that are normally obscured by dense vegetation. The industry in Tarnowice Plateau was based on exploitation of galena from the bedrock. New technologies, including DEM from airborne LiDAR provide show that present landscape and relief of post-mining area under study developed during several, subsequent phases of exploitation when different techniques of exploitation were used and probably different types of ores were exploited. Study conducted on the Tarnowice Plateau proved that combining GIS visualization techniques with historical maps, among all geological maps, is a promising approach in reconstructing development of anthropogenic relief and landscape..

  3. Identifying novel biomarkers in sarcoidosis using genome-based approaches

    PubMed Central

    Knox, Kenneth S.; Garcia, Joe G.N.

    2015-01-01

    Synopsis We briefly review conventional biomarkers used clinically to 1) support a diagnosis and 2) monitor disease progression in patients with sarcoidosis. We describe potential new biomarkers identified by genome-wide screening and the approaches to discover these biomarkers. PMID:26593137

  4. The Voice of Chinese Health Consumers: A Text Mining Approach to Web-Based Physician Reviews

    PubMed Central

    Zhang, Kunpeng

    2016-01-01

    experience of finding doctors, doctors’ technical skills and bedside manner, general appreciation from patients, and description of various symptoms. Conclusions To the best of our knowledge, our work is the first study using an automated text-mining approach to analyze a large amount of unstructured textual data of Web-based physician reviews in China. Based on our analysis, we found that Chinese reviewers mainly concentrate on a few popular topics. This is consistent with the goal of Chinese online health platforms and demonstrates the health care focus in China’s health care system. Our text-mining approach reveals a new research area on how to use big data to help health care providers, health care administrators, and policy makers hear patient voices, target patient concerns, and improve the quality of care in this age of patient-centered care. Also, on the health care consumer side, our text mining technique helps patients make more informed decisions about which specialists to see without reading thousands of reviews, which is simply not feasible. In addition, our comparison analysis of Web-based physician reviews in China and the United States also indicates some cultural differences. PMID:27165558

  5. The Voice of Chinese Health Consumers: A Text Mining Approach to Web-Based Physician Reviews.

    PubMed

    Hao, Haijing; Zhang, Kunpeng

    2016-05-10

    skills and bedside manner, general appreciation from patients, and description of various symptoms. To the best of our knowledge, our work is the first study using an automated text-mining approach to analyze a large amount of unstructured textual data of Web-based physician reviews in China. Based on our analysis, we found that Chinese reviewers mainly concentrate on a few popular topics. This is consistent with the goal of Chinese online health platforms and demonstrates the health care focus in China's health care system. Our text-mining approach reveals a new research area on how to use big data to help health care providers, health care administrators, and policy makers hear patient voices, target patient concerns, and improve the quality of care in this age of patient-centered care. Also, on the health care consumer side, our text mining technique helps patients make more informed decisions about which specialists to see without reading thousands of reviews, which is simply not feasible. In addition, our comparison analysis of Web-based physician reviews in China and the United States also indicates some cultural differences.

  6. The LANL/LLNL/AFTAC Black Thunder Coal Mine regional mine monitoring experiment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pearson, D.C.; Stump, B.W.; Baker, D.F.

    Cast blasting operations associated with near surface coal recovery provide relatively large explosive sources that generate regional seismograms of interest in monitoring a Comprehensive Test Ban Treaty (CTBT). This paper describes preliminary results of a series of experiments currently being conducted at the Black Thunder Coal Mine in northeast Wyoming as part of the DOE CTBT Research and Development Program. These experiments are intended to provide an integrated set of near-source and regional seismic data for the purposes of quantifying the coupling and source characterization of the explosions. The focus of this paper is on the types of data beingmore » recovered with some preliminary implications. The Black Thunder experiments are designed to assess three major questions: (1) how many mining explosions produce seismograms at regional distances that will have to be detected, located and ultimately identified by the National Data Center and what are the waveform characteristics of these particular mining explosions; (2) can discrimination techniques based on empirical studies be placed on a firm physical basis so that they can be applied to other regions where there is little monitoring experience; (3) can large scale chemical explosions (possibly mining explosions) be used to calibrate source and propagation path effects to regional stations, can source depth of burial and decoupling effects be studied in such a controlled environment? With these key questions in mind and given the cooperation of the Black Thunder Mine, a suite of experiments have been and are currently being conducted. This paper will describe the experiments and their relevance to CTBT issues.« less

  7. Mine Waste at The Kherzet Youcef Mine : Environmental Characterization

    NASA Astrophysics Data System (ADS)

    Issaad, Mouloud; Boutaleb, Abdelhak; Kolli, Omar

    2017-04-01

    Mining activity in Algeria has existed since antiquity. But it was very important since the 20th century. This activity has virtually ceased since the beginning of the 1990s, leaving many mine sites abandoned (so-called orphan mines). The abandonment of mining today poses many environmental problems (soil pollution, contamination of surface water, mining collapses...). The mining wastes often occupy large volumes that can be hazardous to the environment and human health, often neglected in the past: Faulting geotechnical implementation, acid mine drainage (AMD), alkalinity, presence of pollutants and toxic substances (heavy metals, cyanide...). The study started already six years ago and it covers all mines located in NE Algeria, almost are stopped for more than thirty years. So the most important is to have an overview of all the study area. After the inventory job of the abandoned mines, the rock drainage prediction will help us to classify sites according to their acid generating potential.

  8. BOUNDS ON SUBSURFACE MERCURY FLUX FROM THE SULPHUR BANK MERCURY MINE, LAKE COUNTY, CALIFORNIA

    EPA Science Inventory

    The Sulphur Bank Mercury Mine (SBMM) in Lake County, California has been identified as a significant source of mercury to Clear Lake. The mine was operated from the 1860s through the 1950's. Mining started with surface operations, progressed to shaft mining, and later to open p...

  9. Effective application of improved profit-mining algorithm for the interday trading model.

    PubMed

    Hsieh, Yu-Lung; Yang, Don-Lin; Wu, Jungpin

    2014-01-01

    Many real world applications of association rule mining from large databases help users make better decisions. However, they do not work well in financial markets at this time. In addition to a high profit, an investor also looks for a low risk trading with a better rate of winning. The traditional approach of using minimum confidence and support thresholds needs to be changed. Based on an interday model of trading, we proposed effective profit-mining algorithms which provide investors with profit rules including information about profit, risk, and winning rate. Since profit-mining in the financial market is still in its infant stage, it is important to detail the inner working of mining algorithms and illustrate the best way to apply them. In this paper we go into details of our improved profit-mining algorithm and showcase effective applications with experiments using real world trading data. The results show that our approach is practical and effective with good performance for various datasets.

  10. Effective Application of Improved Profit-Mining Algorithm for the Interday Trading Model

    PubMed Central

    Wu, Jungpin

    2014-01-01

    Many real world applications of association rule mining from large databases help users make better decisions. However, they do not work well in financial markets at this time. In addition to a high profit, an investor also looks for a low risk trading with a better rate of winning. The traditional approach of using minimum confidence and support thresholds needs to be changed. Based on an interday model of trading, we proposed effective profit-mining algorithms which provide investors with profit rules including information about profit, risk, and winning rate. Since profit-mining in the financial market is still in its infant stage, it is important to detail the inner working of mining algorithms and illustrate the best way to apply them. In this paper we go into details of our improved profit-mining algorithm and showcase effective applications with experiments using real world trading data. The results show that our approach is practical and effective with good performance for various datasets. PMID:24688442

  11. An Outbreak of Lymphocutaneous Sporotrichosis among Mine-Workers in South Africa.

    PubMed

    Govender, Nelesh P; Maphanga, Tsidiso G; Zulu, Thokozile G; Patel, Jaymati; Walaza, Sibongile; Jacobs, Charlene; Ebonwu, Joy I; Ntuli, Sindile; Naicker, Serisha D; Thomas, Juno

    2015-09-01

    The largest outbreak of sporotrichosis occurred between 1938 and 1947 in the gold mines of Witwatersrand in South Africa. Here, we describe an outbreak of lymphocutaneous sporotrichosis that was investigated in a South African gold mine in 2011. Employees working at a reopened section of the mine were recruited for a descriptive cross-sectional study. Informed consent was sought for interview, clinical examination and medical record review. Specimens were collected from participants with active or partially-healed lymphocutaneous lesions. Environmental samples were collected from underground mine levels. Sporothrix isolates were identified by sequencing of the internal transcribed spacer region of the ribosomal gene and the nuclear calmodulin gene. Of 87 male miners, 81 (93%) were interviewed and examined, of whom 29 (36%) had skin lesions; specimens were collected from 17 (59%). Sporotrichosis was laboratory-confirmed among 10 patients and seven had clinically-compatible lesions. Of 42 miners with known HIV status, 11 (26%) were HIV-infected. No cases of disseminated disease were detected. Participants with ≤ 3 years' mining experience had a four times greater odds of developing sporotrichosis than those who had been employed for >3 years (adjusted OR 4.0, 95% CI 1.2-13.1). Isolates from 8 patients were identified as Sporothrix schenckii sensu stricto by calmodulin gene sequencing while environmental isolates were identified as Sporothrix mexicana. S. schenckii sensu stricto was identified as the causative pathogen. Although genetically distinct species were isolated from clinical and environmental sources, it is likely that the source was contaminated soil and untreated wood underground. No cases occurred following recommendations to close sections of the mine, treat timber and encourage consistent use of personal protective equipment. Sporotrichosis is a potentially re-emerging disease where traditional, rather than heavily mechanised, mining techniques are

  12. An Outbreak of Lymphocutaneous Sporotrichosis among Mine-Workers in South Africa

    PubMed Central

    Govender, Nelesh P.; Maphanga, Tsidiso G.; Zulu, Thokozile G.; Patel, Jaymati; Walaza, Sibongile; Jacobs, Charlene; Ebonwu, Joy I.; Ntuli, Sindile; Naicker, Serisha D.; Thomas, Juno

    2015-01-01

    Background The largest outbreak of sporotrichosis occurred between 1938 and 1947 in the gold mines of Witwatersrand in South Africa. Here, we describe an outbreak of lymphocutaneous sporotrichosis that was investigated in a South African gold mine in 2011. Methodology Employees working at a reopened section of the mine were recruited for a descriptive cross-sectional study. Informed consent was sought for interview, clinical examination and medical record review. Specimens were collected from participants with active or partially-healed lymphocutaneous lesions. Environmental samples were collected from underground mine levels. Sporothrix isolates were identified by sequencing of the internal transcribed spacer region of the ribosomal gene and the nuclear calmodulin gene. Principal Findings Of 87 male miners, 81 (93%) were interviewed and examined, of whom 29 (36%) had skin lesions; specimens were collected from 17 (59%). Sporotrichosis was laboratory-confirmed among 10 patients and seven had clinically-compatible lesions. Of 42 miners with known HIV status, 11 (26%) were HIV-infected. No cases of disseminated disease were detected. Participants with ≤3 years’ mining experience had a four times greater odds of developing sporotrichosis than those who had been employed for >3 years (adjusted OR 4.0, 95% CI 1.2–13.1). Isolates from 8 patients were identified as Sporothrix schenckii sensu stricto by calmodulin gene sequencing while environmental isolates were identified as Sporothrix mexicana. Conclusions/Significance S. schenckii sensu stricto was identified as the causative pathogen. Although genetically distinct species were isolated from clinical and environmental sources, it is likely that the source was contaminated soil and untreated wood underground. No cases occurred following recommendations to close sections of the mine, treat timber and encourage consistent use of personal protective equipment. Sporotrichosis is a potentially re-emerging disease where

  13. Recent developments in the reclamation of surface mined lands

    USGS Publications Warehouse

    Sharma, K.D.; Gough, L.P.; Kumar, S.; Sharma, B.K.; Saxena, S.K.

    1997-01-01

    A broad review of mine land reclamation problems and challenges in arid lands is presented with special emphasis on work recently completed in India. The economics of mining in the Indian Desert is second only to agriculture in importance. Lands disturbed by mining, however, have only recently been the focus of reclamation attempts. Studies were made and results compiled of problems associated with germplasm selection, soil, plant and overburden characterization and manipulation, plant establishment methods utilized, soil amendment needs, use and conservation of available water and the evaluation of ecosystem sustainability. Emphasis is made of the need for multi-disciplinary approaches to mine land reclamation research and for the long-term monitoring of reclamation success.

  14. Study of acid mine drainage management with evaluating climate and rainfall in East Pit 3 West Banko coal mine

    NASA Astrophysics Data System (ADS)

    Rochyani, Neny

    2017-11-01

    Acid mine drainage is a major problem for the mining environment. The main factor that formed acid mine drainage is the volume of rainfall. Therefore, it is important to know clearly the main climate pattern of rainfall and season on the management of acid mine drainage. This study focuses on the effects of rainfall on acid mine water management. Based on daily rainfall data, monthly and seasonal patterns by using Gumbel approach is known the amount of rainfall that occurred in East Pit 3 West Banko area. The data also obtained the highest maximum daily rainfall on 165 mm/day and the lowest at 76.4 mm/day, where it is known that the rainfall conditions during the period 2007 - 2016 is from November to April so the use of lime is also slightly, While the low rainfall is from May to October and the use of lime will be more and more. Based on calculation of lime requirement for each return period, it can be seen the total of lime and financial requirement for treatment of each return period.

  15. The human factor in mining reclamation

    USGS Publications Warehouse

    Arbogast, Belinda F.; Knepper, Daniel H.; Langer, William H.

    2000-01-01

    Rapid urbanization of the landscape results in less space available for wildlife habitat, agriculture, and recreation. Mineral resources (especially nonmetallic construction materials) become unrecoverable due to inaccessibility caused by development. This report both describes mine sites with serious problems and draws attention to thoughtful reclamation projects for better future management. It presents information from selected sites in terms of their history, landform, design approach, and visual discernment. Examples from Colorado are included to introduce the broader issue of regions soundly developing mining sites, permitting the best utilization of natural resources, and respecting the landscape.

  16. Application of techniques to identify coal-mine and power-generation effects on surface-water quality, San Juan River basin, New Mexico and Colorado

    USGS Publications Warehouse

    Goetz, C.L.; Abeyta, Cynthia G.; Thomas, E.V.

    1987-01-01

    Numerous analytical techniques were applied to determine water quality changes in the San Juan River basin upstream of Shiprock , New Mexico. Eight techniques were used to analyze hydrologic data such as: precipitation, water quality, and streamflow. The eight methods used are: (1) Piper diagram, (2) time-series plot, (3) frequency distribution, (4) box-and-whisker plot, (5) seasonal Kendall test, (6) Wilcoxon rank-sum test, (7) SEASRS procedure, and (8) analysis of flow adjusted, specific conductance data and smoothing. Post-1963 changes in dissolved solids concentration, dissolved potassium concentration, specific conductance, suspended sediment concentration, or suspended sediment load in the San Juan River downstream from the surface coal mines were examined to determine if coal mining was having an effect on the quality of surface water. None of the analytical methods used to analyzed the data showed any increase in dissolved solids concentration, dissolved potassium concentration, or specific conductance in the river downstream from the mines; some of the analytical methods used showed a decrease in dissolved solids concentration and specific conductance. Chaco River, an ephemeral stream tributary to the San Juan River, undergoes changes in water quality due to effluent from a power generation facility. The discharge in the Chaco River contributes about 1.9% of the average annual discharge at the downstream station, San Juan River at Shiprock, NM. The changes in water quality detected at the Chaco River station were not detected at the downstream Shiprock station. It was not possible, with the available data, to identify any effects of the surface coal mines on water quality that were separable from those of urbanization, agriculture, and other cultural and natural changes. In order to determine the specific causes of changes in water quality, it would be necessary to collect additional data at strategically located stations. (Author 's abstract)

  17. Review of Recent Development of Dynamic Wind Farm Equivalent Models Based on Big Data Mining

    NASA Astrophysics Data System (ADS)

    Wang, Chenggen; Zhou, Qian; Han, Mingzhe; Lv, Zhan’ao; Hou, Xiao; Zhao, Haoran; Bu, Jing

    2018-04-01

    Recently, the big data mining method has been applied in dynamic wind farm equivalent modeling. In this paper, its recent development with present research both domestic and overseas is reviewed. Firstly, the studies of wind speed prediction, equivalence and its distribution in the wind farm are concluded. Secondly, two typical approaches used in the big data mining method is introduced, respectively. For single wind turbine equivalent modeling, it focuses on how to choose and identify equivalent parameters. For multiple wind turbine equivalent modeling, the following three aspects are concentrated, i.e. aggregation of different wind turbine clusters, the parameters in the same cluster, and equivalence of collector system. Thirdly, an outlook on the development of dynamic wind farm equivalent models in the future is discussed.

  18. Identifying influential factors of business process performance using dependency analysis

    NASA Astrophysics Data System (ADS)

    Wetzstein, Branimir; Leitner, Philipp; Rosenberg, Florian; Dustdar, Schahram; Leymann, Frank

    2011-02-01

    We present a comprehensive framework for identifying influential factors of business process performance. In particular, our approach combines monitoring of process events and Quality of Service (QoS) measurements with dependency analysis to effectively identify influential factors. The framework uses data mining techniques to construct tree structures to represent dependencies of a key performance indicator (KPI) on process and QoS metrics. These dependency trees allow business analysts to determine how process KPIs depend on lower-level process metrics and QoS characteristics of the IT infrastructure. The structure of the dependencies enables a drill-down analysis of single factors of influence to gain a deeper knowledge why certain KPI targets are not met.

  19. Pattern mining of user interaction logs for a post-deployment usability evaluation of a radiology PACS client.

    PubMed

    Jorritsma, Wiard; Cnossen, Fokie; Dierckx, Rudi A; Oudkerk, Matthijs; van Ooijen, Peter M A

    2016-01-01

    To perform a post-deployment usability evaluation of a radiology Picture Archiving and Communication System (PACS) client based on pattern mining of user interaction log data, and to assess the usefulness of this approach compared to a field study. All user actions performed on the PACS client were logged for four months. A data mining technique called closed sequential pattern mining was used to automatically extract frequently occurring interaction patterns from the log data. These patterns were used to identify usability issues with the PACS. The results of this evaluation were compared to the results of a field study based usability evaluation of the same PACS client. The interaction patterns revealed four usability issues: (1) the display protocols do not function properly, (2) the line measurement tool stays active until another tool is selected, rather than being deactivated after one use, (3) the PACS's built-in 3D functionality does not allow users to effectively perform certain 3D-related tasks, (4) users underuse the PACS's customization possibilities. All usability issues identified based on the log data were also found in the field study, which identified 48 issues in total. Post-deployment usability evaluation based on pattern mining of user interaction log data provides useful insights into the way users interact with the radiology PACS client. However, it reveals few usability issues compared to a field study and should therefore not be used as the sole method of usability evaluation. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  20. Applying Data Mining Techniques to Improve Breast Cancer Diagnosis.

    PubMed

    Diz, Joana; Marreiros, Goreti; Freitas, Alberto

    2016-09-01

    In the field of breast cancer research, and more than ever, new computer aided diagnosis based systems have been developed aiming to reduce diagnostic tests false-positives. Within this work, we present a data mining based approach which might support oncologists in the process of breast cancer classification and diagnosis. The present study aims to compare two breast cancer datasets and find the best methods in predicting benign/malignant lesions, breast density classification, and even for finding identification (mass / microcalcification distinction). To carry out these tasks, two matrices of texture features extraction were implemented using Matlab, and classified using data mining algorithms, on WEKA. Results revealed good percentages of accuracy for each class: 89.3 to 64.7 % - benign/malignant; 75.8 to 78.3 % - dense/fatty tissue; 71.0 to 83.1 % - finding identification. Among the different tests classifiers, Naive Bayes was the best to identify masses texture, and Random Forests was the first or second best classifier for the majority of tested groups.

  1. Data Mining Research with the LSST

    NASA Astrophysics Data System (ADS)

    Borne, Kirk D.; Strauss, M. A.; Tyson, J. A.

    2007-12-01

    The LSST catalog database will exceed 10 petabytes, comprising several hundred attributes for 5 billion galaxies, 10 billion stars, and over 1 billion variable sources (optical variables, transients, or moving objects), extracted from over 20,000 square degrees of deep imaging in 5 passbands with thorough time domain coverage: 1000 visits over the 10-year LSST survey lifetime. The opportunities are enormous for novel scientific discoveries within this rich time-domain ultra-deep multi-band survey database. Data Mining, Machine Learning, and Knowledge Discovery research opportunities with the LSST are now under study, with a potential for new collaborations to develop to contribute to these investigations. We will describe features of the LSST science database that are amenable to scientific data mining, object classification, outlier identification, anomaly detection, image quality assurance, and survey science validation. We also give some illustrative examples of current scientific data mining research in astronomy, and point out where new research is needed. In particular, the data mining research community will need to address several issues in the coming years as we prepare for the LSST data deluge. The data mining research agenda includes: scalability (at petabytes scales) of existing machine learning and data mining algorithms; development of grid-enabled parallel data mining algorithms; designing a robust system for brokering classifications from the LSST event pipeline (which may produce 10,000 or more event alerts per night); multi-resolution methods for exploration of petascale databases; visual data mining algorithms for visual exploration of the data; indexing of multi-attribute multi-dimensional astronomical databases (beyond RA-Dec spatial indexing) for rapid querying of petabyte databases; and more. Finally, we will identify opportunities for synergistic collaboration between the data mining research group and the LSST Data Management and Science

  2. Cinnamon gulch revisited: Another look at separating natural and mining-impacted contributions to instream metal load

    USGS Publications Warehouse

    Runkel, Robert L.; Verplanck, Philip; Kimball, Briant; Walton-Day, Katie

    2018-01-01

    Baseline, premining data for streams draining abandoned mine lands is virtually non existent, and indirect methods for estimating premining conditions are needed to establish realistic, cost effective cleanup goals. One such indirect method is the proximal analog approach, in which premining conditions are estimated using data from nearby mineralized areas that are unaffected by mining. In this paper, we combine the proximal analog approach with a quantitative mass balance framework using data from a spatially-detailed synoptic sampling campaign. The combined approach is applied to Cinnamon Gulch, a headwater stream with numerous draining adits. Synoptic sampling results indicate that three of the top five metal sources are affected by mining activities, and stream segments draining these sources account for a large percentage of overall metal loading within the study reach. These initial calculations overestimate the effects of mining, as the affected stream segments were likely acidic and metal rich prior to mining. Premining loads and concentrations were therefore determined through a replacement approach in which the chemistry of each mining-affected stream segment is revised based on proximal analog concentrations. The revised loading profiles indicate that 15–17% of the Al, Cd, Cu, Mn, Ni, and Zn loads are attributable to mining, whereas the mining contribution for Pb is 40%. Premining concentrations of Al, Cd, Cu, Mn, and Zn are estimated to be in excess of aquatic life standards over the length of the study reach.

  3. A novel approach for acid mine drainage pollution biomonitoring using rare earth elements bioaccumulated in the freshwater clam Corbicula fluminea.

    PubMed

    Bonnail, Estefanía; Pérez-López, Rafael; Sarmiento, Aguasanta M; Nieto, José Miguel; DelValls, T Ángel

    2017-09-15

    Lanthanide series have been used as a record of the water-rock interaction and work as a tool for identifying impacts of acid mine drainage (lixiviate residue derived from sulphide oxidation). The application of North-American Shale Composite-normalized rare earth elements patterns to these minority elements allows determining the origin of the contamination. In the current study, geochemical patterns were applied to rare earth elements bioaccumulated in the soft tissue of the freshwater clam Corbicula fluminea after exposure to different acid mine drainage contaminated environments. Results show significant bioaccumulation of rare earth elements in soft tissue of the clam after 14 days of exposure to acid mine drainage contaminated sediment (ΣREE=1.3-8μg/gdw). Furthermore, it was possible to biomonitor different degrees of contamination based on rare earth elements in tissue. The pattern of this type of contamination describes a particular curve characterized by an enrichment in the middle rare earth elements; a homologous pattern (E MREE =0.90) has also been observed when applied NASC normalization in clam tissues. Results of lanthanides found in clams were contrasted with the paucity of toxicity studies, determining risk caused by light rare earth elements in the Odiel River close to the Estuary. The current study purposes the use of clam as an innovative "bio-tool" for the biogeochemical monitoring of pollution inputs that determines the acid mine drainage networks affection. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Geochemical characterisation of seepage and drainage water quality from two sulphide mine tailings impoundments: Acid mine drainage versus neutral mine drainage

    USGS Publications Warehouse

    Heikkinen, P.M.; Raisanen, M.L.; Johnson, R.H.

    2009-01-01

    Seepage water and drainage water geochemistry (pH, EC, O2, redox, alkalinity, dissolved cations and trace metals, major anions, total element concentrations) were studied at two active sulphide mine tailings impoundments in Finland (the Hitura Ni mine and Luikonlahti Cu mine/talc processing plant). The data were used to assess the factors influencing tailings seepage quality and to identify constraints for water treatment. Changes in seepage water quality after equilibration with atmospheric conditions were evaluated based on geochemical modelling. At Luikonlahti, annual and seasonal changes were also studied. Seepage quality was largely influenced by the tailings mineralogy, and the serpentine-rich, low sulphide Hitura tailings produced neutral mine drainage with high Ni. In contrast, drainage from the high sulphide, multi-metal tailings of Luikonlahti represented typical acid mine drainage with elevated contents of Zn, Ni, Cu, and Co. Other factors affecting the seepage quality included weathering of the tailings along the seepage flow path, process water input, local hydrological settings, and structural changes in the tailings impoundment. Geochemical modelling showed that pH increased and some heavy metals were adsorbed to Fe precipitates after net alkaline waters equilibrated with the atmosphere. In the net acidic waters, pH decreased and no adsorption occurred. A combination of aerobic and anaerobic treatments is proposed for Hitura seepages to decrease the sulphate and metal loading. For Luikonlahti, prolonged monitoring of the seepage quality is suggested instead of treatment, since the water quality is still adjusting to recent modifications to the tailings impoundment.

  5. POST-MINING DEVELOPMENT USING RESOURCES FROM FLOODED UNDERGROUND MINE WORKINGS

    EPA Science Inventory

    Post-mining issues of land and surface utilization now serve to accentuate how important it is to incorporate sustainable development aspects into hard rock mining. In an effort to revitalize lands degraded by historic mining, 10 acres of mine tailings near the Belmont Mine have...

  6. Mining concepts of health responsibility using text mining and exploratory graph analysis.

    PubMed

    Kjellström, Sofia; Golino, Hudson

    2018-05-24

    Occupational therapists need to know about people's beliefs about personal responsibility for health to help them pursue everyday activities. The study aims to employ state-of-the-art quantitative approaches to understand people's views of health and responsibility at different ages. A mixed method approach was adopted, using text mining to extract information from 233 interviews with participants aged 5 to 96 years, and then exploratory graph analysis to estimate the number of latent variables. The fit of the structure estimated via the exploratory graph analysis was verified using confirmatory factor analysis. Exploratory graph analysis estimated three dimensions of health responsibility: (1) creating good health habits and feeling good; (2) thinking about one's own health and wanting to improve it; and 3) adopting explicitly normative attitudes to take care of one's health. The comparison between the three dimensions among age groups showed, in general, that children and adolescents, as well as the old elderly (>73 years old) expressed ideas about personal responsibility for health less than young adults, adults and young elderly. Occupational therapists' knowledge of the concepts of health responsibility is of value when working with a patient's health, but an identified challenge is how to engage children and older persons.

  7. Mars methane analogue mission: Mission simulation and rover operations at Jeffrey Mine and Norbestos Mine Quebec, Canada

    NASA Astrophysics Data System (ADS)

    Qadi, A.; Cloutis, E.; Samson, C.; Whyte, L.; Ellery, A.; Bell, J. F.; Berard, G.; Boivin, A.; Haddad, E.; Lavoie, J.; Jamroz, W.; Kruzelecky, R.; Mack, A.; Mann, P.; Olsen, K.; Perrot, M.; Popa, D.; Rhind, T.; Sharma, R.; Stromberg, J.; Strong, K.; Tremblay, A.; Wilhelm, R.; Wing, B.; Wong, B.

    2015-05-01

    The Canadian Space Agency (CSA), through its Analogue Missions program, supported a microrover-based analogue mission designed to simulate a Mars rover mission geared toward identifying and characterizing methane emissions on Mars. The analogue mission included two, progressively more complex, deployments in open-pit asbestos mines where methane can be generated from the weathering of olivine into serpentine: the Jeffrey mine deployment (June 2011) and the Norbestos mine deployment (June 2012). At the Jeffrey Mine, testing was conducted over 4 days using a modified off-the-shelf Pioneer rover and scientific instruments including Raman spectrometer, Picarro methane detector, hyperspectral point spectrometer and electromagnetic induction sounder for testing rock and gas samples. At the Norbestos Mine, we used the research Kapvik microrover which features enhanced autonomous navigation capabilities and a wider array of scientific instruments. This paper describes the rover operations in terms of planning, deployment, communication and equipment setup, rover path parameters and instrument performance. Overall, the deployments suggest that a search strategy of “follow the methane” is not practical given the mechanisms of methane dispersion. Rather, identification of features related to methane sources based on image tone/color and texture from panoramic imagery is more profitable.

  8. Identifying high-cost patients using data mining techniques and a small set of non-trivial attributes.

    PubMed

    Izad Shenas, Seyed Abdolmotalleb; Raahemi, Bijan; Hossein Tekieh, Mohammad; Kuziemsky, Craig

    2014-10-01

    In this paper, we use data mining techniques, namely neural networks and decision trees, to build predictive models to identify very high-cost patients in the top 5 percentile among the general population. A large empirical dataset from the Medical Expenditure Panel Survey with 98,175 records was used in our study. After pre-processing, partitioning and balancing the data, the refined dataset of 31,704 records was modeled by Decision Trees (including C5.0 and CHAID), and Neural Networks. The performances of the models are analyzed using various measures including accuracy, G-mean, and Area under ROC curve. We concluded that the CHAID classifier returns the best G-mean and AUC measures for top performing predictive models ranging from 76% to 85%, and 0.812 to 0.942 units, respectively. We also identify a small set of 5 non-trivial attributes among a primary set of 66 attributes to identify the top 5% of the high cost population. The attributes are the individual׳s overall health perception, age, history of blood cholesterol check, history of physical/sensory/mental limitations, and history of colonic prevention measures. The small set of attributes are what we call non-trivial and does not include visits to care providers, doctors or hospitals, which are highly correlated with expenditures and does not offer new insight to the data. The results of this study can be used by healthcare data analysts, policy makers, insurer, and healthcare planners to improve the delivery of health services. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Geochemistry and mercury contamination in receiving environments of artisanal mining wastes and identified concerns for food safety

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reichelt-Brushett, Amanda J., E-mail: amanda.reich

    Artisanal small-scale gold mining (ASGM) using mercury (Hg) amalgamation has been occurring on Buru Island, Indonesia since early 2012, and has caused rapid accumulation of high Hg concentrations in river, estuary and marine sediments. In this study, sediment samples were collected from several sites downstream of the Mount Botak ASGM site, as well as in the vicinity of the more recently established site at Gogrea where no sampling had previously been completed. All sediment samples had total Hg (THg) concentrations exceeding Indonesian sediment quality guidelines and were up to 82 times this limit at one estuary site. The geochemistry ofmore » sediments in receiving environments indicates the potential for Hg-methylation to form highly bioavailable Hg species. To assess the current contamination threat from consumption of local seafood, samples of fish, molluscs and crustaceans were collected from the Namlea fish market and analysed for THg concentrations. The majority of edible tissue samples had elevated THg concentrations, which raises concerns for food safety. This study shows that river, estuary and marine ecosystems downstream of ASGM operations on Buru Island are exposed to dangerously high Hg concentrations, which are impacting aquatic food chains, and fisheries resources. Considering the high dietary dependence on marine protein in the associated community and across the Mollucas Province, and the short time period since ASGM operations commenced in this region, the results warrant urgent further investigation, risk mitigation, and community education. - Highlights: • Mercury contamination of sediments and seafood due to artisanal gold mining. • Considerable risks to human and ecosystem health are identified. • Results emphasise the urgent need for risk mitigation and community education.« less

  10. Sampling and monitoring for the mine life cycle

    USGS Publications Warehouse

    McLemore, Virginia T.; Smith, Kathleen S.; Russell, Carol C.

    2014-01-01

    Sampling and Monitoring for the Mine Life Cycle provides an overview of sampling for environmental purposes and monitoring of environmentally relevant variables at mining sites. It focuses on environmental sampling and monitoring of surface water, and also considers groundwater, process water streams, rock, soil, and other media including air and biological organisms. The handbook includes an appendix of technical summaries written by subject-matter experts that describe field measurements, collection methods, and analytical techniques and procedures relevant to environmental sampling and monitoring.The sixth of a series of handbooks on technologies for management of metal mine and metallurgical process drainage, this handbook supplements and enhances current literature and provides an awareness of the critical components and complexities involved in environmental sampling and monitoring at the mine site. It differs from most information sources by providing an approach to address all types of mining influenced water and other sampling media throughout the mine life cycle.Sampling and Monitoring for the Mine Life Cycle is organized into a main text and six appendices that are an integral part of the handbook. Sidebars and illustrations are included to provide additional detail about important concepts, to present examples and brief case studies, and to suggest resources for further information. Extensive references are included.

  11. 30 CFR 77.1712 - Reopening mines; notification; inspection prior to mining.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... to mining. 77.1712 Section 77.1712 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... prior to mining. Prior to reopening any surface coal mine after it has been abandoned or declared... an authorized representative of the Secretary before any mining operations in such mine are...

  12. 30 CFR 77.1712 - Reopening mines; notification; inspection prior to mining.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... to mining. 77.1712 Section 77.1712 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... prior to mining. Prior to reopening any surface coal mine after it has been abandoned or declared... an authorized representative of the Secretary before any mining operations in such mine are...

  13. Stochastic production phase design for an open pit mining complex with multiple processing streams

    NASA Astrophysics Data System (ADS)

    Asad, Mohammad Waqar Ali; Dimitrakopoulos, Roussos; van Eldert, Jeroen

    2014-08-01

    In a mining complex, the mine is a source of supply of valuable material (ore) to a number of processes that convert the raw ore to a saleable product or a metal concentrate for production of the refined metal. In this context, expected variation in metal content throughout the extent of the orebody defines the inherent uncertainty in the supply of ore, which impacts the subsequent ore and metal production targets. Traditional optimization methods for designing production phases and ultimate pit limit of an open pit mine not only ignore the uncertainty in metal content, but, in addition, commonly assume that the mine delivers ore to a single processing facility. A stochastic network flow approach is proposed that jointly integrates uncertainty in supply of ore and multiple ore destinations into the development of production phase design and ultimate pit limit. An application at a copper mine demonstrates the intricacies of the new approach. The case study shows a 14% higher discounted cash flow when compared to the traditional approach.

  14. Genome mining for ribosomally synthesized natural products.

    PubMed

    Velásquez, Juan E; van der Donk, Wilfred A

    2011-02-01

    In recent years, the number of known peptide natural products that are synthesized via the ribosomal pathway has rapidly grown. Taking advantage of sequence homology among genes encoding precursor peptides or biosynthetic proteins, in silico mining of genomes combined with molecular biology approaches has guided the discovery of a large number of new ribosomal natural products, including lantipeptides, cyanobactins, linear thiazole/oxazole-containing peptides, microviridins, lasso peptides, amatoxins, cyclotides, and conopeptides. In this review, we describe the strategies used for the identification of these ribosomally synthesized and posttranslationally modified peptides (RiPPs) and the structures of newly identified compounds. The increasing number of chemical entities and their remarkable structural and functional diversity may lead to novel pharmaceutical applications. Copyright © 2010 Elsevier Ltd. All rights reserved.

  15. Genome Mining for Ribosomally Synthesized Natural Products

    PubMed Central

    Velásquez, Juan E.; van der Donk, Wilfred

    2011-01-01

    In recent years, the number of known peptide natural products that are synthesized via the ribosomal pathway has rapidly grown. Taking advantage of sequence homology among genes encoding precursor peptides or biosynthetic proteins, in silico mining of genomes combined with molecular biology approaches has guided the discovery of a large number of new ribosomal natural products, including lantipeptides, cyanobactins, linear thiazole/oxazole-containing peptides, microviridins, lasso peptides, amatoxins, cyclotides, and conopeptides. In this review, we describe the strategies used for the identification of these ribosomally-synthesized and posttranslationally modified peptides (RiPPs) and the structures of newly identified compounds. The increasing number of chemical entities and their remarkable structural and functional diversity may lead to novel pharmaceutical applications. PMID:21095156

  16. Identified Palliative Care Approach Needs with SPICT in Family Practice: A Preliminary Observational Study.

    PubMed

    Hamano, Jun; Oishi, Ai; Kizawa, Yoshiyuki

    2018-02-09

    Identifying patients who require palliative care approach is challenging for family physicians, even though several identification tools have been developed for this purpose. To explore the prevalence and characteristics of family practice patients who need palliative care approach as determined using Supportive and Palliative Care Indicators Tool (SPICT™, April 2015) in Japan. Single-center cross-sectional study. We enrolled all patients ≥65 years of age who visited the chief researcher's outpatient clinic in October 2016. We used Japanese version of SPICT (SPICT-J) to identify patients who need palliative care approach. We assessed patients' backgrounds and whether they had undergone advance care planning with their family physicians. This study included 87 patients (61 females) with a mean age of 79.0 ± 7.4 years. Eight patients (9.2%) were identified as needing palliative care approach. The mean age of patients who needed this approach was 82.3 ± 8.3 years and main underlying conditions were heart/vascular disease (37.5%), dementia/frailty (25.0%), and respiratory disease (12.5%). Only two of eight patients identified as needing palliative care approach had discussed advance care planning with their family physicians. In family practice, 9.2% of outpatients ≥65 years of age were identified as needing palliative care approach. Family physicians should carefully evaluate whether outpatients need palliative care approach.

  17. Moment tensor clustering: a tool to monitor mining induced seismicity

    NASA Astrophysics Data System (ADS)

    Cesca, Simone; Dahm, Torsten; Tolga Sen, Ali

    2013-04-01

    Automated moment tensor inversion routines have been setup in the last decades for the analysis of global and regional seismicity. Recent developments could be used to analyse smaller events and larger datasets. In particular, applications to microseismicity, e.g. in mining environments, have then led to the generation of large moment tensor catalogues. Moment tensor catalogues provide a valuable information about the earthquake source and details of rupturing processes taking place in the seismogenic region. Earthquake focal mechanisms can be used to discuss the local stress field, possible orientations of the fault system or to evaluate the presence of shear and/or tensile cracks. Focal mechanism and moment tensor solutions are typically analysed for selected events, and quick and robust tools for the automated analysis of larger catalogues are needed. We propose here a method to perform cluster analysis for large moment tensor catalogues and identify families of events which characterize the studied microseismicity. Clusters include events with similar focal mechanisms, first requiring the definition of distance between focal mechanisms. Different metrics are here proposed, both for the case of pure double couple, constrained moment tensor and full moment tensor catalogues. Different clustering approaches are implemented and discussed. The method is here applied to synthetic and real datasets from mining environments to demonstrate its potential: the proposed cluserting techniques prove to be able to automatically recognise major clusters. An important application for mining monitoring concerns the early identification of anomalous rupture processes, which is relevant for the hazard assessment. This study is funded by the project MINE, which is part of the R&D-Programme GEOTECHNOLOGIEN. The project MINE is funded by the German Ministry of Education and Research (BMBF), Grant of project BMBF03G0737.

  18. A primer to frequent itemset mining for bioinformatics

    PubMed Central

    Naulaerts, Stefan; Meysman, Pieter; Bittremieux, Wout; Vu, Trung Nghia; Vanden Berghe, Wim; Goethals, Bart

    2015-01-01

    Over the past two decades, pattern mining techniques have become an integral part of many bioinformatics solutions. Frequent itemset mining is a popular group of pattern mining techniques designed to identify elements that frequently co-occur. An archetypical example is the identification of products that often end up together in the same shopping basket in supermarket transactions. A number of algorithms have been developed to address variations of this computationally non-trivial problem. Frequent itemset mining techniques are able to efficiently capture the characteristics of (complex) data and succinctly summarize it. Owing to these and other interesting properties, these techniques have proven their value in biological data analysis. Nevertheless, information about the bioinformatics applications of these techniques remains scattered. In this primer, we introduce frequent itemset mining and their derived association rules for life scientists. We give an overview of various algorithms, and illustrate how they can be used in several real-life bioinformatics application domains. We end with a discussion of the future potential and open challenges for frequent itemset mining in the life sciences. PMID:24162173

  19. Hydrologic Investigations Concerning Lead Mining Issues in Southeastern Missouri

    USGS Publications Warehouse

    Kleeschulte, Michael J.

    2008-01-01

    Good stewardship of our Nation's natural resources demands that the extraction of exploitable, minable ore deposits be conducted in harmony with the protection of the environment, a dilemma faced by many land and water management agencies in the Nation's mining areas. As ore is mined, milled, and sent to the smelter, it leaves footprints where it has been in the form of residual trace metals. Often these footprints become remnants that can be detrimental to other natural resources. This emphasizes the importance of understanding the earth's complex physical and biological processes and their interactions at increasingly smaller scales because subtle changes in one component can substantially affect others. Understanding these changes and resulting effects requires an integrated, multidisciplinary scientific approach. As ore reserves are depleted in one area, additional exploitable deposits are required to replace them, and at times these new deposits are discovered in previously unmined areas. Informed decisions concerning resource management in these new, proposed mining areas require an understanding of the potential consequences of the planned mining actions. This understanding is usually based on knowledge that has been accumulated from studying previously mined areas with similar geohydrologic and biologic conditions. If the two areas experience similar mining practices, the information should be transferable. Lead and zinc mining along the Viburnum Trend Subdistrict of southeastern Missouri has occurred for more than 40 years. Additional potentially exploitable deposits have been discovered 30 miles to the south, within the Mark Twain National Forest. It is anticipated that the observation of current (2008) geohydrologic conditions in the Viburnum Trend can provide insight to land managers that will help reasonably anticipate the potential mining effects should additional mining occur in the exploration area. The purpose of this report is to present a

  20. Seamount characteristics and mine-site model applied to exploration- and mining-lease-block selection for cobalt-rich ferromanganese crusts

    USGS Publications Warehouse

    Hein, James R.; Conrad, Tracey A.; Dunham, Rachel E.

    2009-01-01

    Regulations are being developed through the International Seabed Authority (ISBA) for the exploration and mining of cobalt-rich ferromanganese crusts. This paper lays out geologic and geomorphologic criteria that can be used to determine the size and number of exploration and mine-site blocks that will be the focus of much discussion within the ISBA Council deliberations. The surface areas of 155 volcanic edifices in the central equatorial Pacific were measured and used to develop a mine-site model. The mine-site model considers areas above 2,500 m water depth as permissive, and narrows the general area available for exploration and mining to 20% of that permissive area. It is calculated that about eighteen 100 km2 explora-tion blocks, each composed of five 20km2 contiguous sub-blocks, would be adequate to identify a 260 km2 20-year-mine site; the mine site would be composed of thirteen of the 20km2 sub-blocks. In this hypothetical example, the 260 km2 mine site would be spread over four volcanic edifices and comprise 3.7% of the permissive area of the four edifices and 0.01% of the total area of those four edifices. The eighteen 100km2 exploration blocks would be selected from a limited geographic area. That confinement area is defined as having a long dimension of not more than 1,000 km and an area of not more than 300,000 km2.

  1. Improving the Method of Roof Fall Susceptibility Assessment based on Fuzzy Approach

    NASA Astrophysics Data System (ADS)

    Ghasemi, Ebrahim; Ataei, Mohammad; Shahriar, Kourosh

    2017-03-01

    Retreat mining is always accompanied by a great amount of accidents and most of them are due to roof fall. Therefore, development of methodologies to evaluate the roof fall susceptibility (RFS) seems essential. Ghasemi et al. (2012) proposed a systematic methodology to assess the roof fall risk during retreat mining based on risk assessment classic approach. The main defect of this method is ignorance of subjective uncertainties due to linguistic input value of some factors, low resolution, fixed weighting, sharp class boundaries, etc. To remove this defection and improve the mentioned method, in this paper, a novel methodology is presented to assess the RFS using fuzzy approach. The application of fuzzy approach provides an effective tool to handle the subjective uncertainties. Furthermore, fuzzy analytical hierarchy process (AHP) is used to structure and prioritize various risk factors and sub-factors during development of this method. This methodology is applied to identify the susceptibility of roof fall occurrence in main panel of Tabas Central Mine (TCM), Iran. The results indicate that this methodology is effective and efficient in assessing RFS.

  2. Toward edge minability for role mining in bipartite networks

    NASA Astrophysics Data System (ADS)

    Dong, Lijun; Wang, Yi; Liu, Ran; Pi, Benjie; Wu, Liuyi

    2016-11-01

    Bipartite network models have been extensively used in information security to automatically generate role-based access control (RBAC) from dataset. This process is called role mining. However, not all the topologies of bipartite networks are suitable for role mining; some edges may even reduce the quality of role mining. This causes unnecessary time consumption as role mining is NP-hard. Therefore, to promote the quality of role mining results, the capability that an edge composes roles with other edges, called the minability of edge, needs to be identified. We tackle the problem from an angle of edge importance in complex networks; that is an edge easily covered by roles is considered to be more important. Based on this idea, the k-shell decomposition of complex networks is extended to reveal the different minability of edges. By this way, a bipartite network can be quickly purified by excluding the low-minability edges from role mining, and thus the quality of role mining can be effectively improved. Extensive experiments via the real-world datasets are conducted to confirm the above claims.

  3. Three-dimensional organic Dirac-line materials due to nonsymmorphic symmetry: A data mining approach

    NASA Astrophysics Data System (ADS)

    Geilhufe, R. Matthias; Bouhon, Adrien; Borysov, Stanislav S.; Balatsky, Alexander V.

    2017-01-01

    A data mining study of electronic Kohn-Sham band structures was performed to identify Dirac materials within the Organic Materials Database. Out of that, the three-dimensional organic crystal 5,6-bis(trifluoromethyl)-2-methoxy-1 H -1,3-diazepine was found to host different Dirac-line nodes within the band structure. From a group theoretical analysis, it is possible to distinguish between Dirac-line nodes occurring due to twofold degenerate energy levels protected by the monoclinic crystalline symmetry and twofold degenerate accidental crossings protected by the topology of the electronic band structure. The obtained results can be generalized to all materials having the space group P 21/c (No. 14, C2h 5) by introducing three distinct topological classes.

  4. Mining large heterogeneous data sets in drug discovery.

    PubMed

    Wild, David J

    2009-10-01

    Increasingly, effective drug discovery involves the searching and data mining of large volumes of information from many sources covering the domains of chemistry, biology and pharmacology amongst others. This has led to a proliferation of databases and data sources relevant to drug discovery. This paper provides a review of the publicly-available large-scale databases relevant to drug discovery, describes the kinds of data mining approaches that can be applied to them and discusses recent work in integrative data mining that looks for associations that pan multiple sources, including the use of Semantic Web techniques. The future of mining large data sets for drug discovery requires intelligent, semantic aggregation of information from all of the data sources described in this review, along with the application of advanced methods such as intelligent agents and inference engines in client applications.

  5. Development of a data-mining algorithm to identify ages at reproductive milestones in electronic medical records.

    PubMed

    Malinowski, Jennifer; Farber-Eger, Eric; Crawford, Dana C

    2014-01-01

    Electronic medical records (EMRs) are becoming more widely implemented following directives from the federal government and incentives for supplemental reimbursements for Medicare and Medicaid claims. Replete with rich phenotypic data, EMRs offer a unique opportunity for clinicians and researchers to identify potential research cohorts and perform epidemiologic studies. Notable limitations to the traditional epidemiologic study include cost, time to complete the study, and limited ancestral diversity; EMR-based epidemiologic studies offer an alternative. The Epidemiologic Architecture for Genes Linked to Environment (EAGLE) Study, as part of the Population Architecture using Genomics and Epidemiology (PAGE) I Study, has genotyped more than 15,000 patients of diverse ancestry in BioVU, the Vanderbilt University Medical Center's biorepository linked to the EMR (EAGLE BioVU). We report here the development and performance of data-mining techniques used to identify the age at menarche (AM) and age at menopause (AAM), important milestones in the reproductive lifespan, in women from EAGLE BioVU for genetic association studies. In addition, we demonstrate the ability to discriminate age at naturally-occurring menopause (ANM) from medically-induced menopause. Unusual timing of these events may indicate underlying pathologies and increased risk for some complex diseases and cancer; however, they are not consistently recorded in the EMR. Our algorithm offers a mechanism by which to extract these data for clinical and research goals.

  6. Computer-aided visual assessment in mine planning and design

    Treesearch

    Michael Hatfield; A. J. LeRoy Balzer; Roger E. Nelson

    1979-01-01

    A computer modeling technique is described for evaluating the visual impact of a proposed surface mine located within the viewshed of a national park. A computer algorithm analyzes digitized USGS baseline topography and identifies areas subject to surface disturbance visible from the park. Preliminary mine and reclamation plan information is used to describe how the...

  7. A Systematic Approach to Determining the Identifiability of Multistage Carcinogenesis Models.

    PubMed

    Brouwer, Andrew F; Meza, Rafael; Eisenberg, Marisa C

    2017-07-01

    Multistage clonal expansion (MSCE) models of carcinogenesis are continuous-time Markov process models often used to relate cancer incidence to biological mechanism. Identifiability analysis determines what model parameter combinations can, theoretically, be estimated from given data. We use a systematic approach, based on differential algebra methods traditionally used for deterministic ordinary differential equation (ODE) models, to determine identifiable combinations for a generalized subclass of MSCE models with any number of preinitation stages and one clonal expansion. Additionally, we determine the identifiable combinations of the generalized MSCE model with up to four clonal expansion stages, and conjecture the results for any number of clonal expansion stages. The results improve upon previous work in a number of ways and provide a framework to find the identifiable combinations for further variations on the MSCE models. Finally, our approach, which takes advantage of the Kolmogorov backward equations for the probability generating functions of the Markov process, demonstrates that identifiability methods used in engineering and mathematics for systems of ODEs can be applied to continuous-time Markov processes. © 2016 Society for Risk Analysis.

  8. Microalgae-bacteria biofilms: a sustainable synergistic approach in remediation of acid mine drainage.

    PubMed

    Abinandan, Sudharsanam; Subashchandrabose, Suresh R; Venkateswarlu, Kadiyala; Megharaj, Mallavarapu

    2018-02-01

    Microalgae and bacteria offer a huge potential in delving interest to study and explore various mechanisms under extreme environments. Acid mine drainage (AMD) is one such environment which is extremely acidic containing copious amounts of heavy metals and poses a major threat to the ecosystem. Despite its extreme conditions, AMD is the habitat for several microbes and their activities. The use of various chemicals in prevention of AMD formation and conventional treatment in a larger scale is not feasible under different geological conditions. It implies that microbe-mediated approach is a viable and sustainable alternative technology for AMD remediation. Microalgae in biofilms play a pivotal role in such bioremediation as they maintain mutualism with heterotrophic bacteria. Synergistic approach of using microalgae-bacteria biofilms provides supportive metabolites from algal biomass for growth of bacteria and mediates remediation of AMD. However, by virtue of their physiology and capabilities of metal removal, non-acidophilic microalgae can be acclimated for use in AMD remediation. A combination of selective acidophilic and non-acidophilic microalgae together with bacteria, all in the form of biofilms, may be very effective for bioremediation of metal-contaminated waters. The present review critically examines the nature of mutualistic interactions established between microalgae and bacteria in biofilms and their role in removal of metals from AMDs, and consequent biomass production for the yield of biofuel. Integration of microalgal-bacterial consortia in fuel cells would be an attractive emerging approach of microbial biotechnology for AMD remediation.

  9. Text mining for the biocuration workflow

    PubMed Central

    Hirschman, Lynette; Burns, Gully A. P. C; Krallinger, Martin; Arighi, Cecilia; Cohen, K. Bretonnel; Valencia, Alfonso; Wu, Cathy H.; Chatr-Aryamontri, Andrew; Dowell, Karen G.; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G.

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on ‘Text Mining for the BioCuration Workflow’ at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community. PMID:22513129

  10. Text mining for the biocuration workflow.

    PubMed

    Hirschman, Lynette; Burns, Gully A P C; Krallinger, Martin; Arighi, Cecilia; Cohen, K Bretonnel; Valencia, Alfonso; Wu, Cathy H; Chatr-Aryamontri, Andrew; Dowell, Karen G; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on 'Text Mining for the BioCuration Workflow' at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community.

  11. Mitigation planning for raptors during mining

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Platt, S.W.; Hargis, N.E.

    1990-12-31

    Birds of prey and their eggs, young and nests are protected by state and federal laws and regulations. Surface mining operators may experience conflicts with raptors when expanding into nesting areas or when raptors are attracted into mining areas. State and federal permits are required for disturbance or manipulation of birds of prey. Mitigation planning for raptors begins before mining and continues through mining. As conflict situations changes, so must the mitigation plan. Before each nesting season the mining schedule should be compared to areas of known raptor nesting activity. If overlap occurs, nest protection measures may be needed. Areasmore » of potential conflict should be patrolled regularly to identify the presence of a raptor pair and nest starts. Should a raptor nest be built and eggs laid, a change in the mining schedule or an egg or brood manipulation may resolve the conflict. Bridger Coal Company has successfully mitigated conflicts with 3 raptor species. A ferruginous hawk (Buteo regalis) nest with brood was successfully relocated across a pit. Red-tailed hawk (B. jamaicensis) egg clutches were removed from 2 highwall nests and transported in a portable incubator to a commercial raptor propagator where they were hatched, fed and conspecifically imprinted until achieving self-thermoregulation. All chicks were returned to the mine and successfully placed into foster nests. A metal artificial nest ledge for a prairie falcon (Falco mexicanus) was constructed in a cliff and a traditional nesting ledge rendered inaccessible. The falcon pair successfully nested in the artificial ledge.« less

  12. Identifying Immune Drivers of Gulf War Illness Using a Novel Daily Sampling Approach

    DTIC Science & Technology

    2017-10-01

    AWARD NUMBER: W81XWH-12-1-0557 TITLE: Identifying Immune Drivers of Gulf War Illness Using a Novel Daily Sampling Approach PRINCIPAL...TITLE AND SUBTITLE Identifying Immune Drivers of Gulf War Illness Using A Novel 5a. CONTRACT NUMBER Daily Sampling Approach 5b. GRANT NUMBER...INTRODUCTION: The major aim of this research project is to identify aspects of the immune system that are dysregulated in veterans with Gulf War Illness

  13. Data Mining and Machine Learning in Astronomy

    NASA Astrophysics Data System (ADS)

    Ball, Nicholas M.; Brunner, Robert J.

    We review the current state of data mining and machine learning in astronomy. Data Mining can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those in which data mining techniques directly contributed to improving science, and important current and future directions, including probability density functions, parallel algorithms, Peta-Scale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.

  14. Analysis of water control in an underground mine under strong karst media influence (Vazante mine, Brazil)

    NASA Astrophysics Data System (ADS)

    Ninanya, Hugo; Guiguer, Nilson; Vargas, Eurípedes A.; Nascimento, Gustavo; Araujo, Edmar; Cazarin, Caroline L.

    2018-05-01

    This work presents analysis of groundwater flow conditions and groundwater control measures for Vazante underground mine located in the state of Minas Gerais, Brazil. According to field observations, groundwater flow processes in this mine are highly influenced by the presence of karst features located in the near-surface terrain next to Santa Catarina River. The karstic features, such as caves, sinkholes, dolines and conduits, have direct contact with the aquifer and tend to increase water flow into the mine. These effects are more acute in areas under the influence of groundwater-level drawdown by pumping. Numerical analyses of this condition were carried out using the computer program FEFLOW. This program represents karstic features as one-dimensional discrete flow conduits inside a three-dimensional finite element structure representing the geologic medium following a combined discrete-continuum approach for representing the karst system. These features create preferential flow paths between the river and mine; their incorporation into the model is able to more realistically represent the hydrogeological environment of the mine surroundings. In order to mitigate the water-inflow problems, impermeabilization of the river through construction of a reinforced concrete channel was incorporated in the developed hydrogeological model. Different scenarios for channelization lengths for the most critical zones along the river were studied. Obtained results were able to compare effectiveness of different river channelization scenarios. It was also possible to determine whether the use of these impermeabilization measures would be able to reduce, in large part, the elevated costs of pumping inside the mine.

  15. Mercury Mining in Mexico: I. Community Engagement to Improve Health Outcomes from Artisanal Mining.

    PubMed

    Camacho, Andrea; Van Brussel, Evelyn; Carrizales, Leticia; Flores-Ramírez, Rogelio; Verduzco, Beatriz; Huerta, Selene Ruvalcaba-Aranda; Leon, Mauricio; Díaz-Barriga, Fernando

    2016-01-01

    Mercury is an element that cannot be destroyed and is a global threat to human and environmental health. In Latin America and the Caribbean, artisanal and small-scale gold mining represents the main source of mercury emissions, releases, and consumption. However, another source of concern is the primary production of mercury. In the case of Mexico, in the past 2 years the informal production of mercury mining has increased 10-fold. Considering this scenario, an intervention program was initiated to reduce health risks in the mining communities. The program's final goal is to introduce different alternatives in line to stop the mining of mercury, but introducing at the same time, a community-based development program. The aim of this study was to present results from a preliminary study in the community of Plazuela, located in the municipality of Peñamiller in the State of Queretaro, Mexico. Total mercury was measured in urine and environmental samples using atomic absorption spectrometry by cold vapor technique. Urine samples were collected from children aged 6-14 years and who had lived in the selected area from birth. Urine samples were also collected from miners who were currently working in the mine. To confirm the presence of mercury in the community, mining waste, water, soil, and sediment samples were collected from those high-risk areas identified by members of the community. Children, women, and miners were heavily exposed to mercury (urine samples); and in agreement, we registered high concentrations of mercury in soils and sediments. Considering these results and taking into account that the risk perception toward mercury toxicity is very low in the community (mining is the only economic activity), an integral intervention program has started. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  16. Fungal diversity in major oil-shale mines in China.

    PubMed

    Jiang, Shaoyan; Wang, Wenxing; Xue, Xiangxin; Cao, Chengyou; Zhang, Ying

    2016-03-01

    As an insufficiently utilized energy resource, oil shale is conducive to the formation of characteristic microbial communities due to its special geological origins. However, little is known about fungal diversity in oil shale. Polymerase chain reaction cloning was used to construct the fungal ribosomal deoxyribonucleic acid internal transcribed spacer (rDNA ITS) clone libraries of Huadian Mine in Jilin Province, Maoming Mine in Guangdong Province, and Fushun Mine in Liaoning Province. Pure culture and molecular identification were applied for the isolation of cultivable fungi in fresh oil shale of each mine. Results of clone libraries indicated that each mine had over 50% Ascomycota (58.4%-98.9%) and 1.1%-13.5% unidentified fungi. Fushun Mine and Huadian Mine had 5.9% and 28.1% Basidiomycota, respectively. Huadian Mine showed the highest fungal diversity, followed by Fushun Mine and Maoming Mine. Jaccard indexes showed that the similarities between any two of three fungal communities at the genus level were very low, indicating that fungi in each mine developed independently during the long geological adaptation and formed a community composition fitting the environment. In the fresh oil-shale samples of the three mines, cultivable fungal phyla were consistent with the results of clone libraries. Fifteen genera and several unidentified fungi were identified as Ascomycota and Basidiomycota using pure culture. Penicillium was the only genus found in all three mines. These findings contributed to gaining a clear understanding of current fungal resources in major oil-shale mines in China and provided useful information for relevant studies on isolation of indigenous fungi carrying functional genes from oil shale. Copyright © 2015. Published by Elsevier B.V.

  17. Effectively Engaging in Tribal Consultation to protect Traditional Cultural Properties while navigating the 1872 Mining Law - Tonto National Forest, Western Apache Tribes, & Resolution Copper Mine

    NASA Astrophysics Data System (ADS)

    Nez, N.

    2017-12-01

    By effectively engaging in government-to-government consultation the Tonto National Forest is able to consider oral histories and tribal cultural knowledge in decision making. These conversations often have the potential to lead to the protection and preservation of public lands. Discussed here is one example of successful tribal consultation and how it let to the protection of Traditional Cultural Properties (TCPs). One hour east of Phoenix, Arizona on the Tonto National Forest, Resolution Copper Mine, is working to access a rich copper vein more than 7,000 feet deep. As part of the mining plan of operation they are investigating viable locations to store the earth removed from the mine site. One proposed storage location required hydrologic and geotechnical studies to determine viability. This constituted a significant amount of ground disturbance in an area that is of known importance to local Indian tribes. To ensure proper consideration of tribal concerns, the Forest engaged nine local tribes in government-government consultation. Consultation resulted in the identification of five springs in the project area considered (TCPs) by the Western Apache tribes. Due to the presence of identified TCPs, the Forest asked tribes to assist in the development of mitigation measures to minimize effects of this project on the TCPs identified. The goal of this partnership was to find a way for the Mine to still be able to gather data, while protecting TCPs. During field visits and consultations, a wide range of concerns were shared which were recorded and considered by Tonto National Forest. The Forest developed a proposed mitigation approach to protect springs, which would prevent (not permit) the installation of water monitoring wells, geotechnical borings or trench excavations within 1,200 feet of perennial springs in the project area. As an added mitigation measure, a cultural resources specialist would be on-site during all ground-disturbing activities. Diligent work on

  18. Characterizing the hydrological system in Rosia Montana mining area (Romania) for AMD mitigation

    NASA Astrophysics Data System (ADS)

    Cozma, Alexandra; Baciu, Calin; Olenici, Adriana; Brahaita, Dorian; Pop, Cristian; Lazar, Laura; Roba, Carmen; Popita, Gabriela

    2015-04-01

    Keywords: mining, AMD mitigation, isotopic analyses, Romania Rosia Montana is one of the most important European gold fields, with a long history of mining. The extraction of gold started on site during the Roman age, and the mining operations that spanned over almost two millennia have produced a visible environmental footprint. More than 140 km of mining galleries are documented by historical sources and recent surveys. Water streams are the main vectors spreading the pollution outside the mining area. The main streams, Rosia, Corna, and Saliste, tributaries of Abruzel River are significantly impacted by the acid waters issued by adits, exposed rock surfaces, or rock waste heaps, and tailings depots. Low contamination has been observed in the streams outside the mining area, artificial ponds, and shallow groundwater. Excepting the shallow groundwater system that can be sampled in domestic wells and some springs, the circulation of groundwater is largely unknown. An important amount of the infiltration water is channelled through galleries. The waters sampled at the galleries outlets have low pH, generally between 2 and 3, and very high content of heavy metals. A systematic approach based on monthly sampling and chemical analyses, and isotopic measurements, has been initiated, in order to better understand the underground itinerary of water and the chemical transformations that occur. A sampling network of 28 water points, including streams, ponds, dug wells, springs, and gallery outlets has been setup. Beyond producing a water circulation model in the mining area, the main purpose of the research is to identify ways of decreasing the acid water production and to design low cost techniques for the AMD mitigation. The deposit still hosts about 300 tonnes of gold, and 1600 tonnes of silver. A new large scale mining project is currently under permitting. Cost-effective solutions for the water treatment would be beneficial, especially for the post-mining stage of any

  19. Mining problems caused by tectonic stress in Illinois basin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nelson, W.J.

    1991-08-01

    The Illinois basin coalfield is subject to a contemporary tectonic stress field in which the principal compressive stress axis ({sigma}1) is horizontal and strikes N60{degree}E to east-west. This stress is responsible for widespread development of kind zones and directional roof failures in mine headings driven perpendicular to {sigma}1. Also, small thrust faults perpendicular to {sigma}1 and joints parallel to {sigma}1 weaken the mine roof and occasionally admit water and gas to workings, depending upon geologic setting. The direction of magnitude of stress have been identified by a variety of techniques that can be applied both prior to mining and duringmore » development. Mining experience shows that the best method of minimizing stress-related problems is to drive mine headings at about 45 to {sigma}1.« less

  20. Lunar resource evaluation and mine site selection

    NASA Technical Reports Server (NTRS)

    Bence, A. Edward

    1992-01-01

    Two scenarios in this evaluation of lunar mineral resources and the selection of possible mining and processing sites are considered. The first scenario assumes that no new surface or near-surface data will be available before site selection (presumably one of the Apollo sites). The second scenario assumes that additional surface geology data will have been obtained by a lunar orbiter mission, an unmanned sample return mission (or missions), and followup manned missions. Regardless of the scenario, once a potentially favorable mine site has been identified, a minimum amount of fundamental data is needed to assess the resources at that site and to evaluate its suitability for mining and downstream processing. Since much of the required data depends on the target mineral(s), information on the resource, its beneficiation, and the refining, smelting, and fabricating processes must be factored into the evaluation. The annual capacity and producing lifetime of the mine and its associated processing plant must be estimated before the resource reserves can be assessed. The available market for the product largely determines the capacity and lifetime of the mine. The Apollo 17 site is described as a possible mining site. The use of new sites is briefly addressed.

  1. Text mining of rheumatoid arthritis and diabetes mellitus to understand the mechanisms of Chinese medicine in different diseases with same treatment.

    PubMed

    Zhao, Ning; Zheng, Guang; Li, Jian; Zhao, Hong-Yan; Lu, Cheng; Jiang, Miao; Zhang, Chi; Guo, Hong-Tao; Lu, Ai-Ping

    2018-01-09

    To identify the commonalities between rheumatoid arthritis (RA) and diabetes mellitus (DM) to understand the mechanisms of Chinese medicine (CM) in different diseases with the same treatment. A text mining approach was adopted to analyze the commonalities between RA and DM according to CM and biological elements. The major commonalities were subsequently verifified in RA and DM rat models, in which herbal formula for the treatment of both RA and DM identifified via text mining was used as the intervention. Similarities were identifified between RA and DM regarding the CM approach used for diagnosis and treatment, as well as the networks of biological activities affected by each disease, including the involvement of adhesion molecules, oxidative stress, cytokines, T-lymphocytes, apoptosis, and inflfl ammation. The Ramulus Cinnamomi-Radix Paeoniae Alba-Rhizoma Anemarrhenae is an herbal combination used to treat RA and DM. This formula demonstrated similar effects on oxidative stress and inflfl ammation in rats with collagen-induced arthritis, which supports the text mining results regarding the commonalities between RA and DM. Commonalities between the biological activities involved in RA and DM were identifified through text mining, and both RA and DM might be responsive to the same intervention at a specifific stage.

  2. Mining Adverse Drug Reactions in Social Media with Named Entity Recognition and Semantic Methods.

    PubMed

    Chen, Xiaoyi; Deldossi, Myrtille; Aboukhamis, Rim; Faviez, Carole; Dahamna, Badisse; Karapetiantz, Pierre; Guenegou-Arnoux, Armelle; Girardeau, Yannick; Guillemin-Lanne, Sylvie; Lillo-Le-Louët, Agnès; Texier, Nathalie; Burgun, Anita; Katsahian, Sandrine

    2017-01-01

    Suspected adverse drug reactions (ADR) reported by patients through social media can be a complementary source to current pharmacovigilance systems. However, the performance of text mining tools applied to social media text data to discover ADRs needs to be evaluated. In this paper, we introduce the approach developed to mine ADR from French social media. A protocol of evaluation is highlighted, which includes a detailed sample size determination and evaluation corpus constitution. Our text mining approach provided very encouraging preliminary results with F-measures of 0.94 and 0.81 for recognition of drugs and symptoms respectively, and with F-measure of 0.70 for ADR detection. Therefore, this approach is promising for downstream pharmacovigilance analysis.

  3. A Functional Genomics Approach to Identify Novel Breast Cancer Gene Targets in Yeast

    DTIC Science & Technology

    2004-05-01

    AD Award Number: DAMD17-03-1-0232 TITLE: A Functional Genomics Approach to Identify Novel Breast Cancer Gene Targets in Yeast PRINCIPAL INVESTIGATOR...Approach to Identify Novel Breast DAMD17-03-1-0232 Cancer Gene Targets in Yeast 6. A UTHOR(S) Craig Bennett, Ph.D. 7. PERFORMING ORGANIZA TION NAME(S...Unlimited 13. ABSTRACT (Maximum 200 Words) We are using the yeast Saccharomyces cerevisiae to identify new cancer gene targets that interact with the

  4. Risk factors and prediction of very short term versus short/intermediate term post-stroke mortality: a data mining approach.

    PubMed

    Easton, Jonathan F; Stephens, Christopher R; Angelova, Maia

    2014-11-01

    Data mining and knowledge discovery as an approach to examining medical data can limit some of the inherent bias in the hypothesis assumptions that can be found in traditional clinical data analysis. In this paper we illustrate the benefits of a data mining inspired approach to statistically analysing a bespoke data set, the academic multicentre randomised control trial, U.K Glucose Insulin in Stroke Trial (GIST-UK), with a view to discovering new insights distinct from the original hypotheses of the trial. We consider post-stroke mortality prediction as a function of days since stroke onset, showing that the time scales that best characterise changes in mortality risk are most naturally defined by examination of the mortality curve. We show that certain risk factors differentiate between very short term and intermediate term mortality. In particular, we show that age is highly relevant for intermediate term risk but not for very short or short term mortality. We suggest that this is due to the concept of frailty. Other risk factors are highlighted across a range of variable types including socio-demographics, past medical histories and admission medication. Using the most statistically significant risk factors we build predictive classification models for very short term and short/intermediate term mortality. Crown Copyright © 2014. Published by Elsevier Ltd. All rights reserved.

  5. A GIS and statistical approach to identify variables that control water quality in hydrothermally altered and mineralized watersheds, Silverton, Colorado, USA

    USGS Publications Warehouse

    Yager, Douglas B.; Johnson, Raymond H.; Rockwell, Barnaby W.; Caine, Jonathan S.; Smith, Kathleen S.

    2013-01-01

    Hydrothermally altered bedrock in the Silverton mining area, southwest Colorado, USA, contains sulfide minerals that weather to produce acidic and metal-rich leachate that is toxic to aquatic life. This study utilized a geographic information system (GIS) and statistical approach to identify watershed-scale geologic variables in the Silverton area that influence water quality. GIS analysis of mineral maps produced using remote sensing datasets including Landsat Thematic Mapper, advanced spaceborne thermal emission and reflection radiometer, and a hybrid airborne visible infrared imaging spectrometer and field-based product enabled areas of alteration to be quantified. Correlations between water quality signatures determined at watershed outlets, and alteration types intersecting both total watershed areas and GIS-buffered areas along streams were tested using linear regression analysis. Despite remote sensing datasets having varying watershed area coverage due to vegetation cover and differing mineral mapping capabilities, each dataset was useful for delineating acid-generating bedrock. Areas of quartz–sericite–pyrite mapped by AVIRIS have the highest correlations with acidic surface water and elevated iron and aluminum concentrations. Alkalinity was only correlated with area of acid neutralizing, propylitically altered bedrock containing calcite and chlorite mapped by AVIRIS. Total watershed area of acid-generating bedrock is more significantly correlated with acidic and metal-rich surface water when compared with acid-generating bedrock intersected by GIS-buffered areas along streams. This methodology could be useful in assessing the possible effects that alteration type area has in either generating or neutralizing acidity in unmined watersheds and in areas where new mining is planned.

  6. Evaluation of the environmental contamination at an abandoned mining site using multivariate statistical techniques--the Rodalquilar (Southern Spain) mining district.

    PubMed

    Bagur, M G; Morales, S; López-Chicano, M

    2009-11-15

    Unsupervised and supervised pattern recognition techniques such as hierarchical cluster analysis, principal component analysis, factor analysis and linear discriminant analysis have been applied to water samples recollected in Rodalquilar mining district (Southern Spain) in order to identify different sources of environmental pollution caused by the abandoned mining industry. The effect of the mining activity on waters was monitored determining the concentration of eleven elements (Mn, Ba, Co, Cu, Zn, As, Cd, Sb, Hg, Au and Pb) by inductively coupled plasma mass spectrometry (ICP-MS). The Box-Cox transformation has been used to transform the data set in normal form in order to minimize the non-normal distribution of the geochemical data. The environmental impact is affected mainly by the mining activity developed in the zone, the acid drainage and finally by the chemical treatment used for the benefit of gold.

  7. Mine safety assessment using gray relational analysis and bow tie model

    PubMed Central

    2018-01-01

    Mine safety assessment is a precondition for ensuring orderly and safety in production. The main purpose of this study was to prevent mine accidents more effectively by proposing a composite risk analysis model. First, the weights of the assessment indicators were determined by the revised integrated weight method, in which the objective weights were determined by a variation coefficient method and the subjective weights determined by the Delphi method. A new formula was then adopted to calculate the integrated weights based on the subjective and objective weights. Second, after the assessment indicator weights were determined, gray relational analysis was used to evaluate the safety of mine enterprises. Mine enterprise safety was ranked according to the gray relational degree, and weak links of mine safety practices identified based on gray relational analysis. Third, to validate the revised integrated weight method adopted in the process of gray relational analysis, the fuzzy evaluation method was used to the safety assessment of mine enterprises. Fourth, for first time, bow tie model was adopted to identify the causes and consequences of weak links and allow corresponding safety measures to be taken to guarantee the mine’s safe production. A case study of mine safety assessment was presented to demonstrate the effectiveness and rationality of the proposed composite risk analysis model, which can be applied to other related industries for safety evaluation. PMID:29561875

  8. Littoral Assessment of Mine Burial Signatures (LAMBS) buried land mine/background spectral signature analyses

    USGS Publications Warehouse

    Kenton, A.C.; Geci, D.M.; Ray, K.J.; Thomas, C.M.; Salisbury, J.W.; Mars, J.C.; Crowley, J.K.; Witherspoon, N.H.; Holloway, J.H.; Harmon R.S.Broach J.T.Holloway, Jr. J.H.

    2004-01-01

    The objective of the Office of Naval Research (ONR) Rapid Overt Reconnaissance (ROR) program and the Airborne Littoral Reconnaissance Technologies (ALRT) project's LAMBS effort is to determine if electro-optical spectral discriminants exist that are useful for the detection of land mines in littoral regions. Statistically significant buried mine overburden and background signature data were collected over a wide spectral range (0.35 to 14 ??m) to identify robust spectral features that might serve as discriminants for new airborne sensor concepts. LAMBS has expanded previously collected databases to littoral areas - primarily dry and wet sandy soils - where tidal, surf, and wind conditions can severely modify spectral signatures. At AeroSense 2003, we reported completion of three buried mine collections at an inland bay, Atlantic and Gulf of Mexico beach sites.1 We now report LAMBS spectral database analyses results using metrics which characterize the detection performance of general types of spectral detection algorithms. These metrics include mean contrast, spectral signal-to-clutter, covariance, information content, and spectral matched filter analyses. Detection performance of the buried land mines was analyzed with regard to burial age, background type, and environmental conditions. These analyses considered features observed due to particle size differences, surface roughness, surface moisture, and compositional differences.

  9. Comparing digital data processing techniques for surface mine and reclamation monitoring

    NASA Technical Reports Server (NTRS)

    Witt, R. G.; Bly, B. G.; Campbell, W. J.; Bloemer, H. H. L.; Brumfield, J. O.

    1982-01-01

    The results of three techniques used for processing Landsat digital data are compared for their utility in delineating areas of surface mining and subsequent reclamation. An unsupervised clustering algorithm (ISOCLS), a maximum-likelihood classifier (CLASFY), and a hybrid approach utilizing canonical analysis (ISOCLS/KLTRANS/ISOCLS) were compared by means of a detailed accuracy assessment with aerial photography at NASA's Goddard Space Flight Center. Results show that the hybrid approach was superior to the traditional techniques in distinguishing strip mined and reclaimed areas.

  10. Using ontology network structure in text mining.

    PubMed

    Berndt, Donald J; McCart, James A; Luther, Stephen L

    2010-11-13

    Statistical text mining treats documents as bags of words, with a focus on term frequencies within documents and across document collections. Unlike natural language processing (NLP) techniques that rely on an engineered vocabulary or a full-featured ontology, statistical approaches do not make use of domain-specific knowledge. The freedom from biases can be an advantage, but at the cost of ignoring potentially valuable knowledge. The approach proposed here investigates a hybrid strategy based on computing graph measures of term importance over an entire ontology and injecting the measures into the statistical text mining process. As a starting point, we adapt existing search engine algorithms such as PageRank and HITS to determine term importance within an ontology graph. The graph-theoretic approach is evaluated using a smoking data set from the i2b2 National Center for Biomedical Computing, cast as a simple binary classification task for categorizing smoking-related documents, demonstrating consistent improvements in accuracy.

  11. Exploration of geo-mineral compounds in granite mining soils using XRD pattern data analysis

    NASA Astrophysics Data System (ADS)

    Koteswara Reddy, G.; Yarakkula, Kiran

    2017-11-01

    The purpose of the study was to investigate the major minerals present in granite mining waste and agricultural soils near and away from mining areas. The mineral exploration of representative sub-soil samples are identified by X-Ray Diffractometer (XRD) pattern data analysis. The morphological features and quantitative elementary analysis was performed by Scanning Electron Microscopy-Energy Dispersed Spectroscopy (SEM-EDS).The XRD pattern data revealed that the major minerals are identified as Quartz, Albite, Anorthite, K-Feldspars, Muscovite, Annite, Lepidolite, Illite, Enstatite and Ferrosilite in granite waste. However, in case of agricultural farm soils the major minerals are identified as Gypsum, Calcite, Magnetite, Hematite, Muscovite, K-Feldspars and Quartz. Moreover, the agricultural soils neighbouring mining areas, the minerals are found that, the enriched Mica group minerals (Lepidolite and Illite) the enriched Orthopyroxene group minerals (Ferrosilite and Enstatite). It is observed that the Mica and Orthopyroxene group minerals are present in agricultural farm soils neighbouring mining areas and absent in agricultural farm soils away from mining areas. The study demonstrated that the chemical migration takes place at agricultural farm lands in the vicinity of the granite mining areas.

  12. Solutions for Mining Distributed Scientific Data

    NASA Astrophysics Data System (ADS)

    Lynnes, C.; Pham, L.; Graves, S.; Ramachandran, R.; Maskey, M.; Keiser, K.

    2007-12-01

    Researchers at the University of Alabama in Huntsville (UAH) and the Goddard Earth Sciences Data and Information Services Center (GES DISC) are working on approaches and methodologies facilitating the analysis of large amounts of distributed scientific data. Despite the existence of full-featured analysis tools, such as the Algorithm Development and Mining (ADaM) toolkit from UAH, and data repositories, such as the GES DISC, that provide online access to large amounts of data, there remain obstacles to getting the analysis tools and the data together in a workable environment. Does one bring the data to the tools or deploy the tools close to the data? The large size of many current Earth science datasets incurs significant overhead in network transfer for analysis workflows, even with the advanced networking capabilities that are available between many educational and government facilities. The UAH and GES DISC team are developing a capability to define analysis workflows using distributed services and online data resources. We are developing two solutions for this problem that address different analysis scenarios. The first is a Data Center Deployment of the analysis services for large data selections, orchestrated by a remotely defined analysis workflow. The second is a Data Mining Center approach of providing a cohesive analysis solution for smaller subsets of data. The two approaches can be complementary and thus provide flexibility for researchers to exploit the best solution for their data requirements. The Data Center Deployment of the analysis services has been implemented by deploying ADaM web services at the GES DISC so they can access the data directly, without the need of network transfers. Using the Mining Workflow Composer, a user can define an analysis workflow that is then submitted through a Web Services interface to the GES DISC for execution by a processing engine. The workflow definition is composed, maintained and executed at a distributed

  13. Haneş and Valea Vinului (Romania) closed mines Acid Mine Drainages (AMDs)--actual condition and passive treatment remediation proposal.

    PubMed

    Măicăneanu, Andrada; Bedelean, Horea; Ardelean, Marius; Burcă, Silvia; Stanca, Maria

    2013-10-01

    Acid Mine Drainages (AMDs) from Haneş and Valea Vinului (Romania) closed mines were considered for characterization and treatment using a local zeolitic volcanic tuff, ZVT, (Măcicaş, Cluj County, Romania). Water samples were collected from two locations, before and after discharging point in case of Haneş mine, and on three horizons in case of Valea Vinului mine. Physico-chemical (pH, total solid, heavy metal ions concentration) analyses showed that the environment is strongly affected by these AMD discharges even if the mines were closed years ago. Iron, manganese and zinc were the main pollutants identified in Haneş mine AMD, while zinc is the one mainly present in case of Valea Vinului AMD. A batch technique (no stirring) in which the ZVT was put in contact with the AMD sample was proposed as a passive remediation technique. ZVT successfully remove heavy metal ion from AMD. According to heavy metal ion concentrations, removal efficiencies are reaching 100%, varying as follows, Fe(2+)>Zn(2+)>Mn(2+). When the ZVT was compared with two cationic resins (strong, SAR and weak acid, WAR) the following series was depicted, SAR>ZVT>WAR. Copyright © 2013 Elsevier Ltd. All rights reserved.

  14. Systematic review: Lost-time injuries in the US mining industry.

    PubMed

    Nowrouzi-Kia, B; Sharma, B; Dignard, C; Kerekes, Z; Dumond, J; Li, A; Larivière, M

    2017-08-01

    The mining industry is associated with high levels of accidents, injuries and illnesses. Lost-time injuries are useful measures of health and safety in mines, and the effectiveness of its safety programmes. To identify the type of lost-time injuries in the US mining workforce and to examine predictors of these occupational injuries. Primary papers on lost-time injuries in the US mining sector were identified through a literature search in eight health, geology and mining databases, using a systematic review protocol tailored to each database. The Critical Appraisal Skills Programme (CASP), Framework of Quality Assurance for Administrative Data Source and the Cochrane Collaboration 'Risk of bias' assessment tools were used to assess study quality. A total of 1736 articles were retrieved before duplicates were removed. Fifteen articles were ultimately included with a CASP mean score of 6.33 (SD 0.62) out of 10. Predictors of lost-time injuries included slips and falls, electric injuries, use of mining equipment, working in underground mining, worker's age and occupational experience. This is the first systematic review of lost-time injuries in the US mining sector. The results support the need for further research on factors that contribute to workplace lost-time injuries as there is limited literature on the topic. Safety analytics should also be applied to uncover new trends and predict the likelihood of future incidents before they occur. New insights will allow employers to prevent injuries and foster a safer workplace environment by implementing successful occupational health and safety programmes. © The Author 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  15. Utilization of volume correlation filters for underwater mine identification in LIDAR imagery

    NASA Astrophysics Data System (ADS)

    Walls, Bradley

    2008-04-01

    Underwater mine identification persists as a critical technology pursued aggressively by the Navy for fleet protection. As such, new and improved techniques must continue to be developed in order to provide measurable increases in mine identification performance and noticeable reductions in false alarm rates. In this paper we show how recent advances in the Volume Correlation Filter (VCF) developed for ground based LIDAR systems can be adapted to identify targets in underwater LIDAR imagery. Current automated target recognition (ATR) algorithms for underwater mine identification employ spatial based three-dimensional (3D) shape fitting of models to LIDAR data to identify common mine shapes consisting of the box, cylinder, hemisphere, truncated cone, wedge, and annulus. VCFs provide a promising alternative to these spatial techniques by correlating 3D models against the 3D rendered LIDAR data.

  16. Environmental Improvement Of Opencast Mining

    NASA Astrophysics Data System (ADS)

    Prokopenko, S.; Sushko, A.; Filatov, Yu; Kislyakov, M.; Kislyakov, I.

    2017-01-01

    Existing classifications of waste dumps in the quarries are given and their phenomenological nature is clarified. The need to identify the essence of the term "dump" is shown as well as the idea of "dump" as an artificial formation with everted and mixed rocks distanced from the quarry. Essential classification of man-made rock formations in quarries is developed. Characteristic of variations of man-made waste formations in quarries is developed. To reduce harmful effects of open-pit mining, dumps should be substituted with strat-lays - man-made structures relevant to natural stratification of litho-substances. Construction of strat-lays would improve ecological and technological culture of open cast mining.

  17. Understanding Genetic Toxicity Through Data Mining: The ...

    EPA Pesticide Factsheets

    This paper demonstrates the usefulness of representing a chemical by its structural features and the use of these features to profile a battery of tests rather than relying on a single toxicity test of a given chemical. This paper presents data mining/profiling methods applied in a weight-of-evidence approach to assess potential for genetic toxicity, and to guide the development of intelligent testing strategies. This paper demonstrates the usefulness of representing a chemical by its structural features and the use of these features to profile a battery of tests rather than relying on a single toxicity test of a given chemical. This paper presents data mining/profiling methods applied in a weight-of-evidence approach to assess potential for genetic toxicity, and to guide the development of intelligent testing strategies.

  18. Advantages and difficulties of implementation of the international GNA standards in sustainable mining development. (Invited)

    NASA Astrophysics Data System (ADS)

    Masaitis, A.

    2013-12-01

    Conflicts in the development of mining projects are now common between the mining proponents, NGO's and communities. These conflicts can sometimes be alleviated by early development of modes of communication, and a formal discussion format that allows airing of concerns and potential resolution of problems. One of the methods that can formalize this process is to establish a Good Neighbor Agreement (GNA), which deals specifically with challenges in relationships between mining operations and the local communities. It is a new practice related to mining operations that are oriented toward social needs and concerns of local communities that arise during the normal life of a mine, which can achieve sustainable mining practices in both developing and developed countries. The GNA project being currently developed at the University of Nevada, Reno in cooperation with the Newmont Mining Corporation has a goal to create an open company/community dialog that is based on the international standards and that will help identify and address sociological and environmental concerns associated with mining, as well as find methods for communication and conflict resolution. GNA standards should be based on trust doctrine, open information access, and community involvement in the decision making process. It should include the following components: emergency response and community communications; environmental issues, including air and water quality standards; reclamation and recultivation; socio-economic issues: transportation, safety, training, and local hiring; and financial issues, particularly related to mitigation offsets and community needs. The GNA standards help identify and evaluate conflict criteria in mining/community relationships; determine the status of concerns; focus on the local political and government systems; separate the acute and the chronic concerns; determine the role and responsibilities of stakeholders; analyze problem resolution feasibility; maintain the

  19. Advanced superconducting gradiometers for mine detection

    NASA Astrophysics Data System (ADS)

    Clem, Ted R.

    1996-05-01

    Sensors incorporating superconducting quantum interference devices provide the greatest sensitivity for magnetic anomaly detection available with current technology. During the 1980s, the Coastal Systems Station (CSS) developed a superconducting magnetic gradiometer capable of operation outside of the laboratory environment. With this sensor, the CSS was able to demonstrate buried mine detection for the U.S. Navy. Subsequently, the sensor was incorporated into a multisensor suite onboard an underwater towed vehicle to provide a robust mine hunting capability for the Magnetic and Acoustic Detection of Mines Project. This sensor using thin film niobium and a new liquid helium cooling concept was developed to provide significant increases in sensitivity and detection range. In the late 1980s, a new class of `high- Tc' superconductor were discovered with critical temperatures above the boiling point of liquid nitrogen (77 K). This advance has opened up new opportunities for mine reconnaissance and hunting, especially for operation onboard small unmanned underwater vehicles. A high-Tc sensor concept using liquid nitrogen refrigeration has been developed and a test article of that concept is currently being evaluated for its applicability to mobile operation. The design principles for the two new sensor approaches and the results of their evaluations will be described. Finally, the implications of these advances to mine reconnaissance and hunting will be discussed.

  20. Mining Patients' Narratives in Social Media for Pharmacovigilance: Adverse Effects and Misuse of Methylphenidate.

    PubMed

    Chen, Xiaoyi; Faviez, Carole; Schuck, Stéphane; Lillo-Le-Louët, Agnès; Texier, Nathalie; Dahamna, Badisse; Huot, Charles; Foulquié, Pierre; Pereira, Suzanne; Leroux, Vincent; Karapetiantz, Pierre; Guenegou-Arnoux, Armelle; Katsahian, Sandrine; Bousquet, Cédric; Burgun, Anita

    2018-01-01

    Background: The Food and Drug Administration (FDA) in the United States and the European Medicines Agency (EMA) have recognized social media as a new data source to strengthen their activities regarding drug safety. Objective: Our objective in the ADR-PRISM project was to provide text mining and visualization tools to explore a corpus of posts extracted from social media. We evaluated this approach on a corpus of 21 million posts from five patient forums, and conducted a qualitative analysis of the data available on methylphenidate in this corpus. Methods: We applied text mining methods based on named entity recognition and relation extraction in the corpus, followed by signal detection using proportional reporting ratio (PRR). We also used topic modeling based on the Correlated Topic Model to obtain the list of the matics in the corpus and classify the messages based on their topics. Results: We automatically identified 3443 posts about methylphenidate published between 2007 and 2016, among which 61 adverse drug reactions (ADR) were automatically detected. Two pharmacovigilance experts evaluated manually the quality of automatic identification, and a f-measure of 0.57 was reached. Patient's reports were mainly neuro-psychiatric effects. Applying PRR, 67% of the ADRs were signals, including most of the neuro-psychiatric symptoms but also palpitations. Topic modeling showed that the most represented topics were related to Childhood and Treatment initiation , but also Side effects . Cases of misuse were also identified in this corpus, including recreational use and abuse. Conclusion: Named entity recognition combined with signal detection and topic modeling have demonstrated their complementarity in mining social media data. An in-depth analysis focused on methylphenidate showed that this approach was able to detect potential signals and to provide better understanding of patients' behaviors regarding drugs, including misuse.

  1. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges

    PubMed Central

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie; Lemberger, Thomas; McEntyre, Johanna; Polson, Shawn; Xenarios, Ioannis; Arighi, Cecilia; Lu, Zhiyong

    2016-01-01

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system ‘accuracy’ remains a challenge and identify several additional common difficulties and potential research directions including (i) the ‘scalability’ issue due to the increasing need of mining information from millions of full-text articles, (ii) the ‘interoperability’ issue of integrating various text-mining systems into existing curation workflows and (iii) the ‘reusability’ issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. Finally, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators. PMID:28025348

  2. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges

    DOE PAGES

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie; ...

    2016-12-26

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system ‘accuracy’ remains a challenge and identify several additional common difficulties and potential research directions including (i) the ‘scalability’ issue due to themore » increasing need of mining information from millions of full-text articles, (ii) the ‘interoperability’ issue of integrating various text-mining systems into existing curation workflows and (iii) the ‘reusability’ issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. In conclusion, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators.« less

  3. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system ‘accuracy’ remains a challenge and identify several additional common difficulties and potential research directions including (i) the ‘scalability’ issue due to themore » increasing need of mining information from millions of full-text articles, (ii) the ‘interoperability’ issue of integrating various text-mining systems into existing curation workflows and (iii) the ‘reusability’ issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. In conclusion, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators.« less

  4. Kernel Methods for Mining Instance Data in Ontologies

    NASA Astrophysics Data System (ADS)

    Bloehdorn, Stephan; Sure, York

    The amount of ontologies and meta data available on the Web is constantly growing. The successful application of machine learning techniques for learning of ontologies from textual data, i.e. mining for the Semantic Web, contributes to this trend. However, no principal approaches exist so far for mining from the Semantic Web. We investigate how machine learning algorithms can be made amenable for directly taking advantage of the rich knowledge expressed in ontologies and associated instance data. Kernel methods have been successfully employed in various learning tasks and provide a clean framework for interfacing between non-vectorial data and machine learning algorithms. In this spirit, we express the problem of mining instances in ontologies as the problem of defining valid corresponding kernels. We present a principled framework for designing such kernels by means of decomposing the kernel computation into specialized kernels for selected characteristics of an ontology which can be flexibly assembled and tuned. Initial experiments on real world Semantic Web data enjoy promising results and show the usefulness of our approach.

  5. Integrated approach to assess the environmental impact of mining activities: estimation of the spatial distribution of soil contamination (Panasqueira mining area, Central Portugal).

    PubMed

    Candeias, Carla; Ávila, Paula F; Ferreira da Silva, Eduardo; Teixeira, João Paulo

    2015-03-01

    Through the years, mining and beneficiation processes in Panasqueira Sn-W mine (Central Portugal) produced large amounts of As-rich mine wastes laid up in huge tailings and open-air impoundments (Barroca Grande and Rio tailings) that are the main source of pollution in the surrounding area once they are exposed to the weathering conditions leading to the formation of acid mine drainage (AMD) and consequently to the contamination of the surrounding environments, particularly soils. The active mine started the exploration during the nineteenth century. This study aims to look at the extension of the soil pollution due to mining activities and tailing erosion by combining data on the degree of soil contamination that allows a better understanding of the dynamics inherent to leaching, transport, and accumulation of some potential toxic elements in soil and their environmental relevance. Soil samples were collected in the surrounding soils of the mine, were digested in aqua regia, and were analyzed for 36 elements by inductively coupled plasma mass spectrometry (ICP-MS). Selected results are that (a) an association of elements like Ag, As, Bi, Cd, Cu, W, and Zn strongly correlated and controlled by the local sulfide mineralization geochemical signature was revealed; (b) the global area discloses significant concentrations of As, Bi, Cd, and W linked to the exchangeable and acid-soluble bearing phases; and (c) wind promotes the mechanical dispersion of the rejected materials, from the milled waste rocks and the mineral processing plant, with subsequent deposition on soils and waters. Arsenic- and sulfide-related heavy metals (such as Cu and Cd) are associated to the fine materials that are transported in suspension by surface waters or associated to the acidic waters, draining these sites and contaminating the local soils. Part of this fraction, especially for As, Cd, and Cu, is temporally retained in solid phases by precipitation of soluble secondary minerals (through

  6. Trust Mines

    EPA Pesticide Factsheets

    The United States and the Navajo Nation entered into settlement agreements that provide funds to conduct investigations and any needed cleanup at 16 of the 46 priority mines, including six mines in the Northern Abandoned Uranium Mine Region.

  7. LLNL electro-optical mine detection program

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anderson, C.; Aimonetti, W.; Barth, M.

    1994-09-30

    Under funding from the Advanced Research Projects Agency (ARPA) and the US Marine Corps (USMC), Lawrence Livermore National Laboratory (LLNL) has directed a program aimed at improving detection capabilities against buried mines and munitions. The program has provided a national test facility for buried mines in arid environments, compiled and distributed an extensive data base of infrared (IR), ground penetrating radar (GPR), and other measurements made at that site, served as a host for other organizations wishing to make measurements, made considerable progress in the use of ground penetrating radar for mine detection, and worked on the difficult problem ofmore » sensor fusion as applied to buried mine detection. While the majority of our effort has been concentrated on the buried mine problem, LLNL has worked with the U.S.M.C. on surface mine problems as well, providing data and analysis to support the COBRA (Coastal Battlefield Reconnaissance and Analysis) program. The original aim of the experimental aspect of the program was the utilization of multiband infrared approaches for the detection of buried mines. Later the work was extended to a multisensor investigation, including sensors other than infrared imagers. After an early series of measurements, it was determined that further progress would require a larger test facility in a natural environment, so the Buried Object Test Facility (BOTF) was constructed at the Nevada Test Site. After extensive testing, with sensors spanning the electromagnetic spectrum from the near ultraviolet to radio frequencies, possible paths for improvement were: improved spatial resolution providing better ground texture discrimination; analysis which involves more complicated spatial queueing and filtering; additional IR bands using imaging spectroscopy; the use of additional sensors other than IR and the use of data fusion techniques with multi-sensor data; and utilizing time dependent observables like temperature.« less

  8. Text Mining in Cancer Gene and Pathway Prioritization

    PubMed Central

    Luo, Yuan; Riedlinger, Gregory; Szolovits, Peter

    2014-01-01

    Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes. PMID:25392685

  9. Text mining in cancer gene and pathway prioritization.

    PubMed

    Luo, Yuan; Riedlinger, Gregory; Szolovits, Peter

    2014-01-01

    Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes.

  10. Prioritizing abandoned coal mine reclamation projects within the contiguous United States using geographic information system extrapolation.

    PubMed

    Gorokhovich, Yuri; Reid, Matthew; Mignone, Erica; Voros, Andrew

    2003-10-01

    Coal mine reclamation projects are very expensive and require coordination of local and federal agencies to identify resources for the most economic way of reclaiming mined land. Location of resources for mine reclamation is a spatial problem. This article presents a methodology that allows the combination of spatial data on resources for the coal mine reclamation and uses GIS analysis to develop a priority list of potential mine reclamation sites within contiguous United States using the method of extrapolation. The extrapolation method in this study was based on the Bark Camp reclamation project. The mine reclamation project at Bark Camp, Pennsylvania, USA, provided an example of the beneficial use of fly ash and dredged material to reclaim 402,600 sq mi of a mine abandoned in the 1980s. Railroads provided transportation of dredged material and fly ash to the site. Therefore, four spatial elements contributed to the reclamation project at Bark Camp: dredged material, abandoned mines, fly ash sources, and railroads. Using spatial distribution of these data in the contiguous United States, it was possible to utilize GIS analysis to prioritize areas where reclamation projects similar to Bark Camp are feasible. GIS analysis identified unique occurrences of all four spatial elements used in the Bark Camp case for each 1 km of the United States territory within 20, 40, 60, 80, and 100 km radii from abandoned mines. The results showed the number of abandoned mines for each state and identified their locations. The federal or state governments can use these results in mine reclamation planning.

  11. Real-time intelligent decision making with data mining

    NASA Astrophysics Data System (ADS)

    Gupta, Deepak P.; Gopalakrishnan, Bhaskaran

    2004-03-01

    Database mining, widely known as knowledge discovery and data mining (KDD), has attracted lot of attention in recent years. With the rapid growth of databases in commercial, industrial, administrative and other applications, it is necessary and interesting to extract knowledge automatically from huge amount of data. Almost all the organizations are generating data and information at an unprecedented rate and they need to get some useful information from this data. Data mining is the extraction of non-trivial, previously unknown and potentially useful patterns, trends, dependence and correlation known as association rules among data values in large databases. In last ten to fifteen years, data mining spread out from one company to the other to help them understand more about customers' aspect of quality and response and also distinguish the customers they want from those they do not. A credit-card company found that customers who complete their applications in pencil rather than pen are more likely to default. There is a program that identifies callers by purchase history. The bigger the spender, the quicker the call will be answered. If you feel your call is being answered in the order in which it was received, think again. Many algorithms assume that data is static in nature and mine the rules and relations in that data. But for a dynamic database e.g. in most of the manufacturing industries, the rules and relations thus developed among the variables/items no longer hold true. A simple approach may be to mine the associations among the variables after every fixed period of time. But again, how much the length of this period should be, is a question to be answered. The next problem with the static data mining is that some of the relationships that might be of interest from one period to the other may be lost after a new set of data is used. To reflect the effect of new data set and current status of the association rules where some of the strong rules might become

  12. Identifying and Further Understanding the Role of Bacteria and Archaea in a Basic Mine Drainage Remediation Site in Tanoma, PA

    NASA Astrophysics Data System (ADS)

    Sharp, G.; Mount, G.

    2017-12-01

    Acid mine drainage pollutes over 3000 miles of streams and ground water in Pennsylvania alone, and in response many solutions have been developed to counteract the effects of acidic mine drainage. It is estimated by USGS that restoring these watersheds would cost 5 billion-15 billion in total. As economic conditions place limits on expenditures, cost effective means of remediation will be of critical importance. One such method is passive bioremediation, and in the case of metal contamination, self-sustaining oxygenation. Our location of interest is the Tanoma Acid Mine Drainage engineered wetland near Tanoma, Pennsylvania. It is estimated that up to 5,000 gallons per minute is currently being discharged into the site. While most local remediation sites are acidic (pH <4), the Tanoma wetland allows for the study of bioremediation in more neutral pH setting (pH of 5.5-7.5). In this study, we look to further understand biologic, chemical, and hydrologic controls that contribute to the efficiency of the wetland. Our research will focus on the spatial and temporal distribution of biomass through the wetland system as well as changes in water and soil chemistry. Local biofilm (Leptothrix discophora ) are an important part of the remediation process, using iron from the water as an energy source. The bacteria reduce the iron content of the water, precipitating it onto the pond bed as Terraced Iron Formations (TIF). Terraces iron formations (TIF's) are correlated with localized biofilm-archaea densities where archaea thrive in iron rich sediments. By determining bacteria densities in the wetland through gram stain analysis, we can further understand their role in terraced iron formation creation, find localized TIF's that occur, and correlate methane production due to archaea in that location. Mapping TIF locations and identifying bacteria densities will help determine the bioremediation effects on the overall efficiency of iron reduction throughout the Tanoma AMD passive

  13. To build a mine: Prospect to product

    NASA Technical Reports Server (NTRS)

    Gertsch, Richard E.

    1992-01-01

    The terrestrial definition of ore is a quantity of earth materials containing a mineral that can be extracted at a profit. While a space-based resource-gathering operation may well be driven by other motives, such an operation should have the most favorable cost-benefit ratio possible. To this end, principles and procedures already tested by the stringent requirements of the profit motive should guide the selection, design, construction, and operation of a space-based mine. Proceeding from project initiation to a fully operational mine requires several interacting and overlapping steps, which are designed to facilitate the decision process and insure economic viability. The steps to achieve a fully operational mine are outlined. Presuming that the approach to developing nonterrestrial resources will parallel that for developing mineral resources on Earth, we can speculate on some of the problems associated with developing lunar and asteroidal resources. The baseline for our study group was a small lunar mine and oxygen extraction facility. The development of this facility is described in accordance with the steps outlined.

  14. Two-step web-mining approach to study geology/geophysics-related open-source software projects

    NASA Astrophysics Data System (ADS)

    Behrends, Knut; Conze, Ronald

    2013-04-01

    Geology/geophysics is a highly interdisciplinary science, overlapping with, for instance, physics, biology and chemistry. In today's software-intensive work environments, geoscientists often encounter new open-source software from scientific fields that are only remotely related to the own field of expertise. We show how web-mining techniques can help to carry out systematic discovery and evaluation of such software. In a first step, we downloaded ~500 abstracts (each consisting of ~1 kb UTF-8 text) from agu-fm12.abstractcentral.com. This web site hosts the abstracts of all publications presented at AGU Fall Meeting 2012, the world's largest annual geology/geophysics conference. All abstracts belonged to the category "Earth and Space Science Informatics", an interdisciplinary label cross-cutting many disciplines such as "deep biosphere", "atmospheric research", and "mineral physics". Each publication was represented by a highly structured record with ~20 short data attributes, the largest authorship-record being the unstructured "abstract" field. We processed texts of the abstracts with the statistics software "R" to calculate a corpus and a term-document matrix. Using R package "tm", we applied text-mining techniques to filter data and develop hypotheses about software-development activities happening in various geology/geophysics fields. Analyzing the term-document matrix with basic techniques (e.g., word frequencies, co-occurences, weighting) as well as more complex methods (clustering, classification) several key pieces of information were extracted. For example, text-mining can be used to identify scientists who are also developers of open-source scientific software, and the names of their programming projects and codes can also be identified. In a second step, based on the intermediate results found by processing the conference-abstracts, any new hypotheses can be tested in another webmining subproject: by merging the dataset with open data from github

  15. Current OCT Approaches Do Not Reliably Identify TCFAs

    PubMed Central

    Brezinski, Mark E.; Harjai, Kishore J

    2017-01-01

    It is now clearly established that Thin-Capped Fibroatheromas (TCFAs) lead to most Acute Coronary Syndromes (ACSs). The ability to selectively intervene on TCFAs predisposed to rupture and ACSs would dramatically alter the practice of cardiology. While the ability of OCT to identify thin walled plaques at micron scale resolutions has represented a major advance, it is a misconception that it can reliably identify TCFAs. One major reason is that the ‘diffuse border’ criteria currently used to determine ‘lipid plaque’ is almost undoubtedly from high scattering in the intima and not because of core composition (necrotic core). A second reason is that, rather than looking at lipid collections, studies need to be focused on identifying necrotic cores with OCT. Necrotic cores are characteristic of TCFAs and not lipid collections. Numerous other OCT approaches are available which can potentially accurately assess TCFAs, but these have not been aggressively pursed which we believe likely stems in part from the misconceptions over the efficacy of ‘diffuse borders’. PMID:29250457

  16. A cross-species bi-clustering approach to identifying conserved co-regulated genes.

    PubMed

    Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun; Bi, Jinbo

    2016-06-15

    A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticated method to date for identifying conserved co-regulated genes is a two-step approach. This approach first identifies gene clusters for each species by a cluster analysis of gene expression data, and subsequently computes the overlaps of clusters identified from different species to reveal common subgroups. This approach is ineffective to deal with the noise in the expression data introduced by the complicated procedures in quantifying gene expression. Furthermore, due to the sequential nature of the approach, the gene clusters identified in the first step may have little overlap among different species in the second step, thus difficult to detect conserved co-regulated genes. We propose a cross-species bi-clustering approach which first denoises the gene expression data of each species into a data matrix. The rows of the data matrices of different species represent the same set of genes that are characterized by their expression patterns over the developmental stages of each species as columns. A novel bi-clustering method is then developed to cluster genes into subgroups by a joint sparse rank-one factorization of all the data matrices. This method decomposes a data matrix into a product of a column vector and a row vector where the column vector is a consistent indicator across the matrices (species) to identify the same gene cluster and the row vector specifies for each species the developmental stages that the clustered genes co-regulate. Efficient optimization algorithm has been developed with convergence analysis. This approach was first validated on synthetic data and compared

  17. Semi-automated knowledge discovery: identifying and profiling human trafficking

    NASA Astrophysics Data System (ADS)

    Poelmans, Jonas; Elzinga, Paul; Ignatov, Dmitry I.; Kuznetsov, Sergei O.

    2012-11-01

    We propose an iterative and human-centred knowledge discovery methodology based on formal concept analysis. The proposed approach recognizes the important role of the domain expert in mining real-world enterprise applications and makes use of specific domain knowledge, including human intelligence and domain-specific constraints. Our approach was empirically validated at the Amsterdam-Amstelland police to identify suspects and victims of human trafficking in 266,157 suspicious activity reports. Based on guidelines of the Attorney Generals of the Netherlands, we first defined multiple early warning indicators that were used to index the police reports. Using concept lattices, we revealed numerous unknown human trafficking and loverboy suspects. In-depth investigation by the police resulted in a confirmation of their involvement in illegal activities resulting in actual arrestments been made. Our human-centred approach was embedded into operational policing practice and is now successfully used on a daily basis to cope with the vastly growing amount of unstructured information.

  18. A Critical Study on the Underground Environment of Coal Mines in India-an Ergonomic Approach

    NASA Astrophysics Data System (ADS)

    Dey, Netai Chandra; Sharma, Gourab Dhara

    2013-04-01

    Ergonomics application on underground miner's health plays a great role in controlling the efficiency of miners. The job stress in underground mine is still physically demanding and continuous stress due to certain posture or movement of miners during work leads to localized muscle fatigue creating musculo-skeletal disorders. A good working environment can change the degree of job heaviness and thermal stress (WBGT values) can directly have the effect on stretch of work of miners. Out of many unit operations in underground mine, roof bolting keeps an important contribution with regard to safety of the mine and miners. Occupational stress of roof bolters from ergonomic consideration has been discussed in the paper.

  19. Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.

    PubMed

    Singhal, Ayush; Simmons, Michael; Lu, Zhiyong

    2016-11-01

    The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease

  20. A discrete wavelet spectrum approach for identifying non-monotonic trends in hydroclimate data

    NASA Astrophysics Data System (ADS)

    Sang, Yan-Fang; Sun, Fubao; Singh, Vijay P.; Xie, Ping; Sun, Jian

    2018-01-01

    The hydroclimatic process is changing non-monotonically and identifying its trends is a great challenge. Building on the discrete wavelet transform theory, we developed a discrete wavelet spectrum (DWS) approach for identifying non-monotonic trends in hydroclimate time series and evaluating their statistical significance. After validating the DWS approach using two typical synthetic time series, we examined annual temperature and potential evaporation over China from 1961-2013 and found that the DWS approach detected both the warming and the warming hiatus in temperature, and the reversed changes in potential evaporation. Further, the identified non-monotonic trends showed stable significance when the time series was longer than 30 years or so (i.e. the widely defined climate timescale). The significance of trends in potential evaporation measured at 150 stations in China, with an obvious non-monotonic trend, was underestimated and was not detected by the Mann-Kendall test. Comparatively, the DWS approach overcame the problem and detected those significant non-monotonic trends at 380 stations, which helped understand and interpret the spatiotemporal variability in the hydroclimatic process. Our results suggest that non-monotonic trends of hydroclimate time series and their significance should be carefully identified, and the DWS approach proposed has the potential for wide use in the hydrological and climate sciences.

  1. Mining Stable Roles in RBAC

    NASA Astrophysics Data System (ADS)

    Colantonio, Alessandro; di Pietro, Roberto; Ocello, Alberto; Verde, Nino Vincenzo

    In this paper we address the problem of generating a candidate role-set for an RBAC configuration that enjoys the following two key features: it minimizes the administration cost; and, it is a stable candidate role-set. To achieve these goals, we implement a three steps methodology: first, we associate a weight to roles; second, we identify and remove the user-permission assignments that cannot belong to a role that have a weight exceeding a given threshold; third, we restrict the problem of finding a candidate role-set for the given system configuration using only the user-permission assignments that have not been removed in the second step—that is, user-permission assignments that belong to roles with a weight exceeding the given threshold. We formally show—proof of our results are rooted in graph theory—that this methodology achieves the intended goals. Finally, we discuss practical applications of our approach to the role mining problem.

  2. Social Network Analysis and Mining to Monitor and Identify Problems with Large-Scale Information and Communication Technology Interventions

    PubMed Central

    da Silva, Aleksandra do Socorro; de Brito, Silvana Rossy; Vijaykumar, Nandamudi Lankalapalli; da Rocha, Cláudio Alex Jorge; Monteiro, Maurílio de Abreu; Costa, João Crisóstomo Weyl Albuquerque; Francês, Carlos Renato Lisboa

    2016-01-01

    The published literature reveals several arguments concerning the strategic importance of information and communication technology (ICT) interventions for developing countries where the digital divide is a challenge. Large-scale ICT interventions can be an option for countries whose regions, both urban and rural, present a high number of digitally excluded people. Our goal was to monitor and identify problems in interventions aimed at certification for a large number of participants in different geographical regions. Our case study is the training at the Telecentros.BR, a program created in Brazil to install telecenters and certify individuals to use ICT resources. We propose an approach that applies social network analysis and mining techniques to data collected from Telecentros.BR dataset and from the socioeconomics and telecommunications infrastructure indicators of the participants’ municipalities. We found that (i) the analysis of interactions in different time periods reflects the objectives of each phase of training, highlighting the increased density in the phase in which participants develop and disseminate their projects; (ii) analysis according to the roles of participants (i.e., tutors or community members) reveals that the interactions were influenced by the center (or region) to which the participant belongs (that is, a community contained mainly members of the same region and always with the presence of tutors, contradicting expectations of the training project, which aimed for intense collaboration of the participants, regardless of the geographic region); (iii) the social network of participants influences the success of the training: that is, given evidence that the degree of the community member is in the highest range, the probability of this individual concluding the training is 0.689; (iv) the North region presented the lowest probability of participant certification, whereas the Northeast, which served municipalities with similar

  3. Social Network Analysis and Mining to Monitor and Identify Problems with Large-Scale Information and Communication Technology Interventions.

    PubMed

    da Silva, Aleksandra do Socorro; de Brito, Silvana Rossy; Vijaykumar, Nandamudi Lankalapalli; da Rocha, Cláudio Alex Jorge; Monteiro, Maurílio de Abreu; Costa, João Crisóstomo Weyl Albuquerque; Francês, Carlos Renato Lisboa

    2016-01-01

    The published literature reveals several arguments concerning the strategic importance of information and communication technology (ICT) interventions for developing countries where the digital divide is a challenge. Large-scale ICT interventions can be an option for countries whose regions, both urban and rural, present a high number of digitally excluded people. Our goal was to monitor and identify problems in interventions aimed at certification for a large number of participants in different geographical regions. Our case study is the training at the Telecentros.BR, a program created in Brazil to install telecenters and certify individuals to use ICT resources. We propose an approach that applies social network analysis and mining techniques to data collected from Telecentros.BR dataset and from the socioeconomics and telecommunications infrastructure indicators of the participants' municipalities. We found that (i) the analysis of interactions in different time periods reflects the objectives of each phase of training, highlighting the increased density in the phase in which participants develop and disseminate their projects; (ii) analysis according to the roles of participants (i.e., tutors or community members) reveals that the interactions were influenced by the center (or region) to which the participant belongs (that is, a community contained mainly members of the same region and always with the presence of tutors, contradicting expectations of the training project, which aimed for intense collaboration of the participants, regardless of the geographic region); (iii) the social network of participants influences the success of the training: that is, given evidence that the degree of the community member is in the highest range, the probability of this individual concluding the training is 0.689; (iv) the North region presented the lowest probability of participant certification, whereas the Northeast, which served municipalities with similar

  4. Distinct Urban Mines: Exploiting secondary resources in unique anthropogenic spaces.

    PubMed

    Ongondo, F O; Williams, I D; Whitlock, G

    2015-11-01

    Fear of scarcity of resources highlight the need to exploit secondary materials from urban mines in the anthroposphere. Analogous to primary mines rich in one type of material (e.g. copper, gold, etc.), some urban mines are unique/distinct. We introduce, illustrate and discuss the concept of Distinct Urban Mines (DUM). Using the example of a university DUM in the UK, analogous to a primary mine, we illustrate potential product/material yields in respect of size, concentration and spatial location of the mine. Product ownership and replacement cycles for 17 high-value electrical and electronic equipment (EEE) among students showed that 20 tonnes of valuable e-waste were in stockpile in this DUM and a further 87 tonnes would 'soon' be available for exploitation. We address the opportunities and challenges of exploiting DUMs and conclude that they are readily available reservoirs for resource recovery. Two original contributions arise from this work: (i) a novel approach to urban mining with a potential for maximising resource recovery within the anthroposphere is conceptualised; and (ii) previously unavailable data for high-value products for a typical university DUM are presented and analysed. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. OntoGene web services for biomedical text mining.

    PubMed

    Rinaldi, Fabio; Clematide, Simon; Marques, Hernani; Ellendorff, Tilia; Romacker, Martin; Rodriguez-Esteban, Raul

    2014-01-01

    Text mining services are rapidly becoming a crucial component of various knowledge management pipelines, for example in the process of database curation, or for exploration and enrichment of biomedical data within the pharmaceutical industry. Traditional architectures, based on monolithic applications, do not offer sufficient flexibility for a wide range of use case scenarios, and therefore open architectures, as provided by web services, are attracting increased interest. We present an approach towards providing advanced text mining capabilities through web services, using a recently proposed standard for textual data interchange (BioC). The web services leverage a state-of-the-art platform for text mining (OntoGene) which has been tested in several community-organized evaluation challenges,with top ranked results in several of them.

  6. OntoGene web services for biomedical text mining

    PubMed Central

    2014-01-01

    Text mining services are rapidly becoming a crucial component of various knowledge management pipelines, for example in the process of database curation, or for exploration and enrichment of biomedical data within the pharmaceutical industry. Traditional architectures, based on monolithic applications, do not offer sufficient flexibility for a wide range of use case scenarios, and therefore open architectures, as provided by web services, are attracting increased interest. We present an approach towards providing advanced text mining capabilities through web services, using a recently proposed standard for textual data interchange (BioC). The web services leverage a state-of-the-art platform for text mining (OntoGene) which has been tested in several community-organized evaluation challenges, with top ranked results in several of them. PMID:25472638

  7. Identification of quality markers of Yuanhu Zhitong tablets based on integrative pharmacology and data mining.

    PubMed

    Li, Ke; Li, Junfang; Su, Jin; Xiao, Xuefeng; Peng, Xiujuan; Liu, Feng; Li, Defeng; Zhang, Yi; Chong, Tao; Xu, Haiyu; Liu, Changxiao; Yang, Hongjun

    2018-03-07

    The quality evaluation of traditional Chinese medicine (TCM) formulations is needed to guarantee the safety and efficacy. In our laboratory, we established interaction rules between chemical quality control and biological activity evaluations to study Yuanhu Zhitong tablets (YZTs). Moreover, a quality marker (Q-marker) has recently been proposed as a new concept in the quality control of TCM. However, no appropriate methods are available for the identification of Q-markers from the complex TCM systems. We aimed to use an integrative pharmacological (IP) approach to further identify Q-markers from YZTs through the integration of multidisciplinary knowledge. In addition, data mining was used to determine the correlation between multiple constituents of this TCM and its bioactivity to improve quality control. The IP approach was used to identify the active constituents of YZTs and elucidate the molecular mechanisms by integrating chemical and biosynthetic analyses, drug metabolism, and network pharmacology. Data mining methods including grey relational analysis (GRA) and least squares support vector machine (LS-SVM) regression techniques, were used to establish the correlations among the constituents and efficacy, and dose efficacy in multiple dimensions. Seven constituents (tetrahydropalmatine, α-allocryptopine, protopine, corydaline, imperatorin, isoimperatorin, and byakangelicin) were identified as Q-markers of YZT using IP based on their high abundance, specific presence in the individual herbal constituents and the product, appropriate drug-like properties, and critical contribution to the bioactivity of the mixture of YZT constituents. Moreover, three Q-markers (protopine, α-allocryptopine, and corydaline) were highly correlated with the multiple bioactivities of the YZTs, as found using data mining. Finally, three constituents (tetrahydropalmatine, corydaline, and imperatorin) were chosen as minimum combinations that both distinguished the authentic

  8. Study of application of ERTS-A imagery to fracture-related mine safety hazards in the coal mining industry

    NASA Technical Reports Server (NTRS)

    Wier, C. E.; Wobber, F. J.; Russell, O. R.; Martin, K. R. (Principal Investigator)

    1973-01-01

    The author has identified the following significant results. Mined land reclamation analysis procedures developed within the Indiana portion of the Illinois Coal Basin were independently tested in Ohio utilizing 1:80,000 scale enlargements of ERTS-1 image 1029-15361-7 (dated August 21, 1972). An area in Belmont County was selected for analysis due to the extensive surface mining and the different degrees of reclamation occurring in this area. Contour mining in this area provided the opportunity to extend techniques developed for analysis of relatively flat mining areas in Indiana to areas of rolling topography in Ohio. The analysts had no previous experience in the area. Field investigations largely confirmed office analysis results although in a few areas estimates of vegetation percentages were found to be too high. In one area this error approximated 25%. These results suggest that systematic ERTS-1 analysis in combination with selective field sampling can provide reliable vegetation percentage estimates in excess of 25% accuracy with minimum equipment investment and training. The utility of ERTS-1 for practical and reasonably reliable update of mined lands information for groups with budget limitations is suggested. Many states can benefit from low cost updates using ERTS-1 imagery from public sources.

  9. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges.

    PubMed

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie; Lemberger, Thomas; McEntyre, Johanna; Polson, Shawn; Xenarios, Ioannis; Arighi, Cecilia; Lu, Zhiyong

    2016-01-01

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system 'accuracy' remains a challenge and identify several additional common difficulties and potential research directions including (i) the 'scalability' issue due to the increasing need of mining information from millions of full-text articles, (ii) the 'interoperability' issue of integrating various text-mining systems into existing curation workflows and (iii) the 'reusability' issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. Finally, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.

  10. Mining the human gut microbiome for novel stress resistance genes

    PubMed Central

    Culligan, Eamonn P.; Marchesi, Julian R.; Hill, Colin; Sleator, Roy D.

    2012-01-01

    With the rapid advances in sequencing technologies in recent years, the human genome is now considered incomplete without the complementing microbiome, which outnumbers human genes by a factor of one hundred. The human microbiome, and more specifically the gut microbiome, has received considerable attention and research efforts over the past decade. Many studies have identified and quantified “who is there?,” while others have determined some of their functional capacity, or “what are they doing?” In a recent study, we identified novel salt-tolerance loci from the human gut microbiome using combined functional metagenomic and bioinformatics based approaches. Herein, we discuss the identified loci, their role in salt-tolerance and their importance in the context of the gut environment. We also consider the utility and power of functional metagenomics for mining such environments for novel genes and proteins, as well as the implications and possible applications for future research. PMID:22688726

  11. Co-evolutionary data mining for fuzzy rules: automatic fitness function creation phase space, and experiments

    NASA Astrophysics Data System (ADS)

    Smith, James F., III; Blank, Joseph A.

    2003-03-01

    An approach is being explored that involves embedding a fuzzy logic based resource manager in an electronic game environment. Game agents can function under their own autonomous logic or human control. This approach automates the data mining problem. The game automatically creates a cleansed database reflecting the domain expert's knowledge, it calls a data mining function, a genetic algorithm, for data mining of the data base as required and allows easy evaluation of the information extracted. The co-evolutionary fitness functions, chromosomes and stopping criteria for ending the game are discussed. Genetic algorithm and genetic program based data mining procedures are discussed that automatically discover new fuzzy rules and strategies. The strategy tree concept and its relationship to co-evolutionary data mining are examined as well as the associated phase space representation of fuzzy concepts. The overlap of fuzzy concepts in phase space reduces the effective strategies available to adversaries. Co-evolutionary data mining alters the geometric properties of the overlap region known as the admissible region of phase space significantly enhancing the performance of the resource manager. Procedures for validation of the information data mined are discussed and significant experimental results provided.

  12. Chapter 7: Selecting tree species for reforestation of Appalachian mined lands

    Treesearch

    V. Davis; J.A. Burger; R. Rathfon; C.E. Zipper

    2017-01-01

    The Forestry Reclamation Approach (FRA) is a method for reclaiming coal-mined land to forested postmining land uses under the federal Surface Mining Control and Reclamation Act of 1977 (SMCRA) (Chapter 2, this volume). Step 4 of the FRA is to plant native trees for commercial timber value, wildlife habitat, soil stability, watershed protection, and other environmental...

  13. Review of Lead-Zinc Mining Impact on Landscape in the Tri-State Mining District using Small Unmanned Aerial Vehicles.

    NASA Astrophysics Data System (ADS)

    Bhakta, K. D.; Yeboah-Forson, A.

    2015-12-01

    The Tri-State lead and zinc mining district in SW Missouri, SE Kansas, and NE Oklahoma encompasses nearly 2,500 sq. miles of land and at its peak accounted for half of the US zinc (23,000,000 tons) production that surpassed one billion dollars in economic value. Once these lead and zinc rich ores were extracted, mining and milling sites were abandoned leaving behind a new landscape with numerous environmental challenges. Since 1970, most of the sites have been targeted for remediation and reclamation by federal and state agencies including the EPA. In order to capture the full extent of the impact of lead and zinc mining in the Tri-State area, numerous geoscientific approaches including data from small unmanned aerial vehicle (UAV) were employed to investigate the influence of mining in the study area. The study presented here is focused on observational assessment of the existing landscape using multiple commercial high-definitions data from UAVs to study different sites across areas of concern in the three states. Primary results (images) gathered and analyzed DEM and GIS data from abandoned mines showed the potential to provide a quick snapshot of successful or unsuccessful remediated areas. Although research and remediation of the Tri-State mining district are a continuous process, evidence from this geomorphic study suggest that UAVs can provide a quick overview of the remediated landscape or serve as a primary background tool for a more detail site-specific environmental study.

  14. Data Mining.

    ERIC Educational Resources Information Center

    Benoit, Gerald

    2002-01-01

    Discusses data mining (DM) and knowledge discovery in databases (KDD), taking the view that KDD is the larger view of the entire process, with DM emphasizing the cleaning, warehousing, mining, and visualization of knowledge discovery in databases. Highlights include algorithms; users; the Internet; text mining; and information extraction.…

  15. Dietary patterns analysis using data mining method. An application to data from the CYKIDS study.

    PubMed

    Lazarou, Chrystalleni; Karaolis, Minas; Matalas, Antonia-Leda; Panagiotakos, Demosthenes B

    2012-11-01

    Data mining is a computational method that permits the extraction of patterns from large databases. We applied the data mining approach in data from 1140 children (9-13 years), in order to derive dietary habits related to children's obesity status. Rules emerged via data mining approach revealed the detrimental influence of the increased consumption of soft dinks, delicatessen meat, sweets, fried and junk food. For example, frequent (3-5 times/week) consumption of all these foods increases the risk for being obese by 75%, whereas in children who have a similar dietary pattern, but eat >2 times/week fish and seafood the risk for obesity is reduced by 33%. In conclusion patterns revealed from data mining technique refer to specific groups of children and demonstrate the effect on the risk associated with obesity status when a single dietary habit might be modified. Thus, a more individualized approach when translating public health messages could be achieved. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  16. Improving risk-stratification of Diabetes complications using temporal data mining.

    PubMed

    Sacchi, Lucia; Dagliati, Arianna; Segagni, Daniele; Leporati, Paola; Chiovato, Luca; Bellazzi, Riccardo

    2015-01-01

    To understand which factor trigger worsened disease control is a crucial step in Type 2 Diabetes (T2D) patient management. The MOSAIC project, funded by the European Commission under the FP7 program, has been designed to integrate heterogeneous data sources and provide decision support in chronic T2D management through patients' continuous stratification. In this work we show how temporal data mining can be fruitfully exploited to improve risk stratification. In particular, we exploit administrative data on drug purchases to divide patients in meaningful groups. The detection of drug consumption patterns allows stratifying the population on the basis of subjects' purchasing attitude. Merging these findings with clinical values indicates the relevance of the applied methods while showing significant differences in the identified groups. This extensive approach emphasized the exploitation of administrative data to identify patterns able to explain clinical conditions.

  17. Model-based approach to the detection and classification of mines in sidescan sonar.

    PubMed

    Reed, Scott; Petillot, Yvan; Bell, Judith

    2004-01-10

    This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis.

  18. 30 CFR 49.4 - Alternative mine rescue capability for special mining conditions.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Alternative mine rescue capability for special mining conditions. 49.4 Section 49.4 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Metal and...

  19. 30 CFR 49.4 - Alternative mine rescue capability for special mining conditions.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Alternative mine rescue capability for special mining conditions. 49.4 Section 49.4 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Metal and...

  20. 30 CFR 49.4 - Alternative mine rescue capability for special mining conditions.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Alternative mine rescue capability for special mining conditions. 49.4 Section 49.4 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Metal and...

  1. Safety survey of Iran's mines and comparison to some other countries.

    PubMed

    Bagherpour, Raheb; Yarahmadi, Reza; Khademian, Amir; Almasi, Seied Najmedin

    2017-03-01

    The increasing development of mining activities in Iran makes it necessary to have a closer look at the safety issues. Analysis of different incidents and damages in mines can be helpful for the adoption of suitable approaches to prevent the incidents. In this study, safety statistics of Iran's mines in 2011 and 2012 were assessed and important incidents and injuries happening to employees for 12 different groups of minerals were evaluated and eventually compared to the situation of some other countries. According to the obtained results, the average incidence probability in Iran's mines was calculated to be 0.18 for 2011 and the incidence probability of coal, copper and iron ore mines was greater than others. The injury rate of Iran's mines was 106 and 164 out of 10,000 persons for 2011 and 2012, respectively, and the maximum values of injury rate belonged to coal, dimension stone and aggregate mines. Also, it turned out that the fatal rate per 100 tons of production had the highest values in chromite and coal mines. Besides, comparison of injury rate and the fatal rate in Iran and some countries showed that the safety situation in Iran's mines was in a fair condition.

  2. Data mining for multiagent rules, strategies, and fuzzy decision tree structure

    NASA Astrophysics Data System (ADS)

    Smith, James F., III; Rhyne, Robert D., II; Fisher, Kristin

    2002-03-01

    A fuzzy logic based resource manager (RM) has been developed that automatically allocates electronic attack resources in real-time over many dissimilar platforms. Two different data mining algorithms have been developed to determine rules, strategies, and fuzzy decision tree structure. The first data mining algorithm uses a genetic algorithm as a data mining function and is called from an electronic game. The game allows a human expert to play against the resource manager in a simulated battlespace with each of the defending platforms being exclusively directed by the fuzzy resource manager and the attacking platforms being controlled by the human expert or operating autonomously under their own logic. This approach automates the data mining problem. The game automatically creates a database reflecting the domain expert's knowledge. It calls a data mining function, a genetic algorithm, for data mining of the database as required and allows easy evaluation of the information mined in the second step. The criterion for re- optimization is discussed as well as experimental results. Then a second data mining algorithm that uses a genetic program as a data mining function is introduced to automatically discover fuzzy decision tree structures. Finally, a fuzzy decision tree generated through this process is discussed.

  3. Use of electrical resistivity to detect underground mine voids in Ohio

    USGS Publications Warehouse

    Sheets, Rodney A.

    2002-01-01

    Electrical resistivity surveys were completed at two sites along State Route 32 in Jackson and Vinton Counties, Ohio. The surveys were done to determine whether the electrical resistivity method could identify areas where coal was mined, leaving air- or water-filled voids. These voids can be local sources of potable water or acid mine drainage. They could also result in potentially dangerous collapse of roads or buildings that overlie the voids. The resistivity response of air- or water-filled voids compared to the surrounding bedrock may allow electrical resistivity surveys to delineate areas underlain by such voids. Surface deformation along State Route 32 in Jackson County led to a site investigation, which included electrical resistivity surveys. Several highly resistive areas were identified using axial dipole-dipole and Wenner resistivity surveys. Subsequent drilling and excavation led to the discovery of several air-filled abandoned underground mine tunnels. A site along State Route 32 in Vinton County, Ohio, was drilled as part of a mining permit application process. A mine void under the highway was instrumented with a pressure transducer to monitor water levels. During a period of high water level, electrical resistivity surveys were completed. The electrical response was dominated by a thin, low-resistivity layer of iron ore above where the coal was mined out. Nearby overhead powerlines also affected the results.

  4. Mining nonterrestrial resources: Information needs and research topics

    NASA Technical Reports Server (NTRS)

    Daemen, Jaak J. K.

    1992-01-01

    An outline of topics we need to understand better in order to apply mining technology to a nonterrestrial environment is presented. The proposed list is not intended to be complete. It aims to identify representative topics that suggest productive research. Such research will reduce the uncertainties associated with extrapolating from conventional earthbound practice to nonterrestrial applications. One objective is to propose projects that should put future discussions of nonterrestrial mining on a firmer, less speculative basis.

  5. VALUING ACID MINE DRAINAGE REMEDIATION IN WEST VIRGINIA: A HEDONIC MODELING APPROACH INCORPORATING GEOGRAPHIC INFORMATION SYSTEMS

    EPA Science Inventory

    States with active and abandoned mines face large private and public costs to remediate damage to streams and rivers from acid mine drainage (AMD). Appalachian states have an especially large number of contaminated streams and rivers, and the USGS places AMD as the primary source...

  6. Elevated rates of gold mining in the Amazon revealed through high-resolution monitoring.

    PubMed

    Asner, Gregory P; Llactayo, William; Tupayachi, Raul; Luna, Ernesto Ráez

    2013-11-12

    Gold mining has rapidly increased in western Amazonia, but the rates and ecological impacts of mining remain poorly known and potentially underestimated. We combined field surveys, airborne mapping, and high-resolution satellite imaging to assess road- and river-based gold mining in the Madre de Dios region of the Peruvian Amazon from 1999 to 2012. In this period, the geographic extent of gold mining increased 400%. The average annual rate of forest loss as a result of gold mining tripled in 2008 following the global economic recession, closely associated with increased gold prices. Small clandestine operations now comprise more than half of all gold mining activities throughout the region. These rates of gold mining are far higher than previous estimates that were based on traditional satellite mapping techniques. Our results prove that gold mining is growing more rapidly than previously thought, and that high-resolution monitoring approaches are required to accurately quantify human impacts on tropical forests.

  7. Elevated rates of gold mining in the Amazon revealed through high-resolution monitoring

    PubMed Central

    Asner, Gregory P.; Llactayo, William; Tupayachi, Raul; Luna, Ernesto Ráez

    2013-01-01

    Gold mining has rapidly increased in western Amazonia, but the rates and ecological impacts of mining remain poorly known and potentially underestimated. We combined field surveys, airborne mapping, and high-resolution satellite imaging to assess road- and river-based gold mining in the Madre de Dios region of the Peruvian Amazon from 1999 to 2012. In this period, the geographic extent of gold mining increased 400%. The average annual rate of forest loss as a result of gold mining tripled in 2008 following the global economic recession, closely associated with increased gold prices. Small clandestine operations now comprise more than half of all gold mining activities throughout the region. These rates of gold mining are far higher than previous estimates that were based on traditional satellite mapping techniques. Our results prove that gold mining is growing more rapidly than previously thought, and that high-resolution monitoring approaches are required to accurately quantify human impacts on tropical forests. PMID:24167281

  8. 30 CFR 49.4 - Alternative mine rescue capability for special mining conditions.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Alternative mine rescue capability for special mining conditions. 49.4 Section 49.4 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING MINE RESCUE TEAMS § 49.4 Alternative mine rescue capability for...

  9. 30 CFR 49.4 - Alternative mine rescue capability for special mining conditions.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Alternative mine rescue capability for special mining conditions. 49.4 Section 49.4 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING MINE RESCUE TEAMS § 49.4 Alternative mine rescue capability for...

  10. Mining in New Caledonia: environmental stakes and restoration opportunities.

    PubMed

    Losfeld, Guillaume; L'Huillier, Laurent; Fogliani, Bruno; Jaffré, Tanguy; Grison, Claude

    2015-04-01

    New Caledonia is a widely recognised marine and terrestrial biodiversity hot spot. However, this unique environment is under increasing anthropogenic pressure. Major threats are related to land cover change and include fire, urban sprawling and mining. Resulting habitat loss and fragmentation end up in serious erosion of the local biodiversity. Mining is of particular concern due to its economic significance for the island. Open cast mines were exploited there since 1873, and scraping out soil to access ores wipes out flora. Resulting perturbations on water flows and dramatic soil erosion lead to metal-rich sediment transport downstream into rivers and the lagoon. Conflicting environmental and economic aspects of mining are discussed in this paper. However, mining practices are also improving, and where impacts are inescapable ecological restoration is now considered. Past and ongoing experiences in the restoration of New Caledonian terrestrial ecosystems are presented and discussed here. Economic use of the local floristic diversity could also promote conservation and restoration, while providing alternative incomes. In this regard, Ecocatalysis, an innovative approach to make use of metal hyperaccumulating plants, is of particular interest.

  11. Mining Tasks from the Web Anchor Text Graph: MSR Notebook Paper for the TREC 2015 Tasks Track

    DTIC Science & Technology

    2015-11-20

    Mining Tasks from the Web Anchor Text Graph: MSR Notebook Paper for the TREC 2015 Tasks Track Paul N. Bennett Microsoft Research Redmond, USA pauben...anchor text graph has proven useful in the general realm of query reformulation [2], we sought to quantify the value of extracting key phrases from...anchor text in the broader setting of the task understanding track. Given a query, our approach considers a simple method for identifying a relevant

  12. PREVENTION OF ACID MINE DRAINAGE GENERATION FROM OPEN-PIT MINE HIGHWALLS

    EPA Science Inventory



    Exposed, open pit mine highwalls contribute significantly to the production of acid mine

    drainage (AMD) thus causing environmental concerns upon closure of an operating mine. Available information on the generation of AMD from open-pit mine highwalls is very limit...

  13. 30 CFR 780.27 - Reclamation plan: Surface mining near underground mining.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 3 2011-07-01 2011-07-01 false Reclamation plan: Surface mining near underground mining. 780.27 Section 780.27 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR SURFACE COAL MINING AND RECLAMATION OPERATIONS PERMITS AND COAL...

  14. Risk Management Interventions to Reduce Injuries and Maximize Economic Benefits in U.S. Mining.

    PubMed

    Griffin, Stephanie C; Bui, David P; Gowrisankaran, Gautam; Lutz, Eric A; He, Charles; Hu, Chengcheng; Burgess, Jefferey L

    2018-03-01

    Risk management (RM) is a cyclical process of identifying and ranking risks, implementing controls, and evaluating their effectiveness. This study aims to identify effective RM interventions in the U.S. mining industry. RM interventions were identified in four companies representing metal, aggregate, and coal mining sectors. Injury rates were determined using Mine Safety and Health Administration (MSHA) data and changes in injury rates identified through change point analysis. Program implementation costs and associated changes in injury costs were evaluated for select interventions. Six of 20 RM interventions were associated with a decline in all injuries and one with a reduction in lost-time injuries, all with a positive return on investment. Reductions in injuries and associated costs were observed following implementation of a limited number of specific RM interventions.

  15. The psychosocial impacts of fly-in fly-out and drive-in drive-out mining on mining employees: a qualitative study.

    PubMed

    Torkington, Amanda May; Larkins, Sarah; Gupta, Tarun Sen

    2011-06-01

    To explore how fly-in fly-out (FIFO) and drive-in drive-out (DIDO) mining affects the psychosocial well-being of miners resident in a rural north Queensland town as well as the sources of support miners identify and use in managing these effects. A descriptive qualitative study, using semistructured interviews. Charters Towers, a rural town in north Queensland, and a remote north-western Queensland mine. Eleven people, resident in or near Charters Towers, currently or formerly employed in FIFO or DIDO mining. Self-reported effects on psychosocial well-being and sources of support. Participants reported positive and negative psychosocial impacts across domains including family life, relationships, social life, work satisfaction, mood, sleep and financial situation. Concerns about the impact on participants' partners were described. Awareness of onsite support, such as Employee Assistance Programs, varied. Other supports included administration staff and nurses or medics. Trusted friends or colleagues at the mine site were considered a preferred means of support. Some, but not most, had experienced coworkers discussing problems with them. A reluctance to seek support was described, with a number of barriers identified. Those having problems might not recognise their own stress and thus not seek support. This study identifies numerous psychosocial impacts on FIFO/DIDO miners and their partners, and provides insights into preferences regarding support. Employee Assistance Programs cannot be relied upon as the sole means of support. Further studies exploring the impact upon and supports for FIFO/DIDO workers and their partners will assist in better understanding these issues. © 2011 The Authors. Australian Journal of Rural Health © National Rural Health Alliance Inc.

  16. Quantitative and qualitative approaches to identifying migration chronology in a continental migrant

    USGS Publications Warehouse

    Beatty, William S.; Kesler, Dylan C.; Webb, Elisabeth B.; Raedeke, Andrew H.; Naylor, Luke W.; Humburg, Dale D.

    2013-01-01

    The degree to which extrinsic factors influence migration chronology in North American waterfowl has not been quantified, particularly for dabbling ducks. Previous studies have examined waterfowl migration using various methods, however, quantitative approaches to define avian migration chronology over broad spatio-temporal scales are limited, and the implications for using different approaches have not been assessed. We used movement data from 19 female adult mallards (Anas platyrhynchos) equipped with solar-powered global positioning system satellite transmitters to evaluate two individual level approaches for quantifying migration chronology. The first approach defined migration based on individual movements among geopolitical boundaries (state, provincial, international), whereas the second method modeled net displacement as a function of time using nonlinear models. Differences in migration chronologies identified by each of the approaches were examined with analysis of variance. The geopolitical method identified mean autumn migration midpoints at 15 November 2010 and 13 November 2011, whereas the net displacement method identified midpoints at 15 November 2010 and 14 November 2011. The mean midpoints for spring migration were 3 April 2011 and 20 March 2012 using the geopolitical method and 31 March 2011 and 22 March 2012 using the net displacement method. The duration, initiation date, midpoint, and termination date for both autumn and spring migration did not differ between the two individual level approaches. Although we did not detect differences in migration parameters between the different approaches, the net displacement metric offers broad potential to address questions in movement ecology for migrating species. Ultimately, an objective definition of migration chronology will allow researchers to obtain a comprehensive understanding of the extrinsic factors that drive migration at the individual and population levels. As a result, targeted

  17. Quantitative and qualitative approaches to identifying migration chronology in a continental migrant.

    PubMed

    Beatty, William S; Kesler, Dylan C; Webb, Elisabeth B; Raedeke, Andrew H; Naylor, Luke W; Humburg, Dale D

    2013-01-01

    The degree to which extrinsic factors influence migration chronology in North American waterfowl has not been quantified, particularly for dabbling ducks. Previous studies have examined waterfowl migration using various methods, however, quantitative approaches to define avian migration chronology over broad spatio-temporal scales are limited, and the implications for using different approaches have not been assessed. We used movement data from 19 female adult mallards (Anas platyrhynchos) equipped with solar-powered global positioning system satellite transmitters to evaluate two individual level approaches for quantifying migration chronology. The first approach defined migration based on individual movements among geopolitical boundaries (state, provincial, international), whereas the second method modeled net displacement as a function of time using nonlinear models. Differences in migration chronologies identified by each of the approaches were examined with analysis of variance. The geopolitical method identified mean autumn migration midpoints at 15 November 2010 and 13 November 2011, whereas the net displacement method identified midpoints at 15 November 2010 and 14 November 2011. The mean midpoints for spring migration were 3 April 2011 and 20 March 2012 using the geopolitical method and 31 March 2011 and 22 March 2012 using the net displacement method. The duration, initiation date, midpoint, and termination date for both autumn and spring migration did not differ between the two individual level approaches. Although we did not detect differences in migration parameters between the different approaches, the net displacement metric offers broad potential to address questions in movement ecology for migrating species. Ultimately, an objective definition of migration chronology will allow researchers to obtain a comprehensive understanding of the extrinsic factors that drive migration at the individual and population levels. As a result, targeted

  18. Orapa Diamond Mine, Botswana

    NASA Image and Video Library

    2015-11-16

    This image from NASA Terra spacecraft shows the Orapa diamond mine, the world largest diamond mine by area. The mine is located in Botswana. It is the oldest of four mines operated by the same company, having begun operations in 1971. Orapa is an open pit style of mine, located on two kimberlite pipes. Currently, the Orapa mine annually produces approximately 11 million carats (2200 kg) of diamonds. The Letlhakane diamond mine is also an open pit construction. In 2003, the Letlhakane mine produced 1.06 million carats of diamonds. The Damtshaa diamond mine is the newest of four mines, located on top of four distinct kimberlite pipes of varying ore grade. The mine is forecast to produce about 5 million carats of diamond over the projected 31 year life of the mine. The image was acquired October 5, 2014, covers an area of 28 by 45 km, and is located at 21.3 degrees south, 25.4 degrees east. http://photojournal.jpl.nasa.gov/catalog/PIA20104

  19. Using Statistics and Data Mining Approaches to Analyze Male Sexual Behaviors and Use of Erectile Dysfunction Drugs Based on Large Questionnaire Data.

    PubMed

    Qiao, Zhi; Li, Xiang; Liu, Haifeng; Zhang, Lei; Cao, Junyang; Xie, Guotong; Qin, Nan; Jiang, Hui; Lin, Haocheng

    2017-01-01

    The prevalence of erectile dysfunction (ED) has been extensively studied worldwide. Erectile dysfunction drugs has shown great efficacy in preventing male erectile dysfunction. In order to help doctors know drug taken preference of patients and better prescribe, it is crucial to analyze who actually take erectile dysfunction drugs and the relation between sexual behaviors and drug use. Existing clinical studies usually used descriptive statistics and regression analysis based on small volume of data. In this paper, based on big volume of data (48,630 questionnaires), we use data mining approaches besides statistics and regression analysis to comprehensively analyze the relation between male sexual behaviors and use of erectile dysfunction drugs for unravelling the characteristic of patients who take erectile dysfunction drugs. We firstly analyze the impact of multiple sexual behavior factors on whether to use the erectile dysfunction drugs. Then, we explore to mine the Decision Rules for Stratification to discover patients who are more likely to take drugs. Based on the decision rules, the patients can be partitioned into four potential groups for use of erectile dysfunction: high potential group, intermediate potential-1 group, intermediate potential-2 group and low potential group. Experimental results show 1) the sexual behavior factors, erectile hardness and time length to prepare (how long to prepares for sexual behaviors ahead of time), have bigger impacts both in correlation analysis and potential drug taking patients discovering; 2) odds ratio between patients identified as low potential and high potential was 6.098 (95% confidence interval, 5.159-7.209) with statistically significant differences in taking drug potential detected between all potential groups.

  20. A Review of Recent Advancement in Integrating Omics Data with Literature Mining towards Biomedical Discoveries

    PubMed Central

    Raja, Kalpana; Patrick, Matthew; Gao, Yilin; Madu, Desmond; Yang, Yuyang

    2017-01-01

    In the past decade, the volume of “omics” data generated by the different high-throughput technologies has expanded exponentially. The managing, storing, and analyzing of this big data have been a great challenge for the researchers, especially when moving towards the goal of generating testable data-driven hypotheses, which has been the promise of the high-throughput experimental techniques. Different bioinformatics approaches have been developed to streamline the downstream analyzes by providing independent information to interpret and provide biological inference. Text mining (also known as literature mining) is one of the commonly used approaches for automated generation of biological knowledge from the huge number of published articles. In this review paper, we discuss the recent advancement in approaches that integrate results from omics data and information generated from text mining approaches to uncover novel biomedical information. PMID:28331849

  1. Comparative analysis of data mining techniques for business data

    NASA Astrophysics Data System (ADS)

    Jamil, Jastini Mohd; Shaharanee, Izwan Nizal Mohd

    2014-12-01

    Data mining is the process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data contained within a database. Companies are using this tool to further understand their customers, to design targeted sales and marketing campaigns, to predict what product customers will buy and the frequency of purchase, and to spot trends in customer preferences that can lead to new product development. In this paper, we conduct a systematic approach to explore several of data mining techniques in business application. The experimental result reveals that all data mining techniques accomplish their goals perfectly, but each of the technique has its own characteristics and specification that demonstrate their accuracy, proficiency and preference.

  2. Surface mining

    Treesearch

    Robert Leopold; Bruce Rowland; Reed Stalder

    1979-01-01

    The surface mining process consists of four phases: (1) exploration; (2) development; (3) production; and (4) reclamation. A variety of surface mining methods has been developed, including strip mining, auger, area strip, open pit, dredging, and hydraulic. Sound planning and design techniques are essential to implement alternatives to meet the myriad of laws,...

  3. 30 CFR 780.27 - Reclamation plan: Surface mining near underground mining.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... RECLAMATION AND OPERATION PLAN § 780.27 Reclamation plan: Surface mining near underground mining. For surface... 30 Mineral Resources 3 2010-07-01 2010-07-01 false Reclamation plan: Surface mining near... ENFORCEMENT, DEPARTMENT OF THE INTERIOR SURFACE COAL MINING AND RECLAMATION OPERATIONS PERMITS AND COAL...

  4. Data mining for personal navigation

    NASA Astrophysics Data System (ADS)

    Hariharan, Gurushyam; Franti, Pasi; Mehta, Sandeep

    2002-03-01

    Relevance is the key in defining what data is to be extracted from the Internet. Traditionally, relevance has been defined mainly by keywords and user profiles. In this paper we discuss a fairly untouched dimension to relevance: location. Any navigational information sought by a user at large on earth is evidently governed by his location. We believe that task oriented data mining of the web amalgamated with location information is the key to providing relevant information for personal navigation. We explore the existential hurdles and propose novel approaches to tackle them. We also present naive, task-oriented data mining based approaches and their implementations in Java, to extract location based information. Ad-hoc pairing of data with coordinates (x, y) is very rare on the web. But if the same co-ordinates are converted to a logical address (state/city/street), a wide spectrum of location-based information base opens up. Hence, given the coordinates (x, y) on the earth, the scheme points to the logical address of the user. Location based information could either be picked up from fixed and known service providers (e.g. Yellow Pages) or from any arbitrary website on the Web. Once the web servers providing information relevant to the logical address are located, task oriented data mining is performed over these sites keeping in mind what information is interesting to the contemporary user. After all this, a simple data stream is provided to the user with information scaled to his convenience. The scheme has been implemented for cities of Finland.

  5. Identifying the Educationally Influential Physician: A Systematic Review of Approaches

    ERIC Educational Resources Information Center

    Kronberger, Matthew P.; Bakken, Lori L.

    2011-01-01

    Introduction: Previous studies have indicated that educationally influential physicians' (EIPs) interactions with peers can lead to practice changes and improved patient outcomes. However, multiple approaches have been used to identify and investigate EIPs' informal or formal influence on practice, which creates study outcomes that are difficult…

  6. Exploring patterns of epigenetic information with data mining techniques.

    PubMed

    Aguiar-Pulido, Vanessa; Seoane, José A; Gestal, Marcos; Dorado, Julián

    2013-01-01

    Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of epigenetic data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of data for which data mining is essential. Part of these data may contain patterns of epigenetic information which are mitotically and/or meiotically heritable determining gene expression and cellular differentiation, as well as cellular fate. Epigenetic lesions and genetic mutations are acquired by individuals during their life and accumulate with ageing. Both defects, either together or individually, can result in losing control over cell growth and, thus, causing cancer development. Data mining techniques could be then used to extract the previous patterns. This work reviews some of the most important applications of data mining to epigenetics.

  7. Geotechnical approaches to coal ash content control in mining of complex structure deposits

    NASA Astrophysics Data System (ADS)

    Batugin, SA; Gavrilov, VL; Khoyutanov, EA

    2017-02-01

    Coal deposits having complex structure and nonuniform quality coal reserves require improved processes of production quality control. The paper proposes a method to present coal ash content as components of natural and technological dilution. It is chosen to carry out studies on the western site of Elginsk coal deposit, composed of four coal beds of complex structure. The reported estimates of coal ash content in the beds with respect to five components point at the need to account for such data in confirmation exploration, mine planning and actual mining. Basic means of analysis and control of overall ash content and its components are discussed.

  8. The node-weighted Steiner tree approach to identify elements of cancer-related signaling pathways.

    PubMed

    Sun, Yahui; Ma, Chenkai; Halgamuge, Saman

    2017-12-28

    Cancer constitutes a momentous health burden in our society. Critical information on cancer may be hidden in its signaling pathways. However, even though a large amount of money has been spent on cancer research, some critical information on cancer-related signaling pathways still remains elusive. Hence, new works towards a complete understanding of cancer-related signaling pathways will greatly benefit the prevention, diagnosis, and treatment of cancer. We propose the node-weighted Steiner tree approach to identify important elements of cancer-related signaling pathways at the level of proteins. This new approach has advantages over previous approaches since it is fast in processing large protein-protein interaction networks. We apply this new approach to identify important elements of two well-known cancer-related signaling pathways: PI3K/Akt and MAPK. First, we generate a node-weighted protein-protein interaction network using protein and signaling pathway data. Second, we modify and use two preprocessing techniques and a state-of-the-art Steiner tree algorithm to identify a subnetwork in the generated network. Third, we propose two new metrics to select important elements from this subnetwork. On a commonly used personal computer, this new approach takes less than 2 s to identify the important elements of PI3K/Akt and MAPK signaling pathways in a large node-weighted protein-protein interaction network with 16,843 vertices and 1,736,922 edges. We further analyze and demonstrate the significance of these identified elements to cancer signal transduction by exploring previously reported experimental evidences. Our node-weighted Steiner tree approach is shown to be both fast and effective to identify important elements of cancer-related signaling pathways. Furthermore, it may provide new perspectives into the identification of signaling pathways for other human diseases.

  9. A New Data Mining Scheme Using Artificial Neural Networks

    PubMed Central

    Kamruzzaman, S. M.; Jehad Sarkar, A. M.

    2011-01-01

    Classification is one of the data mining problems receiving enormous attention in the database community. Although artificial neural networks (ANNs) have been successfully applied in a wide range of machine learning applications, they are however often regarded as black boxes, i.e., their predictions cannot be explained. To enhance the explanation of ANNs, a novel algorithm to extract symbolic rules from ANNs has been proposed in this paper. ANN methods have not been effectively utilized for data mining tasks because how the classifications were made is not explicitly stated as symbolic rules that are suitable for verification or interpretation by human experts. With the proposed approach, concise symbolic rules with high accuracy, that are easily explainable, can be extracted from the trained ANNs. Extracted rules are comparable with other methods in terms of number of rules, average number of conditions for a rule, and the accuracy. The effectiveness of the proposed approach is clearly demonstrated by the experimental results on a set of benchmark data mining classification problems. PMID:22163866

  10. The impact of mining activities on Mongolia's protected areas: a status report with policy recommendations.

    PubMed

    Farrington, John D

    2005-07-01

    Mongolia's protected areas cover 20.5 million ha or 13.1% of its national territory. Existing and proposed protected areas, however, are threatened by mining. Mining impacts on Mongolia's protected areas are diverse and include licensed and unlicensed mineral activities in protected areas, buffer zone disturbance, and prevention of the establishment of proposed protected areas. Review of United States, Canadian, and Australian policies revealed 9 basic approaches to resolving conflicts between protected areas and mining. Four approaches suitable for Mongolia are granting land trades and special dispensations in exchange for mineral licenses in protected areas; granting protected status to all lapsed mineral licenses in protected areas; voluntary forfeiting of mineral licenses in protected areas in exchange for positive corporate publicity; and prohibiting all new mineral activities in existing and proposed protected areas. Mining is Mongolia's most important industry, however, and the long-term benefits of preserving Mongolia's natural heritage must be considered and weighed against the economic benefits and costs of mining activities.

  11. Effects of human and organizational deficiencies on workers'safety behavior in a mining site in Iran.

    PubMed

    Mirzaei Aliabadi, Mostafa; Aghaei, Hamed; Kalatpour, Omid; Soltanian, Ali Reza; SeyedTabib, Maryam

    2018-05-18

    Mines are a dangerous workplace worldwide with a high accident rate. According to the Statistical Center of Iran, the number of occupational accidents in Iranian mines has increased in recent years. This study determined and explained human and organizational deficiencies influencing Iranian mining accidents. In this study, the data associated with 305 mining accidents were investigated. The data were analyzed based on a systems analysis approach to identify critical deficiencies in organizational influences, unsafe supervision, preconditions for unsafe acts, and workers' unsafe acts. Partial Least Square Structural Equation Modeling [PLS-SEM] was utilized for modeling the interactions between these deficiencies. It was demonstrated that organizational deficiencies had a direct positive effect on workers' violations (path coefficient=0.16) and workers' errors (path coefficient=0.23). The effect of unsafe supervision on workers' violations and workers' errors was also significant with the path coefficients of 0.14 and 0.20. Likewise, preconditions for unsafe acts also had a significant effect on both workers' violations (path coefficient=0.16) and workers' errors (path coefficient=0.21). Moreover, organizational deficiencies had an indirect positive effect on workers' unsafe acts mediated by unsafe supervision and preconditions for unsafe acts. Among the variables examined in the current study, organizational influences had the strongest impacts on workers' unsafe acts. Organizational deficiencies are the main causes of accidents in mining sectors that affects all other aspects of system safety. For preventing occupational accidents, organizational deficiencies should be modified first.

  12. The Mechanization of Mining.

    ERIC Educational Resources Information Center

    Marovelli, Robert L.; Karhnak, John M.

    1982-01-01

    Mechanization of mining is explained in terms of its effect on the mining of coal, focusing on, among others, types of mining, productivity, machinery, benefits to retired miners, fatality rate in underground coal mines, and output of U.S. mining industry. (Author/JN)

  13. Text and Structural Data Mining of Influenza Mentions in Web and Social Media

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Corley, Courtney D.; Cook, Diane; Mikler, Armin R.

    Text and structural data mining of Web and social media (WSM) provides a novel disease surveillance resource and can identify online communities for targeted public health communications (PHC) to assure wide dissemination of pertinent information. WSM that mention influenza are harvested over a 24-week period, 5-October-2008 to 21-March-2009. Link analysis reveals communities for targeted PHC. Text mining is shown to identify trends in flu posts that correlate to real-world influenza-like-illness patient report data. We also bring to bear a graph-based data mining technique to detect anomalies among flu blogs connected by publisher type, links, and user-tags.

  14. Study of application of ERTS-A imagery to fracture related mine safety hazards in the coal mining industry

    NASA Technical Reports Server (NTRS)

    Wier, C. E.; Wobber, F. J. (Principal Investigator); Russell, O. R.; Amato, R. V.

    1973-01-01

    The author has identified the following significant results. The 70mm black and white infrared photography acquired in March 1973 at an approximate scale of 1:115,000 permits the identification of areas of mine subsidence not readily evident on other films. This is largely due to the high contrast rendition of water and land by this film and the excessive surface moisture conditions prevalent in the area at the time of photography. Subsided areas consist of shallow depressions which have impounded water. Patterns with a regularity indicative of the room and pillar configuration used in subsurface coal mining are evident.

  15. Machine learning approaches to analysing textual injury surveillance data: a systematic review.

    PubMed

    Vallmuur, Kirsten

    2015-06-01

    To synthesise recent research on the use of machine learning approaches to mining textual injury surveillance data. Systematic review. The electronic databases which were searched included PubMed, Cinahl, Medline, Google Scholar, and Proquest. The bibliography of all relevant articles was examined and associated articles were identified using a snowballing technique. For inclusion, articles were required to meet the following criteria: (a) used a health-related database, (b) focused on injury-related cases, AND used machine learning approaches to analyse textual data. The papers identified through the search were screened resulting in 16 papers selected for review. Articles were reviewed to describe the databases and methodology used, the strength and limitations of different techniques, and quality assurance approaches used. Due to heterogeneity between studies meta-analysis was not performed. Occupational injuries were the focus of half of the machine learning studies and the most common methods described were Bayesian probability or Bayesian network based methods to either predict injury categories or extract common injury scenarios. Models were evaluated through either comparison with gold standard data or content expert evaluation or statistical measures of quality. Machine learning was found to provide high precision and accuracy when predicting a small number of categories, was valuable for visualisation of injury patterns and prediction of future outcomes. However, difficulties related to generalizability, source data quality, complexity of models and integration of content and technical knowledge were discussed. The use of narrative text for injury surveillance has grown in popularity, complexity and quality over recent years. With advances in data mining techniques, increased capacity for analysis of large databases, and involvement of computer scientists in the injury prevention field, along with more comprehensive use and description of quality

  16. Biomedical text mining for research rigor and integrity: tasks, challenges, directions.

    PubMed

    Kilicoglu, Halil

    2017-06-13

    An estimated quarter of a trillion US dollars is invested in the biomedical research enterprise annually. There is growing alarm that a significant portion of this investment is wasted because of problems in reproducibility of research findings and in the rigor and integrity of research conduct and reporting. Recent years have seen a flurry of activities focusing on standardization and guideline development to enhance the reproducibility and rigor of biomedical research. Research activity is primarily communicated via textual artifacts, ranging from grant applications to journal publications. These artifacts can be both the source and the manifestation of practices leading to research waste. For example, an article may describe a poorly designed experiment, or the authors may reach conclusions not supported by the evidence presented. In this article, we pose the question of whether biomedical text mining techniques can assist the stakeholders in the biomedical research enterprise in doing their part toward enhancing research integrity and rigor. In particular, we identify four key areas in which text mining techniques can make a significant contribution: plagiarism/fraud detection, ensuring adherence to reporting guidelines, managing information overload and accurate citation/enhanced bibliometrics. We review the existing methods and tools for specific tasks, if they exist, or discuss relevant research that can provide guidance for future work. With the exponential increase in biomedical research output and the ability of text mining approaches to perform automatic tasks at large scale, we propose that such approaches can support tools that promote responsible research practices, providing significant benefits for the biomedical research enterprise. Published by Oxford University Press 2017. This work is written by a US Government employee and is in the public domain in the US.

  17. Combining QSAR Modeling and Text-Mining Techniques to Link Chemical Structures and Carcinogenic Modes of Action.

    PubMed

    Papamokos, George; Silins, Ilona

    2016-01-01

    There is an increasing need for new reliable non-animal based methods to predict and test toxicity of chemicals. Quantitative structure-activity relationship (QSAR), a computer-based method linking chemical structures with biological activities, is used in predictive toxicology. In this study, we tested the approach to combine QSAR data with literature profiles of carcinogenic modes of action automatically generated by a text-mining tool. The aim was to generate data patterns to identify associations between chemical structures and biological mechanisms related to carcinogenesis. Using these two methods, individually and combined, we evaluated 96 rat carcinogens of the hematopoietic system, liver, lung, and skin. We found that skin and lung rat carcinogens were mainly mutagenic, while the group of carcinogens affecting the hematopoietic system and the liver also included a large proportion of non-mutagens. The automatic literature analysis showed that mutagenicity was a frequently reported endpoint in the literature of these carcinogens, however, less common endpoints such as immunosuppression and hormonal receptor-mediated effects were also found in connection with some of the carcinogens, results of potential importance for certain target organs. The combined approach, using QSAR and text-mining techniques, could be useful for identifying more detailed information on biological mechanisms and the relation with chemical structures. The method can be particularly useful in increasing the understanding of structure and activity relationships for non-mutagens.

  18. Combining QSAR Modeling and Text-Mining Techniques to Link Chemical Structures and Carcinogenic Modes of Action

    PubMed Central

    Papamokos, George; Silins, Ilona

    2016-01-01

    There is an increasing need for new reliable non-animal based methods to predict and test toxicity of chemicals. Quantitative structure-activity relationship (QSAR), a computer-based method linking chemical structures with biological activities, is used in predictive toxicology. In this study, we tested the approach to combine QSAR data with literature profiles of carcinogenic modes of action automatically generated by a text-mining tool. The aim was to generate data patterns to identify associations between chemical structures and biological mechanisms related to carcinogenesis. Using these two methods, individually and combined, we evaluated 96 rat carcinogens of the hematopoietic system, liver, lung, and skin. We found that skin and lung rat carcinogens were mainly mutagenic, while the group of carcinogens affecting the hematopoietic system and the liver also included a large proportion of non-mutagens. The automatic literature analysis showed that mutagenicity was a frequently reported endpoint in the literature of these carcinogens, however, less common endpoints such as immunosuppression and hormonal receptor-mediated effects were also found in connection with some of the carcinogens, results of potential importance for certain target organs. The combined approach, using QSAR and text-mining techniques, could be useful for identifying more detailed information on biological mechanisms and the relation with chemical structures. The method can be particularly useful in increasing the understanding of structure and activity relationships for non-mutagens. PMID:27625608

  19. The Application of LANDSAT Multi-Temporal Thermal Infrared Data to Identify Coal Fire in the Khanh Hoa Coal Mine, Thai Nguyen province, Vietnam

    NASA Astrophysics Data System (ADS)

    Trinh, Le Hung; Zablotskii, V. R.

    2017-12-01

    The Khanh Hoa coal mine is a surface coal mine in the Thai Nguyen province, which is one of the largest deposits of coal in the Vietnam. Numerous reasons such as improper mining techniques and policy, as well as unauthorized mining caused surface and subsurface coal fire in this area. Coal fire is a dangerous phenomenon which affects the environment seriously by releasing toxic fumes which causes forest fires, and subsidence of infrastructure surface. This article presents study on the application of LANDSAT multi-temporal thermal infrared images, which help to detect coal fire. The results obtained in this study can be used to monitor fire zones so as to give warnings and solutions to prevent coal fire.

  20. Mines, Quarries and Landscape. Visuality and Transformation

    NASA Astrophysics Data System (ADS)

    Jimeno, Carlos López; Torrijos, Ignacio Díez; González, Carmen Mataix

    2016-06-01

    In this paper a review of two basic concepts is carried out: scenery and landscape integration, proposing a new concept: "visuality", alternative to the classical "visibility" used in landscape studies related to mining activity, which explores the qualitative aspects that define the visual relationships between observer and environment. In relation to landscape integration studies, some reflections on substantive issues are made which induce certain prejudices at the time of addressing the issue of mining operations landscape integration, and some guidance and integration strategies are formulated. In the second part of the text, a new approach to the landscape integration of mines and quarries is raised, closely linked to the concept of visuality which are based on a basic goal: the re-qualification of the place, and give innovative answers to re-qualify the place and show how to catch the opportunity in the deep transformation generated by the development of mining activities. As a conclusion, a case study is presented in the last section, the landscape integration study conducted on marble exploitations Coto Pinos (Alicante, Spain), considered the largest ornamental rock quarry in Europe.