Sample records for context mining tool

  1. Data Mining in Child Welfare.

    ERIC Educational Resources Information Center

    Schoech, Dick; Quinn, Andrew; Rycraft, Joan R.

    2000-01-01

    Examines the historical and larger context of data mining and describes data mining processes, techniques, and tools. Illustrates these using a child welfare dataset concerning the employee turnover that is mined, using logistic regression and a Bayesian neural network. Discusses the data mining process, the resulting models, their predictive…

  2. Data Mining and Knowledge Management in Higher Education -Potential Applications.

    ERIC Educational Resources Information Center

    Luan, Jing

    This paper introduces a new decision support tool, data mining, in the context of knowledge management. The most striking features of data mining techniques are clustering and prediction. The clustering aspect of data mining offers comprehensive characteristics analysis of students, while the predicting function estimates the likelihood for a…

  3. A Visualization Tool for Integrating Research Results at an Underground Mine

    NASA Astrophysics Data System (ADS)

    Boltz, S.; Macdonald, B. D.; Orr, T.; Johnson, W.; Benton, D. J.

    2016-12-01

    Researchers with the National Institute for Occupational Safety and Health are conducting research at a deep, underground metal mine in Idaho to develop improvements in ground control technologies that reduce the effects of dynamic loading on mine workings, thereby decreasing the risk to miners. This research is multifaceted and includes: photogrammetry, microseismic monitoring, geotechnical instrumentation, and numerical modeling. When managing research involving such a wide range of data, understanding how the data relate to each other and to the mining activity quickly becomes a daunting task. In an effort to combine this diverse research data into a single, easy-to-use system, a three-dimensional visualization tool was developed. The tool was created using the Unity3d video gaming engine and includes the mine development entries, production stopes, important geologic structures, and user-input research data. The tool provides the user with a first-person, interactive experience where they are able to walk through the mine as well as navigate the rock mass surrounding the mine to view and interpret the imported data in the context of the mine and as a function of time. The tool was developed using data from a single mine; however, it is intended to be a generic tool that can be easily extended to other mines. For example, a similar visualization tool is being developed for an underground coal mine in Colorado. The ultimate goal is for NIOSH researchers and mine personnel to be able to use the visualization tool to identify trends that may not otherwise be apparent when viewing the data separately. This presentation highlights the features and capabilities of the mine visualization tool and explains how it may be used to more effectively interpret data and reduce the risk of ground fall hazards to underground miners.

  4. Data mining in child welfare.

    PubMed

    Schoech, D; Quinn, A; Rycraft, J R

    2000-01-01

    Data mining is the sifting through of voluminous data to extract knowledge for decision making. This article illustrates the context, concepts, processes, techniques, and tools of data mining, using statistical and neural network analyses on a dataset concerning employee turnover. The resulting models and their predictive capability, advantages and disadvantages, and implications for decision support are highlighted.

  5. Environmental Decision Making on Acid Mine Drainage Issues in South Africa: An Argument for the Precautionary Principle.

    PubMed

    Morodi, T J; Mpofu, Charles

    2017-06-28

    This paper examines the issue of acid mine drainage in South Africa and environmental decision making processes that could be taken to mitigate the problem in the context of both conventional risk assessment and the precautionary principle. It is argued that conventional risk assessment protects the status quo and hence cannot be entirely relied upon as an effective tool to resolve environmental problems in the context of South Africa, a developing country with complex environmental health concerns. The complexity of the environmental issues is discussed from historical and political perspectives. An argument is subsequently made that the precautionary principle is an alternative tool, and its adoption can be used to empower local communities. This work, therefore, adds to new knowledge by problematising conventional risk assessment and proposing the framing of the acid mine drainage issues in a complex and contextual scenario of a developing country-South Africa.

  6. Definition of redox and pH influence in the AMD mine system using a fuzzy qualitative tool (Iberian Pyrite Belt, SW Spain).

    PubMed

    de la Torre, M L; Grande, J A; Valente, T; Perez-Ostalé, E; Santisteban, M; Aroba, J; Ramos, I

    2016-03-01

    Poderosa Mine is an abandoned pyrite mine, located in the Iberian Pyrite Belt which pours its acid mine drainage (AMD) waters into the Odiel river (South-West Spain). This work focuses on establishing possible reasons for interdependence between the potential redox and pH, with the load of metals and sulfates, as well as a set of variables that define the physical chemistry of the water-conductivity, temperature, TDS, and dissolved oxygen-transported by a channel from Poderosa mine affected by acid mine drainage, through the use of techniques of artificial intelligence: fuzzy logic and data mining. The sampling campaign was carried out in May of 2012. There were a total of 16 sites, the first inside the tunnel and the last at the mouth of the river Odiel, with a distance of approximately 10 m between each pair of measuring stations. While the tools of classical statistics, which are widely used in this context, prove useful for defining proximity ratios between variables based on Pearson's correlations, in addition to making it easier to handle large volumes of data and producing easier-to-understand graphs, the use of fuzzy logic tools and data mining results in better definition of the variations produced by external stimuli on the set of variables. This tool is adaptable and can be extrapolated to any system polluted by acid mine drainage using simple, intuitive reasoning.

  7. Mutation extraction tools can be combined for robust recognition of genetic variants in the literature

    PubMed Central

    Jimeno Yepes, Antonio; Verspoor, Karin

    2014-01-01

    As the cost of genomic sequencing continues to fall, the amount of data being collected and studied for the purpose of understanding the genetic basis of disease is increasing dramatically. Much of the source information relevant to such efforts is available only from unstructured sources such as the scientific literature, and significant resources are expended in manually curating and structuring the information in the literature. As such, there have been a number of systems developed to target automatic extraction of mutations and other genetic variation from the literature using text mining tools. We have performed a broad survey of the existing publicly available tools for extraction of genetic variants from the scientific literature. We consider not just one tool but a number of different tools, individually and in combination, and apply the tools in two scenarios. First, they are compared in an intrinsic evaluation context, where the tools are tested for their ability to identify specific mentions of genetic variants in a corpus of manually annotated papers, the Variome corpus. Second, they are compared in an extrinsic evaluation context based on our previous study of text mining support for curation of the COSMIC and InSiGHT databases. Our results demonstrate that no single tool covers the full range of genetic variants mentioned in the literature. Rather, several tools have complementary coverage and can be used together effectively. In the intrinsic evaluation on the Variome corpus, the combined performance is above 0.95 in F-measure, while in the extrinsic evaluation the combined recall performance is above 0.71 for COSMIC and above 0.62 for InSiGHT, a substantial improvement over the performance of any individual tool. Based on the analysis of these results, we suggest several directions for the improvement of text mining tools for genetic variant extraction from the literature. PMID:25285203

  8. Data mining through simulation.

    PubMed

    Lytton, William W; Stewart, Mark

    2007-01-01

    Data integration is particularly difficult in neuroscience; we must organize vast amounts of data around only a few fragmentary functional hypotheses. It has often been noted that computer simulation, by providing explicit hypotheses for a particular system and bridging across different levels of organization, can provide an organizational focus, which can be leveraged to form substantive hypotheses. Simulations lend meaning to data and can be updated and adapted as further data come in. The use of simulation in this context suggests the need for simulator adjuncts to manage and evaluate data. We have developed a neural query system (NQS) within the NEURON simulator, providing a relational database system, a query function, and basic data-mining tools. NQS is used within the simulation context to manage, verify, and evaluate model parameterizations. More importantly, it is used for data mining of simulation data and comparison with neurophysiology.

  9. Facilitating knowledge discovery and visualization through mining contextual data from published studies: lessons from JournalMap

    USDA-ARS?s Scientific Manuscript database

    Valuable information on the location and context of ecological studies are locked up in publications in myriad formats that are not easily machine readable. This presents significant challenges to building geographic-based tools to search for and visualize sources of ecological knowledge. JournalMap...

  10. A Decision Support System for Predicting Students' Performance

    ERIC Educational Resources Information Center

    Livieris, Ioannis E.; Mikropoulos, Tassos A.; Pintelas, Panagiotis

    2016-01-01

    Educational data mining is an emerging research field concerned with developing methods for exploring the unique types of data that come from educational context. These data allow the educational stakeholders to discover new, interesting and valuable knowledge about students. In this paper, we present a new user-friendly decision support tool for…

  11. On the unsupervised analysis of domain-specific Chinese texts

    PubMed Central

    Deng, Ke; Bol, Peter K.; Li, Kate J.; Liu, Jun S.

    2016-01-01

    With the growing availability of digitized text data both publicly and privately, there is a great need for effective computational tools to automatically extract information from texts. Because the Chinese language differs most significantly from alphabet-based languages in not specifying word boundaries, most existing Chinese text-mining methods require a prespecified vocabulary and/or a large relevant training corpus, which may not be available in some applications. We introduce an unsupervised method, top-down word discovery and segmentation (TopWORDS), for simultaneously discovering and segmenting words and phrases from large volumes of unstructured Chinese texts, and propose ways to order discovered words and conduct higher-level context analyses. TopWORDS is particularly useful for mining online and domain-specific texts where the underlying vocabulary is unknown or the texts of interest differ significantly from available training corpora. When outputs from TopWORDS are fed into context analysis tools such as topic modeling, word embedding, and association pattern finding, the results are as good as or better than that from using outputs of a supervised segmentation method. PMID:27185919

  12. Data mining and visualization from planetary missions: the VESPA-Europlanet2020 activity

    NASA Astrophysics Data System (ADS)

    Longobardo, Andrea; Capria, Maria Teresa; Zinzi, Angelo; Ivanovski, Stavro; Giardino, Marco; di Persio, Giuseppe; Fonte, Sergio; Palomba, Ernesto; Antonelli, Lucio Angelo; Fonte, Sergio; Giommi, Paolo; Europlanet VESPA 2020 Team

    2017-06-01

    This paper presents the VESPA (Virtual European Solar and Planetary Access) activity, developed in the context of the Europlanet 2020 Horizon project, aimed at providing tools for analysis and visualization of planetary data provided by space missions. In particular, the activity is focused on minor bodies of the Solar System.The structure of the computation node, the algorithms developed for analysis of planetary surfaces and cometary comae and the tools for data visualization are presented.

  13. GEOGLE: context mining tool for the correlation between gene expression and the phenotypic distinction.

    PubMed

    Yu, Yao; Tu, Kang; Zheng, Siyuan; Li, Yun; Ding, Guohui; Ping, Jie; Hao, Pei; Li, Yixue

    2009-08-25

    In the post-genomic era, the development of high-throughput gene expression detection technology provides huge amounts of experimental data, which challenges the traditional pipelines for data processing and analyzing in scientific researches. In our work, we integrated gene expression information from Gene Expression Omnibus (GEO), biomedical ontology from Medical Subject Headings (MeSH) and signaling pathway knowledge from sigPathway entries to develop a context mining tool for gene expression analysis - GEOGLE. GEOGLE offers a rapid and convenient way for searching relevant experimental datasets, pathways and biological terms according to multiple types of queries: including biomedical vocabularies, GDS IDs, gene IDs, pathway names and signature list. Moreover, GEOGLE summarizes the signature genes from a subset of GDSes and estimates the correlation between gene expression and the phenotypic distinction with an integrated p value. This approach performing global searching of expression data may expand the traditional way of collecting heterogeneous gene expression experiment data. GEOGLE is a novel tool that provides researchers a quantitative way to understand the correlation between gene expression and phenotypic distinction through meta-analysis of gene expression datasets from different experiments, as well as the biological meaning behind. The web site and user guide of GEOGLE are available at: http://omics.biosino.org:14000/kweb/workflow.jsp?id=00020.

  14. Genomics Portals: integrative web-platform for mining genomics data.

    PubMed

    Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

    2010-01-13

    A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.

  15. Genomics Portals: integrative web-platform for mining genomics data

    PubMed Central

    2010-01-01

    Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org. PMID:20070909

  16. Analysis of Occupational Accidents in Underground and Surface Mining in Spain Using Data-Mining Techniques.

    PubMed

    Sanmiquel, Lluís; Bascompta, Marc; Rossell, Josep M; Anticoi, Hernán Francisco; Guash, Eduard

    2018-03-07

    An analysis of occupational accidents in the mining sector was conducted using the data from the Spanish Ministry of Employment and Social Safety between 2005 and 2015, and data-mining techniques were applied. Data was processed with the software Weka. Two scenarios were chosen from the accidents database: surface and underground mining. The most important variables involved in occupational accidents and their association rules were determined. These rules are composed of several predictor variables that cause accidents, defining its characteristics and context. This study exposes the 20 most important association rules in the sector-either surface or underground mining-based on the statistical confidence levels of each rule as obtained by Weka. The outcomes display the most typical immediate causes, along with the percentage of accidents with a basis in each association rule. The most important immediate cause is body movement with physical effort or overexertion, and the type of accident is physical effort or overexertion. On the other hand, the second most important immediate cause and type of accident are different between the two scenarios. Data-mining techniques were chosen as a useful tool to find out the root cause of the accidents.

  17. Advances in Machine Learning and Data Mining for Astronomy

    NASA Astrophysics Data System (ADS)

    Way, Michael J.; Scargle, Jeffrey D.; Ali, Kamal M.; Srivastava, Ashok N.

    2012-03-01

    Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science. The book's introductory part provides context to issues in the astronomical sciences that are also important to health, social, and physical sciences, particularly probabilistic and statistical aspects of classification and cluster analysis. The next part describes a number of astrophysics case studies that leverage a range of machine learning and data mining technologies. In the last part, developers of algorithms and practitioners of machine learning and data mining show how these tools and techniques are used in astronomical applications. With contributions from leading astronomers and computer scientists, this book is a practical guide to many of the most important developments in machine learning, data mining, and statistics. It explores how these advances can solve current and future problems in astronomy and looks at how they could lead to the creation of entirely new algorithms within the data mining community.

  18. Incorporating ecosystem services into environmental management of deep-seabed mining

    NASA Astrophysics Data System (ADS)

    Le, Jennifer T.; Levin, Lisa A.; Carson, Richard T.

    2017-03-01

    Accelerated exploration of minerals in the deep sea over the past decade has raised the likelihood that commercial mining of the deep seabed will commence in the near future. Environmental concerns create a growing urgency for development of environmental regulations under commercial exploitation. Here, we consider an ecosystem services approach to the environmental policy and management of deep-sea mineral resources. Ecosystem services link the environment and human well-being, and can help improve sustainability and stewardship of the deep sea by providing a quantitative basis for decision-making. This paper briefly reviews ecosystem services provided by habitats targeted for deep-seabed mining (hydrothermal vents, seamounts, nodule provinces, and phosphate-rich margins), and presents practical steps to incorporate ecosystem services into deep-seabed mining regulation. The linkages and translation between ecosystem structure, ecological function (including supporting services), and ecosystem services are highlighted as generating human benefits. We consider criteria for identifying which ecosystem services are vulnerable to potential mining impacts, the role of ecological functions in providing ecosystem services, development of ecosystem service indicators, valuation of ecosystem services, and implementation of ecosystem services concepts. The first three steps put ecosystem services into a deep-seabed mining context; the last two steps help to incorporate ecosystem services into a management and decision-making framework. Phases of environmental planning discussed in the context of ecosystem services include conducting strategic environmental assessments, collecting baseline data, monitoring, establishing marine protected areas, assessing cumulative impacts, identifying thresholds and triggers, and creating an environmental damage compensation regime. We also identify knowledge gaps that need to be addressed in order to operationalize ecosystem services concepts in deep-seabed mining regulation and propose potential tools to fill them.

  19. Environmental management in North American mining sector.

    PubMed

    Asif, Zunaira; Chen, Zhi

    2016-01-01

    This paper reviews the environmental issues and management practices in the mining sector in the North America. The sustainable measures on waste management are recognized as one of the most serious environmental concerns in the mining industry. For mining activities, it will be no surprise that the metal recovery reagents and acid effluents are a threat to the ecosystem as well as hazards to human health. In addition, poor air quality and ventilation in underground mines can lead to occupational illness and death of workers. Electricity usage and fuel consumption are major factors that contribute to greenhouse gases. On the other hand, many sustainability challenges are faced in the management of tailings and disposal of waste rock. This paper aims to highlight the problems that arise due to poor air quality and acid mine drainage. The paper also addresses some of the advantages and limitations of tailing and waste rock management that still have to be studied in context of the mining sector. This paper suggests that implementation of suitable environmental management tools like life cycle assessment (LCA), cleaner production technologies (CPTs), and multicriteria decision analysis (MCD) are important as it ultimately lead to improve environmental performance and enabling a mine to focus on the next stage of sustainability.

  20. Collaborative Data Mining Tool for Education

    ERIC Educational Resources Information Center

    Garcia, Enrique; Romero, Cristobal; Ventura, Sebastian; Gea, Miguel; de Castro, Carlos

    2009-01-01

    This paper describes a collaborative educational data mining tool based on association rule mining for the continuous improvement of e-learning courses allowing teachers with similar course's profile sharing and scoring the discovered information. This mining tool is oriented to be used by instructors non experts in data mining such that, its…

  1. Analysis of Occupational Accidents in Underground and Surface Mining in Spain Using Data-Mining Techniques

    PubMed Central

    Sanmiquel, Lluís; Bascompta, Marc; Rossell, Josep M.; Anticoi, Hernán Francisco; Guash, Eduard

    2018-01-01

    An analysis of occupational accidents in the mining sector was conducted using the data from the Spanish Ministry of Employment and Social Safety between 2005 and 2015, and data-mining techniques were applied. Data was processed with the software Weka. Two scenarios were chosen from the accidents database: surface and underground mining. The most important variables involved in occupational accidents and their association rules were determined. These rules are composed of several predictor variables that cause accidents, defining its characteristics and context. This study exposes the 20 most important association rules in the sector—either surface or underground mining—based on the statistical confidence levels of each rule as obtained by Weka. The outcomes display the most typical immediate causes, along with the percentage of accidents with a basis in each association rule. The most important immediate cause is body movement with physical effort or overexertion, and the type of accident is physical effort or overexertion. On the other hand, the second most important immediate cause and type of accident are different between the two scenarios. Data-mining techniques were chosen as a useful tool to find out the root cause of the accidents. PMID:29518921

  2. A Collaborative Educational Association Rule Mining Tool

    ERIC Educational Resources Information Center

    Garcia, Enrique; Romero, Cristobal; Ventura, Sebastian; de Castro, Carlos

    2011-01-01

    This paper describes a collaborative educational data mining tool based on association rule mining for the ongoing improvement of e-learning courses and allowing teachers with similar course profiles to share and score the discovered information. The mining tool is oriented to be used by non-expert instructors in data mining so its internal…

  3. Mining and integration of pathway diagrams from imaging data.

    PubMed

    Kozhenkov, Sergey; Baitaluk, Michael

    2012-03-01

    Pathway diagrams from PubMed and World Wide Web (WWW) contain valuable highly curated information difficult to reach without tools specifically designed and customized for the biological semantics and high-content density of the images. There is currently no search engine or tool that can analyze pathway images, extract their pathway components (molecules, genes, proteins, organelles, cells, organs, etc.) and indicate their relationships. Here, we describe a resource of pathway diagrams retrieved from article and web-page images through optical character recognition, in conjunction with data mining and data integration methods. The recognized pathways are integrated into the BiologicalNetworks research environment linking them to a wealth of data available in the BiologicalNetworks' knowledgebase, which integrates data from >100 public data sources and the biomedical literature. Multiple search and analytical tools are available that allow the recognized cellular pathways, molecular networks and cell/tissue/organ diagrams to be studied in the context of integrated knowledge, experimental data and the literature. BiologicalNetworks software and the pathway repository are freely available at www.biologicalnetworks.org. Supplementary data are available at Bioinformatics online.

  4. Temporal data mining for the quality assessment of hemodialysis services.

    PubMed

    Bellazzi, Riccardo; Larizza, Cristiana; Magni, Paolo; Bellazzi, Roberto

    2005-05-01

    This paper describes the temporal data mining aspects of a research project that deals with the definition of methods and tools for the assessment of the clinical performance of hemodialysis (HD) services, on the basis of the time series automatically collected during hemodialysis sessions. Intelligent data analysis and temporal data mining techniques are applied to gain insight and to discover knowledge on the causes of unsatisfactory clinical results. In particular, two new methods for association rule discovery and temporal rule discovery are applied to the time series. Such methods exploit several pre-processing techniques, comprising data reduction, multi-scale filtering and temporal abstractions. We have analyzed the data of more than 5800 dialysis sessions coming from 43 different patients monitored for 19 months. The qualitative rules associating the outcome parameters and the measured variables were examined by the domain experts, which were able to distinguish between rules confirming available background knowledge and unexpected but plausible rules. The new methods proposed in the paper are suitable tools for knowledge discovery in clinical time series. Their use in the context of an auditing system for dialysis management helped clinicians to improve their understanding of the patients' behavior.

  5. Textpresso Central: a customizable platform for searching, text mining, viewing, and curating biomedical literature.

    PubMed

    Müller, H-M; Van Auken, K M; Li, Y; Sternberg, P W

    2018-03-09

    The biomedical literature continues to grow at a rapid pace, making the challenge of knowledge retrieval and extraction ever greater. Tools that provide a means to search and mine the full text of literature thus represent an important way by which the efficiency of these processes can be improved. We describe the next generation of the Textpresso information retrieval system, Textpresso Central (TPC). TPC builds on the strengths of the original system by expanding the full text corpus to include the PubMed Central Open Access Subset (PMC OA), as well as the WormBase C. elegans bibliography. In addition, TPC allows users to create a customized corpus by uploading and processing documents of their choosing. TPC is UIMA compliant, to facilitate compatibility with external processing modules, and takes advantage of Lucene indexing and search technology for efficient handling of millions of full text documents. Like Textpresso, TPC searches can be performed using keywords and/or categories (semantically related groups of terms), but to provide better context for interpreting and validating queries, search results may now be viewed as highlighted passages in the context of full text. To facilitate biocuration efforts, TPC also allows users to select text spans from the full text and annotate them, create customized curation forms for any data type, and send resulting annotations to external curation databases. As an example of such a curation form, we describe integration of TPC with the Noctua curation tool developed by the Gene Ontology (GO) Consortium. Textpresso Central is an online literature search and curation platform that enables biocurators and biomedical researchers to search and mine the full text of literature by integrating keyword and category searches with viewing search results in the context of the full text. It also allows users to create customized curation interfaces, use those interfaces to make annotations linked to supporting evidence statements, and then send those annotations to any database in the world. Textpresso Central URL: http://www.textpresso.org/tpc.

  6. A sentence sliding window approach to extract protein annotations from biomedical articles

    PubMed Central

    Krallinger, Martin; Padron, Maria; Valencia, Alfonso

    2005-01-01

    Background Within the emerging field of text mining and statistical natural language processing (NLP) applied to biomedical articles, a broad variety of techniques have been developed during the past years. Nevertheless, there is still a great ned of comparative assessment of the performance of the proposed methods and the development of common evaluation criteria. This issue was addressed by the Critical Assessment of Text Mining Methods in Molecular Biology (BioCreative) contest. The aim of this contest was to assess the performance of text mining systems applied to biomedical texts including tools which recognize named entities such as genes and proteins, and tools which automatically extract protein annotations. Results The "sentence sliding window" approach proposed here was found to efficiently extract text fragments from full text articles containing annotations on proteins, providing the highest number of correctly predicted annotations. Moreover, the number of correct extractions of individual entities (i.e. proteins and GO terms) involved in the relationships used for the annotations was significantly higher than the correct extractions of the complete annotations (protein-function relations). Conclusion We explored the use of averaging sentence sliding windows for information extraction, especially in a context where conventional training data is unavailable. The combination of our approach with more refined statistical estimators and machine learning techniques might be a way to improve annotation extraction for future biomedical text mining applications. PMID:15960831

  7. PubRunner: A light-weight framework for updating text mining results.

    PubMed

    Anekalla, Kishore R; Courneya, J P; Fiorini, Nicolas; Lever, Jake; Muchow, Michael; Busby, Ben

    2017-01-01

    Biomedical text mining promises to assist biologists in quickly navigating the combined knowledge in their domain. This would allow improved understanding of the complex interactions within biological systems and faster hypothesis generation. New biomedical research articles are published daily and text mining tools are only as good as the corpus from which they work. Many text mining tools are underused because their results are static and do not reflect the constantly expanding knowledge in the field. In order for biomedical text mining to become an indispensable tool used by researchers, this problem must be addressed. To this end, we present PubRunner, a framework for regularly running text mining tools on the latest publications. PubRunner is lightweight, simple to use, and can be integrated with an existing text mining tool. The workflow involves downloading the latest abstracts from PubMed, executing a user-defined tool, pushing the resulting data to a public FTP or Zenodo dataset, and publicizing the location of these results on the public PubRunner website. We illustrate the use of this tool by re-running the commonly used word2vec tool on the latest PubMed abstracts to generate up-to-date word vector representations for the biomedical domain. This shows a proof of concept that we hope will encourage text mining developers to build tools that truly will aid biologists in exploring the latest publications.

  8. Software tool for data mining and its applications

    NASA Astrophysics Data System (ADS)

    Yang, Jie; Ye, Chenzhou; Chen, Nianyi

    2002-03-01

    A software tool for data mining is introduced, which integrates pattern recognition (PCA, Fisher, clustering, hyperenvelop, regression), artificial intelligence (knowledge representation, decision trees), statistical learning (rough set, support vector machine), computational intelligence (neural network, genetic algorithm, fuzzy systems). It consists of nine function models: pattern recognition, decision trees, association rule, fuzzy rule, neural network, genetic algorithm, Hyper Envelop, support vector machine, visualization. The principle and knowledge representation of some function models of data mining are described. The software tool of data mining is realized by Visual C++ under Windows 2000. Nonmonotony in data mining is dealt with by concept hierarchy and layered mining. The software tool of data mining has satisfactorily applied in the prediction of regularities of the formation of ternary intermetallic compounds in alloy systems, and diagnosis of brain glioma.

  9. Using the Saccharomyces Genome Database (SGD) for analysis of genomic information

    PubMed Central

    Skrzypek, Marek S.; Hirschman, Jodi

    2011-01-01

    Analysis of genomic data requires access to software tools that place the sequence-derived information in the context of biology. The Saccharomyces Genome Database (SGD) integrates functional information about budding yeast genes and their products with a set of analysis tools that facilitate exploring their biological details. This unit describes how the various types of functional data available at SGD can be searched, retrieved, and analyzed. Starting with the guided tour of the SGD Home page and Locus Summary page, this unit highlights how to retrieve data using YeastMine, how to visualize genomic information with GBrowse, how to explore gene expression patterns with SPELL, and how to use Gene Ontology tools to characterize large-scale datasets. PMID:21901739

  10. Computational tools for exploring sequence databases as a resource for antimicrobial peptides.

    PubMed

    Porto, W F; Pires, A S; Franco, O L

    Data mining has been recognized by many researchers as a hot topic in different areas. In the post-genomic era, the growing number of sequences deposited in databases has been the reason why these databases have become a resource for novel biological information. In recent years, the identification of antimicrobial peptides (AMPs) in databases has gained attention. The identification of unannotated AMPs has shed some light on the distribution and evolution of AMPs and, in some cases, indicated suitable candidates for developing novel antimicrobial agents. The data mining process has been performed mainly by local alignments and/or regular expressions. Nevertheless, for the identification of distant homologous sequences, other techniques such as antimicrobial activity prediction and molecular modelling are required. In this context, this review addresses the tools and techniques, and also their limitations, for mining AMPs from databases. These methods could be helpful not only for the development of novel AMPs, but also for other kinds of proteins, at a higher level of structural genomics. Moreover, solving the problem of unannotated proteins could bring immeasurable benefits to society, especially in the case of AMPs, which could be helpful for developing novel antimicrobial agents and combating resistant bacteria. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Data mining techniques for scientific computing: Application to asymptotic paraxial approximations to model ultrarelativistic particles

    NASA Astrophysics Data System (ADS)

    Assous, Franck; Chaskalovic, Joël

    2011-06-01

    We propose a new approach that consists in using data mining techniques for scientific computing. Indeed, data mining has proved to be efficient in other contexts which deal with huge data like in biology, medicine, marketing, advertising and communications. Our aim, here, is to deal with the important problem of the exploitation of the results produced by any numerical method. Indeed, more and more data are created today by numerical simulations. Thus, it seems necessary to look at efficient tools to analyze them. In this work, we focus our presentation to a test case dedicated to an asymptotic paraxial approximation to model ultrarelativistic particles. Our method directly deals with numerical results of simulations and try to understand what each order of the asymptotic expansion brings to the simulation results over what could be obtained by other lower-order or less accurate means. This new heuristic approach offers new potential applications to treat numerical solutions to mathematical models.

  12. Data Mining and Knowledge Discovery tools for exploiting big Earth-Observation data

    NASA Astrophysics Data System (ADS)

    Espinoza Molina, D.; Datcu, M.

    2015-04-01

    The continuous increase in the size of the archives and in the variety and complexity of Earth-Observation (EO) sensors require new methodologies and tools that allow the end-user to access a large image repository, to extract and to infer knowledge about the patterns hidden in the images, to retrieve dynamically a collection of relevant images, and to support the creation of emerging applications (e.g.: change detection, global monitoring, disaster and risk management, image time series, etc.). In this context, we are concerned with providing a platform for data mining and knowledge discovery content from EO archives. The platform's goal is to implement a communication channel between Payload Ground Segments and the end-user who receives the content of the data coded in an understandable format associated with semantics that is ready for immediate exploitation. It will provide the user with automated tools to explore and understand the content of highly complex images archives. The challenge lies in the extraction of meaningful information and understanding observations of large extended areas, over long periods of time, with a broad variety of EO imaging sensors in synergy with other related measurements and data. The platform is composed of several components such as 1.) ingestion of EO images and related data providing basic features for image analysis, 2.) query engine based on metadata, semantics and image content, 3.) data mining and knowledge discovery tools for supporting the interpretation and understanding of image content, 4.) semantic definition of the image content via machine learning methods. All these components are integrated and supported by a relational database management system, ensuring the integrity and consistency of Terabytes of Earth Observation data.

  13. Exploring context and content links in social media: a latent space method.

    PubMed

    Qi, Guo-Jun; Aggarwal, Charu; Tian, Qi; Ji, Heng; Huang, Thomas S

    2012-05-01

    Social media networks contain both content and context-specific information. Most existing methods work with either of the two for the purpose of multimedia mining and retrieval. In reality, both content and context information are rich sources of information for mining, and the full power of mining and processing algorithms can be realized only with the use of a combination of the two. This paper proposes a new algorithm which mines both context and content links in social media networks to discover the underlying latent semantic space. This mapping of the multimedia objects into latent feature vectors enables the use of any off-the-shelf multimedia retrieval algorithms. Compared to the state-of-the-art latent methods in multimedia analysis, this algorithm effectively solves the problem of sparse context links by mining the geometric structure underlying the content links between multimedia objects. Specifically for multimedia annotation, we show that an effective algorithm can be developed to directly construct annotation models by simultaneously leveraging both context and content information based on latent structure between correlated semantic concepts. We conduct experiments on the Flickr data set, which contains user tags linked with images. We illustrate the advantages of our approach over the state-of-the-art multimedia retrieval techniques.

  14. New Trends in E-Science: Machine Learning and Knowledge Discovery in Databases

    NASA Astrophysics Data System (ADS)

    Brescia, Massimo

    2012-11-01

    Data mining, or Knowledge Discovery in Databases (KDD), while being the main methodology to extract the scientific information contained in Massive Data Sets (MDS), needs to tackle crucial problems since it has to orchestrate complex challenges posed by transparent access to different computing environments, scalability of algorithms, reusability of resources. To achieve a leap forward for the progress of e-science in the data avalanche era, the community needs to implement an infrastructure capable of performing data access, processing and mining in a distributed but integrated context. The increasing complexity of modern technologies carried out a huge production of data, whose related warehouse management and the need to optimize analysis and mining procedures lead to a change in concept on modern science. Classical data exploration, based on local user own data storage and limited computing infrastructures, is no more efficient in the case of MDS, worldwide spread over inhomogeneous data centres and requiring teraflop processing power. In this context modern experimental and observational science requires a good understanding of computer science, network infrastructures, Data Mining, etc. i.e. of all those techniques which fall into the domain of the so called e-science (recently assessed also by the Fourth Paradigm of Science). Such understanding is almost completely absent in the older generations of scientists and this reflects in the inadequacy of most academic and research programs. A paradigm shift is needed: statistical pattern recognition, object oriented programming, distributed computing, parallel programming need to become an essential part of scientific background. A possible practical solution is to provide the research community with easy-to understand, easy-to-use tools, based on the Web 2.0 technologies and Machine Learning methodology. Tools where almost all the complexity is hidden to the final user, but which are still flexible and able to produce efficient and reliable scientific results. All these considerations will be described in the detail in the chapter. Moreover, examples of modern applications offering to a wide variety of e-science communities a large spectrum of computational facilities to exploit the wealth of available massive data sets and powerful machine learning and statistical algorithms will be also introduced.

  15. Two modelling approaches to water-quality simulation in a flooded iron-ore mine (Saizerais, Lorraine, France): a semi-distributed chemical reactor model and a physically based distributed reactive transport pipe network model.

    PubMed

    Hamm, V; Collon-Drouaillet, P; Fabriol, R

    2008-02-19

    The flooding of abandoned mines in the Lorraine Iron Basin (LIB) over the past 25 years has degraded the quality of the groundwater tapped for drinking water. High concentrations of dissolved sulphate have made the water unsuitable for human consumption. This problematic issue has led to the development of numerical tools to support water-resource management in mining contexts. Here we examine two modelling approaches using different numerical tools that we tested on the Saizerais flooded iron-ore mine (Lorraine, France). A first approach considers the Saizerais Mine as a network of two chemical reactors (NCR). The second approach is based on a physically distributed pipe network model (PNM) built with EPANET 2 software. This approach considers the mine as a network of pipes defined by their geometric and chemical parameters. Each reactor in the NCR model includes a detailed chemical model built to simulate quality evolution in the flooded mine water. However, in order to obtain a robust PNM, we simplified the detailed chemical model into a specific sulphate dissolution-precipitation model that is included as sulphate source/sink in both a NCR model and a pipe network model. Both the NCR model and the PNM, based on different numerical techniques, give good post-calibration agreement between the simulated and measured sulphate concentrations in the drinking-water well and overflow drift. The NCR model incorporating the detailed chemical model is useful when a detailed chemical behaviour at the overflow is needed. The PNM incorporating the simplified sulphate dissolution-precipitation model provides better information of the physics controlling the effect of flow and low flow zones, and the time of solid sulphate removal whereas the NCR model will underestimate clean-up time due to the complete mixing assumption. In conclusion, the detailed NCR model will give a first assessment of chemical processes at overflow, and in a second time, the PNM model will provide more detailed information on flow and chemical behaviour (dissolved sulphate concentrations, remaining mass of solid sulphate) in the network. Nevertheless, both modelling methods require hydrological and chemical parameters (recharge flow rate, outflows, volume of mine voids, mass of solids, kinetic constants of the dissolution-precipitation reactions), which are commonly not available for a mine and therefore call for calibration data.

  16. Open-source tools for data mining.

    PubMed

    Zupan, Blaz; Demsar, Janez

    2008-03-01

    With a growing volume of biomedical databases and repositories, the need to develop a set of tools to address their analysis and support knowledge discovery is becoming acute. The data mining community has developed a substantial set of techniques for computational treatment of these data. In this article, we discuss the evolution of open-source toolboxes that data mining researchers and enthusiasts have developed over the span of a few decades and review several currently available open-source data mining suites. The approaches we review are diverse in data mining methods and user interfaces and also demonstrate that the field and its tools are ready to be fully exploited in biomedical research.

  17. DEXTER: Disease-Expression Relation Extraction from Text.

    PubMed

    Gupta, Samir; Dingerdissen, Hayley; Ross, Karen E; Hu, Yu; Wu, Cathy H; Mazumder, Raja; Vijay-Shanker, K

    2018-01-01

    Gene expression levels affect biological processes and play a key role in many diseases. Characterizing expression profiles is useful for clinical research, and diagnostics and prognostics of diseases. There are currently several high-quality databases that capture gene expression information, obtained mostly from large-scale studies, such as microarray and next-generation sequencing technologies, in the context of disease. The scientific literature is another rich source of information on gene expression-disease relationships that not only have been captured from large-scale studies but have also been observed in thousands of small-scale studies. Expression information obtained from literature through manual curation can extend expression databases. While many of the existing databases include information from literature, they are limited by the time-consuming nature of manual curation and have difficulty keeping up with the explosion of publications in the biomedical field. In this work, we describe an automated text-mining tool, Disease-Expression Relation Extraction from Text (DEXTER) to extract information from literature on gene and microRNA expression in the context of disease. One of the motivations in developing DEXTER was to extend the BioXpress database, a cancer-focused gene expression database that includes data derived from large-scale experiments and manual curation of publications. The literature-based portion of BioXpress lags behind significantly compared to expression information obtained from large-scale studies and can benefit from our text-mined results. We have conducted two different evaluations to measure the accuracy of our text-mining tool and achieved average F-scores of 88.51 and 81.81% for the two evaluations, respectively. Also, to demonstrate the ability to extract rich expression information in different disease-related scenarios, we used DEXTER to extract information on differential expression information for 2024 genes in lung cancer, 115 glycosyltransferases in 62 cancers and 826 microRNA in 171 cancers. All extractions using DEXTER are integrated in the literature-based portion of BioXpress.Database URL: http://biotm.cis.udel.edu/DEXTER.

  18. Hydrochemical characterization of a river affected by acid mine drainage in the Iberian Pyrite Belt.

    PubMed

    Grande, J A; Santisteban, M; Valente, T; de la Torre, M L; Gomes, P

    2017-06-01

    This paper addresses the modelling of the processes associated with acid mine drainage affecting the Trimpancho River basin, chosen for this purpose because of its location and paradigmatic hydrological, geological, mining and environmental contexts. By using physical-chemical indicators it is possible to define the contamination degree of the system from the perspective of an entire river basin, due to its reduced dimension. This allows an exhaustive monitoring of the study area, considering the particularity that the stream flows directly into a water dam used for human supply. With such a perspective, and in order to find global solutions, the present study seeks to develop methodologies and tools for expeditious and accurate diagnosis of the pollution level of the affected stream that feeds the water reservoir. The implemented methodology can be applied to other water systems affected by similar problems, while the results will contribute to the development of the state of the art in a representative basin of the Iberian Pyrite Belt, whose pollutants' contributions are incorporated into the reservoir.

  19. 30 CFR 56.12033 - Hand-held electric tools.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Hand-held electric tools. 56.12033 Section 56.12033 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Electricity § 56...

  20. 30 CFR 56.12033 - Hand-held electric tools.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Hand-held electric tools. 56.12033 Section 56.12033 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Electricity § 56...

  1. 30 CFR 56.12033 - Hand-held electric tools.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Hand-held electric tools. 56.12033 Section 56.12033 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Electricity § 56...

  2. 30 CFR 56.12033 - Hand-held electric tools.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Hand-held electric tools. 56.12033 Section 56.12033 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Electricity § 56...

  3. 30 CFR 56.12033 - Hand-held electric tools.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Hand-held electric tools. 56.12033 Section 56.12033 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Electricity § 56...

  4. 30 CFR 75.214 - Supplemental support materials, equipment and tools.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... tools. 75.214 Section 75.214 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR COAL MINE SAFETY AND HEALTH MANDATORY SAFETY STANDARDS-UNDERGROUND COAL MINES Roof Support § 75... accessible location on each working section or within four crosscuts of each working section. (b) The...

  5. COPS: Detecting Co-Occurrence and Spatial Arrangement of Transcription Factor Binding Motifs in Genome-Wide Datasets

    PubMed Central

    Lohmann, Ingrid

    2012-01-01

    In multi-cellular organisms, spatiotemporal activity of cis-regulatory DNA elements depends on their occupancy by different transcription factors (TFs). In recent years, genome-wide ChIP-on-Chip, ChIP-Seq and DamID assays have been extensively used to unravel the combinatorial interaction of TFs with cis-regulatory modules (CRMs) in the genome. Even though genome-wide binding profiles are increasingly becoming available for different TFs, single TF binding profiles are in most cases not sufficient for dissecting complex regulatory networks. Thus, potent computational tools detecting statistically significant and biologically relevant TF-motif co-occurrences in genome-wide datasets are essential for analyzing context-dependent transcriptional regulation. We have developed COPS (Co-Occurrence Pattern Search), a new bioinformatics tool based on a combination of association rules and Markov chain models, which detects co-occurring TF binding sites (BSs) on genomic regions of interest. COPS scans DNA sequences for frequent motif patterns using a Frequent-Pattern tree based data mining approach, which allows efficient performance of the software with respect to both data structure and implementation speed, in particular when mining large datasets. Since transcriptional gene regulation very often relies on the formation of regulatory protein complexes mediated by closely adjoining TF binding sites on CRMs, COPS additionally detects preferred short distance between co-occurring TF motifs. The performance of our software with respect to biological significance was evaluated using three published datasets containing genomic regions that are independently bound by several TFs involved in a defined biological process. In sum, COPS is a fast, efficient and user-friendly tool mining statistically and biologically significant TFBS co-occurrences and therefore allows the identification of TFs that combinatorially regulate gene expression. PMID:23272209

  6. Deep learning with word embeddings improves biomedical named entity recognition.

    PubMed

    Habibi, Maryam; Weber, Leon; Neves, Mariana; Wiegandt, David Luis; Leser, Ulf

    2017-07-15

    Text mining has become an important tool for biomedical research. The most fundamental text-mining task is the recognition of biomedical named entities (NER), such as genes, chemicals and diseases. Current NER methods rely on pre-defined features which try to capture the specific surface properties of entity types, properties of the typical local context, background knowledge, and linguistic information. State-of-the-art tools are entity-specific, as dictionaries and empirically optimal feature sets differ between entity types, which makes their development costly. Furthermore, features are often optimized for a specific gold standard corpus, which makes extrapolation of quality measures difficult. We show that a completely generic method based on deep learning and statistical word embeddings [called long short-term memory network-conditional random field (LSTM-CRF)] outperforms state-of-the-art entity-specific NER tools, and often by a large margin. To this end, we compared the performance of LSTM-CRF on 33 data sets covering five different entity classes with that of best-of-class NER tools and an entity-agnostic CRF implementation. On average, F1-score of LSTM-CRF is 5% above that of the baselines, mostly due to a sharp increase in recall. The source code for LSTM-CRF is available at https://github.com/glample/tagger and the links to the corpora are available at https://corposaurus.github.io/corpora/ . habibima@informatik.hu-berlin.de. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  7. Deep learning with word embeddings improves biomedical named entity recognition

    PubMed Central

    Habibi, Maryam; Weber, Leon; Neves, Mariana; Wiegandt, David Luis; Leser, Ulf

    2017-01-01

    Abstract Motivation: Text mining has become an important tool for biomedical research. The most fundamental text-mining task is the recognition of biomedical named entities (NER), such as genes, chemicals and diseases. Current NER methods rely on pre-defined features which try to capture the specific surface properties of entity types, properties of the typical local context, background knowledge, and linguistic information. State-of-the-art tools are entity-specific, as dictionaries and empirically optimal feature sets differ between entity types, which makes their development costly. Furthermore, features are often optimized for a specific gold standard corpus, which makes extrapolation of quality measures difficult. Results: We show that a completely generic method based on deep learning and statistical word embeddings [called long short-term memory network-conditional random field (LSTM-CRF)] outperforms state-of-the-art entity-specific NER tools, and often by a large margin. To this end, we compared the performance of LSTM-CRF on 33 data sets covering five different entity classes with that of best-of-class NER tools and an entity-agnostic CRF implementation. On average, F1-score of LSTM-CRF is 5% above that of the baselines, mostly due to a sharp increase in recall. Availability and implementation: The source code for LSTM-CRF is available at https://github.com/glample/tagger and the links to the corpora are available at https://corposaurus.github.io/corpora/. Contact: habibima@informatik.hu-berlin.de PMID:28881963

  8. Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence-Function Space and Genome Context to Discover Novel Functions.

    PubMed

    Gerlt, John A

    2017-08-22

    The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.

  9. Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence–Function Space and Genome Context to Discover Novel Functions

    PubMed Central

    2017-01-01

    The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of “genomic enzymology” web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence–function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems. PMID:28826221

  10. Decision support methods for the environmental assessment of contamination at mining sites.

    PubMed

    Jordan, Gyozo; Abdaal, Ahmed

    2013-09-01

    Polluting mine accidents and widespread environmental contamination associated with historic mining in Europe and elsewhere has triggered the improvement of related environmental legislation and of the environmental assessment and management methods for the mining industry. Mining has some unique features such as natural background pollution associated with natural mineral deposits, industrial activities and contamination located in the three-dimensional sub-surface space, the problem of long-term remediation after mine closure, problem of secondary contaminated areas around mine sites and abandoned mines in historic regions like Europe. These mining-specific problems require special tools to address the complexity of the environmental problems of mining-related contamination. The objective of this paper is to review and evaluate some of the decision support methods that have been developed and applied to mining contamination. In this paper, only those methods that are both efficient decision support tools and provide a 'holistic' approach to the complex problem as well are considered. These tools are (1) landscape ecology, (2) industrial ecology, (3) landscape geochemistry, (4) geo-environmental models, (5) environmental impact assessment, (6) environmental risk assessment, (7) material flow analysis and (8) life cycle assessment. This unique inter-disciplinary study should enable both the researcher and the practitioner to obtain broad view on the state-of-the-art of decision support methods for the environmental assessment of contamination at mine sites. Documented examples and abundant references are also provided.

  11. Design Pattern Mining Using Distributed Learning Automata and DNA Sequence Alignment

    PubMed Central

    Esmaeilpour, Mansour; Naderifar, Vahideh; Shukur, Zarina

    2014-01-01

    Context Over the last decade, design patterns have been used extensively to generate reusable solutions to frequently encountered problems in software engineering and object oriented programming. A design pattern is a repeatable software design solution that provides a template for solving various instances of a general problem. Objective This paper describes a new method for pattern mining, isolating design patterns and relationship between them; and a related tool, DLA-DNA for all implemented pattern and all projects used for evaluation. DLA-DNA achieves acceptable precision and recall instead of other evaluated tools based on distributed learning automata (DLA) and deoxyribonucleic acid (DNA) sequences alignment. Method The proposed method mines structural design patterns in the object oriented source code and extracts the strong and weak relationships between them, enabling analyzers and programmers to determine the dependency rate of each object, component, and other section of the code for parameter passing and modular programming. The proposed model can detect design patterns better that available other tools those are Pinot, PTIDEJ and DPJF; and the strengths of their relationships. Results The result demonstrate that whenever the source code is build standard and non-standard, based on the design patterns, then the result of the proposed method is near to DPJF and better that Pinot and PTIDEJ. The proposed model is tested on the several source codes and is compared with other related models and available tools those the results show the precision and recall of the proposed method, averagely 20% and 9.6% are more than Pinot, 27% and 31% are more than PTIDEJ and 3.3% and 2% are more than DPJF respectively. Conclusion The primary idea of the proposed method is organized in two following steps: the first step, elemental design patterns are identified, while at the second step, is composed to recognize actual design patterns. PMID:25243670

  12. Identification and classification of known and putative antimicrobial compounds produced by a wide variety of Bacillales species.

    PubMed

    Zhao, Xin; Kuipers, Oscar P

    2016-11-07

    Gram-positive bacteria of the Bacillales are important producers of antimicrobial compounds that might be utilized for medical, food or agricultural applications. Thanks to the wide availability of whole genome sequence data and the development of specific genome mining tools, novel antimicrobial compounds, either ribosomally- or non-ribosomally produced, of various Bacillales species can be predicted and classified. Here, we provide a classification scheme of known and putative antimicrobial compounds in the specific context of Bacillales species. We identify and describe known and putative bacteriocins, non-ribosomally synthesized peptides (NRPs), polyketides (PKs) and other antimicrobials from 328 whole-genome sequenced strains of 57 species of Bacillales by using web based genome-mining prediction tools. We provide a classification scheme for these bacteriocins, update the findings of NRPs and PKs and investigate their characteristics and suitability for biocontrol by describing per class their genetic organization and structure. Moreover, we highlight the potential of several known and novel antimicrobials from various species of Bacillales. Our extended classification of antimicrobial compounds demonstrates that Bacillales provide a rich source of novel antimicrobials that can now readily be tapped experimentally, since many new gene clusters are identified.

  13. Trace elements and Pb isotopes in soils and sediments impacted by uranium mining.

    PubMed

    Cuvier, A; Pourcelot, L; Probst, A; Prunier, J; Le Roux, G

    2016-10-01

    The purpose of this study is to evaluate the contamination in As, Ba, Co, Cu, Mn, Ni, Sr, V, Zn and REE, in a high uranium activity (up to 21,000Bq∙kg(-1)) area, downstream of a former uranium mine. Different geochemical proxies like enrichment factor and fractions from a sequential extraction procedure are used to evaluate the level of contamination, the mobility and the availability of the potential contaminants. Pb isotope ratios are determined in the total samples and in the sequential leachates to identify the sources of the contaminants and to determine the mobility of radiogenic Pb in the context of uranium mining. In spite of the large uranium contamination measured in the soils and the sediments (EF≫40), trace element contamination is low to moderate (2

  14. Application of Modern Tools and Techniques for Mine Safety & Disaster Management

    NASA Astrophysics Data System (ADS)

    Kumar, Dheeraj

    2016-04-01

    The implementation of novel systems and adoption of improvised equipment in mines help mining companies in two important ways: enhanced mine productivity and improved worker safety. There is a substantial need for adoption of state-of-the-art automation technologies in the mines to ensure the safety and to protect health of mine workers. With the advent of new autonomous equipment used in the mine, the inefficiencies are reduced by limiting human inconsistencies and error. The desired increase in productivity at a mine can sometimes be achieved by changing only a few simple variables. Significant developments have been made in the areas of surface and underground communication, robotics, smart sensors, tracking systems, mine gas monitoring systems and ground movements etc. Advancement in information technology in the form of internet, GIS, remote sensing, satellite communication, etc. have proved to be important tools for hazard reduction and disaster management. This paper is mainly focused on issues pertaining to mine safety and disaster management and some of the recent innovations in the mine automations that could be deployed in mines for safe mining operations and for avoiding any unforeseen mine disaster.

  15. Application of Quality Management Tools for Evaluating the Failure Frequency of Cutter-Loader and Plough Mining Systems

    NASA Astrophysics Data System (ADS)

    Biały, Witold

    2017-06-01

    Failure frequency in the mining process, with a focus on the mining machine, has been presented and illustrated by the example of two coal-mines. Two mining systems have been subjected to analysis: a cutter-loader and a plough system. In order to reduce costs generated by failures, maintenance teams should regularly make sure that the machines are used and operated in a rational and effective way. Such activities will allow downtimes to be reduced, and, in consequence, will increase the effectiveness of a mining plant. The evaluation of mining machines' failure frequency contained in this study has been based on one of the traditional quality management tools - the Pareto chart.

  16. miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases.

    PubMed

    Gupta, Samir; Ross, Karen E; Tudor, Catalina O; Wu, Cathy H; Schmidt, Carl J; Vijay-Shanker, K

    2016-04-29

    MicroRNAs are increasingly being appreciated as critical players in human diseases, and questions concerning the role of microRNAs arise in many areas of biomedical research. There are several manually curated databases of microRNA-disease associations gathered from the biomedical literature; however, it is difficult for curators of these databases to keep up with the explosion of publications in the microRNA-disease field. Moreover, automated literature mining tools that assist manual curation of microRNA-disease associations currently capture only one microRNA property (expression) in the context of one disease (cancer). Thus, there is a clear need to develop more sophisticated automated literature mining tools that capture a variety of microRNA properties and relations in the context of multiple diseases to provide researchers with fast access to the most recent published information and to streamline and accelerate manual curation. We have developed miRiaD (microRNAs in association with Disease), a text-mining tool that automatically extracts associations between microRNAs and diseases from the literature. These associations are often not directly linked, and the intermediate relations are often highly informative for the biomedical researcher. Thus, miRiaD extracts the miR-disease pairs together with an explanation for their association. We also developed a procedure that assigns scores to sentences, marking their informativeness, based on the microRNA-disease relation observed within the sentence. miRiaD was applied to the entire Medline corpus, identifying 8301 PMIDs with miR-disease associations. These abstracts and the miR-disease associations are available for browsing at http://biotm.cis.udel.edu/miRiaD . We evaluated the recall and precision of miRiaD with respect to information of high interest to public microRNA-disease database curators (expression and target gene associations), obtaining a recall of 88.46-90.78. When we expanded the evaluation to include sentences with a wide range of microRNA-disease information that may be of interest to biomedical researchers, miRiaD also performed very well with a F-score of 89.4. The informativeness ranking of sentences was evaluated in terms of nDCG (0.977) and correlation metrics (0.678-0.727) when compared to an annotator's ranked list. miRiaD, a high performance system that can capture a wide variety of microRNA-disease related information, extends beyond the scope of existing microRNA-disease resources. It can be incorporated into manual curation pipelines and serve as a resource for biomedical researchers interested in the role of microRNAs in disease. In our ongoing work we are developing an improved miRiaD web interface that will facilitate complex queries about microRNA-disease relationships, such as "In what diseases does microRNA regulation of apoptosis play a role?" or "Is there overlap in the sets of genes targeted by microRNAs in different types of dementia?"."

  17. Data mining applications in the context of casemix.

    PubMed

    Koh, H C; Leong, S K

    2001-07-01

    In October 1999, the Singapore Government introduced casemix-based funding to public hospitals. The casemix approach to health care funding is expected to yield significant benefits, including equity and rationality in financing health care, the use of comparative casemix data for quality improvement activities, and the provision of information that enables hospitals to understand their cost behaviour and reinforces the drive for more cost-efficient services. However, there is some concern about the "quicker and sicker" syndrome (that is, the rapid discharge of patients with little regard for the quality of outcome). As it is likely that consequences of premature discharges will be reflected in the readmission data, an analysis of possible systematic patterns in readmission data can provide useful insight into the "quicker and sicker" syndrome. This paper explores potential data mining applications in the context of casemix by using readmission data as an illustration. In particular, it illustrates how data mining can be used to better understand readmission data and to detect systematic patterns, if any. From a technical perspective, data mining (which is capable of analysing complex non-linear and interaction relationships) supplements and complements traditional statistical methods in data analysis. From an applications perspective, data mining provides the technology and methodology to analyse mass volume of data to detect hidden patterns in data. Using readmission data as an illustrative data mining application, this paper explores potential data mining applications in the general casemix context.

  18. 30 CFR 57.9261 - Transporting tools and materials on locomotives.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Transporting tools and materials on locomotives. 57.9261 Section 57.9261 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR... MINES Loading, Hauling, and Dumping Transportation of Persons and Materials § 57.9261 Transporting tools...

  19. 30 CFR 57.9261 - Transporting tools and materials on locomotives.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Transporting tools and materials on locomotives. 57.9261 Section 57.9261 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR... MINES Loading, Hauling, and Dumping Transportation of Persons and Materials § 57.9261 Transporting tools...

  20. 30 CFR 56.14116 - Hand-held power tools.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Hand-held power tools. 56.14116 Section 56... MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Machinery and Equipment Safety Devices and Maintenance Requirements § 56.14116 Hand-held power tools. (a) Power drills...

  1. 30 CFR 56.14116 - Hand-held power tools.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Hand-held power tools. 56.14116 Section 56... MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Machinery and Equipment Safety Devices and Maintenance Requirements § 56.14116 Hand-held power tools. (a) Power drills...

  2. 30 CFR 56.14116 - Hand-held power tools.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Hand-held power tools. 56.14116 Section 56... MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Machinery and Equipment Safety Devices and Maintenance Requirements § 56.14116 Hand-held power tools. (a) Power drills...

  3. 30 CFR 57.14116 - Hand-held power tools.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Hand-held power tools. 57.14116 Section 57... MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Machinery and Equipment Safety Devices and Maintenance Requirements § 57.14116 Hand-held power tools. (a) Power drills...

  4. 30 CFR 56.14116 - Hand-held power tools.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Hand-held power tools. 56.14116 Section 56... MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Machinery and Equipment Safety Devices and Maintenance Requirements § 56.14116 Hand-held power tools. (a) Power drills...

  5. 30 CFR 57.14116 - Hand-held power tools.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Hand-held power tools. 57.14116 Section 57... MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Machinery and Equipment Safety Devices and Maintenance Requirements § 57.14116 Hand-held power tools. (a) Power drills...

  6. 30 CFR 57.14116 - Hand-held power tools.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Hand-held power tools. 57.14116 Section 57... MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Machinery and Equipment Safety Devices and Maintenance Requirements § 57.14116 Hand-held power tools. (a) Power drills...

  7. 30 CFR 56.14116 - Hand-held power tools.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Hand-held power tools. 56.14116 Section 56... MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Machinery and Equipment Safety Devices and Maintenance Requirements § 56.14116 Hand-held power tools. (a) Power drills...

  8. 30 CFR 57.14116 - Hand-held power tools.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Hand-held power tools. 57.14116 Section 57... MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Machinery and Equipment Safety Devices and Maintenance Requirements § 57.14116 Hand-held power tools. (a) Power drills...

  9. 30 CFR 57.14116 - Hand-held power tools.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Hand-held power tools. 57.14116 Section 57... MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Machinery and Equipment Safety Devices and Maintenance Requirements § 57.14116 Hand-held power tools. (a) Power drills...

  10. LAND REBORN: TOOLS FOR THE 21ST CENTURY/NATIONAL ASSOCIATION OF ABANDONED MINE LAND PROGRAMS

    EPA Science Inventory

    Mining activities in the US (not counting coal) produce 1-2 billion tons of mine waste annually. Since many of the ore mines involve sulfide minerals, the production of acid mine drainage (AMD) is a common problem from these abandoned mine sites. The combination of acidity, heavy...

  11. Mining the Geophysical Research Abstracts Corpus: Mapping the impact of Free and Open Source Software on the EGU Divisions

    NASA Astrophysics Data System (ADS)

    Löwe, Peter; Klump, Jens; Robertson, Jesse

    2015-04-01

    Text mining is commonly employed as a tool in data science to investigate and chart emergent information from corpora of research abstracts, such as the Geophysical Research Abstracts (GRA) published by Copernicus. In this context current standards, such as persistent identifiers like DOI and ORCID, allow us to trace, cite and map links between journal publications, the underlying research data and scientific software. This network can be expressed as a directed graph which enables us to chart networks of cooperation and innovation, thematic foci and the locations of research communities in time and space. However, this approach of data science, focusing on the research process in a self-referential manner, rather than the topical work, is still in a developing stage. Scientific work presented at the EGU General Assembly is often the first step towards new approaches and innovative ideas to the geospatial community. It represents a rich, deep and heterogeneous source of geoscientific thought. This corpus is a significant data source for data science, which has not been analysed on this scale previously. In this work, the corpus of the Geophysical Research Abstracts is used for the first time as a data base for analyses of topical text mining. For this, we used a sturdy and customizable software framework, based on the work of Schmitt et al. [1]. For the analysis we used the High Performance Computing infrastructure of the German Research Centre for Geosciences GFZ in Potsdam, Germany. Here, we report on the first results from the analysis of the continuous spreading the of use of Free and Open Source Software Tools (FOSS) within the EGU communities, mapping the general increase of FOSS-themed GRA articles in the last decade and the developing spatial patterns of involved parties and FOSS topics. References: [1] Schmitt, L. M., Christianson, K.T, Gupta R..: Linguistic Computing with UNIX Tools, in Kao, A., Poteet S.R. (Eds.): Natural Language processing and Text Mining, Springer, 2007. doi:10.1007/978-1-84628-754-1_12.

  12. Data mining in radiology

    PubMed Central

    Kharat, Amit T; Singh, Amarjit; Kulkarni, Vilas M; Shah, Digish

    2014-01-01

    Data mining facilitates the study of radiology data in various dimensions. It converts large patient image and text datasets into useful information that helps in improving patient care and provides informative reports. Data mining technology analyzes data within the Radiology Information System and Hospital Information System using specialized software which assesses relationships and agreement in available information. By using similar data analysis tools, radiologists can make informed decisions and predict the future outcome of a particular imaging finding. Data, information and knowledge are the components of data mining. Classes, Clusters, Associations, Sequential patterns, Classification, Prediction and Decision tree are the various types of data mining. Data mining has the potential to make delivery of health care affordable and ensure that the best imaging practices are followed. It is a tool for academic research. Data mining is considered to be ethically neutral, however concerns regarding privacy and legality exists which need to be addressed to ensure success of data mining. PMID:25024513

  13. Tools for Educational Data Mining: A Review

    ERIC Educational Resources Information Center

    Slater, Stefan; Joksimovic, Srecko; Kovanovic, Vitomir; Baker, Ryan S.; Gasevic, Dragan

    2017-01-01

    In recent years, a wide array of tools have emerged for the purposes of conducting educational data mining (EDM) and/or learning analytics (LA) research. In this article, we hope to highlight some of the most widely used, most accessible, and most powerful tools available for the researcher interested in conducting EDM/LA research. We will…

  14. Using Data Mining Techniques Examination of the Middle School Students' Attitude towards Mathematics in the Context of Some Variables

    ERIC Educational Resources Information Center

    Idil, Feriha Hande; Narli, Serkan; Aksoy, Esra

    2016-01-01

    The aim of this study is to examine middle school students' attitude towards mathematics in the context of their mathematic learning preferences using data mining which is data analysis methodology that has been successfully used in different areas including educational domains. "How do I actually learn?" questionnaire and attitude scale…

  15. A case-based reasoning tool for breast cancer knowledge management with data mining concepts and techniques

    NASA Astrophysics Data System (ADS)

    Demigha, Souâd.

    2016-03-01

    The paper presents a Case-Based Reasoning Tool for Breast Cancer Knowledge Management to improve breast cancer screening. To develop this tool, we combine both concepts and techniques of Case-Based Reasoning (CBR) and Data Mining (DM). Physicians and radiologists ground their diagnosis on their expertise (past experience) based on clinical cases. Case-Based Reasoning is the process of solving new problems based on the solutions of similar past problems and structured as cases. CBR is suitable for medical use. On the other hand, existing traditional hospital information systems (HIS), Radiological Information Systems (RIS) and Picture Archiving Information Systems (PACS) don't allow managing efficiently medical information because of its complexity and heterogeneity. Data Mining is the process of mining information from a data set and transform it into an understandable structure for further use. Combining CBR to Data Mining techniques will facilitate diagnosis and decision-making of medical experts.

  16. Static versus dynamic sampling for data mining

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    John, G.H.; Langley, P.

    1996-12-31

    As data warehouses grow to the point where one hundred gigabytes is considered small, the computational efficiency of data-mining algorithms on large databases becomes increasingly important. Using a sample from the database can speed up the datamining process, but this is only acceptable if it does not reduce the quality of the mined knowledge. To this end, we introduce the {open_quotes}Probably Close Enough{close_quotes} criterion to describe the desired properties of a sample. Sampling usually refers to the use of static statistical tests to decide whether a sample is sufficiently similar to the large database, in the absence of any knowledgemore » of the tools the data miner intends to use. We discuss dynamic sampling methods, which take into account the mining tool being used and can thus give better samples. We describe dynamic schemes that observe a mining tool`s performance on training samples of increasing size and use these results to determine when a sample is sufficiently large. We evaluate these sampling methods on data from the UCI repository and conclude that dynamic sampling is preferable.« less

  17. The Mining Minds digital health and wellness framework.

    PubMed

    Banos, Oresti; Bilal Amin, Muhammad; Ali Khan, Wajahat; Afzal, Muhammad; Hussain, Maqbool; Kang, Byeong Ho; Lee, Sungyong

    2016-07-15

    The provision of health and wellness care is undergoing an enormous transformation. A key element of this revolution consists in prioritizing prevention and proactivity based on the analysis of people's conducts and the empowerment of individuals in their self-management. Digital technologies are unquestionably destined to be the main engine of this change, with an increasing number of domain-specific applications and devices commercialized every year; however, there is an apparent lack of frameworks capable of orchestrating and intelligently leveraging, all the data, information and knowledge generated through these systems. This work presents Mining Minds, a novel framework that builds on the core ideas of the digital health and wellness paradigms to enable the provision of personalized support. Mining Minds embraces some of the most prominent digital technologies, ranging from Big Data and Cloud Computing to Wearables and Internet of Things, as well as modern concepts and methods, such as context-awareness, knowledge bases or analytics, to holistically and continuously investigate on people's lifestyles and provide a variety of smart coaching and support services. This paper comprehensively describes the efficient and rational combination and interoperation of these technologies and methods through Mining Minds, while meeting the essential requirements posed by a framework for personalized health and wellness support. Moreover, this work presents a realization of the key architectural components of Mining Minds, as well as various exemplary user applications and expert tools to illustrate some of the potential services supported by the proposed framework. Mining Minds constitutes an innovative holistic means to inspect human behavior and provide personalized health and wellness support. The principles behind this framework uncover new research ideas and may serve as a reference for similar initiatives.

  18. A software tool for determination of breast cancer treatment methods using data mining approach.

    PubMed

    Cakır, Abdülkadir; Demirel, Burçin

    2011-12-01

    In this work, breast cancer treatment methods are determined using data mining. For this purpose, software is developed to help to oncology doctor for the suggestion of application of the treatment methods about breast cancer patients. 462 breast cancer patient data, obtained from Ankara Oncology Hospital, are used to determine treatment methods for new patients. This dataset is processed with Weka data mining tool. Classification algorithms are applied one by one for this dataset and results are compared to find proper treatment method. Developed software program called as "Treatment Assistant" uses different algorithms (IB1, Multilayer Perception and Decision Table) to find out which one is giving better result for each attribute to predict and by using Java Net beans interface. Treatment methods are determined for the post surgical operation of breast cancer patients using this developed software tool. At modeling step of data mining process, different Weka algorithms are used for output attributes. For hormonotherapy output IB1, for tamoxifen and radiotherapy outputs Multilayer Perceptron and for the chemotherapy output decision table algorithm shows best accuracy performance compare to each other. In conclusion, this work shows that data mining approach can be a useful tool for medical applications particularly at the treatment decision step. Data mining helps to the doctor to decide in a short time.

  19. Physics Mining of Multi-Source Data Sets

    NASA Technical Reports Server (NTRS)

    Helly, John; Karimabadi, Homa; Sipes, Tamara

    2012-01-01

    Powerful new parallel data mining algorithms can produce diagnostic and prognostic numerical models and analyses from observational data. These techniques yield higher-resolution measures than ever before of environmental parameters by fusing synoptic imagery and time-series measurements. These techniques are general and relevant to observational data, including raster, vector, and scalar, and can be applied in all Earth- and environmental science domains. Because they can be highly automated and are parallel, they scale to large spatial domains and are well suited to change and gap detection. This makes it possible to analyze spatial and temporal gaps in information, and facilitates within-mission replanning to optimize the allocation of observational resources. The basis of the innovation is the extension of a recently developed set of algorithms packaged into MineTool to multi-variate time-series data. MineTool is unique in that it automates the various steps of the data mining process, thus making it amenable to autonomous analysis of large data sets. Unlike techniques such as Artificial Neural Nets, which yield a blackbox solution, MineTool's outcome is always an analytical model in parametric form that expresses the output in terms of the input variables. This has the advantage that the derived equation can then be used to gain insight into the physical relevance and relative importance of the parameters and coefficients in the model. This is referred to as physics-mining of data. The capabilities of MineTool are extended to include both supervised and unsupervised algorithms, handle multi-type data sets, and parallelize it.

  20. Managing biological networks by using text mining and computer-aided curation

    NASA Astrophysics Data System (ADS)

    Yu, Seok Jong; Cho, Yongseong; Lee, Min-Ho; Lim, Jongtae; Yoo, Jaesoo

    2015-11-01

    In order to understand a biological mechanism in a cell, a researcher should collect a huge number of protein interactions with experimental data from experiments and the literature. Text mining systems that extract biological interactions from papers have been used to construct biological networks for a few decades. Even though the text mining of literature is necessary to construct a biological network, few systems with a text mining tool are available for biologists who want to construct their own biological networks. We have developed a biological network construction system called BioKnowledge Viewer that can generate a biological interaction network by using a text mining tool and biological taggers. It also Boolean simulation software to provide a biological modeling system to simulate the model that is made with the text mining tool. A user can download PubMed articles and construct a biological network by using the Multi-level Knowledge Emergence Model (KMEM), MetaMap, and A Biomedical Named Entity Recognizer (ABNER) as a text mining tool. To evaluate the system, we constructed an aging-related biological network that consist 9,415 nodes (genes) by using manual curation. With network analysis, we found that several genes, including JNK, AP-1, and BCL-2, were highly related in aging biological network. We provide a semi-automatic curation environment so that users can obtain a graph database for managing text mining results that are generated in the server system and can navigate the network with BioKnowledge Viewer, which is freely available at http://bioknowledgeviewer.kisti.re.kr.

  1. IT Data Mining Tool Uses in Aerospace

    NASA Technical Reports Server (NTRS)

    Monroe, Gilena A.; Freeman, Kenneth; Jones, Kevin L.

    2012-01-01

    Data mining has a broad spectrum of uses throughout the realms of aerospace and information technology. Each of these areas has useful methods for processing, distributing, and storing its corresponding data. This paper focuses on ways to leverage the data mining tools and resources used in NASA's information technology area to meet the similar data mining needs of aviation and aerospace domains. This paper details the searching, alerting, reporting, and application functionalities of the Splunk system, used by NASA's Security Operations Center (SOC), and their potential shared solutions to address aircraft and spacecraft flight and ground systems data mining requirements. This paper also touches on capacity and security requirements when addressing sizeable amounts of data across a large data infrastructure.

  2. HC StratoMineR: A Web-Based Tool for the Rapid Analysis of High-Content Datasets.

    PubMed

    Omta, Wienand A; van Heesbeen, Roy G; Pagliero, Romina J; van der Velden, Lieke M; Lelieveld, Daphne; Nellen, Mehdi; Kramer, Maik; Yeong, Marley; Saeidi, Amir M; Medema, Rene H; Spruit, Marco; Brinkkemper, Sjaak; Klumperman, Judith; Egan, David A

    2016-10-01

    High-content screening (HCS) can generate large multidimensional datasets and when aligned with the appropriate data mining tools, it can yield valuable insights into the mechanism of action of bioactive molecules. However, easy-to-use data mining tools are not widely available, with the result that these datasets are frequently underutilized. Here, we present HC StratoMineR, a web-based tool for high-content data analysis. It is a decision-supportive platform that guides even non-expert users through a high-content data analysis workflow. HC StratoMineR is built by using My Structured Query Language for storage and querying, PHP: Hypertext Preprocessor as the main programming language, and jQuery for additional user interface functionality. R is used for statistical calculations, logic and data visualizations. Furthermore, C++ and graphical processor unit power is diffusely embedded in R by using the rcpp and rpud libraries for operations that are computationally highly intensive. We show that we can use HC StratoMineR for the analysis of multivariate data from a high-content siRNA knock-down screen and a small-molecule screen. It can be used to rapidly filter out undesirable data; to select relevant data; and to perform quality control, data reduction, data exploration, morphological hit picking, and data clustering. Our results demonstrate that HC StratoMineR can be used to functionally categorize HCS hits and, thus, provide valuable information for hit prioritization.

  3. Quantification of Operational Risk Using A Data Mining

    NASA Technical Reports Server (NTRS)

    Perera, J. Sebastian

    1999-01-01

    What is Data Mining? - Data Mining is the process of finding actionable information hidden in raw data. - Data Mining helps find hidden patterns, trends, and important relationships often buried in a sea of data - Typically, automated software tools based on advanced statistical analysis and data modeling technology can be utilized to automate the data mining process

  4. Sentiments analysis at conceptual level making use of the Narrative Knowledge Representation Language.

    PubMed

    Zarri, Gian Piero

    2014-10-01

    This paper illustrates some of the knowledge representation structures and inference procedures proper to a high-level, fully implemented conceptual language, NKRL (Narrative Knowledge Representation Language). The aim is to show how these tools can be used to deal, in a sentiment analysis/opinion mining context, with some common types of human (and non-human) "behaviors". These behaviors correspond, in particular, to the concrete, mutual relationships among human and non-human characters that can be expressed under the form of non-fictional and real-time "narratives" (i.e., as logically and temporally structured sequences of "elementary events"). Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Using Perilog to Explore "Decision Making at NASA"

    NASA Technical Reports Server (NTRS)

    McGreevy, Michael W.

    2005-01-01

    Perilog, a context intensive text mining system, is used as a discovery tool to explore topics and concerns in "Decision Making at NASA," chapter 6 of the Columbia Accident Investigation Board (CAIB) Report, Volume I. Two examples illustrate how Perilog can be used to discover highly significant safety-related information in the text without prior knowledge of the contents of the document. A third example illustrates how "if-then" statements found by Perilog can be used in logical analysis of decision making. In addition, in order to serve as a guide for future work, the technical details of preparing a PDF document for input to Perilog are included in an appendix.

  6. Ten quick tips for machine learning in computational biology.

    PubMed

    Chicco, Davide

    2017-01-01

    Machine learning has become a pivotal tool for many projects in computational biology, bioinformatics, and health informatics. Nevertheless, beginners and biomedical researchers often do not have enough experience to run a data mining project effectively, and therefore can follow incorrect practices, that may lead to common mistakes or over-optimistic results. With this review, we present ten quick tips to take advantage of machine learning in any computational biology context, by avoiding some common errors that we observed hundreds of times in multiple bioinformatics projects. We believe our ten suggestions can strongly help any machine learning practitioner to carry on a successful project in computational biology and related sciences.

  7. A Survey of Educational Data-Mining Research

    ERIC Educational Resources Information Center

    Huebner, Richard A.

    2013-01-01

    Educational data mining (EDM) is an emerging discipline that focuses on applying data mining tools and techniques to educationally related data. The discipline focuses on analyzing educational data to develop models for improving learning experiences and improving institutional effectiveness. A literature review on educational data mining topics…

  8. NASA aviation safety program aircraft engine health management data mining tools roadmap

    DOT National Transportation Integrated Search

    2000-04-01

    Aircraft Engine Health Management Data Mining Tools is a project led by NASA Glenn Research Center in support of the NASA Aviation Safety Program's Aviation System Monitoring and Modeling Thrust. The objective of the Glenn-led effort is to develop en...

  9. Data Mining: A Hybrid Methodology for Complex and Dynamic Research

    ERIC Educational Resources Information Center

    Lang, Susan; Baehr, Craig

    2012-01-01

    This article provides an overview of the ways in which data and text mining have potential as research methodologies in composition studies. It introduces data mining in the context of the field of composition studies and discusses ways in which this methodology can complement and extend our existing research practices by blending the best of what…

  10. Data Mining in Course Management Systems: Moodle Case Study and Tutorial

    ERIC Educational Resources Information Center

    Romero, Cristobal; Ventura, Sebastian; Garcia, Enrique

    2008-01-01

    Educational data mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from the educational context. This work is a survey of the specific application of data mining in learning management systems and a case study tutorial with the Moodle system. Our objective is to introduce it both…

  11. Understanding Teacher Users of a Digital Library Service: A Clustering Approach

    ERIC Educational Resources Information Center

    Xu, Beijie; Recker, Mimi

    2011-01-01

    This article describes the Knowledge Discovery and Data Mining (KDD) process and its application in the field of educational data mining (EDM) in the context of a digital library service called the Instructional Architect (IA.usu.edu). In particular, the study reported in this article investigated a certain type of data mining problem, clustering,…

  12. Ergonomics in the arctic - a study and checklist for heavy machinery in open pit mining.

    PubMed

    Reiman, Arto; Sormunen, Erja; Morris, Drew

    2016-11-22

    Heavy mining vehicle operators at arctic mines have a high risk of discomfort, musculoskeletal disorders and occupational accidents. There is a need for tailored approaches and safety management tools that take into account the specific characteristics of arctic work environments. The aim of this study was to develop a holistic evaluation tool for heavy mining vehicles and operator well-being in arctic mine environments. Data collection was based on design science principles and included literature review, expert observations and participatory ergonomic sessions. As a result of this study, a systemic checklist was developed and tested by eight individuals in a 350-employee mining environment. The checklist includes sections for evaluating vehicle specific ergonomic and safety aspects from a technological point of view and for checking if the work has been arranged so that it can be performed safely and fluently from an employee's point of view.

  13. Data Mining in Health and Medical Information.

    ERIC Educational Resources Information Center

    Bath, Peter A.

    2004-01-01

    Presents a literature review that covers the following topics related to data mining (DM) in health and medical information: the potential of DM in health and medicine; statistical methods; evaluation of methods; DM tools for health and medicine; inductive learning of symbolic rules; application of DM tools in diagnosis and prognosis; and…

  14. A review of genomic data warehousing systems.

    PubMed

    Triplet, Thomas; Butler, Gregory

    2014-07-01

    To facilitate the integration and querying of genomics data, a number of generic data warehousing frameworks have been developed. They differ in their design and capabilities, as well as their intended audience. We provide a comprehensive and quantitative review of those genomic data warehousing frameworks in the context of large-scale systems biology. We reviewed in detail four genomic data warehouses (BioMart, BioXRT, InterMine and PathwayTools) freely available to the academic community. We quantified 20 aspects of the warehouses, covering the accuracy of their responses, their computational requirements and development efforts. Performance of the warehouses was evaluated under various hardware configurations to help laboratories optimize hardware expenses. Each aspect of the benchmark may be dynamically weighted by scientists using our online tool BenchDW (http://warehousebenchmark.fungalgenomics.ca/benchmark/) to build custom warehouse profiles and tailor our results to their specific needs.

  15. Combining complex networks and data mining: Why and how

    NASA Astrophysics Data System (ADS)

    Zanin, M.; Papo, D.; Sousa, P. A.; Menasalvas, E.; Nicchi, A.; Kubik, E.; Boccaletti, S.

    2016-05-01

    The increasing power of computer technology does not dispense with the need to extract meaningful information out of data sets of ever growing size, and indeed typically exacerbates the complexity of this task. To tackle this general problem, two methods have emerged, at chronologically different times, that are now commonly used in the scientific community: data mining and complex network theory. Not only do complex network analysis and data mining share the same general goal, that of extracting information from complex systems to ultimately create a new compact quantifiable representation, but they also often address similar problems too. In the face of that, a surprisingly low number of researchers turn out to resort to both methodologies. One may then be tempted to conclude that these two fields are either largely redundant or totally antithetic. The starting point of this review is that this state of affairs should be put down to contingent rather than conceptual differences, and that these two fields can in fact advantageously be used in a synergistic manner. An overview of both fields is first provided, some fundamental concepts of which are illustrated. A variety of contexts in which complex network theory and data mining have been used in a synergistic manner are then presented. Contexts in which the appropriate integration of complex network metrics can lead to improved classification rates with respect to classical data mining algorithms and, conversely, contexts in which data mining can be used to tackle important issues in complex network theory applications are illustrated. Finally, ways to achieve a tighter integration between complex networks and data mining, and open lines of research are discussed.

  16. PubMedMiner: Mining and Visualizing MeSH-based Associations in PubMed.

    PubMed

    Zhang, Yucan; Sarkar, Indra Neil; Chen, Elizabeth S

    2014-01-01

    The exponential growth of biomedical literature provides the opportunity to develop approaches for facilitating the identification of possible relationships between biomedical concepts. Indexing by Medical Subject Headings (MeSH) represent high-quality summaries of much of this literature that can be used to support hypothesis generation and knowledge discovery tasks using techniques such as association rule mining. Based on a survey of literature mining tools, a tool implemented using Ruby and R - PubMedMiner - was developed in this study for mining and visualizing MeSH-based associations for a set of MEDLINE articles. To demonstrate PubMedMiner's functionality, a case study was conducted that focused on identifying and comparing comorbidities for asthma in children and adults. Relative to the tools surveyed, the initial results suggest that PubMedMiner provides complementary functionality for summarizing and comparing topics as well as identifying potentially new knowledge.

  17. tmBioC: improving interoperability of text-mining tools with BioC.

    PubMed

    Khare, Ritu; Wei, Chih-Hsuan; Mao, Yuqing; Leaman, Robert; Lu, Zhiyong

    2014-01-01

    The lack of interoperability among biomedical text-mining tools is a major bottleneck in creating more complex applications. Despite the availability of numerous methods and techniques for various text-mining tasks, combining different tools requires substantial efforts and time owing to heterogeneity and variety in data formats. In response, BioC is a recent proposal that offers a minimalistic approach to tool interoperability by stipulating minimal changes to existing tools and applications. BioC is a family of XML formats that define how to present text documents and annotations, and also provides easy-to-use functions to read/write documents in the BioC format. In this study, we introduce our text-mining toolkit, which is designed to perform several challenging and significant tasks in the biomedical domain, and repackage the toolkit into BioC to enhance its interoperability. Our toolkit consists of six state-of-the-art tools for named-entity recognition, normalization and annotation (PubTator) of genes (GenNorm), diseases (DNorm), mutations (tmVar), species (SR4GN) and chemicals (tmChem). Although developed within the same group, each tool is designed to process input articles and output annotations in a different format. We modify these tools and enable them to read/write data in the proposed BioC format. We find that, using the BioC family of formats and functions, only minimal changes were required to build the newer versions of the tools. The resulting BioC wrapped toolkit, which we have named tmBioC, consists of our tools in BioC, an annotated full-text corpus in BioC, and a format detection and conversion tool. Furthermore, through participation in the 2013 BioCreative IV Interoperability Track, we empirically demonstrate that the tools in tmBioC can be more efficiently integrated with each other as well as with external tools: Our experimental results show that using BioC reduces >60% in lines of code for text-mining tool integration. The tmBioC toolkit is publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/. Database URL: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.

  18. Planetary science and exploration in the deep subsurface: results from the MINAR Program, Boulby Mine, UK

    NASA Astrophysics Data System (ADS)

    Payler, Samuel J.; Biddle, Jennifer F.; Coates, Andrew J.; Cousins, Claire R.; Cross, Rachel E.; Cullen, David C.; Downs, Michael T.; Direito, Susana O. L.; Edwards, Thomas; Gray, Amber L.; Genis, Jac; Gunn, Matthew; Hansford, Graeme M.; Harkness, Patrick; Holt, John; Josset, Jean-Luc; Li, Xuan; Lees, David S.; Lim, Darlene S. S.; McHugh, Melissa; McLuckie, David; Meehan, Emma; Paling, Sean M.; Souchon, Audrey; Yeoman, Louise; Cockell, Charles S.

    2017-04-01

    The subsurface exploration of other planetary bodies can be used to unravel their geological history and assess their habitability. On Mars in particular, present-day habitable conditions may be restricted to the subsurface. Using a deep subsurface mine, we carried out a program of extraterrestrial analog research - MINe Analog Research (MINAR). MINAR aims to carry out the scientific study of the deep subsurface and test instrumentation designed for planetary surface exploration by investigating deep subsurface geology, whilst establishing the potential this technology has to be transferred into the mining industry. An integrated multi-instrument suite was used to investigate samples of representative evaporite minerals from a subsurface Permian evaporite sequence, in particular to assess mineral and elemental variations which provide small-scale regions of enhanced habitability. The instruments used were the Panoramic Camera emulator, Close-Up Imager, Raman spectrometer, Small Planetary Linear Impulse Tool, Ultrasonic drill and handheld X-ray diffraction (XRD). We present science results from the analog research and show that these instruments can be used to investigate in situ the geological context and mineralogical variations of a deep subsurface environment, and thus habitability, from millimetre to metre scales. We also show that these instruments are complementary. For example, the identification of primary evaporite minerals such as NaCl and KCl, which are difficult to detect by portable Raman spectrometers, can be accomplished with XRD. By contrast, Raman is highly effective at locating and detecting mineral inclusions in primary evaporite minerals. MINAR demonstrates the effective use of a deep subsurface environment for planetary instrument development, understanding the habitability of extreme deep subsurface environments on Earth and other planetary bodies, and advancing the use of space technology in economic mining.

  19. Tools of Realization of Social Responsibility of Industrial Business for Sustainable Socio-economic Development of Mining Region's Rural Territory

    NASA Astrophysics Data System (ADS)

    Jurzina, Tatyana; Egorova, Natalia; Zaruba, Natalia; Kosinskij, Peter

    2017-11-01

    Modern conditions of the Russian economy do especially relevant questions of social responsibility of industrial business of the mining region for sustainable social and economic development of rural territories that demands search of the new strategy, tools, ways for positioning and increase in competitiveness of the enterprises, which are carrying out the entrepreneurial activity in this territory. The article opens problems of an influence of the industrial enterprises on the territory of presence, reasons the theoretical base directed to the formation of practical tools (mechanism) providing realization of social responsibility of business for sustainable social and economic development of rural territories of the mining region.

  20. A study of unstable rock failures using finite difference and discrete element methods

    NASA Astrophysics Data System (ADS)

    Garvey, Ryan J.

    Case histories in mining have long described pillars or faces of rock failing violently with an accompanying rapid ejection of debris and broken material into the working areas of the mine. These unstable failures have resulted in large losses of life and collapses of entire mine panels. Modern mining operations take significant steps to reduce the likelihood of unstable failure, however eliminating their occurrence is difficult in practice. Researchers over several decades have supplemented studies of unstable failures through the application of various numerical methods. The direction of the current research is to extend these methods and to develop improved numerical tools with which to study unstable failures in underground mining layouts. An extensive study is first conducted on the expression of unstable failure in discrete element and finite difference methods. Simulated uniaxial compressive strength tests are run on brittle rock specimens. Stable or unstable loading conditions are applied onto the brittle specimens by a pair of elastic platens with ranging stiffnesses. Determinations of instability are established through stress and strain histories taken for the specimen and the system. Additional numerical tools are then developed for the finite difference method to analyze unstable failure in larger mine models. Instability identifiers are established for assessing the locations and relative magnitudes of unstable failure through measures of rapid dynamic motion. An energy balance is developed which calculates the excess energy released as a result of unstable equilibria in rock systems. These tools are validated through uniaxial and triaxial compressive strength tests and are extended to models of coal pillars and a simplified mining layout. The results of the finite difference simulations reveal that the instability identifiers and excess energy calculations provide a generalized methodology for assessing unstable failures within potentially complex mine models. These combined numerical tools may be applied in future studies to design primary and secondary supports in bump-prone conditions, evaluate retreat mining cut sequences, asses pillar de-stressing techniques, or perform backanalyses on unstable failures in select mining layouts.

  1. A Data Warehouse Architecture for DoD Healthcare Performance Measurements.

    DTIC Science & Technology

    1999-09-01

    design, develop, implement, and apply statistical analysis and data mining tools to a Data Warehouse of healthcare metrics. With the DoD healthcare...framework, this thesis defines a methodology to design, develop, implement, and apply statistical analysis and data mining tools to a Data Warehouse...21 F. INABILITY TO CONDUCT HELATHCARE ANALYSIS

  2. The research of structure and mechanical properties of superhard electro-spark coatings for hardwearing mining tools

    NASA Astrophysics Data System (ADS)

    Bajin, P. A.; Chijikov, A. P.; Leybo, D. V.; Chuprunov, K. O.; Yudin, A. G.; Alymov, M. A.; Kuznetsov, D. V.

    2016-01-01

    The development of low cost and hardwearing mining tools is one of the most important areas in mining industry. It is especially important for technologies of rare and rare earth metals mining due to high hardness of related ores. Coatings for electrodes, produced by extrusion of self-propagating high temperature synthesis (SHS) products from hard-alloyed materials with nanosized structure, for further application in processes of electrospark alloying and deposition were studied in this work. The results of microstructure and properties of deposited layers, interaction of support with SHS produced electrodes, comparison of frictional properties of obtained materials as well as some industrial testing results are presented in this work.

  3. Applying Web Usage Mining for Personalizing Hyperlinks in Web-Based Adaptive Educational Systems

    ERIC Educational Resources Information Center

    Romero, Cristobal; Ventura, Sebastian; Zafra, Amelia; de Bra, Paul

    2009-01-01

    Nowadays, the application of Web mining techniques in e-learning and Web-based adaptive educational systems is increasing exponentially. In this paper, we propose an advanced architecture for a personalization system to facilitate Web mining. A specific Web mining tool is developed and a recommender engine is integrated into the AHA! system in…

  4. Text Mining in Cancer Gene and Pathway Prioritization

    PubMed Central

    Luo, Yuan; Riedlinger, Gregory; Szolovits, Peter

    2014-01-01

    Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes. PMID:25392685

  5. Text mining in cancer gene and pathway prioritization.

    PubMed

    Luo, Yuan; Riedlinger, Gregory; Szolovits, Peter

    2014-01-01

    Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes.

  6. Rare earth metal-containing ionic liquids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Prodius, Denis; Mudring, Anja-Verena

    As an innovative tool, ionic liquids (ILs) are widely employed as an alternative, smart, reaction media (vs. traditional solvents) offering interesting technology solutions for dissolving, processing and recycling of metal-containing materials. The costly mining and refining of rare earths (RE), combined with increasing demand for high-tech and energy-related applications around the world, urgently requires effective approaches to improve the efficiency of rare earth separation and recovery. In this context, ionic liquids appear as an attractive technology solution. Finally, this paper addresses the structural and coordination chemistry of ionic liquids comprising rare earth metals with the aim to add to understandingmore » prospects of ionic liquids in the chemistry of rare earths.« less

  7. Rare earth metal-containing ionic liquids

    DOE PAGES

    Prodius, Denis; Mudring, Anja-Verena

    2018-03-07

    As an innovative tool, ionic liquids (ILs) are widely employed as an alternative, smart, reaction media (vs. traditional solvents) offering interesting technology solutions for dissolving, processing and recycling of metal-containing materials. The costly mining and refining of rare earths (RE), combined with increasing demand for high-tech and energy-related applications around the world, urgently requires effective approaches to improve the efficiency of rare earth separation and recovery. In this context, ionic liquids appear as an attractive technology solution. Finally, this paper addresses the structural and coordination chemistry of ionic liquids comprising rare earth metals with the aim to add to understandingmore » prospects of ionic liquids in the chemistry of rare earths.« less

  8. How does the mining industry contribute to sexual and reproductive health in developing countries? A narrative synthesis of current evidence to inform practice.

    PubMed

    Dawson, Angela J; Homer, Caroline S

    2013-12-01

    To explore client and provider experiences and related health outcomes of sexual and reproductive health interventions that have been led by or that have involved mining companies. Miners, and those living in communities surrounding mines in developing countries, are a vulnerable population with a high sexual and reproductive health burden. People in these communities require specific healthcare services although the exact delivery needs are unclear. There are no systematic reviews of evidence to guide delivery of sexual and reproductive health interventions to best address the needs of men and women in mining communities. A narrative synthesis. A search of peer-reviewed literature from 2000-2012 was undertaken with retrieved documents assessed using an inclusion/exclusion criterion and quality appraisal guided by critical assessment tools. Concepts were analysed thematically. A desire for HIV testing and treatment was associated with the recognition of personal vulnerability, but this was affected by fear of stigma. Regular on-site services facilitated access to voluntary counselling and testing and HIV care, but concerns for confidentiality were a serious barrier. The provision of HIV and sexually transmitted infection clinical and promotive services revealed mixed health outcomes. Recommended service improvements included rapid HIV testing, the integration of sexual and reproductive health into regular health services also available to family members and culturally competent, ethical, providers who are better supported to involve consumers in health promotion. There is a need for research to better inform health interventions so that they build on local cultural norms and values and address social needs. A holistic approach to sexual and reproductive health beyond a focus on HIV may better engage community members, mining companies and governments in healthcare delivery. Nurses may require appropriate workplace support and incentives to deliver sexual and reproductive health interventions in developing mining contexts where task shifting exists. © 2013 John Wiley & Sons Ltd.

  9. Transient states of air parameters after a stoppage and re-start of the main fan / Stany przejściowe parametrów powietrza po postoju i załączeniu wentylatora głównego

    NASA Astrophysics Data System (ADS)

    Wasilewski, Stanisław

    2012-12-01

    A stoppage of the main ventilation fan constitutes a disturbance of ventilation conditions of a deepmine and its effects can cause serious hazards by generating transient states of air and gas flow. Main ventilation fans are the basic deep-mine facilities; therefore, under mining regulations it is only allowed to stop them with the consent and under the conditions specified by the mine maintenance manager. The stoppage of the main ventilation fan may be accompanied by transient air parameters, including the air pressure and flow patterns. There is even the likelihood of reversing the direction of air flow, which, in case of methane mines, can pose a major hazard, particularly in sections of the mine with fire fields or large goaf areas. At the same time, stoppages of deep-mine main ventilation fans create interesting research conditions, which if conducted under the supervision of the monitoring systems, can provide much information about the transient processes of pressure, air and gas flow in underground workings. This article is a discussion of air parameter observations in mine workings made as part of such experiments. It also presents the procedure of the experiments, conducted in three mines. They involved the observation of transient processes of mine air parameters, and most interestingly, the recording of pressure and air and gas flow in the workings of the mine ventilation networks by mine monitoring systems and using specialist recording instruments. In mining practice, both in Poland and elsewhere, software tools and computer modelling methods are used to try and reproduce the conditions prior to and during disasters based on the existing network model and monitoring system data. The use of these tools to simulate the alternatives of combating and liquidation of the gas-fire hazard after its occurrence is an important issue. Measurement data collected during the experiments provides interesting research material for the verification and validation of the software tools used for the simulation of processes occurring in deep-mine ventilation systems.

  10. Using data mining to segment healthcare markets from patients' preference perspectives.

    PubMed

    Liu, Sandra S; Chen, Jie

    2009-01-01

    This paper aims to provide an example of how to use data mining techniques to identify patient segments regarding preferences for healthcare attributes and their demographic characteristics. Data were derived from a number of individuals who received in-patient care at a health network in 2006. Data mining and conventional hierarchical clustering with average linkage and Pearson correlation procedures are employed and compared to show how each procedure best determines segmentation variables. Data mining tools identified three differentiable segments by means of cluster analysis. These three clusters have significantly different demographic profiles. The study reveals, when compared with traditional statistical methods, that data mining provides an efficient and effective tool for market segmentation. When there are numerous cluster variables involved, researchers and practitioners need to incorporate factor analysis for reducing variables to clearly and meaningfully understand clusters. Interests and applications in data mining are increasing in many businesses. However, this technology is seldom applied to healthcare customer experience management. The paper shows that efficient and effective application of data mining methods can aid the understanding of patient healthcare preferences.

  11. Data Mining Tools Make Flights Safer, More Efficient

    NASA Technical Reports Server (NTRS)

    2014-01-01

    A small data mining team at Ames Research Center developed a set of algorithms ideal for combing through flight data to find anomalies. Dallas-based Southwest Airlines Co. signed a Space Act Agreement with Ames in 2011 to access the tools, helping the company refine its safety practices, improve its safety reviews, and increase flight efficiencies.

  12. The Effectiveness of Web-Based Learning Environment: A Case Study of Public Universities in Kenya

    ERIC Educational Resources Information Center

    Kirui, Paul A.; Mutai, Sheila J.

    2010-01-01

    Web mining is emerging in many aspects of e-learning, aiming at improving online learning and teaching processes and making them more transparent and effective. Researchers using Web mining tools and techniques are challenged to learn more about the online students' reshaping online courses and educational websites, and create tools for…

  13. Learning in the context of distribution drift

    DTIC Science & Technology

    2017-05-09

    published in the leading data mining journal, Data Mining and Knowledge Discovery (Webb et. al., 2016)1. We have shown that the previous qualitative...learner Low-bias learner Aggregated classifier Figure 7: Architecture for learning fr m streaming data in th co text of variable or unknown...Learning limited dependence Bayesian classifiers, in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD

  14. Data Mining in Social Media

    NASA Astrophysics Data System (ADS)

    Barbier, Geoffrey; Liu, Huan

    The rise of online social media is providing a wealth of social network data. Data mining techniques provide researchers and practitioners the tools needed to analyze large, complex, and frequently changing social media data. This chapter introduces the basics of data mining, reviews social media, discusses how to mine social media data, and highlights some illustrative examples with an emphasis on social networking sites and blogs.

  15. Prioritization of malaria endemic zones using self-organizing maps in the Manipur state of India.

    PubMed

    Murty, Upadhyayula Suryanarayana; Srinivasa Rao, Mutheneni; Misra, Sunil

    2008-09-01

    Due to the availability of a huge amount of epidemiological and public health data that require analysis and interpretation by using appropriate mathematical tools to support the existing method to control the mosquito and mosquito-borne diseases in a more effective way, data-mining tools are used to make sense from the chaos. Using data-mining tools, one can develop predictive models, patterns, association rules, and clusters of diseases, which can help the decision-makers in controlling the diseases. This paper mainly focuses on the applications of data-mining tools that have been used for the first time to prioritize the malaria endemic regions in Manipur state by using Self Organizing Maps (SOM). The SOM results (in two-dimensional images called Kohonen maps) clearly show the visual classification of malaria endemic zones into high, medium and low in the different districts of Manipur, and will be discussed in the paper.

  16. Genomic research and data-mining technology: implications for personal privacy and informed consent.

    PubMed

    Tavani, Herman T

    2004-01-01

    This essay examines issues involving personal privacy and informed consent that arise at the intersection of information and communication technology (ICT) and population genomics research. I begin by briefly examining the ethical, legal, and social implications (ELSI) program requirements that were established to guide researchers working on the Human Genome Project (HGP). Next I consider a case illustration involving deCODE Genetics, a privately owned genetic company in Iceland, which raises some ethical concerns that are not clearly addressed in the current ELSI guidelines. The deCODE case also illustrates some ways in which an ICT technique known as data mining has both aided and posed special challenges for researchers working in the field of population genomics. On the one hand, data-mining tools have greatly assisted researchers in mapping the human genome and in identifying certain "disease genes" common in specific populations (which, in turn, has accelerated the process of finding cures for diseases tha affect those populations). On the other hand, this technology has significantly threatened the privacy of research subjects participating in population genomics studies, who may, unwittingly, contribute to the construction of new groups (based on arbitrary and non-obvious patterns and statistical correlations) that put those subjects at risk for discrimination and stigmatization. In the final section of this paper I examine some ways in which the use of data mining in the context of population genomics research poses a critical challenge for the principle of informed consent, which traditionally has played a central role in protecting the privacy interests of research subjects participating in epidemiological studies.

  17. Distributed data mining on grids: services, tools, and applications.

    PubMed

    Cannataro, Mario; Congiusta, Antonio; Pugliese, Andrea; Talia, Domenico; Trunfio, Paolo

    2004-12-01

    Data mining algorithms are widely used today for the analysis of large corporate and scientific datasets stored in databases and data archives. Industry, science, and commerce fields often need to analyze very large datasets maintained over geographically distributed sites by using the computational power of distributed and parallel systems. The grid can play a significant role in providing an effective computational support for distributed knowledge discovery applications. For the development of data mining applications on grids we designed a system called Knowledge Grid. This paper describes the Knowledge Grid framework and presents the toolset provided by the Knowledge Grid for implementing distributed knowledge discovery. The paper discusses how to design and implement data mining applications by using the Knowledge Grid tools starting from searching grid resources, composing software and data components, and executing the resulting data mining process on a grid. Some performance results are also discussed.

  18. Application of text mining in the biomedical domain.

    PubMed

    Fleuren, Wilco W M; Alkema, Wynand

    2015-03-01

    In recent years the amount of experimental data that is produced in biomedical research and the number of papers that are being published in this field have grown rapidly. In order to keep up to date with developments in their field of interest and to interpret the outcome of experiments in light of all available literature, researchers turn more and more to the use of automated literature mining. As a consequence, text mining tools have evolved considerably in number and quality and nowadays can be used to address a variety of research questions ranging from de novo drug target discovery to enhanced biological interpretation of the results from high throughput experiments. In this paper we introduce the most important techniques that are used for a text mining and give an overview of the text mining tools that are currently being used and the type of problems they are typically applied for. Copyright © 2015 Elsevier Inc. All rights reserved.

  19. Empirical Models of Zones Protecting Against Coal Dust Explosion

    NASA Astrophysics Data System (ADS)

    Prostański, Dariusz

    2017-09-01

    The paper presents predicted use of research' results to specify relations between volume of dust deposition and changes of its concentration in air. These were used to shape zones protecting against coal dust explosion. Methodology of research was presented, including methods of measurement of dust concentration as well as deposition. Measurements were taken in the Brzeszcze Mine within framework of MEZAP, co-financed by The National Centre for Research and Development (NCBR) and performed by the Institute of Mining Technology KOMAG, the Central Mining Institute (GIG) and the Coal Company PLC. The project enables performing of research related to measurements of volume of dust deposition as well as its concentration in air in protective zones in a number of mine workings in the Brzeszcze Mine. Developed model may be supportive tool in form of system located directly in protective zones or as operator tool warning about increasing hazard of coal dust explosion.

  20. Implementation of a Flexible Tool for Automated Literature-Mining and Knowledgebase Development (DevToxMine)

    EPA Science Inventory

    Deriving novel relationships from the scientific literature is an important adjunct to datamining activities for complex datasets in genomics and high-throughput screening activities. Automated text-mining algorithms can be used to extract relevant content from the literature and...

  1. Creating, generating and comparing random network models with NetworkRandomizer.

    PubMed

    Tosadori, Gabriele; Bestvina, Ivan; Spoto, Fausto; Laudanna, Carlo; Scardoni, Giovanni

    2016-01-01

    Biological networks are becoming a fundamental tool for the investigation of high-throughput data in several fields of biology and biotechnology. With the increasing amount of information, network-based models are gaining more and more interest and new techniques are required in order to mine the information and to validate the results. To fill the validation gap we present an app, for the Cytoscape platform, which aims at creating randomised networks and randomising existing, real networks. Since there is a lack of tools that allow performing such operations, our app aims at enabling researchers to exploit different, well known random network models that could be used as a benchmark for validating real, biological datasets. We also propose a novel methodology for creating random weighted networks, i.e. the multiplication algorithm, starting from real, quantitative data. Finally, the app provides a statistical tool that compares real versus randomly computed attributes, in order to validate the numerical findings. In summary, our app aims at creating a standardised methodology for the validation of the results in the context of the Cytoscape platform.

  2. Micron to Mine: Synchrotron Science for Mineral Exploration, Production, and Remediation

    NASA Astrophysics Data System (ADS)

    Banerjee, N.; Van Loon, L.; Flynn, T.

    2017-12-01

    Synchrotron science for mineral exploration, production, and remediation studies is a powerful tool that provides industry with relevant micron to macro geochemical information. Synchrotron micro X-ray fluorescence (SR-µXRF) offers a direct, high-resolution, rapid, and cost-effective chemical analysis while preserving the context of the sample by mapping ore minerals with ppm detection limits. Speciation of trace and deleterious elements can then be probed using X-ray absorption near-edge structure (XANES) spectroscopy. Large-scale (tens of cm) µXRF mapping and XANES analysis of samples collected at various mine locations have been undertaken to address questions regarding mineralization history to develop novel trace element exploration vectors. This information provides integral insights into trace element associations with ore minerals, local redox conditions responsible for mineralization, and mineralizing mechanisms. Gold is commonly intimately associated with sulfide mineralization (e.g., pyrite, arsenopyrite, etc.) and is present both as inclusions and filling fractures in sulfide grains. Gold may also occur as nanoparticles and/or in the sulfide mineral crystal lattice, known as "invisible gold". Understanding the nature and distribution of invisible gold in ore is integral to processing efficiency. The high flux and energy of a synchrotron light source allows for the detection of invisible gold by µXRF, and can probe its nature (metallic Au0 vs. lattice bound Au1+) using XANES spectroscopy. The long-term containment and management of arsenic is necessary to protect the health of both humans and the environment. Understanding the relationship of arsenic mineralization to gold deposits can lead to more sophisticated planning for mineral processing and the eventual storage of gangue materials. µXANES spectroscopy is an excellent tool for determining arsenic speciation within the context of the sample. Mineral phases such as arsenopyrite, scorodite, and arsenic trioxide can be accurately identified as well as relative amounts determined. With this information the oxidation-reduction of arsenic-bearing compounds can be monitored to optimize management practices for the long-term capture of arsenic contaminants.

  3. Mining Student Data Captured from a Web-Based Tutoring Tool: Initial Exploration and Results

    ERIC Educational Resources Information Center

    Merceron, Agathe; Yacef, Kalina

    2004-01-01

    In this article we describe the initial investigations that we have conducted on student data collected from a web-based tutoring tool. We have used some data mining techniques such as association rule and symbolic data analysis, as well as traditional SQL queries to gain further insight on the students' learning and deduce information to improve…

  4. Data Mining and Optimization Tools for Developing Engine Parameters Tools

    NASA Technical Reports Server (NTRS)

    Dhawan, Atam P.

    1998-01-01

    This project was awarded for understanding the problem and developing a plan for Data Mining tools for use in designing and implementing an Engine Condition Monitoring System. Tricia Erhardt and I studied the problem domain for developing an Engine Condition Monitoring system using the sparse and non-standardized datasets to be available through a consortium at NASA Lewis Research Center. We visited NASA three times to discuss additional issues related to dataset which was not made available to us. We discussed and developed a general framework of data mining and optimization tools to extract useful information from sparse and non-standard datasets. These discussions lead to the training of Tricia Erhardt to develop Genetic Algorithm based search programs which were written in C++ and used to demonstrate the capability of GA algorithm in searching an optimal solution in noisy, datasets. From the study and discussion with NASA LeRC personnel, we then prepared a proposal, which is being submitted to NASA for future work for the development of data mining algorithms for engine conditional monitoring. The proposed set of algorithm uses wavelet processing for creating multi-resolution pyramid of tile data for GA based multi-resolution optimal search.

  5. DMT-TAFM: a data mining tool for technical analysis of futures market

    NASA Astrophysics Data System (ADS)

    Stepanov, Vladimir; Sathaye, Archana

    2002-03-01

    Technical analysis of financial markets describes many patterns of market behavior. For practical use, all these descriptions need to be adjusted for each particular trading session. In this paper, we develop a data mining tool for technical analysis of the futures markets (DMT-TAFM), which dynamically generates rules based on the notion of the price pattern similarity. The tool consists of three main components. The first component provides visualization of data series on a chart with different ranges, scales, and chart sizes and types. The second component constructs pattern descriptions using sets of polynomials. The third component specifies the training set for mining, defines the similarity notion, and searches for a set of similar patterns. DMT-TAFM is useful to prepare the data, and then reveal and systemize statistical information about similar patterns found in any type of historical price series. We performed experiments with our tool on three decades of trading data fro hundred types of futures. Our results for this data set shows that, we can prove or disprove many well-known patterns based on real data, as well as reveal new ones, and use the set of relatively consistent patterns found during data mining for developing better futures trading strategies.

  6. Text mining for search term development in systematic reviewing: A discussion of some methods and challenges.

    PubMed

    Stansfield, Claire; O'Mara-Eves, Alison; Thomas, James

    2017-09-01

    Using text mining to aid the development of database search strings for topics described by diverse terminology has potential benefits for systematic reviews; however, methods and tools for accomplishing this are poorly covered in the research methods literature. We briefly review the literature on applications of text mining for search term development for systematic reviewing. We found that the tools can be used in 5 overarching ways: improving the precision of searches; identifying search terms to improve search sensitivity; aiding the translation of search strategies across databases; searching and screening within an integrated system; and developing objectively derived search strategies. Using a case study and selected examples, we then reflect on the utility of certain technologies (term frequency-inverse document frequency and Termine, term frequency, and clustering) in improving the precision and sensitivity of searches. Challenges in using these tools are discussed. The utility of these tools is influenced by the different capabilities of the tools, the way the tools are used, and the text that is analysed. Increased awareness of how the tools perform facilitates the further development of methods for their use in systematic reviews. Copyright © 2017 John Wiley & Sons, Ltd.

  7. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges

    PubMed Central

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie; Lemberger, Thomas; McEntyre, Johanna; Polson, Shawn; Xenarios, Ioannis; Arighi, Cecilia; Lu, Zhiyong

    2016-01-01

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system ‘accuracy’ remains a challenge and identify several additional common difficulties and potential research directions including (i) the ‘scalability’ issue due to the increasing need of mining information from millions of full-text articles, (ii) the ‘interoperability’ issue of integrating various text-mining systems into existing curation workflows and (iii) the ‘reusability’ issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. Finally, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators. PMID:28025348

  8. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges

    DOE PAGES

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie; ...

    2016-12-26

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system ‘accuracy’ remains a challenge and identify several additional common difficulties and potential research directions including (i) the ‘scalability’ issue due to themore » increasing need of mining information from millions of full-text articles, (ii) the ‘interoperability’ issue of integrating various text-mining systems into existing curation workflows and (iii) the ‘reusability’ issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. In conclusion, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators.« less

  9. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system ‘accuracy’ remains a challenge and identify several additional common difficulties and potential research directions including (i) the ‘scalability’ issue due to themore » increasing need of mining information from millions of full-text articles, (ii) the ‘interoperability’ issue of integrating various text-mining systems into existing curation workflows and (iii) the ‘reusability’ issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. In conclusion, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators.« less

  10. An Evaluation of Text Mining Tools as Applied to Selected Scientific and Engineering Literature.

    ERIC Educational Resources Information Center

    Trybula, Walter J.; Wyllys, Ronald E.

    2000-01-01

    Addresses an approach to the discovery of scientific knowledge through an examination of data mining and text mining techniques. Presents the results of experiments that investigated knowledge acquisition from a selected set of technical documents by domain experts. (Contains 15 references.) (Author/LRW)

  11. Visual Based Retrieval Systems and Web Mining--Introduction.

    ERIC Educational Resources Information Center

    Iyengar, S. S.

    2001-01-01

    Briefly discusses Web mining and image retrieval techniques, and then presents a summary of articles in this special issue. Articles focus on Web content mining, artificial neural networks as tools for image retrieval, content-based image retrieval systems, and personalizing the Web browsing experience using media agents. (AEF)

  12. Applying WEPP technologies to western alkaline surface coal mines

    Treesearch

    J. Q. Wu; S. Dun; H. Rhee; X. Liu; W. J. Elliot; T. Golnar; J. R. Frankenberger; D. C. Flanagan; P. W. Conrad; R. L. McNearny

    2011-01-01

    One aspect of planning surface mining operations, regulated by the National Pollutant Discharge Elimination System (NPDES), is estimating potential environmental impacts during mining operations and the reclamation period that follows. Practical computer simulation tools are effective for evaluating site-specific sediment control and reclamation plans for the NPDES....

  13. ANCAC: amino acid, nucleotide, and codon analysis of COGs--a tool for sequence bias analysis in microbial orthologs.

    PubMed

    Meiler, Arno; Klinger, Claudia; Kaufmann, Michael

    2012-09-08

    The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC's NUCOCOG dataset as the largest one available for that purpose thus far. Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills.

  14. ANCAC: amino acid, nucleotide, and codon analysis of COGs – a tool for sequence bias analysis in microbial orthologs

    PubMed Central

    2012-01-01

    Background The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Results Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC’s NUCOCOG dataset as the largest one available for that purpose thus far. Conclusions Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills. PMID:22958836

  15. Unsupervised user similarity mining in GSM sensor networks.

    PubMed

    Shad, Shafqat Ali; Chen, Enhong

    2013-01-01

    Mobility data has attracted the researchers for the past few years because of its rich context and spatiotemporal nature, where this information can be used for potential applications like early warning system, route prediction, traffic management, advertisement, social networking, and community finding. All the mentioned applications are based on mobility profile building and user trend analysis, where mobility profile building is done through significant places extraction, user's actual movement prediction, and context awareness. However, significant places extraction and user's actual movement prediction for mobility profile building are a trivial task. In this paper, we present the user similarity mining-based methodology through user mobility profile building by using the semantic tagging information provided by user and basic GSM network architecture properties based on unsupervised clustering approach. As the mobility information is in low-level raw form, our proposed methodology successfully converts it to a high-level meaningful information by using the cell-Id location information rather than previously used location capturing methods like GPS, Infrared, and Wifi for profile mining and user similarity mining.

  16. D Reconstruction and Modeling of Subterranean Landscapes in Collaborative Mining Archeology Projects: Techniques, Applications and Experiences

    NASA Astrophysics Data System (ADS)

    Arles, A.; Clerc, P.; Sarah, G.; Téreygeol, F.; Bonnamour, G.; Heckes, J.; Klein, A.

    2013-07-01

    Mining and underground archaeology are two domains of expertise where three-dimensional data take an important part in the associated researches. Up to now, archaeologists study mines and underground networks from line-plot surveys, cross-section of galleries, and from tool marks surveys. All this kind of information can be clearly recorded back from the field from threedimensional models with a more cautious and extensive approach. Besides, the volumes of the underground structures that are very important data to explain the mining activities are difficult to evaluate from "traditional" hand-made recordings. They can now be calculated more accurately from a 3D model. Finally, reconstructed scenes are a powerful tool as thinking aid to look back again to a structure in the office or in future times. And the recorded models, rendered photo-realistically, can also be used for cultural heritage documentation presenting inaccessible and sometimes dangerous places to the public. Nowadays, thanks to modern computer technologies and highly developed software tools paired with sophisticated digital camera equipment, complex photogrammetric processes are available for moderate costs for research teams. Recognizing these advantages the authors develop and utilize image-based workflows in order to document ancient mining monuments and underground sites as a basis for further historical and archaeological researches, performed in collaborative partnership during recent projects on medieval silver mines and preventive excavations of undergrounds in France.

  17. An overview of the BioCreative 2012 Workshop Track III: interactive text mining task

    PubMed Central

    Arighi, Cecilia N.; Carterette, Ben; Cohen, K. Bretonnel; Krallinger, Martin; Wilbur, W. John; Fey, Petra; Dodson, Robert; Cooper, Laurel; Van Slyke, Ceri E.; Dahdul, Wasila; Mabee, Paula; Li, Donghui; Harris, Bethany; Gillespie, Marc; Jimenez, Silvia; Roberts, Phoebe; Matthews, Lisa; Becker, Kevin; Drabkin, Harold; Bello, Susan; Licata, Luana; Chatr-aryamontri, Andrew; Schaeffer, Mary L.; Park, Julie; Haendel, Melissa; Van Auken, Kimberly; Li, Yuling; Chan, Juancarlos; Muller, Hans-Michael; Cui, Hong; Balhoff, James P.; Chi-Yang Wu, Johnny; Lu, Zhiyong; Wei, Chih-Hsuan; Tudor, Catalina O.; Raja, Kalpana; Subramani, Suresh; Natarajan, Jeyakumar; Cejuela, Juan Miguel; Dubey, Pratibha; Wu, Cathy

    2013-01-01

    In many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to assist in biocuration becomes increasingly relevant. A number of groups have developed tools for text mining from a computer science/linguistics perspective, and there are many initiatives to curate some aspect of biology from the literature. Some biocuration efforts already make use of a text mining tool, but there have not been many broad-based systematic efforts to study which aspects of a text mining tool contribute to its usefulness for a curation task. Here, we report on an effort to bring together text mining tool developers and database biocurators to test the utility and usability of tools. Six text mining systems presenting diverse biocuration tasks participated in a formal evaluation, and appropriate biocurators were recruited for testing. The performance results from this evaluation indicate that some of the systems were able to improve efficiency of curation by speeding up the curation task significantly (∼1.7- to 2.5-fold) over manual curation. In addition, some of the systems were able to improve annotation accuracy when compared with the performance on the manually curated set. In terms of inter-annotator agreement, the factors that contributed to significant differences for some of the systems included the expertise of the biocurator on the given curation task, the inherent difficulty of the curation and attention to annotation guidelines. After the task, annotators were asked to complete a survey to help identify strengths and weaknesses of the various systems. The analysis of this survey highlights how important task completion is to the biocurators’ overall experience of a system, regardless of the system’s high score on design, learnability and usability. In addition, strategies to refine the annotation guidelines and systems documentation, to adapt the tools to the needs and query types the end user might have and to evaluate performance in terms of efficiency, user interface, result export and traditional evaluation metrics have been analyzed during this task. This analysis will help to plan for a more intense study in BioCreative IV. PMID:23327936

  18. An overview of the BioCreative 2012 Workshop Track III: interactive text mining task.

    PubMed

    Arighi, Cecilia N; Carterette, Ben; Cohen, K Bretonnel; Krallinger, Martin; Wilbur, W John; Fey, Petra; Dodson, Robert; Cooper, Laurel; Van Slyke, Ceri E; Dahdul, Wasila; Mabee, Paula; Li, Donghui; Harris, Bethany; Gillespie, Marc; Jimenez, Silvia; Roberts, Phoebe; Matthews, Lisa; Becker, Kevin; Drabkin, Harold; Bello, Susan; Licata, Luana; Chatr-aryamontri, Andrew; Schaeffer, Mary L; Park, Julie; Haendel, Melissa; Van Auken, Kimberly; Li, Yuling; Chan, Juancarlos; Muller, Hans-Michael; Cui, Hong; Balhoff, James P; Chi-Yang Wu, Johnny; Lu, Zhiyong; Wei, Chih-Hsuan; Tudor, Catalina O; Raja, Kalpana; Subramani, Suresh; Natarajan, Jeyakumar; Cejuela, Juan Miguel; Dubey, Pratibha; Wu, Cathy

    2013-01-01

    In many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to assist in biocuration becomes increasingly relevant. A number of groups have developed tools for text mining from a computer science/linguistics perspective, and there are many initiatives to curate some aspect of biology from the literature. Some biocuration efforts already make use of a text mining tool, but there have not been many broad-based systematic efforts to study which aspects of a text mining tool contribute to its usefulness for a curation task. Here, we report on an effort to bring together text mining tool developers and database biocurators to test the utility and usability of tools. Six text mining systems presenting diverse biocuration tasks participated in a formal evaluation, and appropriate biocurators were recruited for testing. The performance results from this evaluation indicate that some of the systems were able to improve efficiency of curation by speeding up the curation task significantly (∼1.7- to 2.5-fold) over manual curation. In addition, some of the systems were able to improve annotation accuracy when compared with the performance on the manually curated set. In terms of inter-annotator agreement, the factors that contributed to significant differences for some of the systems included the expertise of the biocurator on the given curation task, the inherent difficulty of the curation and attention to annotation guidelines. After the task, annotators were asked to complete a survey to help identify strengths and weaknesses of the various systems. The analysis of this survey highlights how important task completion is to the biocurators' overall experience of a system, regardless of the system's high score on design, learnability and usability. In addition, strategies to refine the annotation guidelines and systems documentation, to adapt the tools to the needs and query types the end user might have and to evaluate performance in terms of efficiency, user interface, result export and traditional evaluation metrics have been analyzed during this task. This analysis will help to plan for a more intense study in BioCreative IV.

  19. Management of the water balance and quality in mining areas

    NASA Astrophysics Data System (ADS)

    Pasanen, Antti; Krogerus, Kirsti; Mroueh, Ulla-Maija; Turunen, Kaisa; Backnäs, Soile; Vento, Tiia; Veijalainen, Noora; Hentinen, Kimmo; Korkealaakso, Juhani

    2015-04-01

    Although mining companies have long been conscious of water related risks they still face environmental management problems. These problems mainly emerge because mine sites' water balances have not been adequately assessed in the stage of the planning of mines. More consistent approach is required to help mining companies identify risks and opportunities related to the management of water resources in all stages of mining. This approach requires that the water cycle of a mine site is interconnected with the general hydrologic water cycle. In addition to knowledge on hydrological conditions, the control of the water balance in the mining processes require knowledge of mining processes, the ability to adjust process parameters to variable hydrological conditions, adaptation of suitable water management tools and systems, systematic monitoring of amounts and quality of water, adequate capacity in water management infrastructure to handle the variable water flows, best practices to assess the dispersion, mixing and dilution of mine water and pollutant loading to receiving water bodies, and dewatering and separation of water from tailing and precipitates. WaterSmart project aims to improve the awareness of actual quantities of water, and water balances in mine areas to improve the forecasting and the management of the water volumes. The study is executed through hydrogeological and hydrological surveys and online monitoring procedures. One of the aims is to exploit on-line water quantity and quality monitoring for the better management of the water balances. The target is to develop a practical and end-user-specific on-line input and output procedures. The second objective is to develop mathematical models to calculate combined water balances including the surface, ground and process waters. WSFS, the Hydrological Modeling and Forecasting System of SYKE is being modified for mining areas. New modelling tools are developed on spreadsheet and system dynamics platforms to systematically integrate all water balance components (groundwater, surface water, infiltration, precipitation, mine water facilities and operations etc.) into overall dynamic mine site considerations. After coupling the surface and ground water models (e.g. Feflow and WSFS) with each other, they are compared with Goldsim. The third objective is to integrate the monitoring and modelling tools into the mine management system and process control. The modelling and predictive process control can prevent flood situations, ensure water adequacy, and enable the controlled mine water treatment. The project will develop a constantly updated management system for water balance including both natural waters and process waters.

  20. Underground coal mine instrumentation and test

    NASA Technical Reports Server (NTRS)

    Burchill, R. F.; Waldron, W. D.

    1976-01-01

    The need to evaluate mechanical performance of mine tools and to obtain test performance data from candidate systems dictate that an engineering data recording system be built. Because of the wide range of test parameters which would be evaluated, a general purpose data gathering system was designed and assembled to permit maximum versatility. A primary objective of this program was to provide a specific operating evaluation of a longwall mining machine vibration response under normal operating conditions. A number of mines were visited and a candidate for test evaluation was selected, based upon management cooperation, machine suitability, and mine conditions. Actual mine testing took place in a West Virginia mine.

  1. High area rate reconnaissance (HARR) and mine reconnaissance/hunter (MR/H) exploratory development programs

    NASA Astrophysics Data System (ADS)

    Lathrop, John D.

    1995-06-01

    This paper describes the sea mine countermeasures developmental context, technology goals, and progress to date of the two principal Office of Naval Research exploratory development programs addressing sea mine reconnaissance and minehunting technology development. The first of these programs, High Area Rate Reconnaissance, is developing toroidal volume search sonar technology, sidelooking sonar technology, and associated signal processing technologies (motion compensation, beamforming, and computer-aided detection and classification) for reconnaissance and hunting against volume mines and proud bottom mines from 21-inch diameter vehicles operating in deeper waters. The second of these programs, Amphibious Operation Area Mine Reconnaissance/Hunter, is developing a suite of sensor technologies (synthetic aperture sonar, ahead-looking sonar, superconducting magnetic field gradiometer, and electro-optic sensor) and associated signal processing technologies for reconnaissance and hunting against all mine types (including buried mines) in shallow water and very shallow water from 21-inch diameter vehicles. The technologies under development by these two programs must provide excellent capabilities for mine detection, mine classification, and discrimination against false targets.

  2. An overview of the biocreative 2012 workshop track III: Interactive text mining task

    USDA-ARS?s Scientific Manuscript database

    An important question is how to make use of text mining to enhance the biocuration workflow. A number of groups have developed tools for text mining from a computer science/linguistics perspective and there are many initiatives to curate some aspect of biology from the literature. In some cases the ...

  3. Cataloging the biomedical world of pain through semi-automated curation of molecular interactions

    PubMed Central

    Jamieson, Daniel G.; Roberts, Phoebe M.; Robertson, David L.; Sidders, Ben; Nenadic, Goran

    2013-01-01

    The vast collection of biomedical literature and its continued expansion has presented a number of challenges to researchers who require structured findings to stay abreast of and analyze molecular mechanisms relevant to their domain of interest. By structuring literature content into topic-specific machine-readable databases, the aggregate data from multiple articles can be used to infer trends that can be compared and contrasted with similar findings from topic-independent resources. Our study presents a generalized procedure for semi-automatically creating a custom topic-specific molecular interaction database through the use of text mining to assist manual curation. We apply the procedure to capture molecular events that underlie ‘pain’, a complex phenomenon with a large societal burden and unmet medical need. We describe how existing text mining solutions are used to build a pain-specific corpus, extract molecular events from it, add context to the extracted events and assess their relevance. The pain-specific corpus contains 765 692 documents from Medline and PubMed Central, from which we extracted 356 499 unique normalized molecular events, with 261 438 single protein events and 93 271 molecular interactions supplied by BioContext. Event chains are annotated with negation, speculation, anatomy, Gene Ontology terms, mutations, pain and disease relevance, which collectively provide detailed insight into how that event chain is associated with pain. The extracted relations are visualized in a wiki platform (wiki-pain.org) that enables efficient manual curation and exploration of the molecular mechanisms that underlie pain. Curation of 1500 grouped event chains ranked by pain relevance revealed 613 accurately extracted unique molecular interactions that in the future can be used to study the underlying mechanisms involved in pain. Our approach demonstrates that combining existing text mining tools with domain-specific terms and wiki-based visualization can facilitate rapid curation of molecular interactions to create a custom database. Database URL: ••• PMID:23707966

  4. Cataloging the biomedical world of pain through semi-automated curation of molecular interactions.

    PubMed

    Jamieson, Daniel G; Roberts, Phoebe M; Robertson, David L; Sidders, Ben; Nenadic, Goran

    2013-01-01

    The vast collection of biomedical literature and its continued expansion has presented a number of challenges to researchers who require structured findings to stay abreast of and analyze molecular mechanisms relevant to their domain of interest. By structuring literature content into topic-specific machine-readable databases, the aggregate data from multiple articles can be used to infer trends that can be compared and contrasted with similar findings from topic-independent resources. Our study presents a generalized procedure for semi-automatically creating a custom topic-specific molecular interaction database through the use of text mining to assist manual curation. We apply the procedure to capture molecular events that underlie 'pain', a complex phenomenon with a large societal burden and unmet medical need. We describe how existing text mining solutions are used to build a pain-specific corpus, extract molecular events from it, add context to the extracted events and assess their relevance. The pain-specific corpus contains 765 692 documents from Medline and PubMed Central, from which we extracted 356 499 unique normalized molecular events, with 261 438 single protein events and 93 271 molecular interactions supplied by BioContext. Event chains are annotated with negation, speculation, anatomy, Gene Ontology terms, mutations, pain and disease relevance, which collectively provide detailed insight into how that event chain is associated with pain. The extracted relations are visualized in a wiki platform (wiki-pain.org) that enables efficient manual curation and exploration of the molecular mechanisms that underlie pain. Curation of 1500 grouped event chains ranked by pain relevance revealed 613 accurately extracted unique molecular interactions that in the future can be used to study the underlying mechanisms involved in pain. Our approach demonstrates that combining existing text mining tools with domain-specific terms and wiki-based visualization can facilitate rapid curation of molecular interactions to create a custom database. Database URL: •••

  5. The requirements for implementing Sustainable Development Goals (SDGs) and for planning and implementing Integrated Territorial Investments (ITI) in mining areas

    NASA Astrophysics Data System (ADS)

    Florkowska, Lucyna; Bryt-Nitarska, Izabela

    2018-04-01

    The notion of Integrated Territorial Investments (ITI) appears more and more frequently in contemporary regional development strategies. Formulating the main assumptions of ITI is a response to a growing need for a co-ordinated, multi-dimensional regional development suitable for the characteristics of a given area. Activities are mainly aimed at improving people's quality of life with their significant participation. These activities include implementing the Sustainable development Goals (SDGs). Territorial investments include, among others, projects in areas where land and building use is governed not only by general regulations (Spatial Planning and Land Development Act) but also by separate legal acts. This issue also concerns areas with active mines and post-mining areas undergoing revitalization. For the areas specified above land development and in particular making building investments is subject to the requirements set forth in the Geological and Mining Law and in the general regulations. In practice this means that factors connected with the present and future mining impacts must be taken into consideration in planning the investment process. This article discusses the role of proper assessment of local geological conditions as well as the current and future mining situation in the context of proper planning and performance of the Integrated Territorial Investment programme and also in the context of implementing the SDGs. It also describes the technical and legislative factors which need to be taken into consideration in areas where mining is planned or where it took place in the past.

  6. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges.

    PubMed

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie; Lemberger, Thomas; McEntyre, Johanna; Polson, Shawn; Xenarios, Ioannis; Arighi, Cecilia; Lu, Zhiyong

    2016-01-01

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system 'accuracy' remains a challenge and identify several additional common difficulties and potential research directions including (i) the 'scalability' issue due to the increasing need of mining information from millions of full-text articles, (ii) the 'interoperability' issue of integrating various text-mining systems into existing curation workflows and (iii) the 'reusability' issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. Finally, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.

  7. Finding Spatio-Temporal Patterns in Large Sensor Datasets

    ERIC Educational Resources Information Center

    McGuire, Michael Patrick

    2010-01-01

    Spatial or temporal data mining tasks are performed in the context of the relevant space, defined by a spatial neighborhood, and the relevant time period, defined by a specific time interval. Furthermore, when mining large spatio-temporal datasets, interesting patterns typically emerge where the dataset is most dynamic. This dissertation is…

  8. Contextual Text Mining

    ERIC Educational Resources Information Center

    Mei, Qiaozhu

    2009-01-01

    With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the…

  9. Sewer-mining: A water reuse option supporting circular economy, public service provision and entrepreneurship.

    PubMed

    Makropoulos, C; Rozos, E; Tsoukalas, I; Plevri, A; Karakatsanis, G; Karagiannidis, L; Makri, E; Lioumis, C; Noutsopoulos, C; Mamais, D; Rippis, C; Lytras, E

    2018-06-15

    Water scarcity, either due to increased urbanisation or climatic variability, has motivated societies to reduce pressure on water resources mainly by reducing water demand. However, this practice alone is not sufficient to guarantee the quality of life that high quality water services underpin, especially within a context of increased urbanisation. As such, the idea of water reuse has been gaining momentum for some time and has recently found a more general context within the idea of the Circular Economy. This paper is set within the context of an ongoing discussion between centralized and decentralized water reuse techniques and the investigation of trade-offs between efficiency and economic viability of reuse at different scales. Specifically, we argue for an intermediate scale of a water reuse option termed 'sewer-mining', which could be considered a reuse scheme at the neighbourhood scale. We suggest that sewer mining (a) provides a feasible alternative reuse option when the geography of the wastewater treatment plant is problematic, (b) relies on mature treatment technologies and (c) presents an opportunity for Small Medium Enterprises (SME) to be involved in the water market, securing environmental, social and economic benefits. To support this argument, we report on a pilot sewer-mining application in Athens, Greece. The pilot, integrates two subsystems: a packaged treatment unit and an information and communications technology (ICT) infrastructure. The paper reports on the pilot's overall performance and critically evaluates the potential of the sewer-mining idea to become a significant piece of the circular economy puzzle for water. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Application of the Biosonar Measurement Tool (BMT) and Instrumented Mine Simulators (IMS) to Exploration of Dolphin Echolocation During Free-Swimming, Bottom-Object Searches

    DTIC Science & Technology

    2003-09-01

    0-933957-31-9 311 Application of the Biosonar Measurement Tool (BMT) and Instrumented...dolphin biosonar (echolocation). Research work conducted by the Navy has addressed the characteristics of echolocation clicks, mechanisms of...information on dolphin echolocation that can be data mined for biosonar search strategies under real-world conditions. Results can be applied to the

  11. Test operation of a pneumatic vibrating-blade planer in phosphate and coal: a progress report on planer-mining research, 1958--1960

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anderson, W.S.

    1962-01-01

    Third report in a series describes progress in research with the pneumatic vibrating-blade planer: Tests conducted in the Arickaree phosphate mine in Utah and in the Roslyn No. 9 coal mine in Washington. After the Arickaree mine tests, the bit design was improved, and tests were conducted in the Roslyn No. 9 mine to check the modifications. The redesigned cutting tool was an improvement, and the possibility of planing coal as well as phosphate was proved.

  12. Mining Context-Aware Association Rules Using Grammar-Based Genetic Programming.

    PubMed

    Luna, Jose Maria; Pechenizkiy, Mykola; Del Jesus, Maria Jose; Ventura, Sebastian

    2017-09-25

    Real-world data usually comprise features whose interpretation depends on some contextual information. Such contextual-sensitive features and patterns are of high interest to be discovered and analyzed in order to obtain the right meaning. This paper formulates the problem of mining context-aware association rules, which refers to the search for associations between itemsets such that the strength of their implication depends on a contextual feature. For the discovery of this type of associations, a model that restricts the search space and includes syntax constraints by means of a grammar-based genetic programming methodology is proposed. Grammars can be considered as a useful way of introducing subjective knowledge to the pattern mining process as they are highly related to the background knowledge of the user. The performance and usefulness of the proposed approach is examined by considering synthetically generated datasets. A posteriori analysis on different domains is also carried out to demonstrate the utility of this kind of associations. For example, in educational domains, it is essential to identify and understand contextual and context-sensitive factors that affect overall and individual student behavior and performance. The results of the experiments suggest that the approach is feasible and it automatically identifies interesting context-aware associations from real-world datasets.

  13. Organizational-Legal and Technological Aspects of Ensuring Environmental Safety of Mining Enterprises: Perspective Analysis in the Context of the General Enhancement of Environmental Problem

    NASA Astrophysics Data System (ADS)

    Vorontsova, Elena; Vorontsov, Andrey; Drozdenko, Yuriy

    2017-11-01

    The article is devoted to the analysis of problems of maintenance of ecological safety of the mining enterprises. The aim of the work was the formulation of proposals, the implementation of which, in the opinion of the authors, is capable of raising the level of environmental safety of the mining industry and ultimately ensuring the environmentally oriented growth of the Russian economy.

  14. Data mining: sophisticated forms of managed care modeling through artificial intelligence.

    PubMed

    Borok, L S

    1997-01-01

    Data mining is a recent development in computer science that combines artificial intelligence algorithms and relational databases to discover patterns automatically, without the use of traditional statistical methods. Work with data mining tools in health care is in a developmental stage that holds great promise, given the combination of demographic and diagnostic information.

  15. Closedure - Mine Closure Technologies Resource

    NASA Astrophysics Data System (ADS)

    Kauppila, Päivi; Kauppila, Tommi; Pasanen, Antti; Backnäs, Soile; Liisa Räisänen, Marja; Turunen, Kaisa; Karlsson, Teemu; Solismaa, Lauri; Hentinen, Kimmo

    2015-04-01

    Closure of mining operations is an essential part of the development of eco-efficient mining and the Green Mining concept in Finland to reduce the environmental footprint of mining. Closedure is a 2-year joint research project between Geological Survey of Finland and Technical Research Centre of Finland that aims at developing accessible tools and resources for planning, executing and monitoring mine closure. The main outcome of the Closedure project is an updatable wiki technology-based internet platform (http://mineclosure.gtk.fi) in which comprehensive guidance on the mine closure is provided and main methods and technologies related to mine closure are evaluated. Closedure also provides new data on the key issues of mine closure, such as performance of passive water treatment in Finland, applicability of test methods for evaluating cover structures for mining wastes, prediction of water effluents from mine wastes, and isotopic and geophysical methods to recognize contaminant transport paths in crystalline bedrock.

  16. Human Behavior Analysis by Means of Multimodal Context Mining

    PubMed Central

    Banos, Oresti; Villalonga, Claudia; Bang, Jaehun; Hur, Taeho; Kang, Donguk; Park, Sangbeom; Huynh-The, Thien; Le-Ba, Vui; Amin, Muhammad Bilal; Razzaq, Muhammad Asif; Khan, Wahajat Ali; Hong, Choong Seon; Lee, Sungyoung

    2016-01-01

    There is sufficient evidence proving the impact that negative lifestyle choices have on people’s health and wellness. Changing unhealthy behaviours requires raising people’s self-awareness and also providing healthcare experts with a thorough and continuous description of the user’s conduct. Several monitoring techniques have been proposed in the past to track users’ behaviour; however, these approaches are either subjective and prone to misreporting, such as questionnaires, or only focus on a specific component of context, such as activity counters. This work presents an innovative multimodal context mining framework to inspect and infer human behaviour in a more holistic fashion. The proposed approach extends beyond the state-of-the-art, since it not only explores a sole type of context, but also combines diverse levels of context in an integral manner. Namely, low-level contexts, including activities, emotions and locations, are identified from heterogeneous sensory data through machine learning techniques. Low-level contexts are combined using ontological mechanisms to derive a more abstract representation of the user’s context, here referred to as high-level context. An initial implementation of the proposed framework supporting real-time context identification is also presented. The developed system is evaluated for various realistic scenarios making use of a novel multimodal context open dataset and data on-the-go, demonstrating prominent context-aware capabilities at both low and high levels. PMID:27517928

  17. Human Behavior Analysis by Means of Multimodal Context Mining.

    PubMed

    Banos, Oresti; Villalonga, Claudia; Bang, Jaehun; Hur, Taeho; Kang, Donguk; Park, Sangbeom; Huynh-The, Thien; Le-Ba, Vui; Amin, Muhammad Bilal; Razzaq, Muhammad Asif; Khan, Wahajat Ali; Hong, Choong Seon; Lee, Sungyoung

    2016-08-10

    There is sufficient evidence proving the impact that negative lifestyle choices have on people's health and wellness. Changing unhealthy behaviours requires raising people's self-awareness and also providing healthcare experts with a thorough and continuous description of the user's conduct. Several monitoring techniques have been proposed in the past to track users' behaviour; however, these approaches are either subjective and prone to misreporting, such as questionnaires, or only focus on a specific component of context, such as activity counters. This work presents an innovative multimodal context mining framework to inspect and infer human behaviour in a more holistic fashion. The proposed approach extends beyond the state-of-the-art, since it not only explores a sole type of context, but also combines diverse levels of context in an integral manner. Namely, low-level contexts, including activities, emotions and locations, are identified from heterogeneous sensory data through machine learning techniques. Low-level contexts are combined using ontological mechanisms to derive a more abstract representation of the user's context, here referred to as high-level context. An initial implementation of the proposed framework supporting real-time context identification is also presented. The developed system is evaluated for various realistic scenarios making use of a novel multimodal context open dataset and data on-the-go, demonstrating prominent context-aware capabilities at both low and high levels.

  18. Automatic detection of referral patients due to retinal pathologies through data mining.

    PubMed

    Quellec, Gwenolé; Lamard, Mathieu; Erginay, Ali; Chabouis, Agnès; Massin, Pascale; Cochener, Béatrice; Cazuguel, Guy

    2016-04-01

    With the increased prevalence of retinal pathologies, automating the detection of these pathologies is becoming more and more relevant. In the past few years, many algorithms have been developed for the automated detection of a specific pathology, typically diabetic retinopathy, using eye fundus photography. No matter how good these algorithms are, we believe many clinicians would not use automatic detection tools focusing on a single pathology and ignoring any other pathology present in the patient's retinas. To solve this issue, an algorithm for characterizing the appearance of abnormal retinas, as well as the appearance of the normal ones, is presented. This algorithm does not focus on individual images: it considers examination records consisting of multiple photographs of each retina, together with contextual information about the patient. Specifically, it relies on data mining in order to learn diagnosis rules from characterizations of fundus examination records. The main novelty is that the content of examination records (images and context) is characterized at multiple levels of spatial and lexical granularity: 1) spatial flexibility is ensured by an adaptive decomposition of composite retinal images into a cascade of regions, 2) lexical granularity is ensured by an adaptive decomposition of the feature space into a cascade of visual words. This multigranular representation allows for great flexibility in automatically characterizing normality and abnormality: it is possible to generate diagnosis rules whose precision and generalization ability can be traded off depending on data availability. A variation on usual data mining algorithms, originally designed to mine static data, is proposed so that contextual and visual data at adaptive granularity levels can be mined. This framework was evaluated in e-ophtha, a dataset of 25,702 examination records from the OPHDIAT screening network, as well as in the publicly-available Messidor dataset. It was successfully applied to the detection of patients that should be referred to an ophthalmologist and also to the specific detection of several pathologies. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. To Walk the Earth in Safety: The United States’ Commitment to Humanitarian Mine Action and Conventional Weapons Destruction

    DTIC Science & Technology

    2008-06-01

    Organization of American States 42 Mine Action Information Center 46 Mine Detection Dog Center 48 for Southeast Europe International Trust Fund for...probes. Mine detecting dogs (MDDs), and mechanical demining tools, such as flails and tillers, are also used. Other bio-sensors such as bees and...members; the survivor must overcome both physical difficulties and feelings of inadequacy or worthlessness to regain a productive life. For these

  20. QuadBase2: web server for multiplexed guanine quadruplex mining and visualization

    PubMed Central

    Dhapola, Parashar; Chowdhury, Shantanu

    2016-01-01

    DNA guanine quadruplexes or G4s are non-canonical DNA secondary structures which affect genomic processes like replication, transcription and recombination. G4s are computationally identified by specific nucleotide motifs which are also called putative G4 (PG4) motifs. Despite the general relevance of these structures, there is currently no tool available that can allow batch queries and genome-wide analysis of these motifs in a user-friendly interface. QuadBase2 (quadbase.igib.res.in) presents a completely reinvented web server version of previously published QuadBase database. QuadBase2 enables users to mine PG4 motifs in up to 178 eukaryotes through the EuQuad module. This module interfaces with Ensembl Compara database, to allow users mine PG4 motifs in the orthologues of genes of interest across eukaryotes. PG4 motifs can be mined across genes and their promoter sequences in 1719 prokaryotes through ProQuad module. This module includes a feature that allows genome-wide mining of PG4 motifs and their visualization as circular histograms. TetraplexFinder, the module for mining PG4 motifs in user-provided sequences is now capable of handling up to 20 MB of data. QuadBase2 is a comprehensive PG4 motif mining tool that further expands the configurations and algorithms for mining PG4 motifs in a user-friendly way. PMID:27185890

  1. Mining Land Subsidence Monitoring Using SENTINEL-1 SAR Data

    NASA Astrophysics Data System (ADS)

    Yuan, W.; Wang, Q.; Fan, J.; Li, H.

    2017-09-01

    In this paper, DInSAR technique was used to monitor land subsidence in mining area. The study area was selected in the coal mine area located in Yuanbaoshan District, Chifeng City, and Sentinel-1 data were used to carry out DInSAR techniqu. We analyzed the interferometric results by Sentinel-1 data from December 2015 to May 2016. Through the comparison of the results of DInSAR technique and the location of the mine on the optical images, it is shown that DInSAR technique can be used to effectively monitor the land subsidence caused by underground mining, and it is an effective tool for law enforcement of over-mining.

  2. Financing Renewable Energy Projects on Contaminated Lands, Landfills, and Mine Sites

    EPA Pesticide Factsheets

    Provides information concerning financing tools and structures, as well as federal financial incentives that may be available for redeveloping potentially contaminated sites, landfills, or mine sites for renewable energy for site owners.

  3. Software Tools Streamline Project Management

    NASA Technical Reports Server (NTRS)

    2009-01-01

    Three innovative software inventions from Ames Research Center (NETMARK, Program Management Tool, and Query-Based Document Management) are finding their way into NASA missions as well as industry applications. The first, NETMARK, is a program that enables integrated searching of data stored in a variety of databases and documents, meaning that users no longer have to look in several places for related information. NETMARK allows users to search and query information across all of these sources in one step. This cross-cutting capability in information analysis has exponentially reduced the amount of time needed to mine data from days or weeks to mere seconds. NETMARK has been used widely throughout NASA, enabling this automatic integration of information across many documents and databases. NASA projects that use NETMARK include the internal reporting system and project performance dashboard, Erasmus, NASA s enterprise management tool, which enhances organizational collaboration and information sharing through document routing and review; the Integrated Financial Management Program; International Space Station Knowledge Management; Mishap and Anomaly Information Reporting System; and management of the Mars Exploration Rovers. Approximately $1 billion worth of NASA s projects are currently managed using Program Management Tool (PMT), which is based on NETMARK. PMT is a comprehensive, Web-enabled application tool used to assist program and project managers within NASA enterprises in monitoring, disseminating, and tracking the progress of program and project milestones and other relevant resources. The PMT consists of an integrated knowledge repository built upon advanced enterprise-wide database integration techniques and the latest Web-enabled technologies. The current system is in a pilot operational mode allowing users to automatically manage, track, define, update, and view customizable milestone objectives and goals. The third software invention, Query-Based Document Management (QBDM) is a tool that enables content or context searches, either simple or hierarchical, across a variety of databases. The system enables users to specify notification subscriptions where they associate "contexts of interest" and "events of interest" to one or more documents or collection(s) of documents. Based on these subscriptions, users receive notification when the events of interest occur within the contexts of interest for associated document or collection(s) of documents. Users can also associate at least one notification time as part of the notification subscription, with at least one option for the time period of notifications.

  4. Text mining resources for the life sciences.

    PubMed

    Przybyła, Piotr; Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia

    2016-01-01

    Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable-those that have the crucial ability to share information, enabling smooth integration and reusability. © The Author(s) 2016. Published by Oxford University Press.

  5. Text mining resources for the life sciences

    PubMed Central

    Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia

    2016-01-01

    Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable—those that have the crucial ability to share information, enabling smooth integration and reusability. PMID:27888231

  6. Indoor metallic pollution and children exposure in a mining city.

    PubMed

    Barbieri, Enio; Fontúrbel, Francisco E; Herbas, Cristian; Barbieri, Flavia L; Gardon, Jacques

    2014-07-15

    Mining industries are known for causing strong environmental contamination. In most developing countries, the management of mining wastes is not adequate, usually contaminating soil, water and air. This situation is a source of concern for human settlements located near mining centers, especially for vulnerable populations such as children. The aim of this study was to assess the correlations of the metallic concentrations between household dust and children hair, comparing these associations in two different contamination contexts: a mining district and a suburban non-mining area. We collected 113 hair samples from children between 7 and 12 years of age in elementary schools in the mining city of Oruro, Bolivia. We collected 97 indoor dust samples from their households, as well as information about the children's behavior. Analyses of hair and dust samples were conducted to measure As, Cd, Pb, Sb, Sn, Cu and Zn contents. In the mining district, there were significant correlations between non-essential metallic elements (As, Cd, Pb, Sb and Sn) in dust and hair, but not for essential elements (Cu and Zn), which remained after adjusting for children habits. Children who played with dirt had higher dust-hair correlations for Pb, Sb, and Cu (P=0.006; 0.022 and 0.001 respectively) and children who put hands or toys in their mouths had higher dust-hair correlations of Cd (P=0.011). On the contrary, in the suburban area, no significant correlations were found between metallic elements in dust and children hair and neither children behavior nor gender modified this lack of associations. Our results suggest that, in a context of high metallic contamination, indoor dust becomes an important exposure pathway for children, modulated by their playing behavior. Copyright © 2014 Elsevier B.V. All rights reserved.

  7. Recommendation in Higher Education Using Data Mining Techniques

    ERIC Educational Resources Information Center

    Vialardi, Cesar; Bravo, Javier; Shafti, Leila; Ortigosa, Alvaro

    2009-01-01

    One of the main problems faced by university students is to take the right decision in relation to their academic itinerary based on available information (for example courses, schedules, sections, classrooms and professors). In this context, this work proposes the use of a recommendation system based on data mining techniques to help students to…

  8. Target-Based Maintenance of Privacy Preserving Association Rules

    ERIC Educational Resources Information Center

    Ahluwalia, Madhu V.

    2011-01-01

    In the context of association rule mining, the state-of-the-art in privacy preserving data mining provides solutions for categorical and Boolean association rules but not for quantitative association rules. This research fills this gap by describing a method based on discrete wavelet transform (DWT) to protect input data privacy while preserving…

  9. RADSS: an integration of GIS, spatial statistics, and network service for regional data mining

    NASA Astrophysics Data System (ADS)

    Hu, Haitang; Bao, Shuming; Lin, Hui; Zhu, Qing

    2005-10-01

    Regional data mining, which aims at the discovery of knowledge about spatial patterns, clusters or association between regions, has widely applications nowadays in social science, such as sociology, economics, epidemiology, crime, and so on. Many applications in the regional or other social sciences are more concerned with the spatial relationship, rather than the precise geographical location. Based on the spatial continuity rule derived from Tobler's first law of geography: observations at two sites tend to be more similar to each other if the sites are close together than if far apart, spatial statistics, as an important means for spatial data mining, allow the users to extract the interesting and useful information like spatial pattern, spatial structure, spatial association, spatial outlier and spatial interaction, from the vast amount of spatial data or non-spatial data. Therefore, by integrating with the spatial statistical methods, the geographical information systems will become more powerful in gaining further insights into the nature of spatial structure of regional system, and help the researchers to be more careful when selecting appropriate models. However, the lack of such tools holds back the application of spatial data analysis techniques and development of new methods and models (e.g., spatio-temporal models). Herein, we make an attempt to develop such an integrated software and apply it into the complex system analysis for the Poyang Lake Basin. This paper presents a framework for integrating GIS, spatial statistics and network service in regional data mining, as well as their implementation. After discussing the spatial statistics methods involved in regional complex system analysis, we introduce RADSS (Regional Analysis and Decision Support System), our new regional data mining tool, by integrating GIS, spatial statistics and network service. RADSS includes the functions of spatial data visualization, exploratory spatial data analysis, and spatial statistics. The tool also includes some fundamental spatial and non-spatial database in regional population and environment, which can be updated by external database via CD or network. Utilizing this data mining and exploratory analytical tool, the users can easily and quickly analyse the huge mount of the interrelated regional data, and better understand the spatial patterns and trends of the regional development, so as to make a credible and scientific decision. Moreover, it can be used as an educational tool for spatial data analysis and environmental studies. In this paper, we also present a case study on Poyang Lake Basin as an application of the tool and spatial data mining in complex environmental studies. At last, several concluding remarks are discussed.

  10. Transplantation of epiphytic bioaccumulators (Tillandsia capillaris) for high spatial resolution biomonitoring of trace elements and point sources deconvolution in a complex mining/smelting urban context

    NASA Astrophysics Data System (ADS)

    Goix, Sylvaine; Resongles, Eléonore; Point, David; Oliva, Priscia; Duprey, Jean Louis; de la Galvez, Erika; Ugarte, Lincy; Huayta, Carlos; Prunier, Jonathan; Zouiten, Cyril; Gardon, Jacques

    2013-12-01

    Monitoring atmospheric trace elements (TE) levels and tracing their source origin is essential for exposure assessment and human health studies. Epiphytic Tillandsia capillaris plants were used as bioaccumulator of TE in a complex polymetallic mining/smelting urban context (Oruro, Bolivia). Specimens collected from a pristine reference site were transplanted at a high spatial resolution (˜1 sample/km2) throughout the urban area. About twenty-seven elements were measured after a 4-month exposure, also providing new information values for reference material BCR482. Statistical power analysis for this biomonitoring mapping approach against classical aerosols surveys performed on the same site showed the better aptitude of T. Capillaris to detect geographical trend, and to deconvolute multiple contamination sources using geostatistical principal component analysis. Transplanted specimens in the vicinity of the mining and smelting areas were characterized by extreme TE accumulation (Sn > Ag > Sb > Pb > Cd > As > W > Cu > Zn). Three contamination sources were identified: mining (Ag, Pb, Sb), smelting (As, Sn) and road traffic (Zn) emissions, confirming results of previous aerosol survey.

  11. Unsupervised User Similarity Mining in GSM Sensor Networks

    PubMed Central

    Shad, Shafqat Ali; Chen, Enhong

    2013-01-01

    Mobility data has attracted the researchers for the past few years because of its rich context and spatiotemporal nature, where this information can be used for potential applications like early warning system, route prediction, traffic management, advertisement, social networking, and community finding. All the mentioned applications are based on mobility profile building and user trend analysis, where mobility profile building is done through significant places extraction, user's actual movement prediction, and context awareness. However, significant places extraction and user's actual movement prediction for mobility profile building are a trivial task. In this paper, we present the user similarity mining-based methodology through user mobility profile building by using the semantic tagging information provided by user and basic GSM network architecture properties based on unsupervised clustering approach. As the mobility information is in low-level raw form, our proposed methodology successfully converts it to a high-level meaningful information by using the cell-Id location information rather than previously used location capturing methods like GPS, Infrared, and Wifi for profile mining and user similarity mining. PMID:23576905

  12. Development of a multilevel health and safety climate survey tool within a mining setting.

    PubMed

    Parker, Anthony W; Tones, Megan J; Ritchie, Gabrielle E

    2017-09-01

    This study aimed to design, implement and evaluate the reliability and validity of a multifactorial and multilevel health and safety climate survey (HSCS) tool with utility in the Australian mining setting. An 84-item questionnaire was developed and pilot tested on a sample of 302 Australian miners across two open cut sites. A 67-item, 10 factor solution was obtained via exploratory factor analysis (EFA) representing prioritization and attitudes to health and safety across multiple domains and organizational levels. Each factor demonstrated a high level of internal reliability, and a series of ANOVAs determined a high level of consistency in responses across the workforce, and generally irrespective of age, experience or job category. Participants tended to hold favorable views of occupational health and safety (OH&S) climate at the management, supervisor, workgroup and individual level. The survey tool demonstrated reliability and validity for use within an open cut Australian mining setting and supports a multilevel, industry specific approach to OH&S climate. Findings suggested a need for mining companies to maintain high OH&S standards to minimize risks to employee health and safety. Future research is required to determine the ability of this measure to predict OH&S outcomes and its utility within other mine settings. As this tool integrates health and safety, it may have benefits for assessment, monitoring and evaluation in the industry, and improving the understanding of how health and safety climate interact at multiple levels to influence OH&S outcomes. Copyright © 2017 National Safety Council and Elsevier Ltd. All rights reserved.

  13. Neural Networks In Mining Sciences - General Overview And Some Representative Examples

    NASA Astrophysics Data System (ADS)

    Tadeusiewicz, Ryszard

    2015-12-01

    The many difficult problems that must now be addressed in mining sciences make us search for ever newer and more efficient computer tools that can be used to solve those problems. Among the numerous tools of this type, there are neural networks presented in this article - which, although not yet widely used in mining sciences, are certainly worth consideration. Neural networks are a technique which belongs to so called artificial intelligence, and originates from the attempts to model the structure and functioning of biological nervous systems. Initially constructed and tested exclusively out of scientific curiosity, as computer models of parts of the human brain, neural networks have become a surprisingly effective calculation tool in many areas: in technology, medicine, economics, and even social sciences. Unfortunately, they are relatively rarely used in mining sciences and mining technology. The article is intended to convince the readers that neural networks can be very useful also in mining sciences. It contains information how modern neural networks are built, how they operate and how one can use them. The preliminary discussion presented in this paper can help the reader gain an opinion whether this is a tool with handy properties, useful for him, and what it might come in useful for. Of course, the brief introduction to neural networks contained in this paper will not be enough for the readers who get convinced by the arguments contained here, and want to use neural networks. They will still need a considerable portion of detailed knowledge so that they can begin to independently create and build such networks, and use them in practice. However, an interested reader who decides to try out the capabilities of neural networks will also find here links to references that will allow him to start exploration of neural networks fast, and then work with this handy tool efficiently. This will be easy, because there are currently quite a few ready-made computer programs, easily available, which allow their user to quickly and effortlessly create artificial neural networks, run them, train and use in practice. The key issue is the question how to use these networks in mining sciences. The fact that this is possible and desirable is shown by convincing examples included in the second part of this study. From the very rich literature on the various applications of neural networks, we have selected several works that show how and what neural networks are used in the mining industry, and what has been achieved thanks to their use. The review of applications will continue in the next article, filed already for publication in the journal "Archives of Mining Sciences". Only studying these two articles will provide sufficient knowledge for initial guidance in the area of issues under consideration here.

  14. An Integrated Suite of Text and Data Mining Tools - Phase II

    DTIC Science & Technology

    2005-08-30

    Riverside, CA, USA Mazda Motor Corp, Jpn Univ of Darmstadt, Darmstadt, Ger Navy Center for Applied Research in Artificial Intelligence Univ of...with Georgia Tech Research Corporation developed a desktop text-mining software tool named TechOASIS (known commercially as VantagePoint). By the...of this dataset and groups the Corporate Source items that co-occur with the found items. He decides he is only interested in the institutions

  15. Graphics-based intelligent search and abstracting using Data Modeling

    NASA Astrophysics Data System (ADS)

    Jaenisch, Holger M.; Handley, James W.; Case, Carl T.; Songy, Claude G.

    2002-11-01

    This paper presents an autonomous text and context-mining algorithm that converts text documents into point clouds for visual search cues. This algorithm is applied to the task of data-mining a scriptural database comprised of the Old and New Testaments from the Bible and the Book of Mormon, Doctrine and Covenants, and the Pearl of Great Price. Results are generated which graphically show the scripture that represents the average concept of the database and the mining of the documents down to the verse level.

  16. A Tools-Based Approach to Teaching Data Mining Methods

    ERIC Educational Resources Information Center

    Jafar, Musa J.

    2010-01-01

    Data mining is an emerging field of study in Information Systems programs. Although the course content has been streamlined, the underlying technology is still in a state of flux. The purpose of this paper is to describe how we utilized Microsoft Excel's data mining add-ins as a front-end to Microsoft's Cloud Computing and SQL Server 2008 Business…

  17. Data Mining and Knowledge Management: A System Analysis for Establishing a Tiered Knowledge Management Model.

    ERIC Educational Resources Information Center

    Luan, Jing; Willett, Terrence

    This paper discusses data mining--an end-to-end (ETE) data analysis tool that is used by researchers in higher education. It also relates data mining and other software programs to a brand new concept called "Knowledge Management." The paper culminates in the Tier Knowledge Management Model (TKMM), which seeks to provide a stable…

  18. A simplified economic filter for open-pit mining and heap-leach recovery of copper in the United States

    USGS Publications Warehouse

    Long, Keith R.; Singer, Donald A.

    2001-01-01

    Determining the economic viability of mineral deposits of various sizes and grades is a critical task in all phases of mineral supply, from land-use management to mine development. This study evaluates two simple tools for estimating the economic viability of porphyry copper deposits mined by open-pit, heap-leach methods when only limited information on these deposits is available. These two methods are useful for evaluating deposits that either (1) are undiscovered deposits predicted by a mineral resource assessment, or (2) have been discovered but for which little data has been collected or released. The first tool uses ordinary least-squared regression analysis of cost and operating data from selected deposits to estimate a predictive relationship between mining rate, itself estimated from deposit size, and capital and operating costs. The second method uses cost models developed by the U.S. Bureau of Mines (Camm, 1991) updated using appropriate cost indices. We find that the cost model method works best for estimating capital costs and the empirical model works best for estimating operating costs for mines to be developed in the United States.

  19. Updated regulation curation model at the Saccharomyces Genome Database

    PubMed Central

    Engel, Stacia R; Skrzypek, Marek S; Hellerstedt, Sage T; Wong, Edith D; Nash, Robert S; Weng, Shuai; Binkley, Gail; Sheppard, Travis K; Karra, Kalpana; Cherry, J Michael

    2018-01-01

    Abstract The Saccharomyces Genome Database (SGD) provides comprehensive, integrated biological information for the budding yeast Saccharomyces cerevisiae, along with search and analysis tools to explore these data, enabling the discovery of functional relationships between sequence and gene products in fungi and higher organisms. We have recently expanded our data model for regulation curation to address regulation at the protein level in addition to transcription, and are presenting the expanded data on the ‘Regulation’ pages at SGD. These pages include a summary describing the context under which the regulator acts, manually curated and high-throughput annotations showing the regulatory relationships for that gene and a graphical visualization of its regulatory network and connected networks. For genes whose products regulate other genes or proteins, the Regulation page includes Gene Ontology enrichment analysis of the biological processes in which those targets participate. For DNA-binding transcription factors, we also provide other information relevant to their regulatory function, such as DNA binding site motifs and protein domains. As with other data types at SGD, all regulatory relationships and accompanying data are available through YeastMine, SGD’s data warehouse based on InterMine. Database URL: http://www.yeastgenome.org PMID:29688362

  20. Data mining for blood glucose prediction and knowledge discovery in diabetic patients: the METABO diabetes modeling and management system.

    PubMed

    Georga, Eleni; Protopappas, Vasilios; Guillen, Alejandra; Fico, Giuseppe; Ardigo, Diego; Arredondo, Maria Teresa; Exarchos, Themis P; Polyzos, Demosthenes; Fotiadis, Dimitrios I

    2009-01-01

    METABO is a diabetes monitoring and management system which aims at recording and interpreting patient's context, as well as, at providing decision support to both the patient and the doctor. The METABO system consists of (a) a Patient's Mobile Device (PMD), (b) different types of unobtrusive biosensors, (c) a Central Subsystem (CS) located remotely at the hospital and (d) the Control Panel (CP) from which physicians can follow-up their patients and gain also access to the CS. METABO provides a multi-parametric monitoring system which facilitates the efficient and systematic recording of dietary, physical activity, medication and medical information (continuous and discontinuous glucose measurements). Based on all recorded contextual information, data mining schemes that run in the PMD are responsible to model patients' metabolism, predict hypo/hyper-glycaemic events, and provide the patient with short and long-term alerts. In addition, all past and recently-recorded data are analyzed to extract patterns of behavior, discover new knowledge and provide explanations to the physician through the CP. Advanced tools in the CP allow the physician to prescribe personalized treatment plans and frequently quantify patient's adherence to treatment.

  1. Analysis of Nature of Science Included in Recent Popular Writing Using Text Mining Techniques

    NASA Astrophysics Data System (ADS)

    Jiang, Feng; McComas, William F.

    2014-09-01

    This study examined the inclusion of nature of science (NOS) in popular science writing to determine whether it could serve supplementary resource for teaching NOS and to evaluate the accuracy of text mining and classification as a viable research tool in science education research. Four groups of documents published from 2001 to 2010 were analyzed: Scientific American, Discover magazine, winners of the Royal Society Winton Prize for Science Books, and books from NSTA's list of Outstanding Science Trade Books. Computer analysis categorized passages in the selected documents based on their inclusions of NOS. Human analysis assessed the frequency, context, coverage, and accuracy of the inclusions of NOS within computer identified NOS passages. NOS was rarely addressed in selected document sets but somewhat more frequently addressed in the letters section of the two magazines. This result suggests that readers seem interested in the discussion of NOS-related themes. In the popular science books analyzed, NOS presentations were found more likely to be aggregated in the beginning and the end of the book, rather than scattered throughout. The most commonly addressed NOS elements in the analyzed documents are science and society and empiricism in science. Only one inaccurate presentation of NOS were identified in all analyzed documents. The text mining technique demonstrated exciting performance, which invites more applications of the technique to analyze other aspects of science textbooks, popular science writing, or other materials involved in science teaching and learning.

  2. Systematic drug repositioning through mining adverse event data in ClinicalTrials.gov.

    PubMed

    Su, Eric Wen; Sanger, Todd M

    2017-01-01

    Drug repositioning (i.e., drug repurposing) is the process of discovering new uses for marketed drugs. Historically, such discoveries were serendipitous. However, the rapid growth in electronic clinical data and text mining tools makes it feasible to systematically identify drugs with the potential to be repurposed. Described here is a novel method of drug repositioning by mining ClinicalTrials.gov. The text mining tools I2E (Linguamatics) and PolyAnalyst (Megaputer) were utilized. An I2E query extracts "Serious Adverse Events" (SAE) data from randomized trials in ClinicalTrials.gov. Through a statistical algorithm, a PolyAnalyst workflow ranks the drugs where the treatment arm has fewer predefined SAEs than the control arm, indicating that potentially the drug is reducing the level of SAE. Hypotheses could then be generated for the new use of these drugs based on the predefined SAE that is indicative of disease (for example, cancer).

  3. Information Mining Technologies to Enable Discovery of Actionable Intelligence to Facilitate Maritime Situational Awareness: I-MINE

    DTIC Science & Technology

    2013-01-01

    website). Data mining tools are in-house code developed in Python, C++ and Java . • NGA The National Geospatial-Intelligence Agency (NGA) performs data...as PostgreSQL (with PostGIS), MySQL , Microsoft SQL Server, SQLite, etc. using the appropriate JDBC driver. 14 The documentation and ease to learn are...written in Java that is able to perform various types of regressions, classi- fications, and other data mining tasks. There is also a commercial version

  4. Metal mining and the environment

    USGS Publications Warehouse

    Hudson, Travis L.; Fox, Frederick D.; Plumlee, Geoffrey S.

    1999-01-01

    The booklet, Metal Mining and the Environment, and the colorful companion poster offer new tools for raising awareness and understanding of the impact and issues surrounding metal mining and the environment. The 64-page full-color booklet contains a copy of the poster which includes a student activity on the back. This booklet and poster can help you: illustrate the importance of our natural and environmental resources; provide a geoscience perspective on metal mining and the environment; improve Earth science literacy; and increase student understandings of Earth resources and systems.

  5. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining.

    PubMed

    Hero, Alfred O; Rajaratnam, Bala

    2016-01-01

    When can reliable inference be drawn in fue "Big Data" context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for "Big Data". Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.

  6. U-Compare: share and compare text mining tools with UIMA.

    PubMed

    Kano, Yoshinobu; Baumgartner, William A; McCrohon, Luke; Ananiadou, Sophia; Cohen, K Bretonnel; Hunter, Lawrence; Tsujii, Jun'ichi

    2009-08-01

    Due to the increasing number of text mining resources (tools and corpora) available to biologists, interoperability issues between these resources are becoming significant obstacles to using them effectively. UIMA, the Unstructured Information Management Architecture, is an open framework designed to aid in the construction of more interoperable tools. U-Compare is built on top of the UIMA framework, and provides both a concrete framework for out-of-the-box text mining and a sophisticated evaluation platform allowing users to run specific tools on any target text, generating both detailed statistics and instance-based visualizations of outputs. U-Compare is a joint project, providing the world's largest, and still growing, collection of UIMA-compatible resources. These resources, originally developed by different groups for a variety of domains, include many famous tools and corpora. U-Compare can be launched straight from the web, without needing to be manually installed. All U-Compare components are provided ready-to-use and can be combined easily via a drag-and-drop interface without any programming. External UIMA components can also simply be mixed with U-Compare components, without distinguishing between locally and remotely deployed resources. http://u-compare.org/

  7. Disk Rock Cutting Tool for the Implementation of Resource-Saving Technologies of Mining of Solid Minerals

    NASA Astrophysics Data System (ADS)

    Manietyev, Leonid; Khoreshok, Aleksey; Tsekhin, Alexander; Borisov, Andrey

    2017-11-01

    The directions of a resource and energy saving when creating a boom-type effectors of roadheaders of selective action with disc rock cutting tools on a multi-faceted prisms for the destruction of formation of minerals and rocks pricemax are presented. Justified reversing the modes of the crowns and booms to improve the efficiency of mining works. Parameters of destruction of coal and rock faces by the disk tool of a biconical design with the unified fastening knots to many-sided prisms on effectors of extraction mining machines are determined. Parameters of tension of the interfaced elements of knots of fastening of the disk tool at static interaction with the destroyed face of rocks are set. The technical solutions containing the constructive and kinematic communications realizing counter and reverse mode of rotation of two radial crowns with the disk tool on trihedral prisms and cases of booms with the disk tool on tetrahedral prisms in internal space between two axial crowns with the cutter are proposed. Reserves of expansion of the front of loading outside a table of a feeder of the roadheader of selective action, including side zones in which loading corridors by blades of trihedral prisms in internal space between two radial crowns are created are revealed.

  8. Accessing Cloud Properties and Satellite Imagery: A tool for visualization and data mining

    NASA Astrophysics Data System (ADS)

    Chee, T.; Nguyen, L.; Minnis, P.; Spangenberg, D.; Palikonda, R.

    2016-12-01

    Providing public access to imagery of cloud macro and microphysical properties and the underlying satellite imagery is a key concern for the NASA Langley Research Center Cloud and Radiation Group. This work describes a tool and system that allows end users to easily browse cloud information and satellite imagery that is otherwise difficult to acquire and manipulate. The tool has two uses, one to visualize the data and the other to access the data directly. It uses a widely used access protocol, the Open Geospatial Consortium's Web Map and Processing Services, to encourage user to access the data we produce. Internally, we leverage our practical experience with large, scalable application practices to develop a system that has the largest potential for scalability as well as the ability to be deployed on the cloud. One goal of the tool is to provide a demonstration of the back end capability to end users so that they can use the dynamically generated imagery and data as an input to their own work flows or to set up data mining constraints. We build upon NASA Langley Cloud and Radiation Group's experience with making real-time and historical satellite cloud product information and satellite imagery accessible and easily searchable. Increasingly, information is used in a "mash-up" form where multiple sources of information are combined to add value to disparate but related information. In support of NASA strategic goals, our group aims to make as much cutting edge scientific knowledge, observations and products available to the citizen science, research and interested communities for these kinds of "mash-ups" as well as provide a means for automated systems to data mine our information. This tool and access method provides a valuable research tool to a wide audience both as a standalone research tool and also as an easily accessed data source that can easily be mined or used with existing tools.

  9. Knowledge modeling of coal mining equipments based on ontology

    NASA Astrophysics Data System (ADS)

    Zhang, Baolong; Wang, Xiangqian; Li, Huizong; Jiang, Miaomiao

    2017-06-01

    The problems of information redundancy and sharing are universe in coal mining equipment management. In order to improve the using efficiency of knowledge of coal mining equipments, this paper proposed a new method of knowledge modeling based on ontology. On the basis of analyzing the structures and internal relations of coal mining equipment knowledge, taking OWL as ontology construct language, the ontology model of coal mining equipment knowledge is built with the help of Protégé 4.3 software tools. The knowledge description method will lay the foundation for the high effective knowledge management and sharing, which is very significant for improving the production management level of coal mining enterprises.

  10. TREC 2010 legal track: method and results of the ELK collaboration

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Spearing, Shelly; Roman, Jorge; Mc Kay, Bain

    The ELK team ([E]WA-IIT, [L]os Alamos National laboratory (LANL), and [K]ayvium Corporation (ELK)) used the legal Track task 302 as an opportunity to compare and integrate advanced semantic-automation strategies. The team members believe that enabling parties to discover, consume, analyze, and make decisions in a noisy and information-overloaded environment requires new tools. Together, as well as independently, they are actively developing these tools and view the TREC exercise as an opportunity to test, compare, and complement tools and approaches. Our collaboration is new to TREC, brought together by a shared interest in document relevance, concept-in-context identification and annotation, and themore » recognition that words out-of-context do not a match make. The team's intent was to lay the foundation for automating the mining and analysis of large volumes of electronic information by litigants and their lawyers, not only in the context of document discovery, but also to support litigation strategy, motion practice, deposition, trial tactics, etc. The premise was that a Subject Matter Expert- (SME-) built model can be automatically mapped onto various search engines for document retrieval, organization, relevance scoring, analysis and decision support. In the end, we ran nearly a dozen models, mostly, but not exclusively, with Kayvium Corporation's knowledge automation technology. The Sal Database Search Engine we used had a bug in its proximity feature, requiring that we develop a workaround. While the work-around was successful, it left us with insufficient time to converge the models to achieve expected quality. However, with optimized proximity processing in place, we would be able to run the model many more times, and believe repeatable quality would be a matter of working through a few requests to get the approach right. We believe that with more time, the results we would achieve might point towards a new way of processing documents for litigation support, so litigators can be confident that results are complete but not overly inclusive.« less

  11. Leaf-Mining and Gall-Forming Insects: Tools for Teaching Population Ecology.

    ERIC Educational Resources Information Center

    Brown, Valerie K.

    1984-01-01

    Discusses the use of leaf mines (formed by larvae of small moths or flies) and galls (wasps' larvae) in various insect population studies. Also considers the advantages of using these structures for instructional purposes. (DH)

  12. Alkemio: association of chemicals with biomedical topics by text and data mining

    PubMed Central

    Gijón-Correas, José A.; Andrade-Navarro, Miguel A.; Fontaine, Jean F.

    2014-01-01

    The PubMed® database of biomedical citations allows the retrieval of scientific articles studying the function of chemicals in biology and medicine. Mining millions of available citations to search reported associations between chemicals and topics of interest would require substantial human time. We have implemented the Alkemio text mining web tool and SOAP web service to help in this task. The tool uses biomedical articles discussing chemicals (including drugs), predicts their relatedness to the query topic with a naïve Bayesian classifier and ranks all chemicals by P-values computed from random simulations. Benchmarks on seven human pathways showed good retrieval performance (areas under the receiver operating characteristic curves ranged from 73.6 to 94.5%). Comparison with existing tools to retrieve chemicals associated to eight diseases showed the higher precision and recall of Alkemio when considering the top 10 candidate chemicals. Alkemio is a high performing web tool ranking chemicals for any biomedical topics and it is free to non-commercial users. Availability: http://cbdm.mdc-berlin.de/∼medlineranker/cms/alkemio. PMID:24838570

  13. Towards the Geospatial Web: Media Platforms for Managing Geotagged Knowledge Repositories

    NASA Astrophysics Data System (ADS)

    Scharl, Arno

    International media have recognized the visual appeal of geo-browsers such as NASA World Wind and Google Earth, for example, when Web and television coverage on Hurricane Katrina used interactive geospatial projections to illustrate its path and the scale of destruction in August 2005. Yet these early applications only hint at the true potential of geospatial technology to build and maintain virtual communities and to revolutionize the production, distribution and consumption of media products. This chapter investigates this potential by reviewing the literature and discussing the integration of geospatial and semantic reference systems, with an emphasis on extracting geospatial context from unstructured text. A content analysis of news coverage based on a suite of text mining tools (webLyzard) sheds light on the popularity and adoption of geospatial platforms.

  14. Assessing Lost Ecosystem Service Benefits Due to Mining-Induced Stream Degradation in the Appalachian Region: Economic Approaches to Valuing Recreational Fishing Impacts

    EPA Science Inventory

    Sport fishing is a popular activity for Appalachian residents and visitors. The region’s coldwater streams support a strong regional outdoor tourism industry. We examined the influence of surface coal mining, in the context of other stressors, on freshwater sport fishing in...

  15. Interstitial Lung Diseases in the U.S. Mining Industry: Using MSHA Data to Examine Trends and the Prevention Effects of Compliance with Health Regulations, 1996-2015.

    PubMed

    Yorio, Patrick L; Laney, A Scott; Halldin, Cara N; Blackley, David J; Moore, Susan M; Wizner, Kerri; Radonovich, Lewis J; Greenawald, Lee A

    2018-04-12

    Given the recent increase in dust-induced lung disease among U.S. coal miners and the respiratory hazards encountered across the U.S. mining industry, it is important to enhance an understanding of lung disease trends and the organizational contexts that precede these events. In addition to exploring overall trends reported to the Mine Safety and Health Administration (MSHA), the current study uses MSHA's enforcement database to examine whether or not compliance with health regulations resulted in fewer mine-level counts of these diseases over time. The findings suggest that interstitial lung diseases were more prevalent in coal mines compared to other mining commodities, in Appalachian coal mines compared to the rest of the United States, and in underground compared to surface coal mines. Mines that followed a relevant subset of MSHA's health regulations were less likely to report a lung disease over time. The findings are discussed from a lung disease prevention strategy perspective. © 2018 Society for Risk Analysis.

  16. The Spatial Assessment of the Current Seismic Hazard State for Hard Rock Underground Mines

    NASA Astrophysics Data System (ADS)

    Wesseloo, Johan

    2018-06-01

    Mining-induced seismic hazard assessment is an important component in the management of safety and financial risk in mines. As the seismic hazard is a response to the mining activity, it is non-stationary and variable both in space and time. This paper presents an approach for implementing a probabilistic seismic hazard assessment to assess the current hazard state of a mine. Each of the components of the probabilistic seismic hazard assessment is considered within the context of hard rock underground mines. The focus of this paper is the assessment of the in-mine hazard distribution and does not consider the hazard to nearby public or structures. A rating system and methodologies to present hazard maps, for the purpose of communicating to different stakeholders in the mine, i.e. mine managers, technical personnel and the work force, are developed. The approach allows one to update the assessment with relative ease and within short time periods as new data become available, enabling the monitoring of the spatial and temporal change in the seismic hazard.

  17. Human factors model concerning the man-machine interface of mining crewstations

    NASA Technical Reports Server (NTRS)

    Rider, James P.; Unger, Richard L.

    1989-01-01

    The U.S. Bureau of Mines is developing a computer model to analyze the human factors aspect of mining machine operator compartments. The model will be used as a research tool and as a design aid. It will have the capability to perform the following: simulated anthropometric or reach assessment, visibility analysis, illumination analysis, structural analysis of the protective canopy, operator fatigue analysis, and computation of an ingress-egress rating. The model will make extensive use of graphics to simplify data input and output. Two dimensional orthographic projections of the machine and its operator compartment are digitized and the data rebuilt into a three dimensional representation of the mining machine. Anthropometric data from either an individual or any size population may be used. The model is intended for use by equipment manufacturers and mining companies during initial design work on new machines. In addition to its use in machine design, the model should prove helpful as an accident investigation tool and for determining the effects of machine modifications made in the field on the critical areas of visibility and control reach ability.

  18. Analysis of the current rib support practices and techniques in U.S. coal mines

    PubMed Central

    Mohamed, Khaled M.; Murphy, Michael M.; Lawson, Heather E.; Klemetti, Ted

    2016-01-01

    Design of rib support systems in U.S. coal mines is based primarily on local practices and experience. A better understanding of current rib support practices in U.S. coal mines is crucial for developing a sound engineering rib support design tool. The objective of this paper is to analyze the current practices of rib control in U.S. coal mines. Twenty underground coal mines were studied representing various coal basins, coal seams, geology, loading conditions, and rib control strategies. The key findings are: (1) any rib design guideline or tool should take into account external rib support as well as internal bolting; (2) rib bolts on their own cannot contain rib spall, especially in soft ribs subjected to significant load—external rib control devices such as mesh are required in such cases to contain rib sloughing; (3) the majority of the studied mines follow the overburden depth and entry height thresholds recommended by the Program Information Bulletin 11-29 issued by the Mine Safety and Health Administration; (4) potential rib instability occurred when certain geological features prevailed—these include draw slate and/or bone coal near the rib/roof line, claystone partings, and soft coal bench overlain by rock strata; (5) 47% of the studied rib spall was classified as blocky—this could indicate a high potential of rib hazards; and (6) rib injury rates of the studied mines for the last three years emphasize the need for more rib control management for mines operating at overburden depths between 152.4 m and 304.8 m. PMID:27648341

  19. Data Mining and Homeland Security: An Overview

    DTIC Science & Technology

    2006-01-27

    which government agencies should use and mix commercial data with government data, whether data sources are being used for purposes other than those...example, a hardware store may compare their customers’ tool purchases with home ownership, type of CRS-2 3 John Makulowich, “ Government Data Mining...cleaning, data integration, data selection, data transformation , (data mining), pattern evaluation, and knowledge presentation.4 A number of advances in

  20. Biomedical text mining and its applications in cancer research.

    PubMed

    Zhu, Fei; Patumcharoenpol, Preecha; Zhang, Cheng; Yang, Yang; Chan, Jonathan; Meechai, Asawin; Vongsangnak, Wanwipa; Shen, Bairong

    2013-04-01

    Cancer is a malignant disease that has caused millions of human deaths. Its study has a long history of well over 100years. There have been an enormous number of publications on cancer research. This integrated but unstructured biomedical text is of great value for cancer diagnostics, treatment, and prevention. The immense body and rapid growth of biomedical text on cancer has led to the appearance of a large number of text mining techniques aimed at extracting novel knowledge from scientific text. Biomedical text mining on cancer research is computationally automatic and high-throughput in nature. However, it is error-prone due to the complexity of natural language processing. In this review, we introduce the basic concepts underlying text mining and examine some frequently used algorithms, tools, and data sets, as well as assessing how much these algorithms have been utilized. We then discuss the current state-of-the-art text mining applications in cancer research and we also provide some resources for cancer text mining. With the development of systems biology, researchers tend to understand complex biomedical systems from a systems biology viewpoint. Thus, the full utilization of text mining to facilitate cancer systems biology research is fast becoming a major concern. To address this issue, we describe the general workflow of text mining in cancer systems biology and each phase of the workflow. We hope that this review can (i) provide a useful overview of the current work of this field; (ii) help researchers to choose text mining tools and datasets; and (iii) highlight how to apply text mining to assist cancer systems biology research. Copyright © 2012 Elsevier Inc. All rights reserved.

  1. An Overview of GIS-Based Modeling and Assessment of Mining-Induced Hazards: Soil, Water, and Forest

    PubMed Central

    Kim, Sung-Min; Yi, Huiuk; Choi, Yosoon

    2017-01-01

    In this study, current geographic information system (GIS)-based methods and their application for the modeling and assessment of mining-induced hazards were reviewed. Various types of mining-induced hazard, including soil contamination, soil erosion, water pollution, and deforestation were considered in the discussion of the strength and role of GIS as a viable problem-solving tool in relation to mining-induced hazards. The various types of mining-induced hazard were classified into two or three subtopics according to the steps involved in the reclamation procedure, or elements of the hazard of interest. Because GIS is appropriated for the handling of geospatial data in relation to mining-induced hazards, the application and feasibility of exploiting GIS-based modeling and assessment of mining-induced hazards within the mining industry could be expanded further. PMID:29186922

  2. Web mining in soft computing framework: relevance, state of the art and future directions.

    PubMed

    Pal, S K; Talwar, V; Mitra, P

    2002-01-01

    The paper summarizes the different characteristics of Web data, the basic components of Web mining and its different types, and the current state of the art. The reason for considering Web mining, a separate field from data mining, is explained. The limitations of some of the existing Web mining methods and tools are enunciated, and the significance of soft computing (comprising fuzzy logic (FL), artificial neural networks (ANNs), genetic algorithms (GAs), and rough sets (RSs) are highlighted. A survey of the existing literature on "soft Web mining" is provided along with the commercially available systems. The prospective areas of Web mining where the application of soft computing needs immediate attention are outlined with justification. Scope for future research in developing "soft Web mining" systems is explained. An extensive bibliography is also provided.

  3. An Overview of GIS-Based Modeling and Assessment of Mining-Induced Hazards: Soil, Water, and Forest.

    PubMed

    Suh, Jangwon; Kim, Sung-Min; Yi, Huiuk; Choi, Yosoon

    2017-11-27

    In this study, current geographic information system (GIS)-based methods and their application for the modeling and assessment of mining-induced hazards were reviewed. Various types of mining-induced hazard, including soil contamination, soil erosion, water pollution, and deforestation were considered in the discussion of the strength and role of GIS as a viable problem-solving tool in relation to mining-induced hazards. The various types of mining-induced hazard were classified into two or three subtopics according to the steps involved in the reclamation procedure, or elements of the hazard of interest. Because GIS is appropriated for the handling of geospatial data in relation to mining-induced hazards, the application and feasibility of exploiting GIS-based modeling and assessment of mining-induced hazards within the mining industry could be expanded further.

  4. Text Mining in Biomedical Domain with Emphasis on Document Clustering.

    PubMed

    Renganathan, Vinaitheerthan

    2017-07-01

    With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

  5. Text Mining Genotype-Phenotype Relationships from Biomedical Literature for Database Curation and Precision Medicine.

    PubMed

    Singhal, Ayush; Simmons, Michael; Lu, Zhiyong

    2016-11-01

    The practice of precision medicine will ultimately require databases of genes and mutations for healthcare providers to reference in order to understand the clinical implications of each patient's genetic makeup. Although the highest quality databases require manual curation, text mining tools can facilitate the curation process, increasing accuracy, coverage, and productivity. However, to date there are no available text mining tools that offer high-accuracy performance for extracting such triplets from biomedical literature. In this paper we propose a high-performance machine learning approach to automate the extraction of disease-gene-variant triplets from biomedical literature. Our approach is unique because we identify the genes and protein products associated with each mutation from not just the local text content, but from a global context as well (from the Internet and from all literature in PubMed). Our approach also incorporates protein sequence validation and disease association using a novel text-mining-based machine learning approach. We extract disease-gene-variant triplets from all abstracts in PubMed related to a set of ten important diseases (breast cancer, prostate cancer, pancreatic cancer, lung cancer, acute myeloid leukemia, Alzheimer's disease, hemochromatosis, age-related macular degeneration (AMD), diabetes mellitus, and cystic fibrosis). We then evaluate our approach in two ways: (1) a direct comparison with the state of the art using benchmark datasets; (2) a validation study comparing the results of our approach with entries in a popular human-curated database (UniProt) for each of the previously mentioned diseases. In the benchmark comparison, our full approach achieves a 28% improvement in F1-measure (from 0.62 to 0.79) over the state-of-the-art results. For the validation study with UniProt Knowledgebase (KB), we present a thorough analysis of the results and errors. Across all diseases, our approach returned 272 triplets (disease-gene-variant) that overlapped with entries in UniProt and 5,384 triplets without overlap in UniProt. Analysis of the overlapping triplets and of a stratified sample of the non-overlapping triplets revealed accuracies of 93% and 80% for the respective categories (cumulative accuracy, 77%). We conclude that our process represents an important and broadly applicable improvement to the state of the art for curation of disease-gene-variant relationships.

  6. Biocuration workflows and text mining: overview of the BioCreative 2012 Workshop Track II.

    PubMed

    Lu, Zhiyong; Hirschman, Lynette

    2012-01-01

    Manual curation of data from the biomedical literature is a rate-limiting factor for many expert curated databases. Despite the continuing advances in biomedical text mining and the pressing needs of biocurators for better tools, few existing text-mining tools have been successfully integrated into production literature curation systems such as those used by the expert curated databases. To close this gap and better understand all aspects of literature curation, we invited submissions of written descriptions of curation workflows from expert curated databases for the BioCreative 2012 Workshop Track II. We received seven qualified contributions, primarily from model organism databases. Based on these descriptions, we identified commonalities and differences across the workflows, the common ontologies and controlled vocabularies used and the current and desired uses of text mining for biocuration. Compared to a survey done in 2009, our 2012 results show that many more databases are now using text mining in parts of their curation workflows. In addition, the workshop participants identified text-mining aids for finding gene names and symbols (gene indexing), prioritization of documents for curation (document triage) and ontology concept assignment as those most desired by the biocurators. DATABASE URL: http://www.biocreative.org/tasks/bc-workshop-2012/workflow/.

  7. Understanding social collaboration between actors and technology in an automated and digitised deep mining environment.

    PubMed

    Sanda, M-A; Johansson, J; Johansson, B; Abrahamsson, L

    2011-10-01

    The purpose of this article is to develop knowledge and learning on the best way to automate organisational activities in deep mines that could lead to the creation of harmony between the human, technical and the social system, towards increased productivity. The findings showed that though the introduction of high-level technological tools in the work environment disrupted the social relations developed over time amongst the employees in most situations, the technological tools themselves became substitute social collaborative partners to the employees. It is concluded that, in developing a digitised mining production system, knowledge of the social collaboration between the humans (miners) and the technology they use for their work must be developed. By implication, knowledge of the human's subject-oriented and object-oriented activities should be considered as an important integral resource for developing a better technological, organisational and human interactive subsystem when designing the intelligent automation and digitisation systems for deep mines. STATEMENT OF RELEVANCE: This study focused on understanding the social collaboration between humans and the technologies they use to work in underground mines. The learning provides an added knowledge in designing technologies and work organisations that could better enhance the human-technology interactive and collaborative system in the automation and digitisation of underground mines.

  8. Realizing modeling and mapping tools to study the upsurge of noise pollution as a result of open-cast mining and transportation activities.

    PubMed

    Lokhande, Satish K; Jain, Mohindra C; Dhawale, Satyajeet A; Gautam, Rakesh; Bodhe, Ghanshyam L

    2018-01-01

    In open-cast mines, noise pollution has become a serious concern due to the extreme use of heavy earth moving machinery (HEMM). This study is focused to measure and assess the effects of the existing noise levels of major operational mines in the Keonjhar, Sundergadh, and Mayurbhanj districts of Odisha, India. The transportation noise levels were also considered in this study, which was predicted using the modified Federal Highway Administration (FHWA) model. It was observed that noise induced by HEMM such as rock breakers, jackhammers, dumpers, and excavators, blasting noise in the mining terrain, as well as associated transportation noise became a major source of annoyance to the habitants living in proximity to the mines. The noise produced by mechanized mining operations was observed between 74.3 and 115.2 dB(A), and its impact on residential areas was observed between 49.4 and 58.9 dB(A). In addition, the noise contour maps of sound level dispersion were demonstrated with the utilization of advanced noise prediction software tools for better understanding. Finally, the predicted values at residential zone and traffic noise are correlated with observed values, and the coefficient of determination, R 2 , was calculated to be 0.6891 and 0.5967, respectively.

  9. Efficient management of marine resources in conflict: an empirical study of marine sand mining, Korea.

    PubMed

    Kim, Tae-Goun

    2009-10-01

    This article develops a dynamic model of efficient use of exhaustible marine sand resources in the context of marine mining externalities. The classical Hotelling extraction model is applied to sand mining in Ongjin, Korea and extended to include the estimated marginal external costs that mining imposes on marine fisheries. The socially efficient sand extraction plan is compared with the extraction paths suggested by scientific research. If marginal environmental costs are correctly estimated, the developed efficient extraction plan considering the resource rent may increase the social welfare and reduce the conflicts among the marine sand resource users. The empirical results are interpreted with an emphasis on guidelines for coastal resource management policy.

  10. Text Mining in Biomedical Domain with Emphasis on Document Clustering

    PubMed Central

    2017-01-01

    Objectives With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. Methods This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Results Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Conclusions Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise. PMID:28875048

  11. Ecological effects of lead mining on Ozark streams: In-situ toxicity to woodland crayfish (Orconectes hylas)

    USGS Publications Warehouse

    Allert, A.L.; Fairchild, J.F.; DiStefano, R.J.; Schmitt, C.J.; Brumbaugh, W.G.; Besser, J.M.

    2009-01-01

    The Viburnum Trend mining district in southeast Missouri, USA is one of the largest producers of lead-zinc ore in the world. Previous stream surveys found evidence of increased metal exposure and reduced population densities of crayfish immediately downstream of mining sites. We conducted an in-situ 28-d exposure to assess toxicity of mining-derived metals to the woodland crayfish (Orconectes hylas). Crayfish survival and biomass were significantly lower at mining sites than at reference and downstream sites. Metal concentrations in water, detritus, macroinvertebrates, fish, and crayfish were significantly higher at mining sites, and were negatively correlated with caged crayfish survival. These results support previous field and laboratory studies that showed mining-derived metals negatively affect O. hylas populations in streams draining the Viburnum Trend, and that in-situ toxicity testing was a valuable tool for assessing the impacts of mining on crayfish populations.

  12. Poly(vinyl alcohol)/hydroxyapatite Monolithic In-Needle Extraction (MINE) device: Preparation and examination of drug affinity.

    PubMed

    Pietrzyńska, Monika; Czerwiński, Michał; Voelkel, Adam

    2017-07-15

    Polymer-ceramic materials based on poly(vinyl alcohol) (PVA) and hydroxyapatite were applied as sorption material in Monolithic In-Needle Extraction (MINE) device. The presented device provides new possibilities for the examination of bisphosphonates affinity for bone and will be a helpful tool in evaluation of potential antiresorptive drugs suitability. A ceramic part of monoliths was prepared by incorporation of hydroxyapatite (HA) into the reaction mixture or by using a soaking method (mineralization of HA on the PVA). The parameters of synthesis conditions were optimized to achieve a monolithic material having the appropriate dimensions after the soaking process enabling placing of the monolithic material inside the needle. Furthermore, the material must have had optimal dimensions after the re-soaking process to fit perfectly to the needle. Among the sixteen monolithic materials, eight of them were selected for further study, and then four of them were selected as a sorbent material for the MINE device. The material properties were examined on the basis of several parameters: swelling ratio, initial mass reversion and initial diameter reversion, mass growth due to the HA formation, and antiresorptive drug sorption. The MINE device might be then used as a tool for examination of interactions between bisphosphonate and bone. The simulated body fluid containing sodium risedronate (RSD) as a standard compound was passed through the MINE device. The obtained device allowed for sorption about 0.38mg of RSD. The desorption process was carried out in five steps allowing insightful analysis. The MINE device turned out to be a helpful tool for determination of the bisphosphonates affinity to the ceramic part of sorbent (hydroxyapatite) and to assess the usefulness of them as antiresorptive drugs in the future. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. VisualUrText: A Text Analytics Tool for Unstructured Textual Data

    NASA Astrophysics Data System (ADS)

    Zainol, Zuraini; Jaymes, Mohd T. H.; Nohuddin, Puteri N. E.

    2018-05-01

    The growing amount of unstructured text over Internet is tremendous. Text repositories come from Web 2.0, business intelligence and social networking applications. It is also believed that 80-90% of future growth data is available in the form of unstructured text databases that may potentially contain interesting patterns and trends. Text Mining is well known technique for discovering interesting patterns and trends which are non-trivial knowledge from massive unstructured text data. Text Mining covers multidisciplinary fields involving information retrieval (IR), text analysis, natural language processing (NLP), data mining, machine learning statistics and computational linguistics. This paper discusses the development of text analytics tool that is proficient in extracting, processing, analyzing the unstructured text data and visualizing cleaned text data into multiple forms such as Document Term Matrix (DTM), Frequency Graph, Network Analysis Graph, Word Cloud and Dendogram. This tool, VisualUrText, is developed to assist students and researchers for extracting interesting patterns and trends in document analyses.

  14. U-Compare: share and compare text mining tools with UIMA

    PubMed Central

    Kano, Yoshinobu; Baumgartner, William A.; McCrohon, Luke; Ananiadou, Sophia; Cohen, K. Bretonnel; Hunter, Lawrence; Tsujii, Jun'ichi

    2009-01-01

    Summary: Due to the increasing number of text mining resources (tools and corpora) available to biologists, interoperability issues between these resources are becoming significant obstacles to using them effectively. UIMA, the Unstructured Information Management Architecture, is an open framework designed to aid in the construction of more interoperable tools. U-Compare is built on top of the UIMA framework, and provides both a concrete framework for out-of-the-box text mining and a sophisticated evaluation platform allowing users to run specific tools on any target text, generating both detailed statistics and instance-based visualizations of outputs. U-Compare is a joint project, providing the world's largest, and still growing, collection of UIMA-compatible resources. These resources, originally developed by different groups for a variety of domains, include many famous tools and corpora. U-Compare can be launched straight from the web, without needing to be manually installed. All U-Compare components are provided ready-to-use and can be combined easily via a drag-and-drop interface without any programming. External UIMA components can also simply be mixed with U-Compare components, without distinguishing between locally and remotely deployed resources. Availability: http://u-compare.org/ Contact: kano@is.s.u-tokyo.ac.jp PMID:19414535

  15. Automatic target validation based on neuroscientific literature mining for tractography

    PubMed Central

    Vasques, Xavier; Richardet, Renaud; Hill, Sean L.; Slater, David; Chappelier, Jean-Cedric; Pralong, Etienne; Bloch, Jocelyne; Draganski, Bogdan; Cif, Laura

    2015-01-01

    Target identification for tractography studies requires solid anatomical knowledge validated by an extensive literature review across species for each seed structure to be studied. Manual literature review to identify targets for a given seed region is tedious and potentially subjective. Therefore, complementary approaches would be useful. We propose to use text-mining models to automatically suggest potential targets from the neuroscientific literature, full-text articles and abstracts, so that they can be used for anatomical connection studies and more specifically for tractography. We applied text-mining models to three structures: two well-studied structures, since validated deep brain stimulation targets, the internal globus pallidus and the subthalamic nucleus and, the nucleus accumbens, an exploratory target for treating psychiatric disorders. We performed a systematic review of the literature to document the projections of the three selected structures and compared it with the targets proposed by text-mining models, both in rat and primate (including human). We ran probabilistic tractography on the nucleus accumbens and compared the output with the results of the text-mining models and literature review. Overall, text-mining the literature could find three times as many targets as two man-weeks of curation could. The overall efficiency of the text-mining against literature review in our study was 98% recall (at 36% precision), meaning that over all the targets for the three selected seeds, only one target has been missed by text-mining. We demonstrate that connectivity for a structure of interest can be extracted from a very large amount of publications and abstracts. We believe this tool will be useful in helping the neuroscience community to facilitate connectivity studies of particular brain regions. The text mining tools used for the study are part of the HBP Neuroinformatics Platform, publicly available at http://connectivity-brainer.rhcloud.com/. PMID:26074781

  16. 30 CFR 57.12033 - Hand-held electric tools.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Hand-held electric tools. 57.12033 Section 57.12033 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL... Surface and Underground § 57.12033 Hand-held electric tools. Hand-held electric tools shall not be...

  17. 30 CFR 57.12033 - Hand-held electric tools.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Hand-held electric tools. 57.12033 Section 57.12033 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL... Surface and Underground § 57.12033 Hand-held electric tools. Hand-held electric tools shall not be...

  18. 30 CFR 57.12033 - Hand-held electric tools.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Hand-held electric tools. 57.12033 Section 57.12033 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL... Surface and Underground § 57.12033 Hand-held electric tools. Hand-held electric tools shall not be...

  19. 30 CFR 57.12033 - Hand-held electric tools.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Hand-held electric tools. 57.12033 Section 57.12033 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL... Surface and Underground § 57.12033 Hand-held electric tools. Hand-held electric tools shall not be...

  20. 30 CFR 57.12033 - Hand-held electric tools.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Hand-held electric tools. 57.12033 Section 57.12033 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL... Surface and Underground § 57.12033 Hand-held electric tools. Hand-held electric tools shall not be...

  1. Assimilating Text-Mining & Bio-Informatics Tools to Analyze Cellulase structures

    NASA Astrophysics Data System (ADS)

    Satyasree, K. P. N. V., Dr; Lalitha Kumari, B., Dr; Jyotsna Devi, K. S. N. V.; Choudri, S. M. Roy; Pratap Joshi, K.

    2017-08-01

    Text-mining is one of the best potential way of automatically extracting information from the huge biological literature. To exploit its prospective, the knowledge encrypted in the text should be converted to some semantic representation such as entities and relations, which could be analyzed by machines. But large-scale practical systems for this purpose are rare. But text mining could be helpful for generating or validating predictions. Cellulases have abundant applications in various industries. Cellulose degrading enzymes are cellulases and the same producing bacteria - Bacillus subtilis & fungus Pseudomonas putida were isolated from top soil of Guntur Dt. A.P. India. Absolute cultures were conserved on potato dextrose agar medium for molecular studies. In this paper, we presented how well the text mining concepts can be used to analyze cellulase producing bacteria and fungi, their comparative structures are also studied with the aid of well-establised, high quality standard bioinformatic tools such as Bioedit, Swissport, Protparam, EMBOSSwin with which a complete data on Cellulases like structure, constituents of the enzyme has been obtained.

  2. Using software to predict occupational hearing loss in the mining industry.

    PubMed

    Azman, A S; Li, M; Thompson, J K

    2016-01-01

    Powerful mining systems typically generate high-level noise that can damage the hearing ability of miners. Engineering noise controls are the most desirable and effective control for overexposure to noise. However, the effects of these noise controls on the actual hearing status of workers are not easily measured. A tool that can provide guidance in assigning workers to jobs based on the noise levels to which they will be exposed is highly desirable. Therefore, the Pittsburgh Mining Research Division (PMRD) of the U.S. National Institute for Occupational Safety and Health (NIOSH) developed a tool to estimate in a systematic way the hearing loss due to occupational noise exposure and to evaluate the effectiveness of developed engineering controls. This computer program is based on the ISO 1999 standard and can be used to estimate the loss of hearing ability caused by occupational noise exposures. In this paper, the functionalities of this software are discussed and several case studies related to mining machinery are presented to demonstrate the functionalities of this software.

  3. Online Analytical Processing (OLAP): A Fast and Effective Data Mining Tool for Gene Expression Databases

    PubMed Central

    2005-01-01

    Gene expression databases contain a wealth of information, but current data mining tools are limited in their speed and effectiveness in extracting meaningful biological knowledge from them. Online analytical processing (OLAP) can be used as a supplement to cluster analysis for fast and effective data mining of gene expression databases. We used Analysis Services 2000, a product that ships with SQLServer2000, to construct an OLAP cube that was used to mine a time series experiment designed to identify genes associated with resistance of soybean to the soybean cyst nematode, a devastating pest of soybean. The data for these experiments is stored in the soybean genomics and microarray database (SGMD). A number of candidate resistance genes and pathways were found. Compared to traditional cluster analysis of gene expression data, OLAP was more effective and faster in finding biologically meaningful information. OLAP is available from a number of vendors and can work with any relational database management system through OLE DB. PMID:16046824

  4. Arrangement for controlled engagement of the tools of a mining machine with a mine face

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Blumenthal, G.; Bollmann, A.

    1981-07-28

    An arrangement for controlled engagement of the tools of a coal planer, with a mine face comprises a scraper conveyor, provided on its front face directed toward the mine face with a guide rail guiding the coal planer for reciprocation along the mine face and a mechanism for tilting the conveyor and the coal planer about a substantially horizontal axis. The tilting mechanism is connected to the rear face of the conveyor and extends in its entirety rearwardly of the rear face of the latter. The tilting mechanism comprises a guide linkage pivotally connected at its front end to themore » rear face of the scraper conveyor while its rear end portion forms a housing for a fluid operated cylinder and piston unit, the piston rod of which is connected to a connecting rod guided by the guide linkage for movement in longitudinal direction and having an upwardly extending front section pivotally connected at its upper free end to the rear face of the scraper conveyor. The fluid operated cylinder-and-piston unit is thus considerably spaced from the scraper conveyor and the material transported thereby and especially coal dust raised during transport of the mined coal by the conveyor, whereby maintenance of the tilting unit is reduced. The guide linkage, the connecting rod and the tilting unit are all in close vicinity to the sole of the mine gallery to leave a considerable free space between the arrangement and the roof of the mine gallery.« less

  5. Transforming data into usable knowledge: the CIRC experience

    NASA Astrophysics Data System (ADS)

    Mote, P.; Lach, D.; Hartmann, H.; Abatzoglou, J. T.; Stevenson, J.

    2017-12-01

    NOAA's northwest RISA, the Climate Impacts Research Consortium, emphasizes the transformation of data into usable knowledge. This effort involves physical scientists (e.g., Abatzoglou) building web-based tools with climate and hydrologic data and model output, a team performing data mining to link crop loss claims to droughts, social scientists (eg., Lach, Hartmann) evaluating the effectiveness of such tools at communicating with end users, and two-way engagement with a wide variety of audiences who are interested in using and improving the tools. Unusual in this effort is the seamless integration across timescales past, present, and future; data mining; and the level of effort in evaluating the tools. We provide examples of agriculturally relevant climate variables (e.g. growing degree days, day of first fall freeze) and describe the iterative process of incorporating user feedback.

  6. The mine management professions in the twentieth-century Scottish coal mining industry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perchard, A.

    2007-07-01

    This book seeks to redress the exclusion of colliery managers and other mining professionals from the history of British, and particularly Scottish, coal industries. This is accomplished by examining these groups within the most crucial period of their ascendancy in the Scottish coal mining industry, 1930-1966. This work seeks to place such persons within their context and to examine their roles, statuses and behaviours through their relationships with employees and the execution of their functions, also examining their terms and conditions of employment, the outlook of their professional associations, and that of their union. Through all this, Dr. Perchard illustratesmore » how this growing consciousness amongst managerial employees in the industry was accompanied by an intense public discussion, within the mining professions, over their future shape, principles and occupational standards.« less

  7. Improve Data Mining and Knowledge Discovery Through the Use of MatLab

    NASA Technical Reports Server (NTRS)

    Shaykhian, Gholam Ali; Martin, Dawn (Elliott); Beil, Robert

    2011-01-01

    Data mining is widely used to mine business, engineering, and scientific data. Data mining uses pattern based queries, searches, or other analyses of one or more electronic databases/datasets in order to discover or locate a predictive pattern or anomaly indicative of system failure, criminal or terrorist activity, etc. There are various algorithms, techniques and methods used to mine data; including neural networks, genetic algorithms, decision trees, nearest neighbor method, rule induction association analysis, slice and dice, segmentation, and clustering. These algorithms, techniques and methods used to detect patterns in a dataset, have been used in the development of numerous open source and commercially available products and technology for data mining. Data mining is best realized when latent information in a large quantity of data stored is discovered. No one technique solves all data mining problems; challenges are to select algorithms or methods appropriate to strengthen data/text mining and trending within given datasets. In recent years, throughout industry, academia and government agencies, thousands of data systems have been designed and tailored to serve specific engineering and business needs. Many of these systems use databases with relational algebra and structured query language to categorize and retrieve data. In these systems, data analyses are limited and require prior explicit knowledge of metadata and database relations; lacking exploratory data mining and discoveries of latent information. This presentation introduces MatLab(R) (MATrix LABoratory), an engineering and scientific data analyses tool to perform data mining. MatLab was originally intended to perform purely numerical calculations (a glorified calculator). Now, in addition to having hundreds of mathematical functions, it is a programming language with hundreds built in standard functions and numerous available toolboxes. MatLab's ease of data processing, visualization and its enormous availability of built in functionalities and toolboxes make it suitable to perform numerical computations and simulations as well as a data mining tool. Engineers and scientists can take advantage of the readily available functions/toolboxes to gain wider insight in their perspective data mining experiments.

  8. Improve Data Mining and Knowledge Discovery through the use of MatLab

    NASA Technical Reports Server (NTRS)

    Shaykahian, Gholan Ali; Martin, Dawn Elliott; Beil, Robert

    2011-01-01

    Data mining is widely used to mine business, engineering, and scientific data. Data mining uses pattern based queries, searches, or other analyses of one or more electronic databases/datasets in order to discover or locate a predictive pattern or anomaly indicative of system failure, criminal or terrorist activity, etc. There are various algorithms, techniques and methods used to mine data; including neural networks, genetic algorithms, decision trees, nearest neighbor method, rule induction association analysis, slice and dice, segmentation, and clustering. These algorithms, techniques and methods used to detect patterns in a dataset, have been used in the development of numerous open source and commercially available products and technology for data mining. Data mining is best realized when latent information in a large quantity of data stored is discovered. No one technique solves all data mining problems; challenges are to select algorithms or methods appropriate to strengthen data/text mining and trending within given datasets. In recent years, throughout industry, academia and government agencies, thousands of data systems have been designed and tailored to serve specific engineering and business needs. Many of these systems use databases with relational algebra and structured query language to categorize and retrieve data. In these systems, data analyses are limited and require prior explicit knowledge of metadata and database relations; lacking exploratory data mining and discoveries of latent information. This presentation introduces MatLab(TradeMark)(MATrix LABoratory), an engineering and scientific data analyses tool to perform data mining. MatLab was originally intended to perform purely numerical calculations (a glorified calculator). Now, in addition to having hundreds of mathematical functions, it is a programming language with hundreds built in standard functions and numerous available toolboxes. MatLab's ease of data processing, visualization and its enormous availability of built in functionalities and toolboxes make it suitable to perform numerical computations and simulations as well as a data mining tool. Engineers and scientists can take advantage of the readily available functions/toolboxes to gain wider insight in their perspective data mining experiments.

  9. Method to Select Technical Terms for Glossaries in Support of Joint Task Force Operations

    DTIC Science & Technology

    2012-01-01

    have been prohibitively time-consuming. Instead, we identified two publicly available terminology extractor tools: TerMine (NaCTEM, 2011) and Alchemy ...and that from the latter, by high recall. The Alchemy approach contrasts with that used in TerMine in that Alchemy will process the text with...information categories, such as person, location, and organization, in addition to returning topic keywords. Output from both TerMine and Alchemy

  10. LimTox: a web tool for applied text mining of adverse event and toxicity associations of compounds, drugs and genes

    PubMed Central

    Cañada, Andres; Rabal, Obdulia; Oyarzabal, Julen; Valencia, Alfonso

    2017-01-01

    Abstract A considerable effort has been devoted to retrieve systematically information for genes and proteins as well as relationships between them. Despite the importance of chemical compounds and drugs as a central bio-entity in pharmacological and biological research, only a limited number of freely available chemical text-mining/search engine technologies are currently accessible. Here we present LimTox (Literature Mining for Toxicology), a web-based online biomedical search tool with special focus on adverse hepatobiliary reactions. It integrates a range of text mining, named entity recognition and information extraction components. LimTox relies on machine-learning, rule-based, pattern-based and term lookup strategies. This system processes scientific abstracts, a set of full text articles and medical agency assessment reports. Although the main focus of LimTox is on adverse liver events, it enables also basic searches for other organ level toxicity associations (nephrotoxicity, cardiotoxicity, thyrotoxicity and phospholipidosis). This tool supports specialized search queries for: chemical compounds/drugs, genes (with additional emphasis on key enzymes in drug metabolism, namely P450 cytochromes—CYPs) and biochemical liver markers. The LimTox website is free and open to all users and there is no login requirement. LimTox can be accessed at: http://limtox.bioinfo.cnio.es PMID:28531339

  11. An Algorithm of Association Rule Mining for Microbial Energy Prospection

    PubMed Central

    Shaheen, Muhammad; Shahbaz, Muhammad

    2017-01-01

    The presence of hydrocarbons beneath earth’s surface produces some microbiological anomalies in soils and sediments. The detection of such microbial populations involves pure bio chemical processes which are specialized, expensive and time consuming. This paper proposes a new algorithm of context based association rule mining on non spatial data. The algorithm is a modified form of already developed algorithm which was for spatial database only. The algorithm is applied to mine context based association rules on microbial database to extract interesting and useful associations of microbial attributes with existence of hydrocarbon reserve. The surface and soil manifestations caused by the presence of hydrocarbon oxidizing microbes are selected from existing literature and stored in a shared database. The algorithm is applied on the said database to generate direct and indirect associations among the stored microbial indicators. These associations are then correlated with the probability of hydrocarbon’s existence. The numerical evaluation shows better accuracy for non-spatial data as compared to conventional algorithms at generating reliable and robust rules. PMID:28393846

  12. Alkemio: association of chemicals with biomedical topics by text and data mining.

    PubMed

    Gijón-Correas, José A; Andrade-Navarro, Miguel A; Fontaine, Jean F

    2014-07-01

    The PubMed® database of biomedical citations allows the retrieval of scientific articles studying the function of chemicals in biology and medicine. Mining millions of available citations to search reported associations between chemicals and topics of interest would require substantial human time. We have implemented the Alkemio text mining web tool and SOAP web service to help in this task. The tool uses biomedical articles discussing chemicals (including drugs), predicts their relatedness to the query topic with a naïve Bayesian classifier and ranks all chemicals by P-values computed from random simulations. Benchmarks on seven human pathways showed good retrieval performance (areas under the receiver operating characteristic curves ranged from 73.6 to 94.5%). Comparison with existing tools to retrieve chemicals associated to eight diseases showed the higher precision and recall of Alkemio when considering the top 10 candidate chemicals. Alkemio is a high performing web tool ranking chemicals for any biomedical topics and it is free to non-commercial users. http://cbdm.mdc-berlin.de/∼medlineranker/cms/alkemio. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Process mining is an underutilized clinical research tool in transfusion medicine.

    PubMed

    Quinn, Jason G; Conrad, David M; Cheng, Calvino K

    2017-03-01

    To understand inventory performance, transfusion services commonly use key performance indicators (KPIs) as summary descriptors of inventory efficiency that are graphed, trended, and used to benchmark institutions. Here, we summarize current limitations in KPI-based evaluation of blood bank inventory efficiency and propose process mining as an ideal methodology for application to inventory management research to improve inventory flows and performance. The transit of a blood product from inventory receipt to final disposition is complex and relates to many internal and external influences, and KPIs may be inadequate to fully understand the complexity of the blood supply chain and how units interact with its processes. Process mining lends itself well to analysis of blood bank inventories, and modern laboratory information systems can track nearly all of the complex processes that occur in the blood bank. Process mining is an analytical tool already used in other industries and can be applied to blood bank inventory management and research through laboratory information systems data using commercial applications. Although the current understanding of real blood bank inventories is value-centric through KPIs, it potentially can be understood from a process-centric lens using process mining. © 2017 AABB.

  14. RESPIROMETRY AS A TOOL TO DETERMINE METAL TOXICITY IN A SULFATE REDUCING BACTERIAL CULTURE

    EPA Science Inventory

    A novel method under development for treatment of acid mine drainage waste uses biologically- generated hydrogen sulfide (H2S) to precipitate the metals in acid mine drainage (principally zinc, copper, aluminum, nickel, cadmium, arsenic, manganese, iron, and cobalt). The insolub...

  15. Phrase Mining of Textual Data to Analyze Extracellular Matrix Protein Patterns Across Cardiovascular Disease.

    PubMed

    Liem, David Alexandre; Murali, Sanjana; Sigdel, Dibakar; Shi, Yu; Wang, Xuan; Shen, Jiaming; Choi, Howard; Caufield, J Harry; Wang, Wei; Ping, Peipei; Han, Jiawei

    2018-05-18

    Extracellular matrix (ECM) proteins have been shown to play important roles regulating multiple biological processes in an array of organ systems, including the cardiovascular system. By using a novel bioinformatics text-mining tool, we studied six categories of cardiovascular disease (CVD), namely ischemic heart disease (IHD), cardiomyopathies (CM), cerebrovascular accident (CVA), congenital heart disease (CHD), arrhythmias (ARR), and valve disease (VD), anticipating novel ECM protein-disease and protein-protein relationships hidden within vast quantities of textual data. We conducted a phrase-mining analysis, delineating the relationships of 709 ECM proteins with the six groups of CVDs reported in 1,099,254 abstracts. The technology pipeline known as Context-aware Semantic Online Analytical Processing (CaseOLAP) was applied to semantically rank the association of proteins to each and all six CVDs, performing analyses to quantify each protein-disease relationship. We performed principal component analysis and hierarchical clustering of the data, where each protein is visualized as a six dimensional vector. We found that ECM proteins display variable degrees of association with the six CVDs; certain CVDs share groups of associated proteins whereas others have divergent protein associations. We identified 82 ECM proteins sharing associations with all six CVDs. Our bioinformatics analysis ascribed distinct ECM pathways (via Reactome) from this subset of proteins, namely insulin-like growth factor regulation and interleukin-4 and interleukin-13 signaling, suggesting their contribution to the pathogenesis of all six CVDs. Finally, we performed hierarchical clustering analysis and identified protein clusters associated with a targeted CVD; analyses revealed unexpected insights underlying ECM-pathogenesis of CVDs.

  16. An Integrated Economics Model for ISRU in Support of a Mars Colony - Initial Status Report

    NASA Technical Reports Server (NTRS)

    Shishko, Robert; Fradet, Rene; Saydam, Serkan; Tapia-Cortez, Carlos; Dempster, Andrew G.; Coulton, Jeff

    2015-01-01

    The aim of this effort is to develop an integrated set of risk-based financial and technical models to evaluate multiple Off-Earth Mining (OEM) scenarios. This quantitative, scenario- and simulation-based tool will help identify combinations of market variables, technical parameters, and policy levers that will enable the expansion of the global economy into the solar system and return economic benefits. Human ventures in space are entering a new phase in which missions formerly driven by government agencies are now being replaced by those led by commercial enterprises - in launch, satellite deployment, resupply of the International Space Station, and space tourism. In the not-too-distant future, commercial opportunities will also include the mining of asteroids, the Moon, and Mars. This investigation will examine the role of OEM in a growing space economy. (In this investigation, the term 'mining' is taken to embrace minerals, ice/water, and other in situ resources.) OEM can be the engine that drives the space economy, so it would be useful to understand what OEM market conditions and technology requirements are needed for that economy to prosper. These specific elements will be studied in the wider context of creating an economy that could ultimately support a sustainable Mars Colony. Such a colony will need in situ resources not only for its own survival, but to prosper and grow, it must create viable business ventures, essentially by fulfilling the demand for in situ resources from and on Mars. This investigation will focus on understanding the role and economic prospect for OEM associated with the Human Colonization of Mars (HCM).

  17. Sediment matrix characterization as a tool for evaluating the environmental impact of heavy metals in metal mining, smelting, and ore processing areas.

    PubMed

    Ružičková, Silvia; Remeteiová, Dagmar; Mičková, Vladislava; Dirner, Vojtech

    2018-02-21

    In this work, the matrix characterization (mineralogy, total and local chemical composition, and total organic (TOC) and inorganic carbon (TIC) contents) of different types of sediments from mining- and metallurgy-influenced areas and the assessment of the impact of the matrix on the association of potentially hazardous metals with the mineral phases of these samples, which affect their mobility in the environment, are presented. For these purposes, sediment samples with different origins and from different locations in the environment were analyzed. Anthropogenic sediments from metal-rich post-flotation tailings (Lintich, Slovakia) represent waste from ore processing, natural river sediments from the Hornád River (Košice, Slovakia) represent areas influenced predominantly by the metallurgical industry, and lake sediments from a water reservoir Ružín (inflow from the Hornád and Hnilec Rivers, Slovakia) represent the impact of the metallurgical and/or mining industries. The total metal contents were determined by X-ray fluorescence (XRF) analysis, the local chemical and morphological microanalysis by scanning electron microscopy with energy-dispersive spectroscopy (SEM-EDS), and the TOC and TIC contents by infrared (IR) spectrometry. The mobility/bioavailability of Cu, Pb, and Zn in/from sediments at the studied areas was assessed by ethylenediaminetetraacetic acid (EDTA) and acetic acid (AA) extraction and is discussed in the context of the matrix composition. The contents of selected potentially hazardous elements in the extracts were determined by the high-resolution continuum source flame atomic absorption spectrometry (HR-CS FAAS).

  18. Numerical Modeling Tools for the Prediction of Solution Migration Applicable to Mining Site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martell, M.; Vaughn, P.

    1999-01-06

    Mining has always had an important influence on cultures and traditions of communities around the globe and throughout history. Today, because mining legislation places heavy emphasis on environmental protection, there is great interest in having a comprehensive understanding of ancient mining and mining sites. Multi-disciplinary approaches (i.e., Pb isotopes as tracers) are being used to explore the distribution of metals in natural environments. Another successful approach is to model solution migration numerically. A proven method to simulate solution migration in natural rock salt has been applied to project through time for 10,000 years the system performance and solution concentrations surroundingmore » a proposed nuclear waste repository. This capability is readily adaptable to simulate solution migration around mining.« less

  19. Mining Deployment Optimization

    NASA Astrophysics Data System (ADS)

    Čech, Jozef

    2016-09-01

    The deployment problem, researched primarily in the military sector, is emerging in some other industries, mining included. The principal decision is how to deploy some activities in space and time to achieve desired outcome while complying with certain requirements or limits. Requirements and limits are on the side constraints, while minimizing costs or maximizing some benefits are on the side of objectives. A model with application to mining of polymetallic deposit is presented. To obtain quick and immediate decision solutions for a mining engineer with experimental possibilities is the main intention of a computer-based tool. The task is to determine strategic deployment of mining activities on a deposit, meeting planned output from the mine and at the same time complying with limited reserves and haulage capacities. Priorities and benefits can be formulated by the planner.

  20. e-IQ and IQ knowledge mining for generalized LDA

    NASA Astrophysics Data System (ADS)

    Jenkins, Jeffrey; van Bergem, Rutger; Sweet, Charles; Vietsch, Eveline; Szu, Harold

    2015-05-01

    How can the human brain uncover patterns, associations and features in real-time, real-world data? There must be a general strategy used to transform raw signals into useful features, but representing this generalization in the context of our information extraction tool set is lacking. In contrast to Big Data (BD), Large Data Analysis (LDA) has become a reachable multi-disciplinary goal in recent years due in part to high performance computers and algorithm development, as well as the availability of large data sets. However, the experience of Machine Learning (ML) and information communities has not been generalized into an intuitive framework that is useful to researchers across disciplines. The data exploration phase of data mining is a prime example of this unspoken, ad-hoc nature of ML - the Computer Scientist works with a Subject Matter Expert (SME) to understand the data, and then build tools (i.e. classifiers, etc.) which can benefit the SME and the rest of the researchers in that field. We ask, why is there not a tool to represent information in a meaningful way to the researcher asking the question? Meaning is subjective and contextual across disciplines, so to ensure robustness, we draw examples from several disciplines and propose a generalized LDA framework for independent data understanding of heterogeneous sources which contribute to Knowledge Discovery in Databases (KDD). Then, we explore the concept of adaptive Information resolution through a 6W unsupervised learning methodology feedback system. In this paper, we will describe the general process of man-machine interaction in terms of an asymmetric directed graph theory (digging for embedded knowledge), and model the inverse machine-man feedback (digging for tacit knowledge) as an ANN unsupervised learning methodology. Finally, we propose a collective learning framework which utilizes a 6W semantic topology to organize heterogeneous knowledge and diffuse information to entities within a society in a personalized way.

  1. Assessing human rights impacts in corporate development projects

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Salcito, Kendyl, E-mail: kendyl.salcito@unibas.ch; University of Basel, P.O. Box, CH-4003 Basel; NomoGaia, 1900 Wazee Street, Suite 303, Denver, CO 80202

    Human rights impact assessment (HRIA) is a process for systematically identifying, predicting and responding to the potential impact on human rights of a business operation, capital project, government policy or trade agreement. Traditionally, it has been conducted as a desktop exercise to predict the effects of trade agreements and government policies on individuals and communities. In line with a growing call for multinational corporations to ensure they do not violate human rights in their activities, HRIA is increasingly incorporated into the standard suite of corporate development project impact assessments. In this context, the policy world's non-structured, desk-based approaches to HRIAmore » are insufficient. Although a number of corporations have commissioned and conducted HRIA, no broadly accepted and validated assessment tool is currently available. The lack of standardisation has complicated efforts to evaluate the effectiveness of HRIA as a risk mitigation tool, and has caused confusion in the corporate world regarding company duties. Hence, clarification is needed. The objectives of this paper are (i) to describe an HRIA methodology, (ii) to provide a rationale for its components and design, and (iii) to illustrate implementation of HRIA using the methodology in two selected corporate development projects—a uranium mine in Malawi and a tree farm in Tanzania. We found that as a prognostic tool, HRIA could examine potential positive and negative human rights impacts and provide effective recommendations for mitigation. However, longer-term monitoring revealed that recommendations were unevenly implemented, dependent on market conditions and personnel movements. This instability in the approach to human rights suggests a need for on-going monitoring and surveillance. -- Highlights: • We developed a novel methodology for corporate human rights impact assessment. • We piloted the methodology on two corporate projects—a mine and a plantation. • Human rights impact assessment exposed impacts not foreseen in ESIA. • Corporations adopted the majority of findings, but not necessarily immediately. • Methodological advancements are expected for monitoring processes.« less

  2. A risk-based decision support framework for selection of appropriate safety measure system for underground coal mines.

    PubMed

    Samantra, Chitrasen; Datta, Saurav; Mahapatra, Siba Sankar

    2017-03-01

    In the context of underground coal mining industry, the increased economic issues regarding implementation of additional safety measure systems, along with growing public awareness to ensure high level of workers safety, have put great pressure on the managers towards finding the best solution to ensure safe as well as economically viable alternative selection. Risk-based decision support system plays an important role in finding such solutions amongst candidate alternatives with respect to multiple decision criteria. Therefore, in this paper, a unified risk-based decision-making methodology has been proposed for selecting an appropriate safety measure system in relation to an underground coal mining industry with respect to multiple risk criteria such as financial risk, operating risk, and maintenance risk. The proposed methodology uses interval-valued fuzzy set theory for modelling vagueness and subjectivity in the estimates of fuzzy risk ratings for making appropriate decision. The methodology is based on the aggregative fuzzy risk analysis and multi-criteria decision making. The selection decisions are made within the context of understanding the total integrated risk that is likely to incur while adapting the particular safety system alternative. Effectiveness of the proposed methodology has been validated through a real-time case study. The result in the context of final priority ranking is seemed fairly consistent.

  3. Text Mining for Adverse Drug Events: the Promise, Challenges, and State of the Art

    PubMed Central

    Harpaz, Rave; Callahan, Alison; Tamang, Suzanne; Low, Yen; Odgers, David; Finlayson, Sam; Jung, Kenneth; LePendu, Paea; Shah, Nigam H.

    2014-01-01

    Text mining is the computational process of extracting meaningful information from large amounts of unstructured text. Text mining is emerging as a tool to leverage underutilized data sources that can improve pharmacovigilance, including the objective of adverse drug event detection and assessment. This article provides an overview of recent advances in pharmacovigilance driven by the application of text mining, and discusses several data sources—such as biomedical literature, clinical narratives, product labeling, social media, and Web search logs—that are amenable to text-mining for pharmacovigilance. Given the state of the art, it appears text mining can be applied to extract useful ADE-related information from multiple textual sources. Nonetheless, further research is required to address remaining technical challenges associated with the text mining methodologies, and to conclusively determine the relative contribution of each textual source to improving pharmacovigilance. PMID:25151493

  4. Monitoring Metal Pollution Levels in Mine Wastes around a Coal Mine Site Using GIS

    NASA Astrophysics Data System (ADS)

    Sanliyuksel Yucel, D.; Yucel, M. A.; Ileri, B.

    2017-11-01

    In this case study, metal pollution levels in mine wastes at a coal mine site in Etili coal mine (Can coal basin, NW Turkey) are evaluated using geographical information system (GIS) tools. Etili coal mine was operated since the 1980s as an open pit. Acid mine drainage is the main environmental problem around the coal mine. The main environmental contamination source is mine wastes stored around the mine site. Mine wastes were dumped over an extensive area along the riverbeds, and are now abandoned. Mine waste samples were homogenously taken at 10 locations within the sampling area of 102.33 ha. The paste pH and electrical conductivity values of mine wastes ranged from 2.87 to 4.17 and 432 to 2430 μS/cm, respectively. Maximum Al, Fe, Mn, Pb, Zn and Ni concentrations of wastes were measured as 109300, 70600, 309.86, 115.2, 38 and 5.3 mg/kg, respectively. The Al, Fe and Pb concentrations of mine wastes are higher than world surface rock average values. The geochemical analysis results from the study area were presented in the form of maps. The GIS based environmental database will serve as a reference study for our future work.

  5. Data mining for health executive decision support: an imperative with a daunting future!

    PubMed Central

    Glover, Saundra; Rivers, Patrick A; Asoh, Derek A; Piper, Crystal N; Murph, Keva

    2010-01-01

    Summary Data mining is highly profiled. It has the potential to enhance executive information systems. Such enhancement would mean better decision-making by management, which in turn would mean better services for customers. While the future of data mining as technology should be exciting, some are worried about privacy concerns, which make the future of data mining daunting. This paper examines why data mining is highly profiled – the imperative toward data mining, data mining models and processes. Additionally, the paper examines some of the benefits and challenges of using data mining processes within the health-care arena. We cast the future of data mining by highlighting two of the many data mining tools available – one commercial and one freely available. Subsequently, we discuss a number of social and technical factors that may thwart the extensive deployment of data mining, especially when the intent is to know more about the people that organizations have to serve and cast a view of what the future holds for data mining. This component is especially important when attempting to determine the longevity of data mining within health-care organizations. It is hoped that our discussions would be useful to organizations as they engage data mining, strategies for executive information systems and information policy issues. PMID:20150610

  6. Framing Psychology as a Discipline (1950-1999): A Large-Scale Term Co-Occurrence Analysis of Scientific Literature in Psychology.

    PubMed

    Flis, Ivan; van Eck, Nees Jan

    2017-07-20

    This study investigated the structure of psychological literature as represented by a corpus of 676,393 articles in the period from 1950 to 1999. The corpus was extracted from 1,269 journals indexed by PsycINFO. The data in our analysis consisted of the relevant terms mined from the titles and abstracts of all of the articles in the corpus. Based on the co-occurrences of these terms, we developed a series of chronological visualizations using a bibliometric software tool called VOSviewer. These visualizations produced a stable structure through the 5 decades under analysis, and this structure was analyzed as a data-mined proxy for the disciplinary formation of scientific psychology in the second part of the 20th century. Considering the stable structure uncovered by our term co-occurrence analysis and its visualization, we discuss it in the context of Lee Cronbach's "Two Disciplines of Scientific Psychology" (1957) and conventional history of 20th-century psychology's disciplinary formation and history of methods. Our aim was to provide a comprehensive digital humanities perspective on the large-scale structural development of research in English-language psychology from 1950 to 1999. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  7. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hero, Alfred O.; Rajaratnam, Bala

    When can reliable inference be drawn in the ‘‘Big Data’’ context? This article presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large-scale inference. In large-scale data applications like genomics, connectomics, and eco-informatics, the data set is often variable rich but sample starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than the number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for ‘‘Big Data.’’ Sample complexity, however, hasmore » received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; and 3) the purely high-dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high-dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. We demonstrate various regimes of correlation mining based on the unifying perspective of high-dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.« less

  8. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining

    PubMed Central

    Hero, Alfred O.; Rajaratnam, Bala

    2015-01-01

    When can reliable inference be drawn in fue “Big Data” context? This paper presents a framework for answering this fundamental question in the context of correlation mining, wifu implications for general large scale inference. In large scale data applications like genomics, connectomics, and eco-informatics fue dataset is often variable-rich but sample-starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than fue number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for “Big Data”. Sample complexity however has received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address fuis gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where fue variable dimension is fixed and fue sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; 3) the purely high dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa cale data dimension. We illustrate this high dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables fua t are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. we demonstrate various regimes of correlation mining based on the unifying perspective of high dimensional learning rates and sample complexity for different structured covariance models and different inference tasks. PMID:27087700

  9. Foundational Principles for Large-Scale Inference: Illustrations Through Correlation Mining

    DOE PAGES

    Hero, Alfred O.; Rajaratnam, Bala

    2015-12-09

    When can reliable inference be drawn in the ‘‘Big Data’’ context? This article presents a framework for answering this fundamental question in the context of correlation mining, with implications for general large-scale inference. In large-scale data applications like genomics, connectomics, and eco-informatics, the data set is often variable rich but sample starved: a regime where the number n of acquired samples (statistical replicates) is far fewer than the number p of observed variables (genes, neurons, voxels, or chemical constituents). Much of recent work has focused on understanding the computational complexity of proposed methods for ‘‘Big Data.’’ Sample complexity, however, hasmore » received relatively less attention, especially in the setting when the sample size n is fixed, and the dimension p grows without bound. To address this gap, we develop a unified statistical framework that explicitly quantifies the sample complexity of various inferential tasks. Sampling regimes can be divided into several categories: 1) the classical asymptotic regime where the variable dimension is fixed and the sample size goes to infinity; 2) the mixed asymptotic regime where both variable dimension and sample size go to infinity at comparable rates; and 3) the purely high-dimensional asymptotic regime where the variable dimension goes to infinity and the sample size is fixed. Each regime has its niche but only the latter regime applies to exa-scale data dimension. We illustrate this high-dimensional framework for the problem of correlation mining, where it is the matrix of pairwise and partial correlations among the variables that are of interest. Correlation mining arises in numerous applications and subsumes the regression context as a special case. We demonstrate various regimes of correlation mining based on the unifying perspective of high-dimensional learning rates and sample complexity for different structured covariance models and different inference tasks.« less

  10. Realizing Modeling and Mapping tools to Study the Upsurge of Noise Pollution as a Result of Open-Cast Mining and Transportation Activities

    PubMed Central

    Lokhande, Satish K.; Jain, Mohindra C.; Dhawale, Satyajeet A.; Gautam, Rakesh; Bodhe, Ghanshyam L.

    2018-01-01

    Introduction: In open-cast mines, noise pollution has become a serious concern due to the extreme use of heavy earth moving machinery (HEMM). Materials and Methods: This study is focused to measure and assess the effects of the existing noise levels of major operational mines in the Keonjhar, Sundergadh, and Mayurbhanj districts of Odisha, India. The transportation noise levels were also considered in this study, which was predicted using the modified Federal Highway Administration (FHWA) model. Result and Discussion: It was observed that noise induced by HEMM such as rock breakers, jackhammers, dumpers, and excavators, blasting noise in the mining terrain, as well as associated transportation noise became a major source of annoyance to the habitants living in proximity to the mines. The noise produced by mechanized mining operations was observed between 74.3 and 115.2 dB(A), and its impact on residential areas was observed between 49.4 and 58.9 dB(A). In addition, the noise contour maps of sound level dispersion were demonstrated with the utilization of advanced noise prediction software tools for better understanding. Conclusion: Finally, the predicted values at residential zone and traffic noise are correlated with observed values, and the coefficient of determination, R2, was calculated to be 0.6891 and 0.5967, respectively. PMID:29676297

  11. Platinum and Gold Mining in South Africa: The Context of the Marikana Massacre.

    PubMed

    Cairncross, Eugene; Kisting, Sophia

    2016-02-01

    Mining is a source of extraordinary wealth, but its benefits often do not accrue to the workers and communities most involved. This paper presents two case studies of mining in South Africa to reflect on the history and legacy of mining both through observation and through the voices of affected communities. Interviews and observations on field visits to the platinum and gold mining areas of South Africa in the immediate aftermath of the Marikana massacre highlight this legacy--including vast quantities of tailings dumps and waste rock, lakes of polluted water and a devastated physical and social environment, high unemployment, high rates of occupational injury and disease including silicosis with co-morbidities, absent social security, and disrupted rural and agricultural communities. Exploitative conditions of work and the externalization of the health and environmental costs of mining will require international solidarity, robust independent trade unions, and a commitment to human rights. © The Author(s) 2016.

  12. Stochastic production phase design for an open pit mining complex with multiple processing streams

    NASA Astrophysics Data System (ADS)

    Asad, Mohammad Waqar Ali; Dimitrakopoulos, Roussos; van Eldert, Jeroen

    2014-08-01

    In a mining complex, the mine is a source of supply of valuable material (ore) to a number of processes that convert the raw ore to a saleable product or a metal concentrate for production of the refined metal. In this context, expected variation in metal content throughout the extent of the orebody defines the inherent uncertainty in the supply of ore, which impacts the subsequent ore and metal production targets. Traditional optimization methods for designing production phases and ultimate pit limit of an open pit mine not only ignore the uncertainty in metal content, but, in addition, commonly assume that the mine delivers ore to a single processing facility. A stochastic network flow approach is proposed that jointly integrates uncertainty in supply of ore and multiple ore destinations into the development of production phase design and ultimate pit limit. An application at a copper mine demonstrates the intricacies of the new approach. The case study shows a 14% higher discounted cash flow when compared to the traditional approach.

  13. ABC for AIDS prevention in Guinea: migrant gold mining communities address their risks.

    PubMed

    Kis, Adam Daniel

    2010-04-01

    Contrary to expectation when compared with other migrant mining zones of sub-Saharan Africa, the nation of Guinea has a comparatively low and stable HIV rate. In addition, the regions with the largest gold, diamond, and bauxite mining operations report the lowest HIV rates within the country. This research set out to explain practices and beliefs within gold mining communities near Siguiri, Guinea--the highest-producing gold mining zone in the country--that may contribute to this phenomenon, particularly as they relate to the Abstinence, Be faithful, use a Condom approach to AIDS prevention. Structured interviews on a randomly selected sample of 460 adults and regular visitation to 16 pharmacies and health clinics within the mining zone yielded data showing that abstinence and condom use are minimally practiced for AIDS prevention. Instead, faithfulness to partners was overwhelmingly reported as the method of choice for AIDS avoidance. In addition, this research explored ways in which local conceptions of fidelity differed from those generally understood in other contexts, including engagement in short-term marriages at the gold mining sites.

  14. Characterizing Ground-Water Flow Paths in High-Altitude Fractured Rock Settings Impacted by Mining Activities

    NASA Astrophysics Data System (ADS)

    Wireman, M.; Williams, D.

    2003-12-01

    The Rocky Mountains of the western USA have tens of thousands of abandoned, inactive and active precious-metal(gold,silver,copper)mine sites. Most of these sites occur in fractured rock hydrogeologic settings. Mining activities often resulted in mobilization and transport of associated heavy metals (zinc,cadmium,lead) which pose a significant threat to aquatic communities in mountain streams.Transport of heavy metals from mine related sources (waste rock piles,tailings impoudments,underground workings, mine pits)can occur along numerous hydrological pathways including complex fracture controlled ground-water pathways. Since 1991, the United States Environmental Protection Agency, the Colorado Division of Minerals and Geology and the University of Colorado (INSTAAR)have been conducting applied hydrologic research at the Mary Murphy underground mine. The mine is in the Chalk Creek mining district which is located on the southwestern flanks of the Mount Princeton Batholith, a Tertiary age intrusive comprised primarily of quartz monzonite.The Mount Princeton batholith comprises a large portion of the southern part of the Collegiate Range west of Buena Vista in Chaffee County, CO. Chalk Creek and its 14 tributaries drain about 24,900 hectares of the eastern slopes of the Range including the mining district. Within the mining district, ground-water flow is controlled by the distribution, orientation and permeability of discontinuities within the bedrock. Important discontinuities include faults, joints and weathered zones. Local and intermediate flow systems are perturbed by extensive underground excavations associated with mining (adits, shafts, stopes, drifts,, etc.). During the past 12 years numerous hydrological investigations have been completed. The investigations have been focused on developing tools for characterizing ground-water flow and contaminant transport in the vicinity of hard-rock mines in fractured-rock settings. In addition, the results from these investigations have been used to develop a sound conceptual model of ground-water flow and transport of heavy metals from the mine workings to Chalk Creek. Ground-water tracing techniques (using organic, fluorescent dyes) have been successfully used to delineate ground-water flow paths. Surface-water tracing techniques have been used to acquire very accurate stream flow measuements and to identify ground-water inflow zones to streams. Stable (O18/D)and radioactive (tritium,sulphur 35) isotope anlysis of waters flowing into and out of underground workings have proved useful for conducting end member mixing analysis to determine which inflows and outflows are most significant with respect to metals loading. Hydrogeologic mapping, inverse geochemical modeling (using MINTEQAK code)and helium 3 analysis of ground water have also proven to useful tools. These tools, used in combination have provided multiple lines of evidence regarding the nature, timing and magnitude of ground-water inflow into underground mine workings and the distribution and types of hydrologic pathways that transport metals from the underground workings to Chalk Creek. This paper presents the results of some of the more important hydrologic investigations completed at the site and a conceptual model of ground-water flow in fractured rock settings that have been impacted by underground mining activites.

  15. Data Analysis and Data Mining: Current Issues in Biomedical Informatics

    PubMed Central

    Bellazzi, Riccardo; Diomidous, Marianna; Sarkar, Indra Neil; Takabayashi, Katsuhiko; Ziegler, Andreas; McCray, Alexa T.

    2011-01-01

    Summary Background Medicine and biomedical sciences have become data-intensive fields, which, at the same time, enable the application of data-driven approaches and require sophisticated data analysis and data mining methods. Biomedical informatics provides a proper interdisciplinary context to integrate data and knowledge when processing available information, with the aim of giving effective decision-making support in clinics and translational research. Objectives To reflect on different perspectives related to the role of data analysis and data mining in biomedical informatics. Methods On the occasion of the 50th year of Methods of Information in Medicine a symposium was organized, that reflected on opportunities, challenges and priorities of organizing, representing and analysing data, information and knowledge in biomedicine and health care. The contributions of experts with a variety of backgrounds in the area of biomedical data analysis have been collected as one outcome of this symposium, in order to provide a broad, though coherent, overview of some of the most interesting aspects of the field. Results The paper presents sections on data accumulation and data-driven approaches in medical informatics, data and knowledge integration, statistical issues for the evaluation of data mining models, translational bioinformatics and bioinformatics aspects of genetic epidemiology. Conclusions Biomedical informatics represents a natural framework to properly and effectively apply data analysis and data mining methods in a decision-making context. In the future, it will be necessary to preserve the inclusive nature of the field and to foster an increasing sharing of data and methods between researchers. PMID:22146916

  16. Symbolic solutions for deadly dilemmas: an analysis of federal coal mine health and safety legislation.

    PubMed

    Curran, D J

    1984-01-01

    Numerous studies of coal mine laws have argued that the passage of all significant health and safety legislation can be attributed to a succession of catastrophic disasters which heightened awareness and propelled lawmakers into action. This paper takes issue with this "disaster-law" argument because it obscures the intricacies of law creation by focusing on a single factor. More accurately, mining disasters represent one dimension of a process aimed at resolving conflicts occurring within a specific social context. Historically, legislation has been utilized to avert economic crises by addressing the demands of protesting miners. Unfortunately, while the "written law" assured improvements, the "law in action" did not meet these guarantees and the deaths in the mines continued. A case study of the Coal Mine Health and Safety Act of 1969 demonstrates how a law with apparently progressive standards can fail to effect change because of its dualistic nature and incomplete implementation.

  17. The Interaction of Economic Rewards and Moral Convictions in Predicting Attitudes toward Resource Use

    PubMed Central

    Bastian, Brock; Zhang, Airong; Moffat, Kieren

    2015-01-01

    When people are morally convicted regarding a specific issue, these convictions exert a powerful influence on their attitudes and behavior. In the current research we examined whether there are boundary conditions to the influence of this effect. Specifically, whether in the context of salient economic rewards, moral convictions may become weaker predictors of attitudes regarding resource use. Focusing on the issue of mining we gathered large-scale samples across three different continents (Australia, Chile, and China). We found that moral convictions against mining were related to a reduced acceptance of mining in each country, while perceived economic rewards from mining increased acceptance. These two motivations interacted, however, such that when perceived economic benefit from mining was high, the influence of moral conviction was weaker. The results highlight the importance of understanding the roles of both moral conviction and financial gain in motivating attitudes towards resource use. PMID:26267904

  18. Antimicrobial Stewardship in a Community Hospital: Attacking the More Difficult Problems

    PubMed Central

    Philmon, Carla L.; Johnson, Gregory D.; Ward, William S.; Rivers, LaToya L.; Williamson, Sharon A.; Goodman, Edward L.

    2014-01-01

    Background: Antibiotic stewardship has been proposed as an important way to reduce or prevent antibiotic resistance. In 2001, a community hospital implemented an antimicrobial management program. It was successful in reducing antimicrobial utilization and expenditure. In 2011, with the implementation of a data-mining tool, the program was expanded and its focus transitioned from control of antimicrobial use to guiding judicious antimicrobial prescribing. Objective: To test the hypothesis that adding a data-mining tool to an existing antimicrobial stewardship program will further increase appropriate use of antimicrobials. Design: Interventional study with historical comparison. Methods: Rules and alerts were built into the data-mining tool to aid in identifying inappropriate antibiotic utilization. Decentralized pharmacists acted on alerts for intravenous (IV) to oral conversion, perioperative antibiotic duration, and restricted antimicrobials. An Infectious Diseases (ID) Pharmacist and ID Physician/Hospital Epidemiologist focused on all other identified alert types such as antibiotic de-escalation, bug-drug mismatch, and double coverage. Electronic chart notes and phone calls to physicians were utilized to make recommendations. Results: During 2012, 2,003 antimicrobial interventions were made with a 90% acceptance rate. Targeted broad-spectrum antimicrobial use decreased by 15% in 2012 compared to 2010, which represented cost savings of $1,621,730. There were no statistically significant changes in antimicrobial resistance, and no adverse patient outcomes were noted. Conclusions: The addition of a data-mining tool to an antimicrobial stewardship program can further decrease inappropriate use of antimicrobials, provide a greater reduction in overall antimicrobial use, and provide increased cost savings without negatively affecting patient outcomes. PMID:25477615

  19. Data Mining and Optimization Tools for Developing Engine Parameters Tools

    NASA Technical Reports Server (NTRS)

    Dhawan, Atam P.

    1998-01-01

    This project was awarded for understanding the problem and developing a plan for Data Mining tools for use in designing and implementing an Engine Condition Monitoring System. From the total budget of $5,000, Tricia and I studied the problem domain for developing ail Engine Condition Monitoring system using the sparse and non-standardized datasets to be available through a consortium at NASA Lewis Research Center. We visited NASA three times to discuss additional issues related to dataset which was not made available to us. We discussed and developed a general framework of data mining and optimization tools to extract useful information from sparse and non-standard datasets. These discussions lead to the training of Tricia Erhardt to develop Genetic Algorithm based search programs which were written in C++ and used to demonstrate the capability of GA algorithm in searching an optimal solution in noisy datasets. From the study and discussion with NASA LERC personnel, we then prepared a proposal, which is being submitted to NASA for future work for the development of data mining algorithms for engine conditional monitoring. The proposed set of algorithm uses wavelet processing for creating multi-resolution pyramid of the data for GA based multi-resolution optimal search. Wavelet processing is proposed to create a coarse resolution representation of data providing two advantages in GA based search: 1. We will have less data to begin with to make search sub-spaces. 2. It will have robustness against the noise because at every level of wavelet based decomposition, we will be decomposing the signal into low pass and high pass filters.

  20. Sustainable mineral resources management: from regional mineral resources exploration to spatial contamination risk assessment of mining

    NASA Astrophysics Data System (ADS)

    Jordan, Gyozo

    2009-07-01

    Wide-spread environmental contamination associated with historic mining in Europe has triggered social responses to improve related environmental legislation, the environmental assessment and management methods for the mining industry. Mining has some unique features such as natural background contamination associated with mineral deposits, industrial activities and contamination in the three-dimensional subsurface space, problem of long-term remediation after mine closure, problem of secondary contaminated areas around mine sites, land use conflicts and abandoned mines. These problems require special tools to address the complexity of the environmental problems of mining-related contamination. The objective of this paper is to show how regional mineral resources mapping has developed into the spatial contamination risk assessment of mining and how geological knowledge can be transferred to environmental assessment of mines. The paper provides a state-of-the-art review of the spatial mine inventory, hazard, impact and risk assessment and ranking methods developed by national and international efforts in Europe. It is concluded that geological knowledge on mineral resources exploration is essential and should be used for the environmental contamination assessment of mines. Also, sufficient methodological experience, knowledge and documented results are available, but harmonisation of these methods is still required for the efficient spatial environmental assessment of mine contamination.

  1. Use of a remotely piloted aircraft system for hazard assessment in a rocky mining area (Lucca, Italy)

    NASA Astrophysics Data System (ADS)

    Salvini, Riccardo; Mastrorocco, Giovanni; Esposito, Giuseppe; Di Bartolo, Silvia; Coggan, John; Vanneschi, Claudio

    2018-01-01

    The use of remote sensing techniques is now common practice in different working environments, including engineering geology. Moreover, in recent years the development of structure from motion (SfM) methods, together with rapid technological improvement, has allowed the widespread use of cost-effective remotely piloted aircraft systems (RPAS) for acquiring detailed and accurate geometrical information even in evolving environments, such as mining contexts. Indeed, the acquisition of remotely sensed data from hazardous areas provides accurate 3-D models and high-resolution orthophotos minimizing the risk for operators. The quality and quantity of the data obtainable from RPAS surveys can then be used for inspection of mining areas, audit of mining design, rock mass characterizations, stability analysis investigations and monitoring activities. Despite the widespread use of RPAS, its potential and limitations still have to be fully understood.In this paper a case study is shown where a RPAS was used for the engineering geological investigation of a closed marble mine area in Italy; direct ground-based techniques could not be applied for safety reasons. In view of the re-activation of mining operations, high-resolution images taken from different positions and heights were acquired and processed using SfM techniques to obtain an accurate and detailed 3-D model of the area. The geometrical and radiometrical information was subsequently used for a deterministic rock mass characterization, which led to the identification of two large marble blocks that pose a potential significant hazard issue for the future workforce. A preliminary stability analysis, with a focus on investigating the contribution of potential rock bridges, was then performed in order to demonstrate the potential use of RPAS information in engineering geological contexts for geohazard identification, awareness and reduction.

  2. Analysis of environmental-social changes in the surrounding area of KWB Turow in the historical context

    NASA Astrophysics Data System (ADS)

    Ciesłik, Tobiasz; Górniak-Zimroz, Justyna

    2018-01-01

    Opencast mining of large-area lignite deposits impacts the environment, and the health and life of people living in the vicinity of the conducted mining activity. Therefore, the attempt was made to develop a methodology for identification of environmental and social changes in the Bogatynia municipality (south-western Poland), resulting from functioning of Turow lignite mine within its area. During the study of changes occurring over the years, the development of mining pit was noticed, as well as the transformations of this area and impact of the mining plant on the selected elements of environment and surrounding areas. Analogue and digital data were used for the preparation of cartographic compilations, the usefulness of which was analyzed in accordance with the guidelines contained in the standard [1]. The conducted cartographic studies allowed to learn the history of the mine together with identification of changes taking place in the municipality Bogatynia. The obtained results show the form and condition of the objects in the analyzed year, allowing for the interpretation of changes that occurred in the surrounding areas of the Turow mine. Due to the conducted activity of the mine and Turow power plant, both negative and positive aspects were noted in connection with the carrying out of mining activity in the Bogatynia municipality.

  3. From chemical risk assessment to environmental resources management: the challenge for mining.

    PubMed

    Voulvoulis, Nikolaos; Skolout, John W F; Oates, Christopher J; Plant, Jane A

    2013-11-01

    On top of significant improvements and progress made through science and engineering in the last century to increase efficiency and reduce impacts of mining to the environment, risk assessment has an important role to play in further reducing such impacts and preventing and mitigating risks. This paper reflects on how risk assessment can improve planning, monitoring and management in mining and mineral processing operations focusing on the importance of better understanding source-pathway-receptor linkages for all stages of mining. However, in light of the ever-growing consumption and demand for raw materials from mining, the need to manage environmental resources more sustainably is becoming increasingly important. The paper therefore assesses how mining can form an integral part of wider sustainable resources management, with the need for re-assessing the potential of mining in the context of sustainable management of natural capital, and with a renewed focus on its the role from a systems perspective. The need for understanding demand and pressure on resources, followed by appropriate pricing that is inclusive of all environmental costs, with new opportunities for mining in the wastes we generate, is also discussed. Findings demonstrate the need for a life cycle perspective in closing the loop between mining, production, consumption and waste generation as the way forward.

  4. Text mining and its potential applications in systems biology.

    PubMed

    Ananiadou, Sophia; Kell, Douglas B; Tsujii, Jun-ichi

    2006-12-01

    With biomedical literature increasing at a rate of several thousand papers per week, it is impossible to keep abreast of all developments; therefore, automated means to manage the information overload are required. Text mining techniques, which involve the processes of information retrieval, information extraction and data mining, provide a means of solving this. By adding meaning to text, these techniques produce a more structured analysis of textual knowledge than simple word searches, and can provide powerful tools for the production and analysis of systems biology models.

  5. From data mining rules to medical logical modules and medical advices.

    PubMed

    Gomoi, Valentin; Vida, Mihaela; Robu, Raul; Stoicu-Tivadar, Vasile; Bernad, Elena; Lupşe, Oana

    2013-01-01

    Using data mining in collaboration with Clinical Decision Support Systems adds new knowledge as support for medical diagnosis. The current work presents a tool which translates data mining rules supporting generation of medical advices to Arden Syntax formalism. The developed system was tested with data related to 2326 births that took place in 2010 at the Bega Obstetrics - Gynaecology Hospital, Timişoara. Based on processing these data, 14 medical rules regarding the Apgar score were generated and then translated in Arden Syntax language.

  6. LimTox: a web tool for applied text mining of adverse event and toxicity associations of compounds, drugs and genes.

    PubMed

    Cañada, Andres; Capella-Gutierrez, Salvador; Rabal, Obdulia; Oyarzabal, Julen; Valencia, Alfonso; Krallinger, Martin

    2017-07-03

    A considerable effort has been devoted to retrieve systematically information for genes and proteins as well as relationships between them. Despite the importance of chemical compounds and drugs as a central bio-entity in pharmacological and biological research, only a limited number of freely available chemical text-mining/search engine technologies are currently accessible. Here we present LimTox (Literature Mining for Toxicology), a web-based online biomedical search tool with special focus on adverse hepatobiliary reactions. It integrates a range of text mining, named entity recognition and information extraction components. LimTox relies on machine-learning, rule-based, pattern-based and term lookup strategies. This system processes scientific abstracts, a set of full text articles and medical agency assessment reports. Although the main focus of LimTox is on adverse liver events, it enables also basic searches for other organ level toxicity associations (nephrotoxicity, cardiotoxicity, thyrotoxicity and phospholipidosis). This tool supports specialized search queries for: chemical compounds/drugs, genes (with additional emphasis on key enzymes in drug metabolism, namely P450 cytochromes-CYPs) and biochemical liver markers. The LimTox website is free and open to all users and there is no login requirement. LimTox can be accessed at: http://limtox.bioinfo.cnio.es. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Multimedia Exploratory Data Analysis for Geospatial Data Mining: The Case for Augmented Seriation.

    ERIC Educational Resources Information Center

    Gluck, Myke

    2001-01-01

    Reviews the role of exploratory data analysis (EDA) for spatial data mining and presents a case study addressing environmental risk assessments in New York State to illustrate the feasibility and usability of augmenting seriation for spatial data analysis. Describes augmentation with multimedia tools to understand relationships among spatial,…

  8. Using software to predict occupational hearing loss in the mining industry

    PubMed Central

    Azman, A.S.; Li, M.; Thompson, J.K.

    2017-01-01

    Powerful mining systems typically generate high-level noise that can damage the hearing ability of miners. Engineering noise controls are the most desirable and effective control for overexposure to noise. However, the effects of these noise controls on the actual hearing status of workers are not easily measured. A tool that can provide guidance in assigning workers to jobs based on the noise levels to which they will be exposed is highly desirable. Therefore, the Pittsburgh Mining Research Division (PMRD) of the U.S. National Institute for Occupational Safety and Health (NIOSH) developed a tool to estimate in a systematic way the hearing loss due to occupational noise exposure and to evaluate the effectiveness of developed engineering controls. This computer program is based on the ISO 1999 standard and can be used to estimate the loss of hearing ability caused by occupational noise exposures. In this paper, the functionalities of this software are discussed and several case studies related to mining machinery are presented to demonstrate the functionalities of this software. PMID:28596700

  9. Can abstract screening workload be reduced using text mining? User experiences of the tool Rayyan.

    PubMed

    Olofsson, Hanna; Brolund, Agneta; Hellberg, Christel; Silverstein, Rebecca; Stenström, Karin; Österberg, Marie; Dagerhamn, Jessica

    2017-09-01

    One time-consuming aspect of conducting systematic reviews is the task of sifting through abstracts to identify relevant studies. One promising approach for reducing this burden uses text mining technology to identify those abstracts that are potentially most relevant for a project, allowing those abstracts to be screened first. To examine the effectiveness of the text mining functionality of the abstract screening tool Rayyan. User experiences were collected. Rayyan was used to screen abstracts for 6 reviews in 2015. After screening 25%, 50%, and 75% of the abstracts, the screeners logged the relevant references identified. A survey was sent to users. After screening half of the search result with Rayyan, 86% to 99% of the references deemed relevant to the study were identified. Of those studies included in the final reports, 96% to 100% were already identified in the first half of the screening process. Users rated Rayyan 4.5 out of 5. The text mining function in Rayyan successfully helped reviewers identify relevant studies early in the screening process. Copyright © 2017 John Wiley & Sons, Ltd.

  10. PPInterFinder--a mining tool for extracting causal relations on human proteins from literature.

    PubMed

    Raja, Kalpana; Subramani, Suresh; Natarajan, Jeyakumar

    2013-01-01

    One of the most common and challenging problem in biomedical text mining is to mine protein-protein interactions (PPIs) from MEDLINE abstracts and full-text research articles because PPIs play a major role in understanding the various biological processes and the impact of proteins in diseases. We implemented, PPInterFinder--a web-based text mining tool to extract human PPIs from biomedical literature. PPInterFinder uses relation keyword co-occurrences with protein names to extract information on PPIs from MEDLINE abstracts and consists of three phases. First, it identifies the relation keyword using a parser with Tregex and a relation keyword dictionary. Next, it automatically identifies the candidate PPI pairs with a set of rules related to PPI recognition. Finally, it extracts the relations by matching the sentence with a set of 11 specific patterns based on the syntactic nature of PPI pair. We find that PPInterFinder is capable of predicting PPIs with the accuracy of 66.05% on AIMED corpus and outperforms most of the existing systems. DATABASE URL: http://www.biomining-bu.in/ppinterfinder/

  11. PPInterFinder—a mining tool for extracting causal relations on human proteins from literature

    PubMed Central

    Raja, Kalpana; Subramani, Suresh; Natarajan, Jeyakumar

    2013-01-01

    One of the most common and challenging problem in biomedical text mining is to mine protein–protein interactions (PPIs) from MEDLINE abstracts and full-text research articles because PPIs play a major role in understanding the various biological processes and the impact of proteins in diseases. We implemented, PPInterFinder—a web-based text mining tool to extract human PPIs from biomedical literature. PPInterFinder uses relation keyword co-occurrences with protein names to extract information on PPIs from MEDLINE abstracts and consists of three phases. First, it identifies the relation keyword using a parser with Tregex and a relation keyword dictionary. Next, it automatically identifies the candidate PPI pairs with a set of rules related to PPI recognition. Finally, it extracts the relations by matching the sentence with a set of 11 specific patterns based on the syntactic nature of PPI pair. We find that PPInterFinder is capable of predicting PPIs with the accuracy of 66.05% on AIMED corpus and outperforms most of the existing systems. Database URL: http://www.biomining-bu.in/ppinterfinder/ PMID:23325628

  12. Service-based analysis of biological pathways

    PubMed Central

    Zheng, George; Bouguettaya, Athman

    2009-01-01

    Background Computer-based pathway discovery is concerned with two important objectives: pathway identification and analysis. Conventional mining and modeling approaches aimed at pathway discovery are often effective at achieving either objective, but not both. Such limitations can be effectively tackled leveraging a Web service-based modeling and mining approach. Results Inspired by molecular recognitions and drug discovery processes, we developed a Web service mining tool, named PathExplorer, to discover potentially interesting biological pathways linking service models of biological processes. The tool uses an innovative approach to identify useful pathways based on graph-based hints and service-based simulation verifying user's hypotheses. Conclusion Web service modeling of biological processes allows the easy access and invocation of these processes on the Web. Web service mining techniques described in this paper enable the discovery of biological pathways linking these process service models. Algorithms presented in this paper for automatically highlighting interesting subgraph within an identified pathway network enable the user to formulate hypothesis, which can be tested out using our simulation algorithm that are also described in this paper. PMID:19796403

  13. The Positive Environmental Contribution of Jarosite by Retaining Lead in Acid Mine Drainage Areas

    PubMed Central

    Figueiredo, Maria-Ondina; da Silva, Teresa Pereira

    2011-01-01

    Jarosite, KFe3(SO4)2(OH)6, is a secondary iron sulphate often found in acid mine drainage (AMD) environments, particularly in mining wastes from polymetallic sulphide ore deposits. Despite the negative environmental connotation usually ascribed to secondary sulphate minerals due to the release of hazardous elements to aquifers and soils, jarosite acts as an efficient remover and immobilizer of such metals, particularly lead. The mineral chemistry of jarosite is reviewed and the results of a Fe K-edge XANES (X-Ray Absorption Near-Edge Structure) study of K-, Na- and Pb-jarosite are described and discussed within the context of the abandoned old mines of São Domingos and Aljustrel located in southern Portugal, in the Iberian Pyrite Belt (IPB). PMID:21655138

  14. Mining Large Scale Tandem Mass Spectrometry Data for Protein Modifications Using Spectral Libraries.

    PubMed

    Horlacher, Oliver; Lisacek, Frederique; Müller, Markus

    2016-03-04

    Experimental improvements in post-translational modification (PTM) detection by tandem mass spectrometry (MS/MS) has allowed the identification of vast numbers of PTMs. Open modification searches (OMSs) of MS/MS data, which do not require prior knowledge of the modifications present in the sample, further increased the diversity of detected PTMs. Despite much effort, there is still a lack of functional annotation of PTMs. One possibility to narrow the annotation gap is to mine MS/MS data deposited in public repositories and to correlate the PTM presence with biological meta-information attached to the data. Since the data volume can be quite substantial and contain tens of millions of MS/MS spectra, the data mining tools must be able to cope with big data. Here, we present two tools, Liberator and MzMod, which are built using the MzJava class library and the Apache Spark large scale computing framework. Liberator builds large MS/MS spectrum libraries, and MzMod searches them in an OMS mode. We applied these tools to a recently published set of 25 million spectra from 30 human tissues and present tissue specific PTMs. We also compared the results to the ones obtained with the OMS tool MODa and the search engine X!Tandem.

  15. Research of land resources comprehensive utilization of coal mining in plain area based on GIS: case of Panyi Coal Mine of Huainan Mining Group Corp.

    NASA Astrophysics Data System (ADS)

    Dai, Chunxiao; Wang, Songhui; Sun, Dian; Chen, Dong

    2007-06-01

    The result of land use in coalfield is important to sustainable development in resourceful city. For surface morphology being changed by subsidence, the mining subsidence becomes the main problem to land use with the negative influence of ecological environment, production and steadily develop in coal mining areas. Taking Panyi Coal Mine of Huainan Mining Group Corp as an example, this paper predicted and simulated the mining subsidence in Matlab environment on the basis of the probability integral method. The change of land use types of early term, medium term and long term was analyzed in accordance with the results of mining subsidence prediction with GIS as a spatial data management and spatial analysis tool. The result of analysis showed that 80% area in Panyi Coal Mine be affected by mining subsidence and 52km2 perennial waterlogged area was gradually formed. The farmland ecosystem was gradually turned into wetland ecosystem in most study area. According to the economic and social development and natural conditions of mining area, calculating the ecological environment, production and people's livelihood, this paper supplied the plan for comprehensive utilization of land resource. In this plan, intervention measures be taken during the coal mining and the mining subsidence formation and development, and this method can solve the problems of Land use at the relative low cost.

  16. Reliability and Validity of the Alberta Context Tool (ACT) with Professional Nurses: Findings from a Multi-Study Analysis

    PubMed Central

    Squires, Janet E.; Hayduk, Leslie; Hutchinson, Alison M.; Mallick, Ranjeeta; Norton, Peter G.; Cummings, Greta G.; Estabrooks, Carole A.

    2015-01-01

    Although organizational context is central to evidence-based practice, underdeveloped measurement hindersitsassessment. The Alberta Context Tool, comprised of 59 items that tap10 modifiable contextual concepts, was developed to address this gap. The purpose of this study to examine the reliability and validity of scores obtained when the Alberta Context Tool is completed by professional nurses across different healthcare settings. Five separate studies (N = 2361 nurses across different care settings) comprised the study sample. Reliability and validity were assessed. Cronbach’s alpha exceeded 0.70 for9/10 Alberta Context Tool concepts. Item-total correlations exceeded acceptable standards for 56/59items. Confirmatory Factor Analysescoordinated acceptably with the Alberta Context Tool’s proposed latent structure. The mean values for each Alberta Context Tool concept increased from low to high levels of research utilization(as hypothesized) further supporting its validity. This study provides robust evidence forreliability and validity of scores obtained with the Alberta Context Tool when administered to professional nurses. PMID:26098857

  17. [Text mining, a method for computer-assisted analysis of scientific texts, demonstrated by an analysis of author networks].

    PubMed

    Hahn, P; Dullweber, F; Unglaub, F; Spies, C K

    2014-06-01

    Searching for relevant publications is becoming more difficult with the increasing number of scientific articles. Text mining as a specific form of computer-based data analysis may be helpful in this context. Highlighting relations between authors and finding relevant publications concerning a specific subject using text analysis programs are illustrated graphically by 2 performed examples. © Georg Thieme Verlag KG Stuttgart · New York.

  18. Application of Three Existing Stope Boundary Optimisation Methods in an Operating Underground Mine

    NASA Astrophysics Data System (ADS)

    Erdogan, Gamze; Yavuz, Mahmut

    2017-12-01

    The underground mine planning and design optimisation process have received little attention because of complexity and variability of problems in underground mines. Although a number of optimisation studies and software tools are available and some of them, in special, have been implemented effectively to determine the ultimate-pit limits in an open pit mine, there is still a lack of studies for optimisation of ultimate stope boundaries in underground mines. The proposed approaches for this purpose aim at maximizing the economic profit by selecting the best possible layout under operational, technical and physical constraints. In this paper, the existing three heuristic techniques including Floating Stope Algorithm, Maximum Value Algorithm and Mineable Shape Optimiser (MSO) are examined for optimisation of stope layout in a case study. Each technique is assessed in terms of applicability, algorithm capabilities and limitations considering the underground mine planning challenges. Finally, the results are evaluated and compared.

  19. Fuzzy linear model for production optimization of mining systems with multiple entities

    NASA Astrophysics Data System (ADS)

    Vujic, Slobodan; Benovic, Tomo; Miljanovic, Igor; Hudej, Marjan; Milutinovic, Aleksandar; Pavlovic, Petar

    2011-12-01

    Planning and production optimization within multiple mines or several work sites (entities) mining systems by using fuzzy linear programming (LP) was studied. LP is the most commonly used operations research methods in mining engineering. After the introductory review of properties and limitations of applying LP, short reviews of the general settings of deterministic and fuzzy LP models are presented. With the purpose of comparative analysis, the application of both LP models is presented using the example of the Bauxite Basin Niksic with five mines. After the assessment, LP is an efficient mathematical modeling tool in production planning and solving many other single-criteria optimization problems of mining engineering. After the comparison of advantages and deficiencies of both deterministic and fuzzy LP models, the conclusion presents benefits of the fuzzy LP model but is also stating that seeking the optimal plan of production means to accomplish the overall analysis that will encompass the LP model approaches.

  20. Text mining for adverse drug events: the promise, challenges, and state of the art.

    PubMed

    Harpaz, Rave; Callahan, Alison; Tamang, Suzanne; Low, Yen; Odgers, David; Finlayson, Sam; Jung, Kenneth; LePendu, Paea; Shah, Nigam H

    2014-10-01

    Text mining is the computational process of extracting meaningful information from large amounts of unstructured text. It is emerging as a tool to leverage underutilized data sources that can improve pharmacovigilance, including the objective of adverse drug event (ADE) detection and assessment. This article provides an overview of recent advances in pharmacovigilance driven by the application of text mining, and discusses several data sources-such as biomedical literature, clinical narratives, product labeling, social media, and Web search logs-that are amenable to text mining for pharmacovigilance. Given the state of the art, it appears text mining can be applied to extract useful ADE-related information from multiple textual sources. Nonetheless, further research is required to address remaining technical challenges associated with the text mining methodologies, and to conclusively determine the relative contribution of each textual source to improving pharmacovigilance.

  1. EAGLE: 'EAGLE'Is an' Algorithmic Graph Library for Exploration

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2015-01-16

    The Resource Description Framework (RDF) and SPARQL Protocol and RDF Query Language (SPARQL) were introduced about a decade ago to enable flexible schema-free data interchange on the Semantic Web. Today data scientists use the framework as a scalable graph representation for integrating, querying, exploring and analyzing data sets hosted at different sources. With increasing adoption, the need for graph mining capabilities for the Semantic Web has emerged. Today there is no tools to conduct "graph mining" on RDF standard data sets. We address that need through implementation of popular iterative Graph Mining algorithms (Triangle count, Connected component analysis, degree distribution,more » diversity degree, PageRank, etc.). We implement these algorithms as SPARQL queries, wrapped within Python scripts and call our software tool as EAGLE. In RDF style, EAGLE stands for "EAGLE 'Is an' algorithmic graph library for exploration. EAGLE is like 'MATLAB' for 'Linked Data.'« less

  2. A Recommendation Algorithm for Automating Corollary Order Generation

    PubMed Central

    Klann, Jeffrey; Schadow, Gunther; McCoy, JM

    2009-01-01

    Manual development and maintenance of decision support content is time-consuming and expensive. We explore recommendation algorithms, e-commerce data-mining tools that use collective order history to suggest purchases, to assist with this. In particular, previous work shows corollary order suggestions are amenable to automated data-mining techniques. Here, an item-based collaborative filtering algorithm augmented with association rule interestingness measures mined suggestions from 866,445 orders made in an inpatient hospital in 2007, generating 584 potential corollary orders. Our expert physician panel evaluated the top 92 and agreed 75.3% were clinically meaningful. Also, at least one felt 47.9% would be directly relevant in guideline development. This automated generation of a rough-cut of corollary orders confirms prior indications about automated tools in building decision support content. It is an important step toward computerized augmentation to decision support development, which could increase development efficiency and content quality while automatically capturing local standards. PMID:20351875

  3. A recommendation algorithm for automating corollary order generation.

    PubMed

    Klann, Jeffrey; Schadow, Gunther; McCoy, J M

    2009-11-14

    Manual development and maintenance of decision support content is time-consuming and expensive. We explore recommendation algorithms, e-commerce data-mining tools that use collective order history to suggest purchases, to assist with this. In particular, previous work shows corollary order suggestions are amenable to automated data-mining techniques. Here, an item-based collaborative filtering algorithm augmented with association rule interestingness measures mined suggestions from 866,445 orders made in an inpatient hospital in 2007, generating 584 potential corollary orders. Our expert physician panel evaluated the top 92 and agreed 75.3% were clinically meaningful. Also, at least one felt 47.9% would be directly relevant in guideline development. This automated generation of a rough-cut of corollary orders confirms prior indications about automated tools in building decision support content. It is an important step toward computerized augmentation to decision support development, which could increase development efficiency and content quality while automatically capturing local standards.

  4. Interpreter of maladies: redescription mining applied to biomedical data analysis.

    PubMed

    Waltman, Peter; Pearlman, Alex; Mishra, Bud

    2006-04-01

    Comprehensive, systematic and integrated data-centric statistical approaches to disease modeling can provide powerful frameworks for understanding disease etiology. Here, one such computational framework based on redescription mining in both its incarnations, static and dynamic, is discussed. The static framework provides bioinformatic tools applicable to multifaceted datasets, containing genetic, transcriptomic, proteomic, and clinical data for diseased patients and normal subjects. The dynamic redescription framework provides systems biology tools to model complex sets of regulatory, metabolic and signaling pathways in the initiation and progression of a disease. As an example, the case of chronic fatigue syndrome (CFS) is considered, which has so far remained intractable and unpredictable in its etiology and nosology. The redescription mining approaches can be applied to the Centers for Disease Control and Prevention's Wichita (KS, USA) dataset, integrating transcriptomic, epidemiological and clinical data, and can also be used to study how pathways in the hypothalamic-pituitary-adrenal axis affect CFS patients.

  5. 42 CFR 1007.1 - Definitions.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... context: Data mining is defined as the practice of electronically sorting Medicaid or other relevant data... and relationships within that data to identify aberrant utilization, billing, or other practices that...

  6. 42 CFR 1007.1 - Definitions.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... context: Data mining is defined as the practice of electronically sorting Medicaid or other relevant data... and relationships within that data to identify aberrant utilization, billing, or other practices that...

  7. Occupational accidents in artisanal mining in Katanga, D.R.C.

    PubMed

    Elenge, Myriam; Leveque, Alain; De Brouwer, Christophe

    2013-04-01

    This study focuses on accidents in artisanal mining, to support policies improving miners' employability. Based on a questionnaire administered in November 2009 to a sample of 180 miners from the artisanal mining of LUPOTO, in the Province of Katanga, we explored significant trends between the accidents and their consequences and behavioral or sociological variables. During the 12 months preceding the study, 392 accidents occurred, affecting 72.2% of miners. Tools handling represents 51.5%, of the accidents' causes, followed by handling heavy loads (32.9%). Factors such as age, seniority or apprenticeship did not generate significant differences. Contusions were the most common injuries (50.2%), followed by wounds (44.4%). These injuries were located in upper limbs (50.5%) and in lower limbs (29.3%). 80.5% of miners were cared for by their colleagues and 50% of them could not work for more than 3 days. Physical sequelae were reported by 19% of the injured miners. Many surveys related to accidents in the area of artisanal mining report such high frequency. The unsuitability of tools to jobs to be done is usually raised as one of the major causes of accidents. The lack of differentiation of the tasks carried out in relation to age is another factor explaining the lack of protective effect of seniority as it minimizes the contribution of experience in the worker's safety. The apprenticeship reported is inadequate; it is rather a learning by doing than anything else. That is why it lacks protective effect. Low income combined with precariousness of artisanal mining are likely to explain the low level of work stoppages. Tools improvement associated with adequate training seem to be the basis of accident prevention. Availability of suitable medical care should improve artisanal miners' recovery after accidents.

  8. Data Mining Techniques Applied to Hydrogen Lactose Breath Test.

    PubMed

    Rubio-Escudero, Cristina; Valverde-Fernández, Justo; Nepomuceno-Chamorro, Isabel; Pontes-Balanza, Beatriz; Hernández-Mendoza, Yoedusvany; Rodríguez-Herrera, Alfonso

    2017-01-01

    Analyze a set of data of hydrogen breath tests by use of data mining tools. Identify new patterns of H2 production. Hydrogen breath tests data sets as well as k-means clustering as the data mining technique to a dataset of 2571 patients. Six different patterns have been extracted upon analysis of the hydrogen breath test data. We have also shown the relevance of each of the samples taken throughout the test. Analysis of the hydrogen breath test data sets using data mining techniques has identified new patterns of hydrogen generation upon lactose absorption. We can see the potential of application of data mining techniques to clinical data sets. These results offer promising data for future research on the relations between gut microbiota produced hydrogen and its link to clinical symptoms.

  9. Context Mining of Sedentary Behaviour for Promoting Self-Awareness Using a Smartphone.

    PubMed

    Fahim, Muhammad; Baker, Thar; Khattak, Asad Masood; Shah, Babar; Aleem, Saiqa; Chow, Francis

    2018-03-15

    Sedentary behaviour is increasing due to societal changes and is related to prolonged periods of sitting. There is sufficient evidence proving that sedentary behaviour has a negative impact on people's health and wellness. This paper presents our research findings on how to mine the temporal contexts of sedentary behaviour by utilizing the on-board sensors of a smartphone. We use the accelerometer sensor of the smartphone to recognize user situations (i.e., still or active). If our model confirms that the user context is still, then there is a high probability of being sedentary. Then, we process the environmental sound to recognize the micro-context, such as working on a computer or watching television during leisure time. Our goal is to reduce sedentary behaviour by suggesting preventive interventions to take short breaks during prolonged sitting to be more active. We achieve this goal by providing the visualization to the user, who wants to monitor his/her sedentary behaviour to reduce unhealthy routines for self-management purposes. The main contribution of this paper is two-fold: (i) an initial implementation of the proposed framework supporting real-time context identification; (ii) testing and evaluation of the framework, which suggest that our application is capable of substantially reducing sedentary behaviour and assisting users to be active.

  10. Applications of multi-season hyperspectral remote sensing for acid mine water characterization and mapping of secondary iron minerals associated with acid mine drainage

    NASA Astrophysics Data System (ADS)

    Davies, Gwendolyn E.

    Acid mine drainage (AMD) resulting from the oxidation of sulfides in mine waste is a major environmental issue facing the mining industry today. Open pit mines, tailings ponds, ore stockpiles, and waste rock dumps can all be significant sources of pollution, primarily heavy metals. These large mining-induced footprints are often located across vast geographic expanses and are difficult to access. With the continuing advancement of imaging satellites, remote sensing may provide a useful monitoring tool for pit lake water quality and the rapid assessment of abandoned mine sites. This study explored the applications of laboratory spectroscopy and multi-season hyperspectral remote sensing for environmental monitoring of mine waste environments. Laboratory spectral experiments were first performed on acid mine waters and synthetic ferric iron solutions to identify and isolate the unique spectral properties of mine waters. These spectral characterizations were then applied to airborne hyperspectral imagery for identification of poor water quality in AMD ponds at the Leviathan Mine Superfund site, CA. Finally, imagery varying in temporal and spatial resolutions were used to identify changes in mineralogy over weathering overburden piles and on dry AMD pond liner surfaces at the Leviathan Mine. Results show the utility of hyperspectral remote sensing for monitoring a diverse range of surfaces associated with AMD.

  11. Screening and prioritisation of chemical risks from metal mining operations, identifying exposure media of concern.

    PubMed

    Pan, Jilang; Oates, Christopher J; Ihlenfeld, Christian; Plant, Jane A; Voulvoulis, Nikolaos

    2010-04-01

    Metals have been central to the development of human civilisation from the Bronze Age to modern times, although in the past, metal mining and smelting have been the cause of serious environmental pollution with the potential to harm human health. Despite problems from artisanal mining in some developing countries, modern mining to Western standards now uses the best available mining technology combined with environmental monitoring, mitigation and remediation measures to limit emissions to the environment. This paper develops risk screening and prioritisation methods previously used for contaminated land on military and civilian sites and engineering systems for the analysis and prioritisation of chemical risks from modern metal mining operations. It uses hierarchical holographic modelling and multi-criteria decision making to analyse and prioritise the risks from potentially hazardous inorganic chemical substances released by mining operations. A case study of an active platinum group metals mine in South Africa is used to demonstrate the potential of the method. This risk-based methodology for identifying, filtering and ranking mining-related environmental and human health risks can be used to identify exposure media of greatest concern to inform risk management. It also provides a practical decision-making tool for mine acquisition and helps to communicate risk to all members of mining operation teams.

  12. Are Female Applicants Disadvantaged in National Institutes of Health Peer Review? Combining Algorithmic Text Mining and Qualitative Methods to Detect Evaluative Differences in R01 Reviewers' Critiques.

    PubMed

    Magua, Wairimu; Zhu, Xiaojin; Bhattacharya, Anupama; Filut, Amarette; Potvien, Aaron; Leatherberry, Renee; Lee, You-Geon; Jens, Madeline; Malikireddy, Dastagiri; Carnes, Molly; Kaatz, Anna

    2017-05-01

    Women are less successful than men in renewing R01 grants from the National Institutes of Health. Continuing to probe text mining as a tool to identify gender bias in peer review, we used algorithmic text mining and qualitative analysis to examine a sample of critiques from men's and women's R01 renewal applications previously analyzed by counting and comparing word categories. We analyzed 241 critiques from 79 Summary Statements for 51 R01 renewals awarded to 45 investigators (64% male, 89% white, 80% PhD) at the University of Wisconsin-Madison between 2010 and 2014. We used latent Dirichlet allocation to discover evaluative "topics" (i.e., words that co-occur with high probability). We then qualitatively examined the context in which evaluative words occurred for male and female investigators. We also examined sex differences in assigned scores controlling for investigator productivity. Text analysis results showed that male investigators were described as "leaders" and "pioneers" in their "fields," with "highly innovative" and "highly significant research." By comparison, female investigators were characterized as having "expertise" and working in "excellent" environments. Applications from men received significantly better priority, approach, and significance scores, which could not be accounted for by differences in productivity. Results confirm our previous analyses suggesting that gender stereotypes operate in R01 grant peer review. Reviewers may more easily view male than female investigators as scientific leaders with significant and innovative research, and score their applications more competitively. Such implicit bias may contribute to sex differences in award rates for R01 renewals.

  13. Are Female Applicants Disadvantaged in National Institutes of Health Peer Review? Combining Algorithmic Text Mining and Qualitative Methods to Detect Evaluative Differences in R01 Reviewers' Critiques

    PubMed Central

    Magua, Wairimu; Zhu, Xiaojin; Bhattacharya, Anupama; Filut, Amarette; Potvien, Aaron; Leatherberry, Renee; Lee, You-Geon; Jens, Madeline; Malikireddy, Dastagiri; Carnes, Molly

    2017-01-01

    Abstract Background: Women are less successful than men in renewing R01 grants from the National Institutes of Health. Continuing to probe text mining as a tool to identify gender bias in peer review, we used algorithmic text mining and qualitative analysis to examine a sample of critiques from men's and women's R01 renewal applications previously analyzed by counting and comparing word categories. Methods: We analyzed 241 critiques from 79 Summary Statements for 51 R01 renewals awarded to 45 investigators (64% male, 89% white, 80% PhD) at the University of Wisconsin-Madison between 2010 and 2014. We used latent Dirichlet allocation to discover evaluative “topics” (i.e., words that co-occur with high probability). We then qualitatively examined the context in which evaluative words occurred for male and female investigators. We also examined sex differences in assigned scores controlling for investigator productivity. Results: Text analysis results showed that male investigators were described as “leaders” and “pioneers” in their “fields,” with “highly innovative” and “highly significant research.” By comparison, female investigators were characterized as having “expertise” and working in “excellent” environments. Applications from men received significantly better priority, approach, and significance scores, which could not be accounted for by differences in productivity. Conclusions: Results confirm our previous analyses suggesting that gender stereotypes operate in R01 grant peer review. Reviewers may more easily view male than female investigators as scientific leaders with significant and innovative research, and score their applications more competitively. Such implicit bias may contribute to sex differences in award rates for R01 renewals. PMID:28281870

  14. Mining Available Data from the United States Environmental Protection Agency to Support Rapid Life Cycle Inventory Modeling of Chemical Manufacturing

    EPA Science Inventory

    Demands for quick and accurate life cycle assessments create a need for methods to rapidly generate reliable life cycle inventories (LCI). Data mining is a suitable tool for this purpose, especially given the large amount of available governmental data. These data are typically a...

  15. Using Data Mining for Predicting Relationships between Online Question Theme and Final Grade

    ERIC Educational Resources Information Center

    Abdous, M'hammed; He, Wu; Yen, Cherng-Jyh

    2012-01-01

    As higher education diversifies its delivery modes, our ability to use the predictive and analytical power of educational data mining (EDM) to understand students' learning experiences is a critical step forward. The adoption of EDM by higher education as an analytical and decision making tool is offering new opportunities to exploit the untapped…

  16. Application of Learning Analytics Using Clustering Data Mining for Students' Disposition Analysis

    ERIC Educational Resources Information Center

    Bharara, Sanyam; Sabitha, Sai; Bansal, Abhay

    2018-01-01

    Learning Analytics (LA) is an emerging field in which sophisticated analytic tools are used to improve learning and education. It draws from, and is closely tied to, a series of other fields of study like business intelligence, web analytics, academic analytics, educational data mining, and action analytics. The main objective of this research…

  17. Myth Busting: Using Data Mining to Refute Link between Transfer Students and Retention Risk

    ERIC Educational Resources Information Center

    McAleer, Brenda; Szakas, Joseph S.

    2010-01-01

    In the past few years, universities have become much more involved in outcomes assessment. Outside of the classroom analysis of learning outcomes, an investigation is performed into the use of current data mining tools to assess the issue of student retention within the Computer Information Systems (CIS) department. Utilizing both a historical…

  18. Use of amendments to restore ecosystem function to metal mining impacted sites; Tools to evaluate efficacy

    USDA-ARS?s Scientific Manuscript database

    There is a long history of using residuals based soil amendments for restoration of disturbed sites. More recently, this approach has been tested for use on metal contaminated mining sites. For these sites, amendment mixtures are targeted to reduce metal availability in situ as well as restore eco...

  19. Mining Mathematics in Textbook Lessons

    ERIC Educational Resources Information Center

    Ronda, Erlina; Adler, Jill

    2017-01-01

    In this paper, we propose an analytic tool for describing the mathematics made available to learn in a "textbook lesson". The tool is an adaptation of the Mathematics Discourse in Instruction (MDI) analytic tool that we developed to analyze what is made available to learn in teachers' lessons. Our motivation to adapt the use of the MDI…

  20. Web Usage Mining Analysis of Federated Search Tools for Egyptian Scholars

    ERIC Educational Resources Information Center

    Mohamed, Khaled A.; Hassan, Ahmed

    2008-01-01

    Purpose: This paper aims to examine the behaviour of the Egyptian scholars while accessing electronic resources through two federated search tools. The main purpose of this article is to provide guidance for federated search tool technicians and support teams about user issues, including the need for training. Design/methodology/approach: Log…

  1. 30 CFR 56.7050 - Tool and drill steel racks.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Tool and drill steel racks. 56.7050 Section 56.7050 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL... Jet Piercing Drilling § 56.7050 Tool and drill steel racks. Receptacles or racks shall be provided for...

  2. New perspectives on a 140-year legacy of mining and abandoned mine cleanup in the San Juan Mountains, Colorado

    USGS Publications Warehouse

    Yager, Douglas B.; Fey, David L.; Chapin, Thomas; Johnson, Raymond H.

    2016-01-01

    The Gold King mine water release that occurred on 5 August 2015 near the historical mining community of Silverton, Colorado, highlights the environmental legacy that abandoned mines have on the environment. During reclamation efforts, a breach of collapsed workings at the Gold King mine sent 3 million gallons of acidic and metal-rich mine water into the upper Animas River, a tributary to the Colorado River basin. The Gold King mine is located in the scenic, western San Juan Mountains, a region renowned for its volcano-tectonic and gold-silver-base metal mineralization history. Prior to mining, acidic drainage from hydrothermally altered areas was a major source of metals and acidity to streams, and it continues to be so. In addition to abandoned hard rock metal mines, uranium mine waste poses a long-term storage and immobilization challenge in this area. Uranium resources are mined in the Colorado Plateau, which borders the San Juan Mountains on the west. Uranium processing and repository sites along the Animas River near Durango, Colorado, are a prime example of how the legacy of mining must be managed for the health and well-being of future generations. The San Juan Mountains are part of a geoenvironmental nexus where geology, mining, agriculture, recreation, and community issues converge. This trip will explore the geology, mining, and mine cleanup history in which a community-driven, watershed-based stakeholder process is an integral part. Research tools and historical data useful for understanding complex watersheds impacted by natural sources of metals and acidity overprinted by mining will also be discussed.

  3. Development of ergonomics audits for bagging, haul truck and maintenance and repair operations in mining.

    PubMed

    Dempsey, Patrick G; Pollard, Jonisha; Porter, William L; Mayton, Alan; Heberger, John R; Gallagher, Sean; Reardon, Leanna; Drury, Colin G

    2017-12-01

    The development and testing of ergonomics and safety audits for small and bulk bag filling, haul truck and maintenance and repair operations in coal preparation and mineral processing plants found at surface mine sites is described. The content for the audits was derived from diverse sources of information on ergonomics and safety deficiencies including: analysis of injury, illness and fatality data and reports; task analysis; empirical laboratory studies of particular tasks; field studies and observations at mine sites; and maintenance records. These diverse sources of information were utilised to establish construct validity of the modular audits that were developed for use by mine safety personnel. User and interrater reliability testing was carried out prior to finalising the audits. The audits can be implemented using downloadable paper versions or with a free mobile NIOSH-developed Android application called ErgoMine. Practitioner Summary: The methodology used to develop ergonomics audits for three types of mining operations is described. Various sources of audit content are compared and contrasted to serve as a guide for developing ergonomics audits for other occupational contexts.

  4. Surface Soil Preparetion for Leguminous Plants Growing in Degraded Areas by Mining Located in Amazon Forest-Brazil

    NASA Astrophysics Data System (ADS)

    Irio Ribeiro, Admilson; Hashimoto Fengler, Felipe; Araújo de Medeiros, Gerson; Márcia Longo, Regina; Frederici de Mello, Giovanna; José de Melo, Wanderley

    2015-04-01

    The revegetation of areas degraded by mining usually requires adequate mobilization of surface soil for the development of the species to be implemented. Unlike the traditional tillage, which has periodicity, the mobilization of degraded areas for revegetation can only occur at the beginning of the recovery stage. In this sense, the process of revegetation has as purpose the establishment of local native vegetation with least possible use of inputs and superficial tillage in order to catalyze the process of natural ecological succession, promoting the reintegration of areas and minimizing the negative impacts of mining activities in environmental. In this context, this work describes part of a study of land reclamation by tin exploitation in the Amazon ecosystem in the National Forest Jamari- Rondonia Brazil. So, studied the influence of surface soil mobilization in pit mine areas and tailings a view to the implementation of legumes. The results show that the surface has areas of mobilizing a significant effect on the growth of leguminous plants, areas for both mining and to tailings and pit mine areas.

  5. Data Mining and Privacy of Social Network Sites' Users: Implications of the Data Mining Problem.

    PubMed

    Al-Saggaf, Yeslam; Islam, Md Zahidul

    2015-08-01

    This paper explores the potential of data mining as a technique that could be used by malicious data miners to threaten the privacy of social network sites (SNS) users. It applies a data mining algorithm to a real dataset to provide empirically-based evidence of the ease with which characteristics about the SNS users can be discovered and used in a way that could invade their privacy. One major contribution of this article is the use of the decision forest data mining algorithm (SysFor) to the context of SNS, which does not only build a decision tree but rather a forest allowing the exploration of more logic rules from a dataset. One logic rule that SysFor built in this study, for example, revealed that anyone having a profile picture showing just the face or a picture showing a family is less likely to be lonely. Another contribution of this article is the discussion of the implications of the data mining problem for governments, businesses, developers and the SNS users themselves.

  6. The application of high-resolution mass spectrometry-based data-mining tools in tandem to metabolite profiling of a triple drug combination in humans.

    PubMed

    Xing, Jie; Zang, Meitong; Zhang, Haiying; Zhu, Mingshe

    2015-10-15

    Patients are usually exposed to multiple drugs, and metabolite profiling of each drug in complex biological matrices is a big challenge. This study presented a new application of an improved high resolution mass spectrometry (HRMS)-based data-mining tools in tandem to fast and comprehensive metabolite identification of combination drugs in human. The model drug combination was metronidazole-pantoprazole-clarithromycin (MET-PAN-CLAR), which is widely used in clinic to treat ulcers caused by Helicobacter pylori. First, mass defect filter (MDF), as a targeted data processing tool, was able to recover all relevant metabolites of MET-PAN-CLAR in human plasma and urine from the full-scan MS dataset when appropriate MDF templates for each drug were defined. Second, the accurate mass-based background subtraction (BS), as an untargeted data-mining tool, worked effectively except for several trace metabolites, which were buried in the remaining background signals. Third, an integrated strategy, i.e., untargeted BS followed by improved MDF, was effective for metabolite identification of MET-PAN-CLAR. Most metabolites except for trace ones were found in the first step of BS-processed datasets, and the results led to the setup of appropriate metabolite MDF template for the subsequent MDF data processing. Trace metabolites were further recovered by MDF, which used both common MDF templates and the novel metabolite-based MDF templates. As a result, a total of 44 metabolites or related components were found for MET-PAN-CLAR in human plasma and urine using the integrated strategy. New metabolic pathways such as N-glucuronidation of PAN and dehydrogenation of CLAR were found. This study demonstrated that the combination of accurate mass-based multiple data-mining techniques in tandem, i.e., untargeted background subtraction followed by targeted mass defect filtering, can be a valuable tool for rapid metabolite profiling of combination drugs in vivo. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. MetalS(3), a database-mining tool for the identification of structurally similar metal sites.

    PubMed

    Valasatava, Yana; Rosato, Antonio; Cavallaro, Gabriele; Andreini, Claudia

    2014-08-01

    We have developed a database search tool to identify metal sites having structural similarity to a query metal site structure within the MetalPDB database of minimal functional sites (MFSs) contained in metal-binding biological macromolecules. MFSs describe the local environment around the metal(s) independently of the larger context of the macromolecular structure. Such a local environment has a determinant role in tuning the chemical reactivity of the metal, ultimately contributing to the functional properties of the whole system. The database search tool, which we called MetalS(3) (Metal Sites Similarity Search), can be accessed through a Web interface at http://metalweb.cerm.unifi.it/tools/metals3/ . MetalS(3) uses a suitably adapted version of an algorithm that we previously developed to systematically compare the structure of the query metal site with each MFS in MetalPDB. For each MFS, the best superposition is kept. All these superpositions are then ranked according to the MetalS(3) scoring function and are presented to the user in tabular form. The user can interact with the output Web page to visualize the structural alignment or the sequence alignment derived from it. Options to filter the results are available. Test calculations show that the MetalS(3) output correlates well with expectations from protein homology considerations. Furthermore, we describe some usage scenarios that highlight the usefulness of MetalS(3) to obtain mechanistic and functional hints regardless of homology.

  8. Comparative analysis of data mining techniques for business data

    NASA Astrophysics Data System (ADS)

    Jamil, Jastini Mohd; Shaharanee, Izwan Nizal Mohd

    2014-12-01

    Data mining is the process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data contained within a database. Companies are using this tool to further understand their customers, to design targeted sales and marketing campaigns, to predict what product customers will buy and the frequency of purchase, and to spot trends in customer preferences that can lead to new product development. In this paper, we conduct a systematic approach to explore several of data mining techniques in business application. The experimental result reveals that all data mining techniques accomplish their goals perfectly, but each of the technique has its own characteristics and specification that demonstrate their accuracy, proficiency and preference.

  9. Data mining for signals in spontaneous reporting databases: proceed with caution.

    PubMed

    Stephenson, Wendy P; Hauben, Manfred

    2007-04-01

    To provide commentary and points of caution to consider before incorporating data mining as a routine component of any Pharmacovigilance program, and to stimulate further research aimed at better defining the predictive value of these new tools as well as their incremental value as an adjunct to traditional methods of post-marketing surveillance. Commentary includes review of current data mining methodologies employed and their limitations, caveats to consider in the use of spontaneous reporting databases and caution against over-confidence in the results of data mining. Future research should focus on more clearly delineating the limitations of the various quantitative approaches as well as the incremental value that they bring to traditional methods of pharmacovigilance.

  10. Training and Employment of Land Mine and Booby Trap Detector Dogs. Volume II

    DTIC Science & Technology

    1976-09-01

    1Of injury, disease, and other physical abnormalities. All obligatory Li [1/ • ,i 4’: vaccinations should 1•e current ( canine distemper , infectious...as a procedures manual and reference text to be used during the training of initially naive canines v for land mine and booby trap detection service... canine L. training contexts. * • The techniques and procedures elaborated in the present docu- ment were developed for the United States Army Mobility

  11. Development and application of the Safe Performance Index as a risk-based methodology for identifying major hazard-related safety issues in underground coal mines

    NASA Astrophysics Data System (ADS)

    Kinilakodi, Harisha

    The underground coal mining industry has been under constant watch due to the high risk involved in its activities, and scrutiny increased because of the disasters that occurred in 2006-07. In the aftermath of the incidents, the U.S. Congress passed the Mine Improvement and New Emergency Response Act of 2006 (MINER Act), which strengthened the existing regulations and mandated new laws to address the various issues related to a safe working environment in the mines. Risk analysis in any form should be done on a regular basis to tackle the possibility of unwanted major hazard-related events such as explosions, outbursts, airbursts, inundations, spontaneous combustion, and roof fall instabilities. One of the responses by the Mine Safety and Health Administration (MSHA) in 2007 involved a new pattern of violations (POV) process to target mines with a poor safety performance, specifically to improve their safety. However, the 2010 disaster (worst in 40 years) gave an impression that the collective effort of the industry, federal/state agencies, and researchers to achieve the goal of zero fatalities and serious injuries has gone awry. The Safe Performance Index (SPI) methodology developed in this research is a straight-forward, effective, transparent, and reproducible approach that can help in identifying and addressing some of the existing issues while targeting (poor safety performance) mines which need help. It combines three injury and three citation measures that are scaled to have an equal mean (5.0) in a balanced way with proportionate weighting factors (0.05, 0.15, 0.30) and overall normalizing factor (15) into a mine safety performance evaluation tool. It can be used to assess the relative safety-related risk of mines, including by mine-size category. Using 2008 and 2009 data, comparisons were made of SPI-associated, normalized safety performance measures across mine-size categories, with emphasis on small-mine safety performance as compared to large- and medium-sized mines. The accident rates (NDL IR, NFDL IR, SM/100) of very small and small mines in 2008 and 2009 were less than those of medium and large mines. The data indicates a heavy occurrence of very severe injuries in a number of very small and small mines. In another application which is a part of this research, the six normalized safety measures and the SPI are used to evaluate the risk that existed at mines in the two years preceding the occurrence of a fatality. This mine safety performance tracking method could have been helpful to the companies, state agency, or MSHA in recognizing and addressing emerging problems with actions that may have been able to prevent high-risk conditions, the fatality, and/or other serious injuries. The approach would have given scrutiny to the risk of mines that encompassed 74% of the fatalities during 2007-2010. In order to assess the SPI as a comparable risk measurement tool, a traditional risk approach is also developed using data embracing frequency and severity in the final equation to analyze the relative risk for all underground coal mines for the years 2007--2010. Then, the SPI is compared with this traditional risk analysis method to demonstrate that the results attained by either method provide the relative safety-related risk of underground coal mines regarding injuries and citations for violations of regulations. The comparison reveals that the SPI does emulate a traditional approach to risk analysis. A correlation coefficient of --0.89 or more was observed between the results of these two methodologies and either can be used to assist companies, the Mine Safety and Health Administration (MSHA), or state agencies in target-ing mines with high risk for serious injuries and elevated citations for remediation of their injury and/or violation experience. The SPI, however, provides a more understandable approach for mine operators to apply using measures compatible with MSHA's enforcement tools. These methodologies form an all-encompassing approach that can be used to assist companies, the MSHA, or state agencies in targeting mines with high risk for serious injuries and elevated citations. Once targeted as high risk, mines can then pursue appropriate intervention to remediate their violation and/or injury experience. This research may help in plugging the gap in the safety system and better pursue the goal of zero fatalities and serious injuries in the underground coal mines.

  12. A concept for the modernization of underground mining master maps based on the enrichment of data definitions and spatial database technology

    NASA Astrophysics Data System (ADS)

    Krawczyk, Artur

    2018-01-01

    In this article, topics regarding the technical and legal aspects of creating digital underground mining maps are described. Currently used technologies and solutions for creating, storing and making digital maps accessible are described in the context of the Polish mining industry. Also, some problems with the use of these technologies are identified and described. One of the identified problems is the need to expand the range of mining map data provided by survey departments to other mining departments, such as ventilation maintenance or geological maintenance. Three solutions are proposed and analyzed, and one is chosen for further analysis. The analysis concerns data storage and making survey data accessible not only from paper documentation, but also directly from computer systems. Based on enrichment data, new processing procedures are proposed for a new way of presenting information that allows the preparation of new cartographic representations (symbols) of data with regard to users' needs.

  13. Twitter as a tool for ophthalmologists.

    PubMed

    Micieli, Robert; Micieli, Jonathan A

    2012-10-01

    Twitter is a social media web site created in 2006 that allows users to post Tweets, which are text-based messages containing up to 140 characters. It has grown exponentially in popularity; now more than 340 million Tweets are sent daily, and there are more than 140 million users. Twitter has become an important tool in medicine in a variety of contexts, allowing medical journals to engage their audiences, conference attendees to interact with one another in real time, and physicians to have the opportunity to interact with politicians, organizations, and the media in a manner that can be freely observed. There are also tremendous research opportunities since Twitter contains a database of public opinion that can be mined by keywords and hashtags. This article serves as an introduction to Twitter and surveys the peer-reviewed literature concerning its various uses and original studies. Opportunities for use in ophthalmology are outlined, and a recommended list of ophthalmology feeds on Twitter is presented. Overall, Twitter is an underutilized resource in ophthalmology and has the potential to enhance professional collegiality, advocacy, and scientific research. Copyright © 2012 Canadian Ophthalmological Society. Published by Elsevier Inc. All rights reserved.

  14. ChemicalTagger: A tool for semantic text-mining in chemistry.

    PubMed

    Hawizy, Lezan; Jessop, David M; Adams, Nico; Murray-Rust, Peter

    2011-05-16

    The primary method for scientific communication is in the form of published scientific articles and theses which use natural language combined with domain-specific terminology. As such, they contain free owing unstructured text. Given the usefulness of data extraction from unstructured literature, we aim to show how this can be achieved for the discipline of chemistry. The highly formulaic style of writing most chemists adopt make their contributions well suited to high-throughput Natural Language Processing (NLP) approaches. We have developed the ChemicalTagger parser as a medium-depth, phrase-based semantic NLP tool for the language of chemical experiments. Tagging is based on a modular architecture and uses a combination of OSCAR, domain-specific regex and English taggers to identify parts-of-speech. The ANTLR grammar is used to structure this into tree-based phrases. Using a metric that allows for overlapping annotations, we achieved machine-annotator agreements of 88.9% for phrase recognition and 91.9% for phrase-type identification (Action names). It is possible parse to chemical experimental text using rule-based techniques in conjunction with a formal grammar parser. ChemicalTagger has been deployed for over 10,000 patents and has identified solvents from their linguistic context with >99.5% precision.

  15. ANDSystem: an Associative Network Discovery System for automated literature mining in the field of biology

    PubMed Central

    2015-01-01

    Background Sufficient knowledge of molecular and genetic interactions, which comprise the entire basis of the functioning of living systems, is one of the necessary requirements for successfully answering almost any research question in the field of biology and medicine. To date, more than 24 million scientific papers can be found in PubMed, with many of them containing descriptions of a wide range of biological processes. The analysis of such tremendous amounts of data requires the use of automated text-mining approaches. Although a handful of tools have recently been developed to meet this need, none of them provide error-free extraction of highly detailed information. Results The ANDSystem package was developed for the reconstruction and analysis of molecular genetic networks based on an automated text-mining technique. It provides a detailed description of the various types of interactions between genes, proteins, microRNA's, metabolites, cellular components, pathways and diseases, taking into account the specificity of cell lines and organisms. Although the accuracy of ANDSystem is comparable to other well known text-mining tools, such as Pathway Studio and STRING, it outperforms them in having the ability to identify an increased number of interaction types. Conclusion The use of ANDSystem, in combination with Pathway Studio and STRING, can improve the quality of the automated reconstruction of molecular and genetic networks. ANDSystem should provide a useful tool for researchers working in a number of different fields, including biology, biotechnology, pharmacology and medicine. PMID:25881313

  16. PubstractHelper: A Web-based Text-Mining Tool for Marking Sentences in Abstracts from PubMed Using Multiple User-Defined Keywords.

    PubMed

    Chen, Chou-Cheng; Ho, Chung-Liang

    2014-01-01

    While a huge amount of information about biological literature can be obtained by searching the PubMed database, reading through all the titles and abstracts resulting from such a search for useful information is inefficient. Text mining makes it possible to increase this efficiency. Some websites use text mining to gather information from the PubMed database; however, they are database-oriented, using pre-defined search keywords while lacking a query interface for user-defined search inputs. We present the PubMed Abstract Reading Helper (PubstractHelper) website which combines text mining and reading assistance for an efficient PubMed search. PubstractHelper can accept a maximum of ten groups of keywords, within each group containing up to ten keywords. The principle behind the text-mining function of PubstractHelper is that keywords contained in the same sentence are likely to be related. PubstractHelper highlights sentences with co-occurring keywords in different colors. The user can download the PMID and the abstracts with color markings to be reviewed later. The PubstractHelper website can help users to identify relevant publications based on the presence of related keywords, which should be a handy tool for their research. http://bio.yungyun.com.tw/ATM/PubstractHelper.aspx and http://holab.med.ncku.edu.tw/ATM/PubstractHelper.aspx.

  17. Data Mining in the Context of Monitoring Mt Etna, Italy

    NASA Astrophysics Data System (ADS)

    Aliotta, Marco; Cassisi, Carmelo; D'Agostino, Marcello; Falsaperla, Susanna; Ferrari, Ferruccio; Langer, Horst; Messina, Alfio; Montalto, Placido; Reitano, Danilo; Spampinato, Salvatore

    2015-04-01

    The persistent volcanic activity of Mt Etna makes the continuous monitoring of multidisciplinary data a first-class issue. Indeed, the monitoring systems rapidly accumulate huge quantity of data, arising specific problems of andling and interpretation. In order to respond to these problems, the INGV staff has developed a number of software tools for data mining. These tools have the scope of identifying structures in the data that can be related to volcanic activity, furnishing criteria for the identification of precursory scenarios. In particular, we use methods of clustering and classification in which data are divided into groups according to a-priori-defined measures of similarity or distance. Data groups may assume various shapes, such as convex clouds or complex concave bodies.The "KKAnalysis" software package is a basket of clustering methods. Currently, it is one of the key techniques of the tremor-based automatic alarm systems of INGV Osservatorio Etneo. It exploits both Self-Organizing Maps and Fuzzy Clustering. Beside seismic data, the software has been applied to the geochemical composition of eruptive products as well as a combined analysis of gas-emission (radon) and seismic data. The "DBSCAN" package exploits a concept based on density-based clustering. This method allows discovering clusters with arbitrary shape. Clusters are defined as dense regions of objects in the data space separated by regions of low density. In DBSCAN a cluster grows as long as the density within a group of objects exceeds some threshold. In the context of volcano monitoring, the method is particularly promising in the recognition of ash particles as they have a rather irregular shape. The "MOTIF" software allows us to identify typical waveforms in time series, outperforming methods like cross-correlation that entail a high computational effort. MOTIF can recognize the non-imilarity of two patterns on a small number of data points without going through the whole length of data vectors. All the developments aforementioned come along with modules for feature extraction and post-processing. Specific attention is devoted to the obustness of the feature extraction to avoid misinterpretations due to the presence of disturbances from environmental noise or other undesired signals originating from the source, which are not relevant for the purpose of volcano surveillance.

  18. VRLane: a desktop virtual safety management program for underground coal mine

    NASA Astrophysics Data System (ADS)

    Li, Mei; Chen, Jingzhu; Xiong, Wei; Zhang, Pengpeng; Wu, Daozheng

    2008-10-01

    VR technologies, which generate immersive, interactive, and three-dimensional (3D) environments, are seldom applied to coal mine safety work management. In this paper, a new method that combined the VR technologies with underground mine safety management system was explored. A desktop virtual safety management program for underground coal mine, called VRLane, was developed. The paper mainly concerned about the current research advance in VR, system design, key techniques and system application. Two important techniques were introduced in the paper. Firstly, an algorithm was designed and implemented, with which the 3D laneway models and equipment models can be built on the basis of the latest mine 2D drawings automatically, whereas common VR programs established 3D environment by using 3DS Max or the other 3D modeling software packages with which laneway models were built manually and laboriously. Secondly, VRLane realized system integration with underground industrial automation. VRLane not only described a realistic 3D laneway environment, but also described the status of the coal mining, with functions of displaying the run states and related parameters of equipment, per-alarming the abnormal mining events, and animating mine cars, mine workers, or long-wall shearers. The system, with advantages of cheap, dynamic, easy to maintenance, provided a useful tool for safety production management in coal mine.

  19. Context and hand posture modulate the neural dynamics of tool-object perception.

    PubMed

    Natraj, Nikhilesh; Poole, Victoria; Mizelle, J C; Flumini, Andrea; Borghi, Anna M; Wheaton, Lewis A

    2013-02-01

    Prior research has linked visual perception of tools with plausible motor strategies. Thus, observing a tool activates the putative action-stream, including the left posterior parietal cortex. Observing a hand functionally grasping a tool involves the inferior frontal cortex. However, tool-use movements are performed in a contextual and grasp specific manner, rather than relative isolation. Our prior behavioral data has demonstrated that the context of tool-use (by pairing the tool with different objects) and varying hand grasp postures of the tool can interact to modulate subjects' reaction times while evaluating tool-object content. Specifically, perceptual judgment was delayed in the evaluation of functional tool-object pairings (Correct context) when the tool was non-functionally (Manipulative) grasped. Here, we hypothesized that this behavioral interference seen with the Manipulative posture would be due to increased and extended left parietofrontal activity possibly underlying motor simulations when resolving action conflict due to this particular grasp at time scales relevant to the behavioral data. Further, we hypothesized that this neural effect will be restricted to the Correct tool-object context wherein action affordances are at a maximum. 64-channel electroencephalography (EEG) was recorded from 16 right-handed subjects while viewing images depicting three classes of tool-object contexts: functionally Correct (e.g. coffee pot-coffee mug), functionally Incorrect (e.g. coffee pot-marker) and Spatial (coffee pot-milk). The Spatial context pairs a tool and object that would not functionally match, but may commonly appear in the same scene. These three contexts were modified by hand interaction: No Hand, Static Hand near the tool, Functional Hand posture and Manipulative Hand posture. The Manipulative posture is convenient for relocating a tool but does not afford a functional engagement of the tool on the target object. Subjects were instructed to visually assess whether the pictures displayed correct tool-object associations. EEG data was analyzed in time-voltage and time-frequency domains. Overall, Static Hand, Functional and Manipulative postures cause early activation (100-400ms post image onset) of parietofrontal areas, to varying intensity in each context, when compared to the No Hand control condition. However, when context is Correct, only the Manipulative Posture significantly induces extended neural responses, predominantly over right parietal and right frontal areas [400-600ms post image onset]. Significant power increase was observed in the theta band [4-8Hz] over the right frontal area, [0-500ms]. In addition, when context is Spatial, Manipulative posture alone significantly induces extended neural responses, over bilateral parietofrontal and left motor areas [400-600ms]. Significant power decrease occurred primarily in beta bands [12-16, 20-25Hz] over the aforementioned brain areas [400-600ms]. Here, we demonstrate that the neural processing of tool-object perception is sensitive to several factors. While both Functional and Manipulative postures in Correct context engage predominantly an early left parietofrontal circuit, the Manipulative posture alone extends the neural response and transitions to a late right parietofrontal network. This suggests engagement of a right neural system to evaluate action affordances when hand posture does not support action (Manipulative). Additionally, when tool-use context is ambiguous (Spatial context), there is increased bilateral parietofrontal activation and, extended neural response for the Manipulative posture. These results point to the existence of other networks evaluating tool-object associations when motoric affordances are not readily apparent and underlie corresponding delayed perceptual judgment in our prior behavioral data wherein Manipulative postures had exclusively interfered in judging tool-object content. Copyright © 2012 Elsevier Ltd. All rights reserved.

  20. Analysis of Nature of Science Included in Recent Popular Writing Using Text Mining Techniques

    ERIC Educational Resources Information Center

    Jiang, Feng; McComas, William F.

    2014-01-01

    This study examined the inclusion of nature of science (NOS) in popular science writing to determine whether it could serve supplementary resource for teaching NOS and to evaluate the accuracy of text mining and classification as a viable research tool in science education research. Four groups of documents published from 2001 to 2010 were…

  1. 75 FR 60694 - Taking and Importing Marine Mammals; Naval Explosive Ordnance Disposal School Training Operations...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-10-01

    ... Countermeasures (MCM) detonations is one function of the U.S. Navy EOD force, which involves mine-hunting and mine... of SRI for a portion of the NEODS class. The NEODS would utilize areas approximately one to three nmi... the training is to give NEODS students the tools and techniques to implement MCM through real...

  2. Use of Data Mining to Reveal Body Mass Index (BMI): Patterns among Pennsylvania Schoolchildren, Pre-K to Grade 12

    ERIC Educational Resources Information Center

    YoussefAgha, Ahmed H.; Lohrmann, David K.; Jayawardene, Wasantha P.

    2013-01-01

    Background: Health eTools for Schools was developed to assist school nurses with routine entries, including height and weight, on student health records, thus providing a readily accessible data base. Data-mining techniques were applied to this database to determine if clinically signi?cant results could be generated. Methods: Body mass index…

  3. Developing an Intelligent Diagnosis and Assessment E-Learning Tool for Introductory Programming

    ERIC Educational Resources Information Center

    Huang, Chenn-Jung; Chen, Chun-Hua; Luo, Yun-Cheng; Chen, Hong-Xin; Chuang, Yi-Ta

    2008-01-01

    Recently, a lot of open source e-learning platforms have been offered for free in the Internet. We thus incorporate the intelligent diagnosis and assessment tool into an open software e-learning platform developed for programming language courses, wherein the proposed learning diagnosis assessment tools based on text mining and machine learning…

  4. Beyond accuracy: creating interoperable and scalable text-mining web services.

    PubMed

    Wei, Chih-Hsuan; Leaman, Robert; Lu, Zhiyong

    2016-06-15

    The biomedical literature is a knowledge-rich resource and an important foundation for future research. With over 24 million articles in PubMed and an increasing growth rate, research in automated text processing is becoming increasingly important. We report here our recently developed web-based text mining services for biomedical concept recognition and normalization. Unlike most text-mining software tools, our web services integrate several state-of-the-art entity tagging systems (DNorm, GNormPlus, SR4GN, tmChem and tmVar) and offer a batch-processing mode able to process arbitrary text input (e.g. scholarly publications, patents and medical records) in multiple formats (e.g. BioC). We support multiple standards to make our service interoperable and allow simpler integration with other text-processing pipelines. To maximize scalability, we have preprocessed all PubMed articles, and use a computer cluster for processing large requests of arbitrary text. Our text-mining web service is freely available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/#curl : Zhiyong.Lu@nih.gov. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.

  5. Visual information mining in remote sensing image archives

    NASA Astrophysics Data System (ADS)

    Pelizzari, Andrea; Descargues, Vincent; Datcu, Mihai P.

    2002-01-01

    The present article focuses on the development of interactive exploratory tools for visually mining the image content in large remote sensing archives. Two aspects are treated: the iconic visualization of the global information in the archive and the progressive visualization of the image details. The proposed methods are integrated in the Image Information Mining (I2M) system. The images and image structure in the I2M system are indexed based on a probabilistic approach. The resulting links are managed by a relational data base. Both the intrinsic complexity of the observed images and the diversity of user requests result in a great number of associations in the data base. Thus new tools have been designed to visualize, in iconic representation the relationships created during a query or information mining operation: the visualization of the query results positioned on the geographical map, quick-looks gallery, visualization of the measure of goodness of the query, visualization of the image space for statistical evaluation purposes. Additionally the I2M system is enhanced with progressive detail visualization in order to allow better access for operator inspection. I2M is a three-tier Java architecture and is optimized for the Internet.

  6. Data quality enhancement and knowledge discovery from relevant signals in acoustic emission

    NASA Astrophysics Data System (ADS)

    Mejia, Felipe; Shyu, Mei-Ling; Nanni, Antonio

    2015-10-01

    The increasing popularity of structural health monitoring has brought with it a growing need for automated data management and data analysis tools. Of great importance are filters that can systematically detect unwanted signals in acoustic emission datasets. This study presents a semi-supervised data mining scheme that detects data belonging to unfamiliar distributions. This type of outlier detection scheme is useful detecting the presence of new acoustic emission sources, given a training dataset of unwanted signals. In addition to classifying new observations (herein referred to as "outliers") within a dataset, the scheme generates a decision tree that classifies sub-clusters within the outlier context set. The obtained tree can be interpreted as a series of characterization rules for newly-observed data, and they can potentially describe the basic structure of different modes within the outlier distribution. The data mining scheme is first validated on a synthetic dataset, and an attempt is made to confirm the algorithms' ability to discriminate outlier acoustic emission sources from a controlled pencil-lead-break experiment. Finally, the scheme is applied to data from two fatigue crack-growth steel specimens, where it is shown that extracted rules can adequately describe crack-growth related acoustic emission sources while filtering out background "noise." Results show promising performance in filter generation, thereby allowing analysts to extract, characterize, and focus only on meaningful signals.

  7. Open Research Challenges with Big Data - A Data-Scientist s Perspective

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sukumar, Sreenivas R

    In this paper, we discuss data-driven discovery challenges of the Big Data era. We observe that recent innovations in being able to collect, access, organize, integrate, and query massive amounts of data from a wide variety of data sources have brought statistical data mining and machine learning under more scrutiny and evaluation for gleaning insights from the data than ever before. In that context, we pose and debate the question - Are data mining algorithms scaling with the ability to store and compute? If yes, how? If not, why not? We survey recent developments in the state-of-the-art to discuss emergingmore » and outstanding challenges in the design and implementation of machine learning algorithms at scale. We leverage experience from real-world Big Data knowledge discovery projects across domains of national security, healthcare and manufacturing to suggest our efforts be focused along the following axes: (i) the data science challenge - designing scalable and flexible computational architectures for machine learning (beyond just data-retrieval); (ii) the science of data challenge the ability to understand characteristics of data before applying machine learning algorithms and tools; and (iii) the scalable predictive functions challenge the ability to construct, learn and infer with increasing sample size, dimensionality, and categories of labels. We conclude with a discussion of opportunities and directions for future research.« less

  8. Vaccine adverse event text mining system for extracting features from vaccine safety reports.

    PubMed

    Botsis, Taxiarchis; Buttolph, Thomas; Nguyen, Michael D; Winiecki, Scott; Woo, Emily Jane; Ball, Robert

    2012-01-01

    To develop and evaluate a text mining system for extracting key clinical features from vaccine adverse event reporting system (VAERS) narratives to aid in the automated review of adverse event reports. Based upon clinical significance to VAERS reviewing physicians, we defined the primary (diagnosis and cause of death) and secondary features (eg, symptoms) for extraction. We built a novel vaccine adverse event text mining (VaeTM) system based on a semantic text mining strategy. The performance of VaeTM was evaluated using a total of 300 VAERS reports in three sequential evaluations of 100 reports each. Moreover, we evaluated the VaeTM contribution to case classification; an information retrieval-based approach was used for the identification of anaphylaxis cases in a set of reports and was compared with two other methods: a dedicated text classifier and an online tool. The performance metrics of VaeTM were text mining metrics: recall, precision and F-measure. We also conducted a qualitative difference analysis and calculated sensitivity and specificity for classification of anaphylaxis cases based on the above three approaches. VaeTM performed best in extracting diagnosis, second level diagnosis, drug, vaccine, and lot number features (lenient F-measure in the third evaluation: 0.897, 0.817, 0.858, 0.874, and 0.914, respectively). In terms of case classification, high sensitivity was achieved (83.1%); this was equal and better compared to the text classifier (83.1%) and the online tool (40.7%), respectively. Our VaeTM implementation of a semantic text mining strategy shows promise in providing accurate and efficient extraction of key features from VAERS narratives.

  9. Social Web mining and exploitation for serious applications: Technosocial Predictive Analytics and related technologies for public health, environmental and national security surveillance.

    PubMed

    Kamel Boulos, Maged N; Sanfilippo, Antonio P; Corley, Courtney D; Wheeler, Steve

    2010-10-01

    This paper explores Technosocial Predictive Analytics (TPA) and related methods for Web "data mining" where users' posts and queries are garnered from Social Web ("Web 2.0") tools such as blogs, micro-blogging and social networking sites to form coherent representations of real-time health events. The paper includes a brief introduction to commonly used Social Web tools such as mashups and aggregators, and maps their exponential growth as an open architecture of participation for the masses and an emerging way to gain insight about people's collective health status of whole populations. Several health related tool examples are described and demonstrated as practical means through which health professionals might create clear location specific pictures of epidemiological data such as flu outbreaks. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.

  10. Thinking in/through Movements; Working with/in Affect within the Context of Norwegian Early Years Education and Practice

    ERIC Educational Resources Information Center

    Rossholt, Nina

    2018-01-01

    This paper draws on data undertaken with very young children within the context of Norwegian kindergartens. Specifically, the paper focuses on non-human and human movements. Mine included, that are undertaken in time and space. Following I argue that as the researcher I am always already entangled in inquiry and that there is no beginning. As a…

  11. Mining Adverse Drug Reactions in Social Media with Named Entity Recognition and Semantic Methods.

    PubMed

    Chen, Xiaoyi; Deldossi, Myrtille; Aboukhamis, Rim; Faviez, Carole; Dahamna, Badisse; Karapetiantz, Pierre; Guenegou-Arnoux, Armelle; Girardeau, Yannick; Guillemin-Lanne, Sylvie; Lillo-Le-Louët, Agnès; Texier, Nathalie; Burgun, Anita; Katsahian, Sandrine

    2017-01-01

    Suspected adverse drug reactions (ADR) reported by patients through social media can be a complementary source to current pharmacovigilance systems. However, the performance of text mining tools applied to social media text data to discover ADRs needs to be evaluated. In this paper, we introduce the approach developed to mine ADR from French social media. A protocol of evaluation is highlighted, which includes a detailed sample size determination and evaluation corpus constitution. Our text mining approach provided very encouraging preliminary results with F-measures of 0.94 and 0.81 for recognition of drugs and symptoms respectively, and with F-measure of 0.70 for ADR detection. Therefore, this approach is promising for downstream pharmacovigilance analysis.

  12. Biota dose assessment of small mammals sampled near uranium mines in northern Arizona

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jannik, T.; Minter, K.; Kuhne, W.

    In 2015, the U. S. Geological Survey (USGS) collected approximately 50 small mammal carcasses from Northern Arizona uranium mines and other background locations. Based on the highest gross alpha results, 11 small mammal samples were selected for radioisotopic analyses. None of the background samples had significant gross alpha results. The 11 small mammals were identified relative to the three ‘indicator’ mines located south of Fredonia, AZ on the Kanab Plateau (Kanab North Mine, Pinenut Mine, and Arizona 1 Mine) (Figure 1-1) and are operated by Energy Fuels Resources Inc. (EFRI). EFRI annually reports soil analysis for uranium and radium-226 usingmore » Arizona Department of Environmental Quality (ADEQ)-approved Standard Operating Procedures for Soil Sampling (EFRI 2016a, 2016b, 2017). In combination with the USGS small mammal radioiosotopic tissue analyses, a biota dose assessment was completed by Savannah River National Laboratory (SRNL) using the RESidual RADioactivity-BIOTA (RESRAD-BIOTA, V. 1.8) dose assessment tool provided by the Argonne National Laboratory (ANL 2017).« less

  13. Tracking acid mine-drainage in Southeast Arizona using GIS and sediment delivery models

    USGS Publications Warehouse

    Norman, L.M.; Gray, F.; Guertin, D.P.; Wissler, C.; Bliss, J.D.

    2008-01-01

    This study investigates the application of models traditionally used to estimate erosion and sediment deposition to assess the potential risk of water quality impairment resulting from metal-bearing materials related to mining and mineralization. An integrated watershed analysis using Geographic Information Systems (GIS) based tools was undertaken to examine erosion and sediment transport characteristics within the watersheds. Estimates of stream deposits of sediment from mine tailings were related to the chemistry of surface water to assess the effectiveness of the methodology to assess the risk of acid mine-drainage being dispersed downstream of abandoned tailings and waste rock piles. A watershed analysis was preformed in the Patagonia Mountains in southeastern Arizona which has seen substantial mining and where recent water quality samples have reported acidic surface waters. This research demonstrates an improvement of the ability to predict streams that are likely to have severely degraded water quality as a result of past mining activities. ?? Springer Science+Business Media B.V. 2007.

  14. Consultation and remediation in the north: meeting international commitments to safeguard health and well-being.

    PubMed

    Banfield, Laura; Jardine, Cynthia G

    2013-01-01

    International commitments exist for the safeguarding of health and the prevention of ill health. One of the earliest commitments is the Declaration of Alma-Ata (1978), which provides 5 principles guiding primary health care: equity, community participation, health promotion, intersectoral collaboration and appropriate technology. These broadly applicable international commitments are premised on the World Health Organization's multifaceted definition of health. The environment is one sector in which these commitments to safeguarding health can be applied. Giant Mine, a contaminated former gold mine in the Northwest Territories, Canada, represents potential threats to all aspects of health. Strategies for managing such threats usually involve an obligation to engage the affected communities through consultation. To examine the remediation and consultation process associated with Giant Mine within the context of commitments to safeguard health and well-being through adapting and applying the principles of primary health care. Semi-structured interviews with purposively selected key informants representing government proponents and community members were conducted. in reviewing themes which emerged from a series of interviews exploring the community consultation process for the remediation of Giant Mine, the principles guiding primary health were mapped to CONSULTATION IN the North: (a) "equity" is the capacity to fairly and meaningfully participate in the consultation; (b) "community participation" is the right to engage in the process through reciprocal dialogue; (c) "health promotion" represents the need for continued information sharing towards awareness; (d) "intersectoral collaboration" signifies the importance of including all stakeholders; and (e) "appropriate technology" is the need to employ the best remediation actions relevant to the site and the community. Within the context of mining remediation, these principles form an appropriate framework for viewing consultation as a means of meeting international obligations to safeguard health.

  15. Inferring transposons activity chronology by TRANScendence - TEs database and de-novo mining tool.

    PubMed

    Startek, Michał Piotr; Nogły, Jakub; Gromadka, Agnieszka; Grzebelus, Dariusz; Gambin, Anna

    2017-10-16

    The constant progress in sequencing technology leads to ever increasing amounts of genomic data. In the light of current evidence transposable elements (TEs for short) are becoming useful tools for learning about the evolution of host genome. Therefore the software for genome-wide detection and analysis of TEs is of great interest. Here we describe the computational tool for mining, classifying and storing TEs from newly sequenced genomes. This is an online, web-based, user-friendly service, enabling users to upload their own genomic data, and perform de-novo searches for TEs. The detected TEs are automatically analyzed, compared to reference databases, annotated, clustered into families, and stored in TEs repository. Also, the genome-wide nesting structure of found elements are detected and analyzed by new method for inferring evolutionary history of TEs. We illustrate the functionality of our tool by performing a full-scale analyses of TE landscape in Medicago truncatula genome. TRANScendence is an effective tool for the de-novo annotation and classification of transposable elements in newly-acquired genomes. Its streamlined interface makes it well-suited for evolutionary studies.

  16. Make Mine a Metasearcher, Please!

    ERIC Educational Resources Information Center

    Repman, Judi; Carlson, Randal D.

    2000-01-01

    Describes metasearch tools and explains their value in helping library media centers improve students' Web searches. Discusses Boolean queries and the emphasis on speed at the expense of comprehensiveness; and compares four metasearch tools, including the number of search engines consulted, user control, and databases included. (LRW)

  17. New approach for reduction of diesel consumption by comparing different mining haulage configurations.

    PubMed

    Rodovalho, Edmo da Cunha; Lima, Hernani Mota; de Tomi, Giorgio

    2016-05-01

    The mining operations of loading and haulage have an energy source that is highly dependent on fossil fuels. In mining companies that select trucks for haulage, this input is the main component of mining costs. How can the impact of the operational aspects on the diesel consumption of haulage operations in surface mines be assessed? There are many studies relating the consumption of fuel trucks to several variables, but a methodology that prioritizes higher-impact variables under each specific condition is not available. Generic models may not apply to all operational settings presented in the mining industry. This study aims to create a method of analysis, identification, and prioritization of variables related to fuel consumption of haul trucks in open pit mines. For this purpose, statistical analysis techniques and mathematical modelling tools using multiple linear regressions will be applied. The model is shown to be suitable because the results generate a good description of the fuel consumption behaviour. In the practical application of the method, the reduction of diesel consumption reached 10%. The implementation requires no large-scale investments or very long deadlines and can be applied to mining haulage operations in other settings. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. The production of consumption: addressing the impact of mineral mining on tuberculosis in southern Africa

    PubMed Central

    Basu, Sanjay; Stuckler, David; Gonsalves, Gregg; Lurie, Mark

    2009-01-01

    Background Miners in southern Africa experience incident rates of tuberculosis up to ten times greater than the general population. Migration to and from mines may be amplifying tuberculosis epidemics in the general population. Discussion Migration to and from mineral mines contributes to HIV risks and associated tuberculosis incidence. Health and safety conditions within mines also promote the risk of silicosis (a tuberculosis risk factor) and transmission of tuberculosis bacilli in close quarters. In the context of migration, current tuberculosis prevention and treatment strategies often fail to provide sufficient continuity of care to ensure appropriate tuberculosis detection and treatment. Reports from Lesotho and South Africa suggest that miners pose transmission risks to other household or community members as they travel home undetected or inadequately treated, particularly with drug-resistant forms of tuberculosis. Reducing risky exposures on the mines, enhancing the continuity of primary care services, and improving the enforcement of occupational health codes may mitigate the harmful association between mineral mining activities and tuberculosis incidence among affected communities. Summary Tuberculosis incidence appears to be amplified by mineral mining operations in southern Africa. A number of immediately-available measures to improve continuity of care for miners, change recruitment and compensation practices, and reduce the primary risk of infection may critically mitigate the negative association between mineral mining and tuberculosis. PMID:19785769

  19. Using Data Mining to Detect Health Care Fraud and Abuse: A Review of Literature

    PubMed Central

    Joudaki, Hossein; Rashidian, Arash; Minaei-Bidgoli, Behrouz; Mahmoodi, Mahmood; Geraili, Bijan; Nasiri, Mahdi; Arab, Mohammad

    2015-01-01

    Inappropriate payments by insurance organizations or third party payers occur because of errors, abuse and fraud. The scale of this problem is large enough to make it a priority issue for health systems. Traditional methods of detecting health care fraud and abuse are time-consuming and inefficient. Combining automated methods and statistical knowledge lead to the emergence of a new interdisciplinary branch of science that is named Knowledge Discovery from Databases (KDD). Data mining is a core of the KDD process. Data mining can help third-party payers such as health insurance organizations to extract useful information from thousands of claims and identify a smaller subset of the claims or claimants for further assessment. We reviewed studies that performed data mining techniques for detecting health care fraud and abuse, using supervised and unsupervised data mining approaches. Most available studies have focused on algorithmic data mining without an emphasis on or application to fraud detection efforts in the context of health service provision or health insurance policy. More studies are needed to connect sound and evidence-based diagnosis and treatment approaches toward fraudulent or abusive behaviors. Ultimately, based on available studies, we recommend seven general steps to data mining of health care claims. PMID:25560347

  20. Evidence of Historical Mining Impacts on Saltmarshes from east Cornwall, UK

    NASA Astrophysics Data System (ADS)

    Iurian, Andra-Rada; Taylor, Alex; Millward, Geoff; Blake, William

    2016-04-01

    In landscapes with extensive mining history, saltmarshes can become sinks for contaminants that are vulnerable to release with sea-level rise and increased storminess. Given the prolonged residence time of heavy metals in the environment, data is urgently required to contextualise the impacts of past and present mining and pollution events and provide a baseline against which to assess Water Framework Directive (WFD) (2000/60/EC) compliance within an integrated catchment management framework. The geology of east Cornwall, UK (with intrusions of granite into the surrounding sedimentary rocks) was favourable for a prosperous mining industry, although large scale operations did not start until about 1830. Tin, cooper, lead and tungsten were the most important ores in the region. In order to quantify the spatial and temporal extent of contamination from past mining, sediment cores were collected from three saltmarshes, namely: Antony Marsh and Treluggan Marsh on the Lower Basin of River Lynher, and Port Eliot Marsh on the Lower Basin of River Tiddy. Core sections at 1 cm intervals were analysed by gamma-ray spectrometry for Pb-210, Ra-226, Cs-137 and Am-241, and the well-established Constant Rate of Supply (CRS) model was employed to derive Pb-210 geochronology with bomb-derived Cs-137 and Am-241 as independent chronological markers. The geochronological data provided the sedimentary accumulation and temporal context for the study. In terms of sediment quality with respect to mining pollution, core sections were analysed using Q-ICP-MS techniques and, additionally, WD-XRF instrumentation at Plymouth University. Measurements were performed for target elements that are normally associated with mining and smelting activities (e.g. Pb, Cu, Sn, Zn, Cr, Cd, etc.), and lithogenic elements (e.g. Fe, Al, Ti) that allow enrichment factors for the anthropogenically-derived elements to be determined. The grain size distribution was determined to identify storminess events and to detect discontinuities in the sediment record. Downcore trends in metal pollutants are discussed in the context of the chronological data, sediment composition and historic meteorological and river flow records. Acknowledgements: Andra-Rada Iurian acknowledges the support of a Marie Curie Fellowship (H2020-MSCA-IF-2014, Grant Agreement number: 658863) within the Horizon 2020.

  1. PyGPlates - a GPlates Python library for data analysis through space and deep geological time

    NASA Astrophysics Data System (ADS)

    Williams, Simon; Cannon, John; Qin, Xiaodong; Müller, Dietmar

    2017-04-01

    A fundamental consideration for studying the Earth through deep time is that the configurations of the continents, tectonic plates, and plate boundaries are continuously changing. Within a diverse range of fields including geodynamics, paleoclimate, and paleobiology, the importance of considering geodata in their reconstructed context across previous cycles of supercontinent aggregation, dispersal and ocean basin evolution is widely recognised. Open-source software tools such as GPlates provide paleo-geographic information systems for geoscientists to combine a wide variety of geodata and examine them within tectonic reconstructions through time. The availability of such powerful tools also brings new challenges - we want to learn something about the key associations between reconstructed plate motions and the geological record, but the high-dimensional parameter space is difficult for a human being to visually comprehend and quantify these associations. To achieve true spatio-temporal data-mining, new tools are needed. Here, we present a further development of the GPlates ecosystem - a Python-based tool for geotectonic analysis. In contrast to existing GPlates tools that are built around a graphical user interface (GUI) and interactive visualisation, pyGPlates offers a programming interface for the automation of quantitative plate tectonic analysis or arbitrary complexity. The vast array of open-source Python-based tools for data-mining, statistics and machine learning can now be linked to pyGPlates, allowing spatial data to be seamlessly analysed in space and geological "deep time", and with the ability to spread large computations across multiple processors. The presentation will illustrate a range of example applications, both simple and advanced. Basic examples include data querying, filtering, and reconstruction, and file-format conversions. For the innovative study of plate kinematics, pyGPlates has been used to explore the relationships between absolute plate motions, subduction zone kinematics, and mid-ocean ridge migration and orientation through deep time; to investigate the systematics of continental rift velocity evolution during Pangea breakup; and to make connections between kinematics of the Andean subduction zone and ore deposit formation. To support the numerical modelling community, pyGPlates facilitates the connection between tectonic surface boundary conditions contained within plate tectonic reconstructions (plate boundary configurations and plate velocities) and simulations such as thermo-mechanical models of lithospheric deformation and mantle convection. To support the development of web-based applications that can serve the wider geoscience community, we will demonstrate how pyGPlates can be combined with other open-source tools to serve alternative reconstructions together with a diverse array of reconstructed data sets in a self-consistent framework over the internet. PyGPlates is available to the public via the GPlates web site and contains comprehensive documentation covering installation on Windows/Mac/Linux platforms, sample code, tutorials and a detailed reference of pyGPlates functions and classes.

  2. Identification of Social and Environmental Conflicts Resulting from Open-Cast Mining

    NASA Astrophysics Data System (ADS)

    Górniak-Zimroz, Justyna; Pactwa, Katarzyna

    2016-10-01

    Open-cast mining is related to interference in the natural environment. It also affects human health and quality of life. This influence is, among others, dependent on the type of extracted materials, size of deposit, methods of mining and mineral processing, as well as, equally important, sensitivity of the environment within which mining is planned. The negative effects of mining include deformations of land surface or contamination of soils, air and water. What is more, in many cases, mining for minerals leads to clearing of housing and transport infrastructures located within the mining area, a decrease in values of the properties in the immediate vicinity of a deposit, and an increase in stress levels in local residents exposed to noise. The awareness of negative consequences of taking up open-cast mining activity leads to conflicts between a mining entrepreneur and self-government authorities, society or nongovernment organisations. The article attempts to identify potential social and environmental conflicts that may occur in relation to a planned mining activity. The results of the analyses were interpreted with respect to the deposits which were or have been mined. That enabled one to determine which facilities exclude mineral mining and which allow it. The research took the non-energy mineral resources into consideration which are included in the group of solid minerals located in one of the districts of Lower Silesian Province (SW Poland). The spatial analyses used the tools available in the geographical information systems

  3. Health system context and implementation of evidence-based practices-development and validation of the Context Assessment for Community Health (COACH) tool for low- and middle-income settings.

    PubMed

    Bergström, Anna; Skeen, Sarah; Duc, Duong M; Blandon, Elmer Zelaya; Estabrooks, Carole; Gustavsson, Petter; Hoa, Dinh Thi Phuong; Källestål, Carina; Målqvist, Mats; Nga, Nguyen Thu; Persson, Lars-Åke; Pervin, Jesmin; Peterson, Stefan; Rahman, Anisur; Selling, Katarina; Squires, Janet E; Tomlinson, Mark; Waiswa, Peter; Wallin, Lars

    2015-08-15

    The gap between what is known and what is practiced results in health service users not benefitting from advances in healthcare, and in unnecessary costs. A supportive context is considered a key element for successful implementation of evidence-based practices (EBP). There were no tools available for the systematic mapping of aspects of organizational context influencing the implementation of EBPs in low- and middle-income countries (LMICs). Thus, this project aimed to develop and psychometrically validate a tool for this purpose. The development of the Context Assessment for Community Health (COACH) tool was premised on the context dimension in the Promoting Action on Research Implementation in Health Services framework, and is a derivative product of the Alberta Context Tool. Its development was undertaken in Bangladesh, Vietnam, Uganda, South Africa and Nicaragua in six phases: (1) defining dimensions and draft tool development, (2) content validity amongst in-country expert panels, (3) content validity amongst international experts, (4) response process validity, (5) translation and (6) evaluation of psychometric properties amongst 690 health workers in the five countries. The tool was validated for use amongst physicians, nurse/midwives and community health workers. The six phases of development resulted in a good fit between the theoretical dimensions of the COACH tool and its psychometric properties. The tool has 49 items measuring eight aspects of context: Resources, Community engagement, Commitment to work, Informal payment, Leadership, Work culture, Monitoring services for action and Sources of knowledge. Aspects of organizational context that were identified as influencing the implementation of EBPs in high-income settings were also found to be relevant in LMICs. However, there were additional aspects of context of relevance in LMICs specifically Resources, Community engagement, Commitment to work and Informal payment. Use of the COACH tool will allow for systematic description of the local healthcare context prior implementing healthcare interventions to allow for tailoring implementation strategies or as part of the evaluation of implementing healthcare interventions and thus allow for deeper insights into the process of implementing EBPs in LMICs.

  4. BAGEL4: a user-friendly web server to thoroughly mine RiPPs and bacteriocins.

    PubMed

    van Heel, Auke J; de Jong, Anne; Song, Chunxu; Viel, Jakob H; Kok, Jan; Kuipers, Oscar P

    2018-05-21

    Interest in secondary metabolites such as RiPPs (ribosomally synthesized and posttranslationally modified peptides) is increasing worldwide. To facilitate the research in this field we have updated our mining web server. BAGEL4 is faster than its predecessor and is now fully independent from ORF-calling. Gene clusters of interest are discovered using the core-peptide database and/or through HMM motifs that are present in associated context genes. The databases used for mining have been updated and extended with literature references and links to UniProt and NCBI. Additionally, we have included automated promoter and terminator prediction and the option to upload RNA expression data, which can be displayed along with the identified clusters. Further improvements include the annotation of the context genes, which is now based on a fast blast against the prokaryote part of the UniRef90 database, and the improved web-BLAST feature that dynamically loads structural data such as internal cross-linking from UniProt. Overall BAGEL4 provides the user with more information through a user-friendly web-interface which simplifies data evaluation. BAGEL4 is freely accessible at http://bagel4.molgenrug.nl.

  5. Applications of Geomatics in Surface Mining

    NASA Astrophysics Data System (ADS)

    Blachowski, Jan; Górniak-Zimroz, Justyna; Milczarek, Wojciech; Pactwa, Katarzyna

    2017-12-01

    In terms of method of extracting mineral from deposit, mining can be classified into: surface, underground, and borehole mining. Surface mining is a form of mining, in which the soil and the rock covering the mineral deposits are removed. Types of surface mining include mainly strip and open-cast methods, as well as quarrying. Tasks associated with surface mining of minerals include: resource estimation and deposit documentation, mine planning and deposit access, mine plant development, extraction of minerals from deposits, mineral and waste processing, reclamation and reclamation of former mining grounds. At each stage of mining, geodata describing changes occurring in space during the entire life cycle of surface mining project should be taken into consideration, i.e. collected, analysed, processed, examined, distributed. These data result from direct (e.g. geodetic) and indirect (i.e. remote or relative) measurements and observations including airborne and satellite methods, geotechnical, geological and hydrogeological data, and data from other types of sensors, e.g. located on mining equipment and infrastructure, mine plans and maps. Management of such vast sources and sets of geodata, as well as information resulting from processing, integrated analysis and examining such data can be facilitated with geomatic solutions. Geomatics is a discipline of gathering, processing, interpreting, storing and delivering spatially referenced information. Thus, geomatics integrates methods and technologies used for collecting, management, processing, visualizing and distributing spatial data. In other words, its meaning covers practically every method and tool from spatial data acquisition to distribution. In this work examples of application of geomatic solutions in surface mining on representative case studies in various stages of mine operation have been presented. These applications include: prospecting and documenting mineral deposits, assessment of land accessibility for a potential large-scale surface mining project, modelling mineral deposit (granite) management, concept of a system for management of conveyor belt network technical condition, project of a geoinformation system of former mining terrains and objects, and monitoring and control of impact of surface mining on mine surroundings with satellite radar interferometry.

  6. An open data mining framework for the analysis of medical images: application on obstructive nephropathy microscopy images.

    PubMed

    Doukas, Charalampos; Goudas, Theodosis; Fischer, Simon; Mierswa, Ingo; Chatziioannou, Aristotle; Maglogiannis, Ilias

    2010-01-01

    This paper presents an open image-mining framework that provides access to tools and methods for the characterization of medical images. Several image processing and feature extraction operators have been implemented and exposed through Web Services. Rapid-Miner, an open source data mining system has been utilized for applying classification operators and creating the essential processing workflows. The proposed framework has been applied for the detection of salient objects in Obstructive Nephropathy microscopy images. Initial classification results are quite promising demonstrating the feasibility of automated characterization of kidney biopsy images.

  7. An evolving computational platform for biological mass spectrometry: workflows, statistics and data mining with MASSyPup64.

    PubMed

    Winkler, Robert

    2015-01-01

    In biological mass spectrometry, crude instrumental data need to be converted into meaningful theoretical models. Several data processing and data evaluation steps are required to come to the final results. These operations are often difficult to reproduce, because of too specific computing platforms. This effect, known as 'workflow decay', can be diminished by using a standardized informatic infrastructure. Thus, we compiled an integrated platform, which contains ready-to-use tools and workflows for mass spectrometry data analysis. Apart from general unit operations, such as peak picking and identification of proteins and metabolites, we put a strong emphasis on the statistical validation of results and Data Mining. MASSyPup64 includes e.g., the OpenMS/TOPPAS framework, the Trans-Proteomic-Pipeline programs, the ProteoWizard tools, X!Tandem, Comet and SpiderMass. The statistical computing language R is installed with packages for MS data analyses, such as XCMS/metaXCMS and MetabR. The R package Rattle provides a user-friendly access to multiple Data Mining methods. Further, we added the non-conventional spreadsheet program teapot for editing large data sets and a command line tool for transposing large matrices. Individual programs, console commands and modules can be integrated using the Workflow Management System (WMS) taverna. We explain the useful combination of the tools by practical examples: (1) A workflow for protein identification and validation, with subsequent Association Analysis of peptides, (2) Cluster analysis and Data Mining in targeted Metabolomics, and (3) Raw data processing, Data Mining and identification of metabolites in untargeted Metabolomics. Association Analyses reveal relationships between variables across different sample sets. We present its application for finding co-occurring peptides, which can be used for target proteomics, the discovery of alternative biomarkers and protein-protein interactions. Data Mining derived models displayed a higher robustness and accuracy for classifying sample groups in targeted Metabolomics than cluster analyses. Random Forest models do not only provide predictive models, which can be deployed for new data sets, but also the variable importance. We demonstrate that the later is especially useful for tracking down significant signals and affected pathways in untargeted Metabolomics. Thus, Random Forest modeling supports the unbiased search for relevant biological features in Metabolomics. Our results clearly manifest the importance of Data Mining methods to disclose non-obvious information in biological mass spectrometry . The application of a Workflow Management System and the integration of all required programs and data in a consistent platform makes the presented data analyses strategies reproducible for non-expert users. The simple remastering process and the Open Source licenses of MASSyPup64 (http://www.bioprocess.org/massypup/) enable the continuous improvement of the system.

  8. Visual Analysis as a design and decision-making tool in the development of a quarry

    Treesearch

    Randall Boyd Fitzgerald

    1979-01-01

    In order to obtain local and state government approvals, an environmental impact analysis of the mining and reclamation of a proposed hard rock quarry was required. High visibility of the proposed mining area from the adjacent community required a visual impact analysis in the planning and design of the project. The Visual Analysis defined design criteria for the...

  9. Data Mining in Institutional Economics Tasks

    NASA Astrophysics Data System (ADS)

    Kirilyuk, Igor; Kuznetsova, Anna; Senko, Oleg

    2018-02-01

    The paper discusses problems associated with the use of data mining tools to study discrepancies between countries with different types of institutional matrices by variety of potential explanatory variables: climate, economic or infrastructure indicators. An approach is presented which is based on the search of statistically valid regularities describing the dependence of the institutional type on a single variable or a pair of variables. Examples of regularities are given.

  10. Accounts receivable reports: underutilized mining tools.

    PubMed

    Wallace, R

    1999-01-01

    There is gold to be found in accounts receivable reports for those willing to mine the data. The key is to know how to interpret the information buried within the numbers and use it to recover monies owed. This article identifies seven reports that should be staples in every organization committed to improving its overall collection performance. Also included are tips on understanding reports and implementing changes.

  11. Resources for Functional Genomics Studies in Drosophila melanogaster

    PubMed Central

    Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

    2014-01-01

    Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003

  12. Data Mining and Machine Learning Tools for Combinatorial Material Science of All-Oxide Photovoltaic Cells.

    PubMed

    Yosipof, Abraham; Nahum, Oren E; Anderson, Assaf Y; Barad, Hannah-Noa; Zaban, Arie; Senderowitz, Hanoch

    2015-06-01

    Growth in energy demands, coupled with the need for clean energy, are likely to make solar cells an important part of future energy resources. In particular, cells entirely made of metal oxides (MOs) have the potential to provide clean and affordable energy if their power conversion efficiencies are improved. Such improvements require the development of new MOs which could benefit from combining combinatorial material sciences for producing solar cells libraries with data mining tools to direct synthesis efforts. In this work we developed a data mining workflow and applied it to the analysis of two recently reported solar cell libraries based on Titanium and Copper oxides. Our results demonstrate that QSAR models with good prediction statistics for multiple solar cells properties could be developed and that these models highlight important factors affecting these properties in accord with experimental findings. The resulting models are therefore suitable for designing better solar cells. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. MyWEST: my Web Extraction Software Tool for effective mining of annotations from web-based databanks.

    PubMed

    Masseroli, Marco; Stella, Andrea; Meani, Natalia; Alcalay, Myriam; Pinciroli, Francesco

    2004-12-12

    High-throughput technologies create the necessity to mine large amounts of gene annotations from diverse databanks, and to integrate the resulting data. Most databanks can be interrogated only via Web, for a single gene at a time, and query results are generally available only in the HTML format. Although some databanks provide batch retrieval of data via FTP, this requires expertise and resources for locally reimplementing the databank. We developed MyWEST, a tool aimed at researchers without extensive informatics skills or resources, which exploits user-defined templates to easily mine selected annotations from different Web-interfaced databanks, and aggregates and structures results in an automatically updated database. Using microarray results from a model system of retinoic acid-induced differentiation, MyWEST effectively gathered relevant annotations from various biomolecular databanks, highlighted significant biological characteristics and supported a global approach to the understanding of complex cellular mechanisms. MyWEST is freely available for non-profit use at http://www.medinfopoli.polimi.it/MyWEST/

  14. The application of data mining techniques to oral cancer prognosis.

    PubMed

    Tseng, Wan-Ting; Chiang, Wei-Fan; Liu, Shyun-Yeu; Roan, Jinsheng; Lin, Chun-Nan

    2015-05-01

    This study adopted an integrated procedure that combines the clustering and classification features of data mining technology to determine the differences between the symptoms shown in past cases where patients died from or survived oral cancer. Two data mining tools, namely decision tree and artificial neural network, were used to analyze the historical cases of oral cancer, and their performance was compared with that of logistic regression, the popular statistical analysis tool. Both decision tree and artificial neural network models showed superiority to the traditional statistical model. However, as to clinician, the trees created by the decision tree models are relatively easier to interpret compared to that of the artificial neural network models. Cluster analysis also discovers that those stage 4 patients whose also possess the following four characteristics are having an extremely low survival rate: pN is N2b, level of RLNM is level I-III, AJCC-T is T4, and cells mutate situation (G) is moderate.

  15. Data mining in pharma sector: benefits.

    PubMed

    Ranjan, Jayanthi

    2009-01-01

    The amount of data getting generated in any sector at present is enormous. The information flow in the pharma industry is huge. Pharma firms are progressing into increased technology-enabled products and services. Data mining, which is knowledge discovery from large sets of data, helps pharma firms to discover patterns in improving the quality of drug discovery and delivery methods. The paper aims to present how data mining is useful in the pharma industry, how its techniques can yield good results in pharma sector, and to show how data mining can really enhance in making decisions using pharmaceutical data. This conceptual paper is written based on secondary study, research and observations from magazines, reports and notes. The author has listed the types of patterns that can be discovered using data mining in pharma data. The paper shows how data mining is useful in the pharma industry and how its techniques can yield good results in pharma sector. Although much work can be produced for discovering knowledge in pharma data using data mining, the paper is limited to conceptualizing the ideas and view points at this stage; future work may include applying data mining techniques to pharma data based on primary research using the available, famous significant data mining tools. Research papers and conceptual papers related to data mining in Pharma industry are rare; this is the motivation for the paper.

  16. Exploring the effects of acid mine drainage on diatom teratology using geometric morphometry.

    PubMed

    Olenici, Adriana; Blanco, Saúl; Borrego-Ramos, María; Momeu, Laura; Baciu, Călin

    2017-10-01

    Metal pollution of aquatic habitats is a major and persistent environmental problem. Acid mine drainage (AMD) affects lotic systems in numerous and interactive ways. In the present work, a mining area (Roșia Montană) was chosen as study site, and we focused on two aims: (i) to find the set of environmental predictors leading to the appearance of the abnormal diatom individuals in the study area and (ii) to assess the relationship between the degree of valve outline deformation and AMD-derived pollution. In this context, morphological differences between populations of Achnanthidium minutissimum and A. macrocephalum, including normal and abnormal individuals, were evidenced by means of valve shape analysis. Geometric morphometry managed to capture and discriminate normal and abnormal individuals. Multivariate analyses (NMDS, PLS) separated the four populations of the two species mentioned and revealed the main physico-chemical parameters that influenced valve deformation in this context, namely conductivity, Zn, and Cu. ANOSIM test evidenced the presence of statistically significant differences between normal and abnormal individuals within both chosen Achnanthidium taxa. In order to determine the relative contribution of each of the measured physico-chemical parameters in the observed valve outline deformations, a PLS was conducted, confirming the results of the NMDS. The presence of deformed individuals in the study area can be attributed to the fact that the diatom communities were strongly affected by AMD released from old mining works and waste rock deposits.

  17. The United States and Vietnam Relationship: Benefits and Challenges for Vietnam

    DTIC Science & Technology

    2016-06-10

    the current stage in their bilateral relations. The U.S.-Vietnam relationship has been increasingly cemented in the context of the contemporary...reach the current stage in their bilateral relations. The U.S.-Vietnam relationship has been increasingly cemented in the context of the contemporary...Major Exports to Vietnam aircraft, mining equipment, electronic machinery, steel wire, raw cotton, plastics Source: Mark E. Manyin, The Vietnam

  18. A novel approach for acid mine drainage pollution biomonitoring using rare earth elements bioaccumulated in the freshwater clam Corbicula fluminea.

    PubMed

    Bonnail, Estefanía; Pérez-López, Rafael; Sarmiento, Aguasanta M; Nieto, José Miguel; DelValls, T Ángel

    2017-09-15

    Lanthanide series have been used as a record of the water-rock interaction and work as a tool for identifying impacts of acid mine drainage (lixiviate residue derived from sulphide oxidation). The application of North-American Shale Composite-normalized rare earth elements patterns to these minority elements allows determining the origin of the contamination. In the current study, geochemical patterns were applied to rare earth elements bioaccumulated in the soft tissue of the freshwater clam Corbicula fluminea after exposure to different acid mine drainage contaminated environments. Results show significant bioaccumulation of rare earth elements in soft tissue of the clam after 14 days of exposure to acid mine drainage contaminated sediment (ΣREE=1.3-8μg/gdw). Furthermore, it was possible to biomonitor different degrees of contamination based on rare earth elements in tissue. The pattern of this type of contamination describes a particular curve characterized by an enrichment in the middle rare earth elements; a homologous pattern (E MREE =0.90) has also been observed when applied NASC normalization in clam tissues. Results of lanthanides found in clams were contrasted with the paucity of toxicity studies, determining risk caused by light rare earth elements in the Odiel River close to the Estuary. The current study purposes the use of clam as an innovative "bio-tool" for the biogeochemical monitoring of pollution inputs that determines the acid mine drainage networks affection. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. User Guidelines for the Brassica Database: BRAD.

    PubMed

    Wang, Xiaobo; Cheng, Feng; Wang, Xiaowu

    2016-01-01

    The genome sequence of Brassica rapa was first released in 2011. Since then, further Brassica genomes have been sequenced or are undergoing sequencing. It is therefore necessary to develop tools that help users to mine information from genomic data efficiently. This will greatly aid scientific exploration and breeding application, especially for those with low levels of bioinformatic training. Therefore, the Brassica database (BRAD) was built to collect, integrate, illustrate, and visualize Brassica genomic datasets. BRAD provides useful searching and data mining tools, and facilitates the search of gene annotation datasets, syntenic or non-syntenic orthologs, and flanking regions of functional genomic elements. It also includes genome-analysis tools such as BLAST and GBrowse. One of the important aims of BRAD is to build a bridge between Brassica crop genomes with the genome of the model species Arabidopsis thaliana, thus transferring the bulk of A. thaliana gene study information for use with newly sequenced Brassica crops.

  20. Supporting the annotation of chronic obstructive pulmonary disease (COPD) phenotypes with text mining workflows.

    PubMed

    Fu, Xiao; Batista-Navarro, Riza; Rak, Rafal; Ananiadou, Sophia

    2015-01-01

    Chronic obstructive pulmonary disease (COPD) is a life-threatening lung disorder whose recent prevalence has led to an increasing burden on public healthcare. Phenotypic information in electronic clinical records is essential in providing suitable personalised treatment to patients with COPD. However, as phenotypes are often "hidden" within free text in clinical records, clinicians could benefit from text mining systems that facilitate their prompt recognition. This paper reports on a semi-automatic methodology for producing a corpus that can ultimately support the development of text mining tools that, in turn, will expedite the process of identifying groups of COPD patients. A corpus of 30 full-text papers was formed based on selection criteria informed by the expertise of COPD specialists. We developed an annotation scheme that is aimed at producing fine-grained, expressive and computable COPD annotations without burdening our curators with a highly complicated task. This was implemented in the Argo platform by means of a semi-automatic annotation workflow that integrates several text mining tools, including a graphical user interface for marking up documents. When evaluated using gold standard (i.e., manually validated) annotations, the semi-automatic workflow was shown to obtain a micro-averaged F-score of 45.70% (with relaxed matching). Utilising the gold standard data to train new concept recognisers, we demonstrated that our corpus, although still a work in progress, can foster the development of significantly better performing COPD phenotype extractors. We describe in this work the means by which we aim to eventually support the process of COPD phenotype curation, i.e., by the application of various text mining tools integrated into an annotation workflow. Although the corpus being described is still under development, our results thus far are encouraging and show great potential in stimulating the development of further automatic COPD phenotype extractors.

  1. Seismic tomography as a tool for measuring stress in mines

    USGS Publications Warehouse

    Scott, Douglas F.; Williams, T.J.; Denton, D.K.; Friedel, M.J.

    1999-01-01

    Spokane Research Center personnel have been investigating the use of seismic tomography to monitor the behavior of a rock mass, detect hazardous ground conditions and assess the mechanical integrity of a rock mass affected by mining. Seismic tomography can be a valuable tool for determining relative stress in deep, >1,220-m (>4,000-ft), underground pillars. If high-stress areas are detected, they can be destressed prior to development or they can be avoided. High-stress areas can be monitored with successive seismic surveys to determine if stress decreases to a level where development can be initiated safely. There are several benefits to using seismic tomography to identify high stress in deep underground pillars. The technique is reliable, cost-effective, efficient and noninvasive. Also, investigators can monitor large rock masses, as well as monitor pillars during the mining cycle. By identifying areas of high stress, engineers will be able to assure that miners are working in a safer environment.Spokane Research Center personnel have been investigating the use of seismic tomography to monitor the behavior of a rock mass, detect hazardous ground conditions and assess the mechanical integrity of a rock mass affected by mining. Seismic tomography can be a valuable tool for determining relative stress in deep, >1,200-m (>4,000-ft), underground pillars. If high-stress areas are detected, they can be destressed prior to development or they can be avoided. High-stress areas can be monitored with successive seismic surveys to determine if stress decreases to a level where development can be initiated safely. There are several benefits to using seismic tomography to identify high stress in deep underground pillars. The technique is reliable, cost-effective, efficient and noninvasive. Also, investigators can monitor large rock masses, as well as monitor pillars during the mining cycle. By identifying areas of high stress. engineers will be able to assure that miners are working in a safer environment.

  2. Australia's proactive approach to radiation protection of the environment: how integrated is it with radiation protection of humans?

    PubMed

    Hirth, G A; Grzechnik, M; Tinker, R; Larsson, C M

    2018-01-01

    Australia's regulatory framework has evolved over the past decade from the assumption that protection of humans implies protection of the environment to the situation now where radiological impacts on non-human species (wildlife) are considered in their own right. In an Australian context, there was a recognised need for specific national guidance on protection of non-human species, for which the uranium mining industry provides the major backdrop. National guidance supported by publications of the Australian Radiation Protection and Nuclear Safety Agency (Radiation Protection Series) provides clear and consistent advice to operators and regulators on protection of non-human species, including advice on specific assessment methods and models, and how these might be applied in an Australian context. These approaches and the supporting assessment tools provide a mechanism for industry to assess and demonstrate compliance with the environmental protection objectives of relevant legislation, and to meet stakeholder expectations that radiological protection of the environment is taken into consideration in accordance with international best practice. Experiences from the past 5-10 years, and examples of where the approach to radiation protection of the environment has been well integrated or presented some challenges will be discussed. Future challenges in addressing protection of the environment in existing exposure situations will also be discussed.

  3. What the papers say: Text mining for genomics and systems biology

    PubMed Central

    2010-01-01

    Keeping up with the rapidly growing literature has become virtually impossible for most scientists. This can have dire consequences. First, we may waste research time and resources on reinventing the wheel simply because we can no longer maintain a reliable grasp on the published literature. Second, and perhaps more detrimental, judicious (or serendipitous) combination of knowledge from different scientific disciplines, which would require following disparate and distinct research literatures, is rapidly becoming impossible for even the most ardent readers of research publications. Text mining -- the automated extraction of information from (electronically) published sources -- could potentially fulfil an important role -- but only if we know how to harness its strengths and overcome its weaknesses. As we do not expect that the rate at which scientific results are published will decrease, text mining tools are now becoming essential in order to cope with, and derive maximum benefit from, this information explosion. In genomics, this is particularly pressing as more and more rare disease-causing variants are found and need to be understood. Not being conversant with this technology may put scientists and biomedical regulators at a severe disadvantage. In this review, we introduce the basic concepts underlying modern text mining and its applications in genomics and systems biology. We hope that this review will serve three purposes: (i) to provide a timely and useful overview of the current status of this field, including a survey of present challenges; (ii) to enable researchers to decide how and when to apply text mining tools in their own research; and (iii) to highlight how the research communities in genomics and systems biology can help to make text mining from biomedical abstracts and texts more straightforward. PMID:21106487

  4. Moment tensor clustering: a tool to monitor mining induced seismicity

    NASA Astrophysics Data System (ADS)

    Cesca, Simone; Dahm, Torsten; Tolga Sen, Ali

    2013-04-01

    Automated moment tensor inversion routines have been setup in the last decades for the analysis of global and regional seismicity. Recent developments could be used to analyse smaller events and larger datasets. In particular, applications to microseismicity, e.g. in mining environments, have then led to the generation of large moment tensor catalogues. Moment tensor catalogues provide a valuable information about the earthquake source and details of rupturing processes taking place in the seismogenic region. Earthquake focal mechanisms can be used to discuss the local stress field, possible orientations of the fault system or to evaluate the presence of shear and/or tensile cracks. Focal mechanism and moment tensor solutions are typically analysed for selected events, and quick and robust tools for the automated analysis of larger catalogues are needed. We propose here a method to perform cluster analysis for large moment tensor catalogues and identify families of events which characterize the studied microseismicity. Clusters include events with similar focal mechanisms, first requiring the definition of distance between focal mechanisms. Different metrics are here proposed, both for the case of pure double couple, constrained moment tensor and full moment tensor catalogues. Different clustering approaches are implemented and discussed. The method is here applied to synthetic and real datasets from mining environments to demonstrate its potential: the proposed cluserting techniques prove to be able to automatically recognise major clusters. An important application for mining monitoring concerns the early identification of anomalous rupture processes, which is relevant for the hazard assessment. This study is funded by the project MINE, which is part of the R&D-Programme GEOTECHNOLOGIEN. The project MINE is funded by the German Ministry of Education and Research (BMBF), Grant of project BMBF03G0737.

  5. Phase I Contaminant Transport Parameters for the Groundwater Flow and Contaminant Transport Model of Corrective Action Unit 97: Yucca Flat/Climax Mine, Nevada Test Site, Nye County, Nevada, Revision 0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    John McCord

    2007-09-01

    This report documents transport data and data analyses for Yucca Flat/Climax Mine CAU 97. The purpose of the data compilation and related analyses is to provide the primary reference to support parameterization of the Yucca Flat/Climax Mine CAU transport model. Specific task objectives were as follows: • Identify and compile currently available transport parameter data and supporting information that may be relevant to the Yucca Flat/Climax Mine CAU. • Assess the level of quality of the data and associated documentation. • Analyze the data to derive expected values and estimates of the associated uncertainty and variability. The scope of thismore » document includes the compilation and assessment of data and information relevant to transport parameters for the Yucca Flat/Climax Mine CAU subsurface within the context of unclassified source-term contamination. Data types of interest include mineralogy, aqueous chemistry, matrix and effective porosity, dispersivity, matrix diffusion, matrix and fracture sorption, and colloid-facilitated transport parameters.« less

  6. A data mining based approach to predict spatiotemporal changes in satellite images

    NASA Astrophysics Data System (ADS)

    Boulila, W.; Farah, I. R.; Ettabaa, K. Saheb; Solaiman, B.; Ghézala, H. Ben

    2011-06-01

    The interpretation of remotely sensed images in a spatiotemporal context is becoming a valuable research topic. However, the constant growth of data volume in remote sensing imaging makes reaching conclusions based on collected data a challenging task. Recently, data mining appears to be a promising research field leading to several interesting discoveries in various areas such as marketing, surveillance, fraud detection and scientific discovery. By integrating data mining and image interpretation techniques, accurate and relevant information (i.e. functional relation between observed parcels and a set of informational contents) can be automatically elicited. This study presents a new approach to predict spatiotemporal changes in satellite image databases. The proposed method exploits fuzzy sets and data mining concepts to build predictions and decisions for several remote sensing fields. It takes into account imperfections related to the spatiotemporal mining process in order to provide more accurate and reliable information about land cover changes in satellite images. The proposed approach is validated using SPOT images representing the Saint-Denis region, capital of Reunion Island. Results show good performances of the proposed framework in predicting change for the urban zone.

  7. Cooperative organic mine avoidance path planning

    NASA Astrophysics Data System (ADS)

    McCubbin, Christopher B.; Piatko, Christine D.; Peterson, Adam V.; Donnald, Creighton R.; Cohen, David

    2005-06-01

    The JHU/APL Path Planning team has developed path planning techniques to look for paths that balance the utility and risk associated with different routes through a minefield. Extending on previous years' efforts, we investigated real-world Naval mine avoidance requirements and developed a tactical decision aid (TDA) that satisfies those requirements. APL has developed new mine path planning techniques using graph based and genetic algorithms which quickly produce near-minimum risk paths for complicated fitness functions incorporating risk, path length, ship kinematics, and naval doctrine. The TDA user interface, a Java Swing application that obtains data via Corba interfaces to path planning databases, allows the operator to explore a fusion of historic and in situ mine field data, control the path planner, and display the planning results. To provide a context for the minefield data, the user interface also renders data from the Digital Nautical Chart database, a database created by the National Geospatial-Intelligence Agency containing charts of the world's ports and coastal regions. This TDA has been developed in conjunction with the COMID (Cooperative Organic Mine Defense) system. This paper presents a description of the algorithms, architecture, and application produced.

  8. The mine and the furnace: Francis Bacon, Thomas Russell, and early Stuart mining culture.

    PubMed

    Pastorino, Cesare

    2009-01-01

    Notwithstanding Francis Bacon's praise for the philosophical role of the mechanical arts, historians have often downplayed Bacon's connections with actual artisans and entrepreneurs. Addressing the specific context of mining culture, this study proposes a rather different picture. The analysis of a famous mining metaphor in The Advancement of Learning shows us how Bacon's project of reform of knowledge could find an apt correspondence in civic and entrepreneurial values of his time. Also, Bacon had interesting and so far unexplored links with the early modern English mining enterprises, like the Company of Mineral and Battery Works, ofwhich he was a shareholder. Moreover, Bacon's notes in a private notebook, Commentarius Solutus, and records of patents of invention, allow us to start grasping Bacon's connections with the metallurgist and entrepreneur Thomas Russell. Lastly, this paper argues that, to fully understand Bacon's links with the world of Stuart technicians and entrepreneurs, it is necessary to consider a different and insufficiently studied aspect of Bacon's interests, namely his work as patents referee while a Commissioner of Suits.

  9. Rare disease diagnosis: A review of web search, social media and large-scale data-mining approaches.

    PubMed

    Svenstrup, Dan; Jørgensen, Henrik L; Winther, Ole

    2015-01-01

    Physicians and the general public are increasingly using web-based tools to find answers to medical questions. The field of rare diseases is especially challenging and important as shown by the long delay and many mistakes associated with diagnoses. In this paper we review recent initiatives on the use of web search, social media and data mining in data repositories for medical diagnosis. We compare the retrieval accuracy on 56 rare disease cases with known diagnosis for the web search tools google.com, pubmed.gov, omim.org and our own search tool findzebra.com. We give a detailed description of IBM's Watson system and make a rough comparison between findzebra.com and Watson on subsets of the Doctor's dilemma dataset. The recall@10 and recall@20 (fraction of cases where the correct result appears in top 10 and top 20) for the 56 cases are found to be be 29%, 16%, 27% and 59% and 32%, 18%, 34% and 64%, respectively. Thus, FindZebra has a significantly (p < 0.01) higher recall than the other 3 search engines. When tested under the same conditions, Watson and FindZebra showed similar recall@10 accuracy. However, the tests were performed on different subsets of Doctors dilemma questions. Advances in technology and access to high quality data have opened new possibilities for aiding the diagnostic process. Specialized search engines, data mining tools and social media are some of the areas that hold promise.

  10. Rare disease diagnosis: A review of web search, social media and large-scale data-mining approaches

    PubMed Central

    Svenstrup, Dan; Jørgensen, Henrik L; Winther, Ole

    2015-01-01

    Physicians and the general public are increasingly using web-based tools to find answers to medical questions. The field of rare diseases is especially challenging and important as shown by the long delay and many mistakes associated with diagnoses. In this paper we review recent initiatives on the use of web search, social media and data mining in data repositories for medical diagnosis. We compare the retrieval accuracy on 56 rare disease cases with known diagnosis for the web search tools google.com, pubmed.gov, omim.org and our own search tool findzebra.com. We give a detailed description of IBM's Watson system and make a rough comparison between findzebra.com and Watson on subsets of the Doctor's dilemma dataset. The recall@10 and recall@20 (fraction of cases where the correct result appears in top 10 and top 20) for the 56 cases are found to be be 29%, 16%, 27% and 59% and 32%, 18%, 34% and 64%, respectively. Thus, FindZebra has a significantly (p < 0.01) higher recall than the other 3 search engines. When tested under the same conditions, Watson and FindZebra showed similar recall@10 accuracy. However, the tests were performed on different subsets of Doctors dilemma questions. Advances in technology and access to high quality data have opened new possibilities for aiding the diagnostic process. Specialized search engines, data mining tools and social media are some of the areas that hold promise. PMID:26442199

  11. CrosstalkNet: A Visualization Tool for Differential Co-expression Networks and Communities.

    PubMed

    Manem, Venkata; Adam, George Alexandru; Gruosso, Tina; Gigoux, Mathieu; Bertos, Nicholas; Park, Morag; Haibe-Kains, Benjamin

    2018-04-15

    Variations in physiological conditions can rewire molecular interactions between biological compartments, which can yield novel insights into gain or loss of interactions specific to perturbations of interest. Networks are a promising tool to elucidate intercellular interactions, yet exploration of these large-scale networks remains a challenge due to their high dimensionality. To retrieve and mine interactions, we developed CrosstalkNet, a user friendly, web-based network visualization tool that provides a statistical framework to infer condition-specific interactions coupled with a community detection algorithm for bipartite graphs to identify significantly dense subnetworks. As a case study, we used CrosstalkNet to mine a set of 54 and 22 gene-expression profiles from breast tumor and normal samples, respectively, with epithelial and stromal compartments extracted via laser microdissection. We show how CrosstalkNet can be used to explore large-scale co-expression networks and to obtain insights into the biological processes that govern cross-talk between different tumor compartments. Significance: This web application enables researchers to mine complex networks and to decipher novel biological processes in tumor epithelial-stroma cross-talk as well as in other studies of intercompartmental interactions. Cancer Res; 78(8); 2140-3. ©2018 AACR . ©2018 American Association for Cancer Research.

  12. Network-Centric Data Mining for Medical Applications

    ERIC Educational Resources Information Center

    Davis, Darcy A.

    2012-01-01

    Faced with unsustainable costs and enormous amounts of under-utilized data, health care needs more efficient practices, research, and tools to harness the benefits of data. These methods create a feedback loop where computational tools guide and facilitate research, leading to improved biological knowledge and clinical standards, which will in…

  13. Figure mining for biomedical research.

    PubMed

    Rodriguez-Esteban, Raul; Iossifov, Ivan

    2009-08-15

    Figures from biomedical articles contain valuable information difficult to reach without specialized tools. Currently, there is no search engine that can retrieve specific figure types. This study describes a retrieval method that takes advantage of principles in image understanding, text mining and optical character recognition (OCR) to retrieve figure types defined conceptually. A search engine was developed to retrieve tables and figure types to aid computational and experimental research. http://iossifovlab.cshl.edu/figurome/.

  14. Relevance of ERTS-1 to the state of Ohio

    NASA Technical Reports Server (NTRS)

    Sweet, D. C. (Principal Investigator); Wells, T. L.; Wukelic, G. E.

    1973-01-01

    The author has identified the following significant results. To date, only one significant result has been reported for the Ohio ERTS program. This result relates to the proven usefulness of ERTS-1 imagery for mapping and inventorying strip-mined areas in southeastern Ohio. ERTS provides a tool for rapidly and economically acquiring an up-to-date inventory of strip-mined lands for state planning purposes which was not previously possible.

  15. Stratification-Based Outlier Detection over the Deep Web.

    PubMed

    Xian, Xuefeng; Zhao, Pengpeng; Sheng, Victor S; Fang, Ligang; Gu, Caidong; Yang, Yuanfeng; Cui, Zhiming

    2016-01-01

    For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribution of this paper is to develop a new data mining method for outlier detection over deep web. In our approach, the query space of a deep web data source is stratified based on a pilot sample. Neighborhood sampling and uncertainty sampling are developed in this paper with the goal of improving recall and precision based on stratification. Finally, a careful performance evaluation of our algorithm confirms that our approach can effectively detect outliers in deep web.

  16. Data Mining in the Exploration of Stressors Among NCAA Student Athletes.

    PubMed

    Hwang, Seunghyun; Choi, Youngjun

    2016-12-01

    Collegiate student athletes face psychological stressors in adjusting to campus life. This study used preexisting, nationally representative data administered by the National Collegiate Athletic Association for student athletes in 2010 to explore the conjunctive relationships among demographics, personal characteristics, social contexts, and physical condition in predicting perceived stress. The number of valid samples was 19,967 from 609 member institutions. A data mining methodology (i.e., SEARCH) was applied to model the distribution of the perceived stress. Results showed that significant stressors included the variables related to academics, physical well-being, and social contexts. Academic anxiety was the most important predictor, and its interactions with abusive coaching behavior and an inclusive team environment were shown to reduce perceived stress. Sufficient sleep was also found as a moderator in the positive relationship between perceived stress and academic anxiety. © The Author(s) 2016.

  17. Stratification-Based Outlier Detection over the Deep Web

    PubMed Central

    Xian, Xuefeng; Zhao, Pengpeng; Sheng, Victor S.; Fang, Ligang; Gu, Caidong; Yang, Yuanfeng; Cui, Zhiming

    2016-01-01

    For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribution of this paper is to develop a new data mining method for outlier detection over deep web. In our approach, the query space of a deep web data source is stratified based on a pilot sample. Neighborhood sampling and uncertainty sampling are developed in this paper with the goal of improving recall and precision based on stratification. Finally, a careful performance evaluation of our algorithm confirms that our approach can effectively detect outliers in deep web. PMID:27313603

  18. Solutions for Mining Distributed Scientific Data

    NASA Astrophysics Data System (ADS)

    Lynnes, C.; Pham, L.; Graves, S.; Ramachandran, R.; Maskey, M.; Keiser, K.

    2007-12-01

    Researchers at the University of Alabama in Huntsville (UAH) and the Goddard Earth Sciences Data and Information Services Center (GES DISC) are working on approaches and methodologies facilitating the analysis of large amounts of distributed scientific data. Despite the existence of full-featured analysis tools, such as the Algorithm Development and Mining (ADaM) toolkit from UAH, and data repositories, such as the GES DISC, that provide online access to large amounts of data, there remain obstacles to getting the analysis tools and the data together in a workable environment. Does one bring the data to the tools or deploy the tools close to the data? The large size of many current Earth science datasets incurs significant overhead in network transfer for analysis workflows, even with the advanced networking capabilities that are available between many educational and government facilities. The UAH and GES DISC team are developing a capability to define analysis workflows using distributed services and online data resources. We are developing two solutions for this problem that address different analysis scenarios. The first is a Data Center Deployment of the analysis services for large data selections, orchestrated by a remotely defined analysis workflow. The second is a Data Mining Center approach of providing a cohesive analysis solution for smaller subsets of data. The two approaches can be complementary and thus provide flexibility for researchers to exploit the best solution for their data requirements. The Data Center Deployment of the analysis services has been implemented by deploying ADaM web services at the GES DISC so they can access the data directly, without the need of network transfers. Using the Mining Workflow Composer, a user can define an analysis workflow that is then submitted through a Web Services interface to the GES DISC for execution by a processing engine. The workflow definition is composed, maintained and executed at a distributed location, but most of the actual services comprising the workflow are available local to the GES DISC data repository. Additional refinements will ultimately provide a package that is easily implemented and configured at additional data centers for analysis of additional science data sets. Enhancements to the ADaM toolkit allow the staging of distributed data wherever the services are deployed, to support a Data Mining Center that can provide additional computational resources, large storage of output, easier addition and updates to available services, and access to data from multiple repositories. The Data Mining Center case provides researchers more flexibility to quickly try different workflow configurations and refine the process, using smaller amounts of data that may likely be transferred from distributed online repositories. This environment is sufficient for some analyses, but can also be used as an initial sandbox to test and refine a solution before staging the execution at a Data Center Deployment. Detection of airborne dust both over water and land in MODIS imagery using mining services for both solutions will be presented. The dust detection is just one possible example of the mining and analysis capabilities the proposed mining services solutions will provide to the science community. More information about the available services and the current status of this project is available at http://www.itsc.uah.edu/mws/

  19. Data Mining Methods for Recommender Systems

    NASA Astrophysics Data System (ADS)

    Amatriain, Xavier; Jaimes*, Alejandro; Oliver, Nuria; Pujol, Josep M.

    In this chapter, we give an overview of the main Data Mining techniques used in the context of Recommender Systems. We first describe common preprocessing methods such as sampling or dimensionality reduction. Next, we review the most important classification techniques, including Bayesian Networks and Support Vector Machines. We describe the k-means clustering algorithm and discuss several alternatives. We also present association rules and related algorithms for an efficient training process. In addition to introducing these techniques, we survey their uses in Recommender Systems and present cases where they have been successfully applied.

  20. Understanding Information Uncertainty within the Context of a Net-Centric Data Model: A Mine Warfare Example

    DTIC Science & Technology

    2008-06-01

    key assumption in the calculation of the primary MIW MOEs of the estimated risk to a transitor and the expected time required to clear all of the mines...primary MOE of Risk, or Probability of Damage to a Ship Transitor , is calculated by using information in the highlighted circle on the left, to include...percent clearance achieved. 0 E( ) Pr( | , ) r R r r m p ∞ = = ∗∑ (0.2) Risk can be calculated for each transitor given the expected number of

  1. Radioecological impacts of tin mining.

    PubMed

    Aliyu, Abubakar Sadiq; Mousseau, Timothy Alexander; Ramli, Ahmad Termizi; Bununu, Yakubu Aliyu

    2015-12-01

    The tin mining activities in the suburbs of Jos, Plateau State, Nigeria, have resulted in technical enhancement of the natural background radiation as well as higher activity concentrations of primordial radionuclides in the topsoil of mining sites and their environs. Several studies have considered the radiological human health risks of the mining activity; however, to our knowledge no documented study has investigated the radiological impacts on biota. Hence, an attempt is made to assess potential hazards using published data from the literature and the ERICA Tool. This paper considers the effects of mining and milling on terrestrial organisms like shrubs, large mammals, small burrowing mammals, birds (duck), arthropods (earth worm), grasses, and herbs. The dose rates and risk quotients to these organisms are computed using conservative values for activity concentrations of natural radionuclides reported in Bitsichi and Bukuru mining areas. The results suggest that grasses, herbs, lichens, bryophytes and shrubs receive total dose rates that are of potential concern. The effects of dose rates to specific indicator species of interest are highlighted and discussed. We conclude that further investigation and proper regulations should be set in place in order to reduce the risk posed by the tin mining activity on biota. This paper also presents a brief overview of the impact of mineral mining on biota based on documented literature for other countries.

  2. Breast Imaging in the Era of Big Data: Structured Reporting and Data Mining.

    PubMed

    Margolies, Laurie R; Pandey, Gaurav; Horowitz, Eliot R; Mendelson, David S

    2016-02-01

    The purpose of this article is to describe structured reporting and the development of large databases for use in data mining in breast imaging. The results of millions of breast imaging examinations are reported with structured tools based on the BI-RADS lexicon. Much of these data are stored in accessible media. Robust computing power creates great opportunity for data scientists and breast imagers to collaborate to improve breast cancer detection and optimize screening algorithms. Data mining can create knowledge, but the questions asked and their complexity require extremely powerful and agile databases. New data technologies can facilitate outcomes research and precision medicine.

  3. QTLTableMiner++: semantic mining of QTL tables in scientific articles.

    PubMed

    Singh, Gurnoor; Kuzniar, Arnold; van Mulligen, Erik M; Gavai, Anand; Bachem, Christian W; Visser, Richard G F; Finkers, Richard

    2018-05-25

    A quantitative trait locus (QTL) is a genomic region that correlates with a phenotype. Most of the experimental information about QTL mapping studies is described in tables of scientific publications. Traditional text mining techniques aim to extract information from unstructured text rather than from tables. We present QTLTableMiner ++ (QTM), a table mining tool that extracts and semantically annotates QTL information buried in (heterogeneous) tables of plant science literature. QTM is a command line tool written in the Java programming language. This tool takes scientific articles from the Europe PMC repository as input, extracts QTL tables using keyword matching and ontology-based concept identification. The tables are further normalized using rules derived from table properties such as captions, column headers and table footers. Furthermore, table columns are classified into three categories namely column descriptors, properties and values based on column headers and data types of cell entries. Abbreviations found in the tables are expanded using the Schwartz and Hearst algorithm. Finally, the content of QTL tables is semantically enriched with domain-specific ontologies (e.g. Crop Ontology, Plant Ontology and Trait Ontology) using the Apache Solr search platform and the results are stored in a relational database and a text file. The performance of the QTM tool was assessed by precision and recall based on the information retrieved from two manually annotated corpora of open access articles, i.e. QTL mapping studies in tomato (Solanum lycopersicum) and in potato (S. tuberosum). In summary, QTM detected QTL statements in tomato with 74.53% precision and 92.56% recall and in potato with 82.82% precision and 98.94% recall. QTM is a unique tool that aids in providing QTL information in machine-readable and semantically interoperable formats.

  4. An interactive web application for the dissemination of human systems immunology data.

    PubMed

    Speake, Cate; Presnell, Scott; Domico, Kelly; Zeitner, Brad; Bjork, Anna; Anderson, David; Mason, Michael J; Whalen, Elizabeth; Vargas, Olivia; Popov, Dimitry; Rinchai, Darawan; Jourde-Chiche, Noemie; Chiche, Laurent; Quinn, Charlie; Chaussabel, Damien

    2015-06-19

    Systems immunology approaches have proven invaluable in translational research settings. The current rate at which large-scale datasets are generated presents unique challenges and opportunities. Mining aggregates of these datasets could accelerate the pace of discovery, but new solutions are needed to integrate the heterogeneous data types with the contextual information that is necessary for interpretation. In addition, enabling tools and technologies facilitating investigators' interaction with large-scale datasets must be developed in order to promote insight and foster knowledge discovery. State of the art application programming was employed to develop an interactive web application for browsing and visualizing large and complex datasets. A collection of human immune transcriptome datasets were loaded alongside contextual information about the samples. We provide a resource enabling interactive query and navigation of transcriptome datasets relevant to human immunology research. Detailed information about studies and samples are displayed dynamically; if desired the associated data can be downloaded. Custom interactive visualizations of the data can be shared via email or social media. This application can be used to browse context-rich systems-scale data within and across systems immunology studies. This resource is publicly available online at [Gene Expression Browser Landing Page ( https://gxb.benaroyaresearch.org/dm3/landing.gsp )]. The source code is also available openly [Gene Expression Browser Source Code ( https://github.com/BenaroyaResearch/gxbrowser )]. We have developed a data browsing and visualization application capable of navigating increasingly large and complex datasets generated in the context of immunological studies. This intuitive tool ensures that, whether taken individually or as a whole, such datasets generated at great effort and expense remain interpretable and a ready source of insight for years to come.

  5. Trace metal depositional patterns from an open pit mining activity as revealed by archived avian gizzard contents.

    PubMed

    Bendell, L I

    2011-02-15

    Archived samples of blue grouse (Dendragapus obscurus) gizzard contents, inclusive of grit, collected yearly between 1959 and 1970 were analyzed for cadmium, lead, zinc, and copper content. Approximately halfway through the 12-year sampling period, an open-pit copper mine began activities, then ceased operations 2 years later. Thus the archived samples provided a unique opportunity to determine if avian gizzard contents, inclusive of grit, could reveal patterns in the anthropogenic deposition of trace metals associated with mining activities. Gizzard concentrations of cadmium and copper strongly coincided with the onset of opening and the closing of the pit mining activity. Gizzard zinc and lead demonstrated significant among year variation; however, maximum concentrations did not correlate to mining activity. The archived gizzard contents did provide a useful tool for documenting trends in metal depositional patterns related to an anthropogenic activity. Further, blue grouse ingesting grit particles during the time of active mining activity would have been exposed to toxicologically significant levels of cadmium. Gizzard lead concentrations were also of toxicological significance but not related to mining activity. This type of "pulse" toxic metal exposure as a consequence of open-pit mining activity would not necessarily have been revealed through a "snap-shot" of soil, plant or avian tissue trace metal analysis post-mining activity. Copyright © 2010 Elsevier B.V. All rights reserved.

  6. Assessing the Environmental and Socio-Economic Impacts of Artisanal Gold Mining on the Livelihoods of Communities in the Tarkwa Nsuaem Municipality in Ghana.

    PubMed

    Obiri, Samuel; Mattah, Precious A D; Mattah, Memuna M; Armah, Frederick A; Osae, Shiloh; Adu-kumi, Sam; Yeboah, Philip O

    2016-01-26

    Gold mining has played an important role in Ghana's economy, however the negative environmental and socio-economic effects on the host communities associated with gold mining have overshadowed these economic gains. It is within this context that this paper assessed in an integrated manner the environmental and socio-economic impacts of artisanal gold mining in the Tarkwa Nsuaem Municipality from a natural and social science perspective. The natural science group collected 200 random samples on bi-weekly basis between January to October 2013 from water bodies in the study area for analysis in line with methods outlined by the American Water Works Association, while the social science team interviewed 250 residents randomly selected for interviews on socio-economic issues associated with mining. Data from the socio-economic survey was analyzed using logistic regression with SPSS version 17. The results of the natural science investigation revealed that the levels of heavy metals in water samples from the study area in most cases exceeded GS 175-1/WHO permissible guideline values, which are in tandem with the results of inhabitants' perceptions of water quality survey (as 83% of the respondents are of the view that water bodies in the study area are polluted). This calls for cost-benefits analysis of mining before new mining leases are granted by the relevant authorities.

  7. Assessing the Environmental and Socio-Economic Impacts of Artisanal Gold Mining on the Livelihoods of Communities in the Tarkwa Nsuaem Municipality in Ghana

    PubMed Central

    Obiri, Samuel; Mattah, Precious A. D.; Mattah, Memuna M.; Armah, Frederick A.; Osae, Shiloh; Adu-kumi, Sam; Yeboah, Philip O.

    2016-01-01

    Gold mining has played an important role in Ghana’s economy, however the negative environmental and socio-economic effects on the host communities associated with gold mining have overshadowed these economic gains. It is within this context that this paper assessed in an integrated manner the environmental and socio-economic impacts of artisanal gold mining in the Tarkwa Nsuaem Municipality from a natural and social science perspective. The natural science group collected 200 random samples on bi-weekly basis between January to October 2013 from water bodies in the study area for analysis in line with methods outlined by the American Water Works Association, while the social science team interviewed 250 residents randomly selected for interviews on socio-economic issues associated with mining. Data from the socio-economic survey was analyzed using logistic regression with SPSS version 17. The results of the natural science investigation revealed that the levels of heavy metals in water samples from the study area in most cases exceeded GS 175-1/WHO permissible guideline values, which are in tandem with the results of inhabitants’ perceptions of water quality survey (as 83% of the respondents are of the view that water bodies in the study area are polluted). This calls for cost-benefits analysis of mining before new mining leases are granted by the relevant authorities. PMID:26821039

  8. MinePath: Mining for Phenotype Differential Sub-paths in Molecular Pathways

    PubMed Central

    Koumakis, Lefteris; Kartsaki, Evgenia; Chatzimina, Maria; Zervakis, Michalis; Vassou, Despoina; Marias, Kostas; Moustakis, Vassilis; Potamias, George

    2016-01-01

    Pathway analysis methodologies couple traditional gene expression analysis with knowledge encoded in established molecular pathway networks, offering a promising approach towards the biological interpretation of phenotype differentiating genes. Early pathway analysis methodologies, named as gene set analysis (GSA), view pathways just as plain lists of genes without taking into account either the underlying pathway network topology or the involved gene regulatory relations. These approaches, even if they achieve computational efficiency and simplicity, consider pathways that involve the same genes as equivalent in terms of their gene enrichment characteristics. Most recent pathway analysis approaches take into account the underlying gene regulatory relations by examining their consistency with gene expression profiles and computing a score for each profile. Even with this approach, assessing and scoring single-relations limits the ability to reveal key gene regulation mechanisms hidden in longer pathway sub-paths. We introduce MinePath, a pathway analysis methodology that addresses and overcomes the aforementioned problems. MinePath facilitates the decomposition of pathways into their constituent sub-paths. Decomposition leads to the transformation of single-relations to complex regulation sub-paths. Regulation sub-paths are then matched with gene expression sample profiles in order to evaluate their functional status and to assess phenotype differential power. Assessment of differential power supports the identification of the most discriminant profiles. In addition, MinePath assess the significance of the pathways as a whole, ranking them by their p-values. Comparison results with state-of-the-art pathway analysis systems are indicative for the soundness and reliability of the MinePath approach. In contrast with many pathway analysis tools, MinePath is a web-based system (www.minepath.org) offering dynamic and rich pathway visualization functionality, with the unique characteristic to color regulatory relations between genes and reveal their phenotype inclination. This unique characteristic makes MinePath a valuable tool for in silico molecular biology experimentation as it serves the biomedical researchers’ exploratory needs to reveal and interpret the regulatory mechanisms that underlie and putatively govern the expression of target phenotypes. PMID:27832067

  9. MinePath: Mining for Phenotype Differential Sub-paths in Molecular Pathways.

    PubMed

    Koumakis, Lefteris; Kanterakis, Alexandros; Kartsaki, Evgenia; Chatzimina, Maria; Zervakis, Michalis; Tsiknakis, Manolis; Vassou, Despoina; Kafetzopoulos, Dimitris; Marias, Kostas; Moustakis, Vassilis; Potamias, George

    2016-11-01

    Pathway analysis methodologies couple traditional gene expression analysis with knowledge encoded in established molecular pathway networks, offering a promising approach towards the biological interpretation of phenotype differentiating genes. Early pathway analysis methodologies, named as gene set analysis (GSA), view pathways just as plain lists of genes without taking into account either the underlying pathway network topology or the involved gene regulatory relations. These approaches, even if they achieve computational efficiency and simplicity, consider pathways that involve the same genes as equivalent in terms of their gene enrichment characteristics. Most recent pathway analysis approaches take into account the underlying gene regulatory relations by examining their consistency with gene expression profiles and computing a score for each profile. Even with this approach, assessing and scoring single-relations limits the ability to reveal key gene regulation mechanisms hidden in longer pathway sub-paths. We introduce MinePath, a pathway analysis methodology that addresses and overcomes the aforementioned problems. MinePath facilitates the decomposition of pathways into their constituent sub-paths. Decomposition leads to the transformation of single-relations to complex regulation sub-paths. Regulation sub-paths are then matched with gene expression sample profiles in order to evaluate their functional status and to assess phenotype differential power. Assessment of differential power supports the identification of the most discriminant profiles. In addition, MinePath assess the significance of the pathways as a whole, ranking them by their p-values. Comparison results with state-of-the-art pathway analysis systems are indicative for the soundness and reliability of the MinePath approach. In contrast with many pathway analysis tools, MinePath is a web-based system (www.minepath.org) offering dynamic and rich pathway visualization functionality, with the unique characteristic to color regulatory relations between genes and reveal their phenotype inclination. This unique characteristic makes MinePath a valuable tool for in silico molecular biology experimentation as it serves the biomedical researchers' exploratory needs to reveal and interpret the regulatory mechanisms that underlie and putatively govern the expression of target phenotypes.

  10. The use of data mining by private health insurance companies and customers' privacy.

    PubMed

    Al-Saggaf, Yeslam

    2015-07-01

    This article examines privacy threats arising from the use of data mining by private Australian health insurance companies. Qualitative interviews were conducted with key experts, and Australian governmental and nongovernmental websites relevant to private health insurance were searched. Using Rationale, a critical thinking tool, the themes and considerations elicited through this empirical approach were developed into an argument about the use of data mining by private health insurance companies. The argument is followed by an ethical analysis guided by classical philosophical theories-utilitarianism, Mill's harm principle, Kant's deontological theory, and Helen Nissenbaum's contextual integrity framework. Both the argument and the ethical analysis find the use of data mining by private health insurance companies in Australia to be unethical. Although private health insurance companies in Australia cannot use data mining for risk rating to cherry-pick customers and cannot use customers' personal information for unintended purposes, this article nonetheless concludes that the secondary use of customers' personal information and the absence of customers' consent still suggest that the use of data mining by private health insurance companies is wrong.

  11. Statistical methods of estimating mining costs

    USGS Publications Warehouse

    Long, K.R.

    2011-01-01

    Until it was defunded in 1995, the U.S. Bureau of Mines maintained a Cost Estimating System (CES) for prefeasibility-type economic evaluations of mineral deposits and estimating costs at producing and non-producing mines. This system had a significant role in mineral resource assessments to estimate costs of developing and operating known mineral deposits and predicted undiscovered deposits. For legal reasons, the U.S. Geological Survey cannot update and maintain CES. Instead, statistical tools are under development to estimate mining costs from basic properties of mineral deposits such as tonnage, grade, mineralogy, depth, strip ratio, distance from infrastructure, rock strength, and work index. The first step was to reestimate "Taylor's Rule" which relates operating rate to available ore tonnage. The second step was to estimate statistical models of capital and operating costs for open pit porphyry copper mines with flotation concentrators. For a sample of 27 proposed porphyry copper projects, capital costs can be estimated from three variables: mineral processing rate, strip ratio, and distance from nearest railroad before mine construction began. Of all the variables tested, operating costs were found to be significantly correlated only with strip ratio.

  12. Towards novel organic high-Tc superconductors: Data mining using density of states similarity search

    NASA Astrophysics Data System (ADS)

    Geilhufe, R. Matthias; Borysov, Stanislav S.; Kalpakchi, Dmytro; Balatsky, Alexander V.

    2018-02-01

    Identifying novel functional materials with desired key properties is an important part of bridging the gap between fundamental research and technological advancement. In this context, high-throughput calculations combined with data-mining techniques highly accelerated this process in different areas of research during the past years. The strength of a data-driven approach for materials prediction lies in narrowing down the search space of thousands of materials to a subset of prospective candidates. Recently, the open-access organic materials database OMDB was released providing electronic structure data for thousands of previously synthesized three-dimensional organic crystals. Based on the OMDB, we report about the implementation of a novel density of states similarity search tool which is capable of retrieving materials with similar density of states to a reference material. The tool is based on the approximate nearest neighbor algorithm as implemented in the ANNOY library and can be applied via the OMDB web interface. The approach presented here is wide ranging and can be applied to various problems where the density of states is responsible for certain key properties of a material. As the first application, we report about materials exhibiting electronic structure similarities to the aromatic hydrocarbon p-terphenyl which was recently discussed as a potential organic high-temperature superconductor exhibiting a transition temperature in the order of 120 K under strong potassium doping. Although the mechanism driving the remarkable transition temperature remains under debate, we argue that the density of states, reflecting the electronic structure of a material, might serve as a crucial ingredient for the observed high Tc. To provide candidates which might exhibit comparable properties, we present 15 purely organic materials with similar features to p-terphenyl within the electronic structure, which also tend to have structural similarities with p-terphenyl such as space group symmetries, chemical composition, and molecular structure. The experimental verification of these candidates might lead to a better understanding of the underlying mechanism in case similar superconducting properties are revealed.

  13. Mouse Genome Informatics (MGI): Resources for Mining Mouse Genetic, Genomic, and Biological Data in Support of Primary and Translational Research.

    PubMed

    Eppig, Janan T; Smith, Cynthia L; Blake, Judith A; Ringwald, Martin; Kadin, James A; Richardson, Joel E; Bult, Carol J

    2017-01-01

    The Mouse Genome Informatics (MGI), resource ( www.informatics.jax.org ) has existed for over 25 years, and over this time its data content, informatics infrastructure, and user interfaces and tools have undergone dramatic changes (Eppig et al., Mamm Genome 26:272-284, 2015). Change has been driven by scientific methodological advances, rapid improvements in computational software, growth in computer hardware capacity, and the ongoing collaborative nature of the mouse genomics community in building resources and sharing data. Here we present an overview of the current data content of MGI, describe its general organization, and provide examples using simple and complex searches, and tools for mining and retrieving sets of data.

  14. The Prospects of Accounting at Mining Enterprises as a Factor of Ensuring their Sustainable Development

    NASA Astrophysics Data System (ADS)

    Tyuleneva, Tatiana

    2017-11-01

    One of the problems of sustainable development of mining companies is attracting additional investment. To solve it requires access to international capital markets, in this context, enterprises need to prepare financial statements with international requirements based on the data generated by the accounting system. The article considers the basic problems of accounting in the extractive industries due to the nature of the industry, as well as evaluation of the completeness of their solution in the framework of international financial reporting standards. In addition, lists the characteristics of accounting for mining industry, due to the peculiarities of the production process that need to be considered to solve these problems. This sector is extremely important for individual countries and on a global scale.

  15. Open data mining for Taiwan's dengue epidemic.

    PubMed

    Wu, ChienHsing; Kao, Shu-Chen; Shih, Chia-Hung; Kan, Meng-Hsuan

    2018-07-01

    By using a quantitative approach, this study examines the applicability of data mining technique to discover knowledge from open data related to Taiwan's dengue epidemic. We compare results when Google trend data are included or excluded. Data sources are government open data, climate data, and Google trend data. Research findings from analysis of 70,914 cases are obtained. Location and time (month) in open data show the highest classification power followed by climate variables (temperature and humidity), whereas gender and age show the lowest values. Both prediction accuracy and simplicity decrease when Google trends are considered (respectively 0.94 and 0.37, compared to 0.96 and 0.46). The article demonstrates the value of open data mining in the context of public health care. Copyright © 2018 Elsevier B.V. All rights reserved.

  16. An Expertise Recommender using Web Mining

    NASA Technical Reports Server (NTRS)

    Joshi, Anupam; Chandrasekaran, Purnima; ShuYang, Michelle; Ramakrishnan, Ramya

    2001-01-01

    This report explored techniques to mine web pages of scientists to extract information regarding their expertise, build expertise chains and referral webs, and semi automatically combine this information with directory information services to create a recommender system that permits query by expertise. The approach included experimenting with existing techniques that have been reported in research literature in recent past , and adapted them as needed. In addition, software tools were developed to capture and use this information.

  17. Translations on North Korea No. 622

    DTIC Science & Technology

    1978-10-13

    Pyongyang Power Station 5 July Electric Factory Hamhung Machine Tool Factory Kosan Plastic Pipe Factory Sog’wangea Plastic Pipe Factory 8...August Factory Double Chollima Hamhung Disabled Veterans’ Plastic Goods Factory Mangyongdae Machine Tool Factory Kangso Coal Mine Tongdaewon Garment...21 Jul 78 p 4) innovating in machine tool production (NC 21 Jul 78 p 2) in 40 days of the 蔴 days of combat" raised coal production 10 percent

  18. ChemicalTagger: A tool for semantic text-mining in chemistry

    PubMed Central

    2011-01-01

    Background The primary method for scientific communication is in the form of published scientific articles and theses which use natural language combined with domain-specific terminology. As such, they contain free owing unstructured text. Given the usefulness of data extraction from unstructured literature, we aim to show how this can be achieved for the discipline of chemistry. The highly formulaic style of writing most chemists adopt make their contributions well suited to high-throughput Natural Language Processing (NLP) approaches. Results We have developed the ChemicalTagger parser as a medium-depth, phrase-based semantic NLP tool for the language of chemical experiments. Tagging is based on a modular architecture and uses a combination of OSCAR, domain-specific regex and English taggers to identify parts-of-speech. The ANTLR grammar is used to structure this into tree-based phrases. Using a metric that allows for overlapping annotations, we achieved machine-annotator agreements of 88.9% for phrase recognition and 91.9% for phrase-type identification (Action names). Conclusions It is possible parse to chemical experimental text using rule-based techniques in conjunction with a formal grammar parser. ChemicalTagger has been deployed for over 10,000 patents and has identified solvents from their linguistic context with >99.5% precision. PMID:21575201

  19. Evaluating Handheld X-Ray Fluorescence (XRF) Technology in Planetary Exploration: Demonstrating Instrument Stability and Understanding Analytical Constraints and Limits for Basaltic Rocks

    NASA Technical Reports Server (NTRS)

    Young, K. E.; Hodges, K. V.; Evans, C. A.

    2012-01-01

    While large-footprint X-ray fluorescence (XRF) instruments are reliable providers of elemental information about geologic samples, handheld XRF instruments are currently being developed that enable the collection of geochemical data in the field in short time periods (approx.60 seconds) [1]. These detectors are lightweight (1.3kg) and can provide elemental abundances of major rock forming elements heavier than Na. While handheld XRF detectors were originally developed for use in mining, we are working with commercially available instruments as prototypes to explore how portable XRF technology may enable planetary field science [2,3,4]. If an astronaut or robotic explorer visited another planetary surface, the ability to obtain and evaluate geochemical data in real-time would be invaluable, especially in the high-grading of samples to determine which should be returned to Earth. We present our results on the evaluation of handheld XRF technology as a geochemical tool in the context of planetary exploration.

  20. Redefining the Practice of Peer Review Through Intelligent Automation-Part 3: Automated Report Analysis and Data Reconciliation.

    PubMed

    Reiner, Bruce I

    2018-02-01

    One method for addressing existing peer review limitations is the assignment of peer review cases on a completely blinded basis, in which the peer reviewer would create an independent report which can then be cross-referenced with the primary reader report of record. By leveraging existing computerized data mining techniques, one could in theory automate and objectify the process of report data extraction, classification, and analysis, while reducing time and resource requirements intrinsic to manual peer review report analysis. Once inter-report analysis has been performed, resulting inter-report discrepancies can be presented to the radiologist of record for review, along with the option to directly communicate with the peer reviewer through an electronic data reconciliation tool aimed at collaboratively resolving inter-report discrepancies and improving report accuracy. All associated report and reconciled data could in turn be recorded in a referenceable peer review database, which provides opportunity for context and user-specific education and decision support.

  1. Biblio-MetReS for user-friendly mining of genes and biological processes in scientific documents.

    PubMed

    Usie, Anabel; Karathia, Hiren; Teixidó, Ivan; Alves, Rui; Solsona, Francesc

    2014-01-01

    One way to initiate the reconstruction of molecular circuits is by using automated text-mining techniques. Developing more efficient methods for such reconstruction is a topic of active research, and those methods are typically included by bioinformaticians in pipelines used to mine and curate large literature datasets. Nevertheless, experimental biologists have a limited number of available user-friendly tools that use text-mining for network reconstruction and require no programming skills to use. One of these tools is Biblio-MetReS. Originally, this tool permitted an on-the-fly analysis of documents contained in a number of web-based literature databases to identify co-occurrence of proteins/genes. This approach ensured results that were always up-to-date with the latest live version of the databases. However, this 'up-to-dateness' came at the cost of large execution times. Here we report an evolution of the application Biblio-MetReS that permits constructing co-occurrence networks for genes, GO processes, Pathways, or any combination of the three types of entities and graphically represent those entities. We show that the performance of Biblio-MetReS in identifying gene co-occurrence is as least as good as that of other comparable applications (STRING and iHOP). In addition, we also show that the identification of GO processes is on par to that reported in the latest BioCreAtIvE challenge. Finally, we also report the implementation of a new strategy that combines on-the-fly analysis of new documents with preprocessed information from documents that were encountered in previous analyses. This combination simultaneously decreases program run time and maintains 'up-to-dateness' of the results. http://metres.udl.cat/index.php/downloads, metres.cmb@gmail.com.

  2. A web-based genomic sequence database for the Streptomycetaceae: a tool for systematics and genome mining

    USDA-ARS?s Scientific Manuscript database

    The ARS Microbial Genome Sequence Database (http://199.133.98.43), a web-based database server, was established utilizing the BIGSdb (Bacterial Isolate Genomics Sequence Database) software package, developed at Oxford University, as a tool to manage multi-locus sequence data for the family Streptomy...

  3. Mining Hidden Gems Beneath the Surface: A Look At the Invisible Web.

    ERIC Educational Resources Information Center

    Carlson, Randal D.; Repman, Judi

    2002-01-01

    Describes resources for researchers called the Invisible Web that are hidden from the usual search engines and other tools and contrasts them with those resources available on the surface Web. Identifies specialized search tools, databases, and strategies that can be used to locate credible in-depth information. (Author/LRW)

  4. Harnessing scientific literature reports for pharmacovigilance. Prototype software analytical tool development and usability testing.

    PubMed

    Sorbello, Alfred; Ripple, Anna; Tonning, Joseph; Munoz, Monica; Hasan, Rashedul; Ly, Thomas; Francis, Henry; Bodenreider, Olivier

    2017-03-22

    We seek to develop a prototype software analytical tool to augment FDA regulatory reviewers' capacity to harness scientific literature reports in PubMed/MEDLINE for pharmacovigilance and adverse drug event (ADE) safety signal detection. We also aim to gather feedback through usability testing to assess design, performance, and user satisfaction with the tool. A prototype, open source, web-based, software analytical tool generated statistical disproportionality data mining signal scores and dynamic visual analytics for ADE safety signal detection and management. We leveraged Medical Subject Heading (MeSH) indexing terms assigned to published citations in PubMed/MEDLINE to generate candidate drug-adverse event pairs for quantitative data mining. Six FDA regulatory reviewers participated in usability testing by employing the tool as part of their ongoing real-life pharmacovigilance activities to provide subjective feedback on its practical impact, added value, and fitness for use. All usability test participants cited the tool's ease of learning, ease of use, and generation of quantitative ADE safety signals, some of which corresponded to known established adverse drug reactions. Potential concerns included the comparability of the tool's automated literature search relative to a manual 'all fields' PubMed search, missing drugs and adverse event terms, interpretation of signal scores, and integration with existing computer-based analytical tools. Usability testing demonstrated that this novel tool can automate the detection of ADE safety signals from published literature reports. Various mitigation strategies are described to foster improvements in design, productivity, and end user satisfaction.

  5. Recent progress in automatically extracting information from the pharmacogenomic literature

    PubMed Central

    Garten, Yael; Coulet, Adrien; Altman, Russ B

    2011-01-01

    The biomedical literature holds our understanding of pharmacogenomics, but it is dispersed across many journals. In order to integrate our knowledge, connect important facts across publications and generate new hypotheses we must organize and encode the contents of the literature. By creating databases of structured pharmocogenomic knowledge, we can make the value of the literature much greater than the sum of the individual reports. We can, for example, generate candidate gene lists or interpret surprising hits in genome-wide association studies. Text mining automatically adds structure to the unstructured knowledge embedded in millions of publications, and recent years have seen a surge in work on biomedical text mining, some specific to pharmacogenomics literature. These methods enable extraction of specific types of information and can also provide answers to general, systemic queries. In this article, we describe the main tasks of text mining in the context of pharmacogenomics, summarize recent applications and anticipate the next phase of text mining applications. PMID:21047206

  6. A planetary nervous system for social mining and collective awareness

    NASA Astrophysics Data System (ADS)

    Giannotti, F.; Pedreschi, D.; Pentland, A.; Lukowicz, P.; Kossmann, D.; Crowley, J.; Helbing, D.

    2012-11-01

    We present a research roadmap of a Planetary Nervous System (PNS), capable of sensing and mining the digital breadcrumbs of human activities and unveiling the knowledge hidden in the big data for addressing the big questions about social complexity. We envision the PNS as a globally distributed, self-organizing, techno-social system for answering analytical questions about the status of world-wide society, based on three pillars: social sensing, social mining and the idea of trust networks and privacy-aware social mining. We discuss the ingredients of a science and a technology necessary to build the PNS upon the three mentioned pillars, beyond the limitations of their respective state-of-art. Social sensing is aimed at developing better methods for harvesting the big data from the techno-social ecosystem and make them available for mining, learning and analysis at a properly high abstraction level. Social mining is the problem of discovering patterns and models of human behaviour from the sensed data across the various social dimensions by data mining, machine learning and social network analysis. Trusted networks and privacy-aware social mining is aimed at creating a new deal around the questions of privacy and data ownership empowering individual persons with full awareness and control on own personal data, so that users may allow access and use of their data for their own good and the common good. The PNS will provide a goal-oriented knowledge discovery framework, made of technology and people, able to configure itself to the aim of answering questions about the pulse of global society. Given an analytical request, the PNS activates a process composed by a variety of interconnected tasks exploiting the social sensing and mining methods within the transparent ecosystem provided by the trusted network. The PNS we foresee is the key tool for individual and collective awareness for the knowledge society. We need such a tool for everyone to become fully aware of how powerful is the knowledge of our society we can achieve by leveraging our wisdom as a crowd, and how important is that everybody participates both as a consumer and as a producer of the social knowledge, for it to become a trustable, accessible, safe and useful public good.

  7. RE-Powering America's Land

    EPA Pesticide Factsheets

    Describes how contaminated lands, landfills, and mine sites can be reused as renewable energy installations. Also supplies best practices, tools and resources for screening properties for renewable energy potential.

  8. [Gender inequity in health in contexts of environmental risk from mining and industrial activity in Mexico].

    PubMed

    Catalán-Vázquez, Minerva; Riojas-Rodríguez, Horacio

    2015-06-01

    Analyze how gender inequity manifests in contexts of poverty in different environmental risk scenarios in Mexico. Qualitative design based on six discussion groups and 54 in-depth interviews with women from six exposed communities: two to environmental manganese in a mining district, two in an industrial corridor, and two bordering a sanitary landfill. A document review of environmental and health studies in each area was done to relate them to the women's perspective on the problem. In the three case studies, by gender roles, women stay at home and do housework and, therefore, are subject to intense environmental exposure when carrying out their daily tasks, such as house cleaning. Interview and discussion group results were found to be related to epidemiological study results. In the case of the mining district, women's perceptions are consistent with study comments on adverse cognitive effects of manganese exposure. In all three cases, there are serious limitations on women's political participation in environmental risk management. Due to conditions of inequity, women are highly exposed to environmental health risks and their social participation in solving environmental problems is quite limited. These results have social and environmental policy implications in the areas studied, especially with regard to risk assessment, management, and communication.

  9. Application of data mining approaches to drug delivery.

    PubMed

    Ekins, Sean; Shimada, Jun; Chang, Cheng

    2006-11-30

    Computational approaches play a key role in all areas of the pharmaceutical industry from data mining, experimental and clinical data capture to pharmacoeconomics and adverse events monitoring. They will likely continue to be indispensable assets along with a growing library of software applications. This is primarily due to the increasingly massive amount of biology, chemistry and clinical data, which is now entering the public domain mainly as a result of NIH and commercially funded projects. We are therefore in need of new methods for mining this mountain of data in order to enable new hypothesis generation. The computational approaches include, but are not limited to, database compilation, quantitative structure activity relationships (QSAR), pharmacophores, network visualization models, decision trees, machine learning algorithms and multidimensional data visualization software that could be used to improve drug delivery after mining public and/or proprietary data. We will discuss some areas of unmet needs in the area of data mining for drug delivery that can be addressed with new software tools or databases of relevance to future pharmaceutical projects.

  10. Corner-cutting mining assembly

    DOEpatents

    Bradley, J.A.

    1981-07-01

    This invention resulted from a contract with the United States Department of Energy and relates to a mining tool. More particularly, the invention relates to an assembly capable of drilling a hole having a square cross-sectional shape with radiused corners. In mining operations in which conventional auger-type drills are used to form a series of parallel, cylindrical holes in a coal seam, a large amount of coal remains in place in the seam because the shape of the holes leaves thick webs between the holes. A higher percentage of coal can be mined from a seam by a means capable of drilling holes having a substantially square cross section. It is an object of this invention to provide an improved mining apparatus by means of which the amount of coal recovered from a seam deposit can be increased. Another object of the invention is to provide a drilling assembly which cuts corners in a hole having a circular cross section. These objects and other advantages are attained by a preferred embodiment of the invention.

  11. Evaluation of Documentation Patterns of Trainees and Supervising Physicians Using Data Mining.

    PubMed

    Madhavan, Ramesh; Tang, Chi; Bhattacharya, Pratik; Delly, Fadi; Basha, Maysaa M

    2014-09-01

    The electronic health record (EHR) includes a rich data set that may offer opportunities for data mining and natural language processing to answer questions about quality of care, key aspects of resident education, or attributes of the residents' learning environment. We used data obtained from the EHR to report on inpatient documentation practices of residents and attending physicians at a large academic medical center. We conducted a retrospective observational study of deidentified patient notes entered over 7 consecutive months by a multispecialty university physician group at an urban hospital. A novel automated data mining technology was used to extract patient note-related variables. A sample of 26 802 consecutive patient notes was analyzed using the data mining and modeling tool Healthcare Smartgrid. Residents entered most of the notes (33%, 8178 of 24 787) between noon and 4 pm and 31% (7718 of 24 787) of notes between 8 am and noon. Attending physicians placed notes about teaching attestations within 24 hours in only 73% (17 843 of 24 443) of the records. Surgical residents were more likely to place notes before noon (P < .001). Nonsurgical faculty were more likely to provide attestation of resident notes within 24 hours (P < .001). Data related to patient note entry was successfully used to objectively measure current work flow of resident physicians and their supervising faculty, and the findings have implications for physician oversight of residents' clinical work. We were able to demonstrate the utility of a data mining model as an assessment tool in graduate medical education.

  12. Pattern mining of user interaction logs for a post-deployment usability evaluation of a radiology PACS client.

    PubMed

    Jorritsma, Wiard; Cnossen, Fokie; Dierckx, Rudi A; Oudkerk, Matthijs; van Ooijen, Peter M A

    2016-01-01

    To perform a post-deployment usability evaluation of a radiology Picture Archiving and Communication System (PACS) client based on pattern mining of user interaction log data, and to assess the usefulness of this approach compared to a field study. All user actions performed on the PACS client were logged for four months. A data mining technique called closed sequential pattern mining was used to automatically extract frequently occurring interaction patterns from the log data. These patterns were used to identify usability issues with the PACS. The results of this evaluation were compared to the results of a field study based usability evaluation of the same PACS client. The interaction patterns revealed four usability issues: (1) the display protocols do not function properly, (2) the line measurement tool stays active until another tool is selected, rather than being deactivated after one use, (3) the PACS's built-in 3D functionality does not allow users to effectively perform certain 3D-related tasks, (4) users underuse the PACS's customization possibilities. All usability issues identified based on the log data were also found in the field study, which identified 48 issues in total. Post-deployment usability evaluation based on pattern mining of user interaction log data provides useful insights into the way users interact with the radiology PACS client. However, it reveals few usability issues compared to a field study and should therefore not be used as the sole method of usability evaluation. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  13. Short-term scheduling of an open-pit mine with multiple objectives

    NASA Astrophysics Data System (ADS)

    Blom, Michelle; Pearce, Adrian R.; Stuckey, Peter J.

    2017-05-01

    This article presents a novel algorithm for the generation of multiple short-term production schedules for an open-pit mine, in which several objectives, of varying priority, characterize the quality of each solution. A short-term schedule selects regions of a mine site, known as 'blocks', to be extracted in each week of a planning horizon (typically spanning 13 weeks). Existing tools for constructing these schedules use greedy heuristics, with little optimization. To construct a single schedule in which infrastructure is sufficiently utilized, with production grades consistently close to a desired target, a planner must often run these heuristics many times, adjusting parameters after each iteration. A planner's intuition and experience can evaluate the relative quality and mineability of different schedules in a way that is difficult to automate. Of interest to a short-term planner is the generation of multiple schedules, extracting available ore and waste in varying sequences, which can then be manually compared. This article presents a tool in which multiple, diverse, short-term schedules are constructed, meeting a range of common objectives without the need for iterative parameter adjustment.

  14. A GIS-based approach: Influence of the ventilation layout to the environmental conditions in an underground mine.

    PubMed

    Bascompta, Marc; Castañón, Ana María; Sanmiquel, Lluís; Oliva, Josep

    2016-11-01

    Gases such as CO, CO2 or NOx are constantly generated by the equipment in any underground mine and the ventilation layout can play an important role in keeping low concentrations in the working faces. Hence, a method able to control the workplace environment is crucial. This paper proposes a geographical information system (GIS) for such goal. The system created provides the necessary tools to manage and analyse an underground environment, connecting pollutants and temperatures with the ventilation characteristics over time. Data concerning the ventilation system, in a case study, has been taken every month since 2009 and integrated into the management system, which has quantified the gasses concentration throughout the mine due to the characteristics and evolution of the ventilation layout. Three different zones concerning CO, CO2, NOx and effective temperature have been found as well as some variations among workplaces within the same zone that suggest local airflow recirculations. The system proposed could be a useful tool to improve the workplace conditions and efficiency levels. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Data-driven decision support for radiologists: re-using the National Lung Screening Trial dataset for pulmonary nodule management.

    PubMed

    Morrison, James J; Hostetter, Jason; Wang, Kenneth; Siegel, Eliot L

    2015-02-01

    Real-time mining of large research trial datasets enables development of case-based clinical decision support tools. Several applicable research datasets exist including the National Lung Screening Trial (NLST), a dataset unparalleled in size and scope for studying population-based lung cancer screening. Using these data, a clinical decision support tool was developed which matches patient demographics and lung nodule characteristics to a cohort of similar patients. The NLST dataset was converted into Structured Query Language (SQL) tables hosted on a web server, and a web-based JavaScript application was developed which performs real-time queries. JavaScript is used for both the server-side and client-side language, allowing for rapid development of a robust client interface and server-side data layer. Real-time data mining of user-specified patient cohorts achieved a rapid return of cohort cancer statistics and lung nodule distribution information. This system demonstrates the potential of individualized real-time data mining using large high-quality clinical trial datasets to drive evidence-based clinical decision-making.

  16. Textpresso site-specific recombinases: A text-mining server for the recombinase literature including Cre mice and conditional alleles.

    PubMed

    Urbanski, William M; Condie, Brian G

    2009-12-01

    Textpresso Site Specific Recombinases (http://ssrc.genetics.uga.edu/) is a text-mining web server for searching a database of more than 9,000 full-text publications. The papers and abstracts in this database represent a wide range of topics related to site-specific recombinase (SSR) research tools. Included in the database are most of the papers that report the characterization or use of mouse strains that express Cre recombinase as well as papers that describe or analyze mouse lines that carry conditional (floxed) alleles or SSR-activated transgenes/knockins. The database also includes reports describing SSR-based cloning methods such as the Gateway or the Creator systems, papers reporting the development or use of SSR-based tools in systems such as Drosophila, bacteria, parasites, stem cells, yeast, plants, zebrafish, and Xenopus as well as publications that describe the biochemistry, genetics, or molecular structure of the SSRs themselves. Textpresso Site Specific Recombinases is the only comprehensive text-mining resource available for the literature describing the biology and technical applications of SSRs. (c) 2009 Wiley-Liss, Inc.

  17. Assessing the effectiveness of sustainable land management policies for combating desertification: A data mining approach.

    PubMed

    Salvati, L; Kosmas, C; Kairis, O; Karavitis, C; Acikalin, S; Belgacem, A; Solé-Benet, A; Chaker, M; Fassouli, V; Gokceoglu, C; Gungor, H; Hessel, R; Khatteli, H; Kounalaki, A; Laouina, A; Ocakoglu, F; Ouessar, M; Ritsema, C; Sghaier, M; Sonmez, H; Taamallah, H; Tezcan, L; de Vente, J; Kelly, C; Colantoni, A; Carlucci, M

    2016-12-01

    This study investigates the relationship between fine resolution, local-scale biophysical and socioeconomic contexts within which land degradation occurs, and the human responses to it. The research draws on experimental data collected under different territorial and socioeconomic conditions at 586 field sites in five Mediterranean countries (Spain, Greece, Turkey, Tunisia and Morocco). We assess the level of desertification risk under various land management practices (terracing, grazing control, prevention of wildland fires, soil erosion control measures, soil water conservation measures, sustainable farming practices, land protection measures and financial subsidies) taken as possible responses to land degradation. A data mining approach, incorporating principal component analysis, non-parametric correlations, multiple regression and canonical analysis, was developed to identify the spatial relationship between land management conditions, the socioeconomic and environmental context (described using 40 biophysical and socioeconomic indicators) and desertification risk. Our analysis identified a number of distinct relationships between the level of desertification experienced and the underlying socioeconomic context, suggesting that the effectiveness of responses to land degradation is strictly dependent on the local biophysical and socioeconomic context. Assessing the latent relationship between land management practices and the biophysical/socioeconomic attributes characterizing areas exposed to different levels of desertification risk proved to be an indirect measure of the effectiveness of field actions contrasting land degradation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. Open reading frames associated with cancer in the dark matter of the human genome.

    PubMed

    Delgado, Ana Paula; Brandao, Pamela; Chapado, Maria Julia; Hamid, Sheilin; Narayanan, Ramaswamy

    2014-01-01

    The uncharacterized proteins (open reading frames, ORFs) in the human genome offer an opportunity to discover novel targets for cancer. A systematic analysis of the dark matter of the human proteome for druggability and biomarker discovery is crucial to mining the genome. Numerous data mining tools are available to mine these ORFs to develop a comprehensive knowledge base for future target discovery and validation. Using the Genetic Association Database, the ORFs of the human dark matter proteome were screened for evidence of association with neoplasms. The Phenome-Genome Integrator tool was used to establish phenotypic association with disease traits including cancer. Batch analysis of the tools for protein expression analysis, gene ontology and motifs and domains was used to characterize the ORFs. Sixty-two ORFs were identified for neoplasm association. The expression Quantitative Trait Loci (eQTL) analysis identified thirteen ORFs related to cancer traits. Protein expression, motifs and domain analysis and genome-wide association studies verified the relevance of these OncoORFs in diverse tumors. The OncoORFs are also associated with a wide variety of human diseases and disorders. Our results link the OncoORFs to diverse diseases and disorders. This suggests a complex landscape of the uncharacterized proteome in human diseases. These results open the dark matter of the proteome to novel cancer target research. Copyright© 2014, International Institute of Anticancer Research (Dr. John G. Delinasios), All rights reserved.

  19. Anthropocene landscape change and the legacy of nineteenth- and twentieth-century mining in the Fourmile Catchment, Colorado Front Range

    USGS Publications Warehouse

    Dethier, David P.; Ouimet, William B.; Murphy, Sheila F.; Kotikian, Maneh; Wicherski, Will; Samuels, Rachel M.

    2018-01-01

    Human impacts on earth surface processes and materials are fundamental to understanding the proposed Anthropocene epoch. This study examines the magnitude, distribution, and long-term context of nineteenth- and twentieth-century mining in the Fourmile Creek catchment, Colorado, coupling airborne LiDAR topographic analysis with historical documents and field studies of river banks exposed by 2013 flooding. Mining impacts represent the dominant Anthropocene landscape change for this basin. Mining activity, particularly placer operations, controls floodplain stratigraphy and waste rock piles related to mining cover >5% of hillslopes in the catchment. Total rates of surface disturbance on slopes from mining activities (prospecting, mining, and road building) exceed pre-nineteenth-century rates by at least fifty times. Recent flooding and the overprint of human impacts obscure the record of Holocene floodplain evolution. Stratigraphic relations indicate that the Fourmile valley floor was as much as two meters higher in the past 2,000 years and that placer reworking, lateral erosion, or minor downcutting dominated from the late Holocene to present. Concentrations of As and Au in the fine fraction of hillslope soil, mining-related deposits, and fluvial deposits serve as a geochemical marker of mining activity in the catchment; reducing As and Au values in floodplain sediment will take hundreds of years to millennia. Overall, the Fourmile Creek catchment provides a valuable example of Anthropocene landscape change for mountainous regions of the Western United States, where hillslope and floodplain markers of human activity vary, high rates of geomorphic processes affect mixing and preservation of marker deposits, and long-term impact varies by landscape location.

  20. Sustainability of uranium mining and milling: toward quantifying resources and eco-efficiency.

    PubMed

    Mudd, Gavin M; Diesendorf, Mark

    2008-04-01

    The mining of uranium has long been a controversial public issue, and a renewed debate has emerged on the potential for nuclear power to help mitigate against climate change. The central thesis of pro-nuclear advocates is the lower carbon intensity of nuclear energy compared to fossil fuels, although there remains very little detailed analysis of the true carbon costs of nuclear energy. In this paper, we compile and analyze a range of data on uranium mining and milling, including uranium resources as well as sustainability metrics such as energy and water consumption and carbon emissions with respect to uranium production-arguably the first time for modern projects. The extent of economically recoverable uranium resources is clearly linked to exploration, technology, and economics but also inextricably to environmental costs such as energy/water/chemicals consumption, greenhouse gas emissions, and social issues. Overall, the data clearly show the sensitivity of sustainability assessments to the ore grade of the uranium deposit being mined and that significant gaps remain in complete sustainability reporting and accounting. This paper is a case study of the energy, water, and carbon costs of uranium mining and milling within the context of the nuclear energy chain.

  1. Determining a pre-mining radiological baseline from historic airborne gamma surveys: a case study.

    PubMed

    Bollhöfer, Andreas; Beraldo, Annamarie; Pfitzner, Kirrilly; Esparon, Andrew; Doering, Che

    2014-01-15

    Knowing the baseline level of radioactivity in areas naturally enriched in radionuclides is important in the uranium mining context to assess radiation doses to humans and the environment both during and after mining. This information is particularly useful in rehabilitation planning and developing closure criteria for uranium mines as only radiation doses additional to the natural background are usually considered 'controllable' for radiation protection purposes. In this case study we have tested whether the method of contemporary groundtruthing of a historic airborne gamma survey could be used to determine the pre-mining radiological conditions at the Ranger mine in northern Australia. The airborne gamma survey was flown in 1976 before mining started and groundtruthed using ground gamma dose rate measurements made between 2007 and 2009 at an undisturbed area naturally enriched in uranium (Anomaly 2) located nearby the Ranger mine. Measurements of (226)Ra soil activity concentration and (222)Rn exhalation flux density at Anomaly 2 were made concurrent with the ground gamma dose rate measurements. Algorithms were developed to upscale the ground gamma data to the same spatial resolution as the historic airborne gamma survey data using a geographic information system, allowing comparison of the datasets. Linear correlation models were developed to estimate the pre-mining gamma dose rates, (226)Ra soil activity concentrations, and (222)Rn exhalation flux densities at selected areas in the greater Ranger region. The modelled levels agreed with measurements made at the Ranger Orebodies 1 and 3 before mining started, and at environmental sites in the region. The conclusion is that our approach can be used to determine baseline radiation levels, and provide a benchmark for rehabilitation of uranium mines or industrial sites where historical airborne gamma survey data are available and an undisturbed radiological analogue exists to groundtruth the data. © 2013.

  2. Proceedings: Fourth Workshop on Mining Scientific Datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kamath, C

    Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratorymore » data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of applications is matched only by the opportunities that await a practitioner.« less

  3. Data Mining of Extremely Large Ad-Hoc Data Sets to Produce Reverse Web-Link Graphs

    DTIC Science & Technology

    2017-03-01

    in most of the MR cases. From these studies , we also learned that computing -optimized instances should be chosen for serialized/compressed input data...maximum 200 words) Data mining can be a valuable tool, particularly in the acquisition of military intelligence. As the second study within a larger Naval...open web crawler data set Common Crawl. Similar to previous studies , this research employs MapReduce (MR) for sorting and categorizing output value

  4. Surface mine planning and design implications and theory of a visual environmental quality predictive model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burley, J.B.

    1999-07-01

    Surface mine planners and designers are searching for scientifically based tools to assist in the pre-mine planning and post-mine development or surface mine sites. In this study, the author presents a science based visual and environmental quality predictive model useful in preparing and assessing landscape treatments for surface mine sites. The equation explains 67 percent of respondent preference, with an overall p-value for the equation >0.0001 and a p-value >0.05 for each regressor. Regressors employed in the equation include an environmental quality index, foreground vegetation, distant nonvegetation, people, vehicles, utilities, foreground flowers, foreground erosion, wildlife, landscape openness, landscape mystery, andmore » noosphericness (a measure of human disturbance). The equation can be explained with an Intrusion/Neutral Modifier/Temporal Enhancement Theory which suggests that human intrusions upon other humans results in landscape of low preference and which also suggests that landscape containing natural and special temporal features such as wildlife and flowers can enhance the value of a landscape scene. This research supports the importance of visual barriers such as berms and vegetation screens during mining operations and supports public perceptions concerning many types of industrial activities. In addition, the equation can be applied to study post-mining landscape development plans to maximize the efficiency and effectiveness of landscape treatments.« less

  5. Digital Workflows for a 3d Semantic Representation of AN Ancient Mining Landscape

    NASA Astrophysics Data System (ADS)

    Hiebel, G.; Hanke, K.

    2017-08-01

    The ancient mining landscape of Schwaz/Brixlegg in the Tyrol, Austria witnessed mining from prehistoric times to modern times creating a first order cultural landscape when it comes to one of the most important inventions in human history: the production of metal. In 1991 a part of this landscape was lost due to an enormous landslide that reshaped part of the mountain. With our work we want to propose a digital workflow to create a 3D semantic representation of this ancient mining landscape with its mining structures to preserve it for posterity. First, we define a conceptual model to integrate the data. It is based on the CIDOC CRM ontology and CRMgeo for geometric data. To transform our information sources to a formal representation of the classes and properties of the ontology we applied semantic web technologies and created a knowledge graph in RDF (Resource Description Framework). Through the CRMgeo extension coordinate information of mining features can be integrated into the RDF graph and thus related to the detailed digital elevation model that may be visualized together with the mining structures using Geoinformation systems or 3D visualization tools. The RDF network of the triple store can be queried using the SPARQL query language. We created a snapshot of mining, settlement and burial sites in the Bronze Age. The results of the query were loaded into a Geoinformation system and a visualization of known bronze age sites related to mining, settlement and burial activities was created.

  6. A survey on annotation tools for the biomedical literature.

    PubMed

    Neves, Mariana; Leser, Ulf

    2014-03-01

    New approaches to biomedical text mining crucially depend on the existence of comprehensive annotated corpora. Such corpora, commonly called gold standards, are important for learning patterns or models during the training phase, for evaluating and comparing the performance of algorithms and also for better understanding the information sought for by means of examples. Gold standards depend on human understanding and manual annotation of natural language text. This process is very time-consuming and expensive because it requires high intellectual effort from domain experts. Accordingly, the lack of gold standards is considered as one of the main bottlenecks for developing novel text mining methods. This situation led the development of tools that support humans in annotating texts. Such tools should be intuitive to use, should support a range of different input formats, should include visualization of annotated texts and should generate an easy-to-parse output format. Today, a range of tools which implement some of these functionalities are available. In this survey, we present a comprehensive survey of tools for supporting annotation of biomedical texts. Altogether, we considered almost 30 tools, 13 of which were selected for an in-depth comparison. The comparison was performed using predefined criteria and was accompanied by hands-on experiences whenever possible. Our survey shows that current tools can support many of the tasks in biomedical text annotation in a satisfying manner, but also that no tool can be considered as a true comprehensive solution.

  7. Climate policy: Uncovering ocean-related priorities

    NASA Astrophysics Data System (ADS)

    Barkemeyer, Ralf

    2017-11-01

    Given the complexity and multi-faceted nature of policy processes, national-level policy preferences are notoriously difficult to capture. Now, research applying an automated text mining approach helps to shed light on country-level differences and priorities in the context of marine climate issues.

  8. SparkText: Biomedical Text Mining on Big Data Framework.

    PubMed

    Ye, Zhan; Tafti, Ahmad P; He, Karen Y; Wang, Kai; He, Max M

    Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  9. Relationships between sources of acid mine drainage and the hydrochemistry of acid effluents during rainy season in the Iberian Pyrite Belt.

    PubMed

    Pérez-Ostalé, E; Grande, J A; Valente, T; de la Torre, M L; Santisteban, M; Fernández, P; Diaz-Curiel, J

    2016-01-01

    In the Iberian Pyrite Belt (IPB), southwest Spain, a prolonged and intense mining activity of more than 4,500 years has resulted in almost a hundred mines scattered through the region. After years of inactivity, these mines are still causing high levels of hydrochemical degradation in the fluvial network. This situation represents a unique scenario in the world, taking into consideration its magnitude and intensity of the contamination processes. In order to obtain a benchmark regarding the degree of acid mine drainage (AMD) pollution in the aquatic environment, the relationship between the areas occupied by the sulfide mines and the characteristics of the respective effluents after rainfall was analysed. The methodology developed, which includes the design of a sampling network, analytical treatment and cluster analysis, is a useful tool for diagnosing the contamination level by AMD in an entire metallogenic province, at the scale of each mining group. The results presented the relationship between sulfate, total dissolved solids and electrical conductivity, as well as other parameters that are typically associated with AMD and the major elements that compose the polymetallic sulfides of IPB. This analysis also indicates the low level of proximity between the affectation area and the other variables.

  10. SparkText: Biomedical Text Mining on Big Data Framework

    PubMed Central

    He, Karen Y.; Wang, Kai

    2016-01-01

    Background Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. Results In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. Conclusions This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research. PMID:27685652

  11. A structural informatics approach to mine kinase knowledge bases.

    PubMed

    Brooijmans, Natasja; Mobilio, Dominick; Walker, Gary; Nilakantan, Ramaswamy; Denny, Rajiah A; Feyfant, Eric; Diller, David; Bikker, Jack; Humblet, Christine

    2010-03-01

    In this paper, we describe a combination of structural informatics approaches developed to mine data extracted from existing structure knowledge bases (Protein Data Bank and the GVK database) with a focus on kinase ATP-binding site data. In contrast to existing systems that retrieve and analyze protein structures, our techniques are centered on a database of ligand-bound geometries in relation to residues lining the binding site and transparent access to ligand-based SAR data. We illustrate the systems in the context of the Abelson kinase and related inhibitor structures. 2009 Elsevier Ltd. All rights reserved.

  12. 30 CFR 75.1103-9 - Minimum requirements; fire suppression materials and location; maintenance of entries and...

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... tools and hardware required for its operation shall be stored at the foam generator. (2) Tools to open a...-expansion foam devices. 75.1103-9 Section 75.1103-9 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... and crosscuts; access doors; communications; fire crews; high-expansion foam devices. (a) The...

  13. 30 CFR 75.1103-9 - Minimum requirements; fire suppression materials and location; maintenance of entries and...

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... tools and hardware required for its operation shall be stored at the foam generator. (2) Tools to open a...-expansion foam devices. 75.1103-9 Section 75.1103-9 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... and crosscuts; access doors; communications; fire crews; high-expansion foam devices. (a) The...

  14. 30 CFR 75.1103-9 - Minimum requirements; fire suppression materials and location; maintenance of entries and...

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... tools and hardware required for its operation shall be stored at the foam generator. (2) Tools to open a...-expansion foam devices. 75.1103-9 Section 75.1103-9 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... and crosscuts; access doors; communications; fire crews; high-expansion foam devices. (a) The...

  15. 30 CFR 75.1103-9 - Minimum requirements; fire suppression materials and location; maintenance of entries and...

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... tools and hardware required for its operation shall be stored at the foam generator. (2) Tools to open a...-expansion foam devices. 75.1103-9 Section 75.1103-9 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... and crosscuts; access doors; communications; fire crews; high-expansion foam devices. (a) The...

  16. 30 CFR 75.1103-9 - Minimum requirements; fire suppression materials and location; maintenance of entries and...

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... tools and hardware required for its operation shall be stored at the foam generator. (2) Tools to open a...-expansion foam devices. 75.1103-9 Section 75.1103-9 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... and crosscuts; access doors; communications; fire crews; high-expansion foam devices. (a) The...

  17. Analytics for Cyber Network Defense

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Plantenga, Todd.; Kolda, Tamara Gibson

    2011-06-01

    This report provides a brief survey of analytics tools considered relevant to cyber network defense (CND). Ideas and tools come from elds such as statistics, data mining, and knowledge discovery. Some analytics are considered standard mathematical or statistical techniques, while others re ect current research directions. In all cases the report attempts to explain the relevance to CND with brief examples.

  18. Analysis of post-mining excavations as places for municipal waste

    NASA Astrophysics Data System (ADS)

    Górniak-Zimroz, Justyna

    2018-01-01

    Waste management planning is an interdisciplinary task covering a wide range of issues including costs, legal requirements, spatial planning, environmental protection, geography, demographics, and techniques used in collecting, transporting, processing and disposing of waste. Designing and analyzing this issue is difficult and requires the use of advanced analysis methods and tools available in GIS geographic information systems containing readily available graphical and descriptive databases, data analysis tools providing expert decision support while selecting the best-designed alternative, and simulation models that allow the user to simulate many variants of waste management together with graphical visualization of the results of performed analyzes. As part of the research study, there have been works undertaken concerning the use of multi-criteria data analysis in waste management in areas located in southwestern Poland. These works have proposed the inclusion in waste management of post-mining excavations as places for the final or temporary collection of waste assessed in terms of their suitability with the tools available in GIS systems.

  19. Institutional challenges for mining and sustainability in Peru.

    PubMed

    Bebbington, Anthony J; Bury, Jeffrey T

    2009-10-13

    Global consumption continues to generate growth in mining. In lesser developed economies, this growth offers the potential to generate new resources for development, but also creates challenges to sustainability in the regions in which extraction occurs. This context leads to debate on the institutional arrangements most likely to build synergies between mining, livelihoods, and development, and on the socio-political conditions under which such institutions can emerge. Building from a multiyear, three-country program of research projects, Peru, a global center of mining expansion, serves as an exemplar for analyzing the effects of extractive industry on livelihoods and the conditions under which arrangements favoring local sustainability might emerge. This program is guided by three emergent hypotheses in human-environmental sciences regarding the relationships among institutions, knowledge, learning, and sustainability. The research combines in-depth and comparative case study analysis, and uses mapping and spatial analysis, surveys, in-depth interviews, participant observation, and our own direct participation in public debates on the regulation of mining for development. The findings demonstrate the pressures that mining expansion has placed on water resources, livelihood assets, and social relationships. These pressures are a result of institutional conditions that separate the governance of mineral expansion, water resources, and local development, and of relationships of power that prioritize large scale investment over livelihood and environment. A further problem is the poor communication between mining sector knowledge systems and those of local populations. These results are consistent with themes recently elaborated in sustainability science.

  20. Institutional challenges for mining and sustainability in Peru

    PubMed Central

    Bebbington, Anthony J.; Bury, Jeffrey T.

    2009-01-01

    Global consumption continues to generate growth in mining. In lesser developed economies, this growth offers the potential to generate new resources for development, but also creates challenges to sustainability in the regions in which extraction occurs. This context leads to debate on the institutional arrangements most likely to build synergies between mining, livelihoods, and development, and on the socio-political conditions under which such institutions can emerge. Building from a multiyear, three-country program of research projects, Peru, a global center of mining expansion, serves as an exemplar for analyzing the effects of extractive industry on livelihoods and the conditions under which arrangements favoring local sustainability might emerge. This program is guided by three emergent hypotheses in human-environmental sciences regarding the relationships among institutions, knowledge, learning, and sustainability. The research combines in-depth and comparative case study analysis, and uses mapping and spatial analysis, surveys, in-depth interviews, participant observation, and our own direct participation in public debates on the regulation of mining for development. The findings demonstrate the pressures that mining expansion has placed on water resources, livelihood assets, and social relationships. These pressures are a result of institutional conditions that separate the governance of mineral expansion, water resources, and local development, and of relationships of power that prioritize large scale investment over livelihood and environment. A further problem is the poor communication between mining sector knowledge systems and those of local populations. These results are consistent with themes recently elaborated in sustainability science. PMID:19805172

  1. Local sustainability and gender ratio: evaluating the impacts of mining and tourism on sustainable development in Yunnan, China.

    PubMed

    Huang, Ganlin; Ali, Saleem

    2015-01-19

    This study employed rapid evaluation methods to investigate how the leading industries of mining and tourism impact sustainability as manifest through social, economic and environmental dimensions in Yunnan, China. Within the social context, we also consider the differentiated impact on gender ratio-which is a salient feature of sustained development trajectories. Our results indicate that mining areas performed better than tourism areas in economic aspects but fell behind in social development, especially regarding the issue of gender balance. Conclusions on environmental status cannot be drawn due to a lack of data.  The results from the environmental indicators are mixed. Our study demonstrates that rapid evaluation using currently available data can provide a means of greater understanding regarding local sustainability and highlights areas that need attention from policy makers, agencies and academia.

  2. Emergency Braking of a Mine Hoist in the Context of the Braking System Selection

    NASA Astrophysics Data System (ADS)

    Wolny, Stanisław

    2017-03-01

    The paper addresses the selected aspects of the dynamic behaviour of mine hoists during the emergency braking phase. Basing on the model of the hoist and supported by theoretical backgrounds provided by the author (Wolny, 2016), analytical formulas are derived to determine the parameters of the braking system such that during an emergency braking it should guarantee that: - the maximal loading of the hoisting ropes should not exceed the rope breaking force, - deceleration of the conveyances being stopped should not exceed the admissible levels Results of the dynamic analysis of the mine hoist behaviour during an emergency braking phase summarised in this study can be utilised to support the design of conveyance and rope attachments by the fatigue endurance methods, with an aim to adapt it to the specified operational parameters of the hoisting installation (Eurokod 3).

  3. Geospatial Image Mining For Nuclear Proliferation Detection: Challenges and New Opportunities

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vatsavai, Raju; Bhaduri, Budhendra L; Cheriyadat, Anil M

    2010-01-01

    With increasing understanding and availability of nuclear technologies, and increasing persuasion of nuclear technologies by several new countries, it is increasingly becoming important to monitor the nuclear proliferation activities. There is a great need for developing technologies to automatically or semi-automatically detect nuclear proliferation activities using remote sensing. Images acquired from earth observation satellites is an important source of information in detecting proliferation activities. High-resolution remote sensing images are highly useful in verifying the correctness, as well as completeness of any nuclear program. DOE national laboratories are interested in detecting nuclear proliferation by developing advanced geospatial image mining algorithms. Inmore » this paper we describe the current understanding of geospatial image mining techniques and enumerate key gaps and identify future research needs in the context of nuclear proliferation.« less

  4. SA-Search: a web tool for protein structure mining based on a Structural Alphabet

    PubMed Central

    Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

    2004-01-01

    SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search. PMID:15215446

  5. SA-Search: a web tool for protein structure mining based on a Structural Alphabet.

    PubMed

    Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre

    2004-07-01

    SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search.

  6. A New Tool for Industry

    NASA Technical Reports Server (NTRS)

    1981-01-01

    Ultrasonic P2L2 bolt monitor is a new industrial tool, developed at Langley Research Laboratory, which is lightweight, portable, extremely accurate because it is not subject to friction error, and it is cost-competitive with the least expensive of other types of accurate strain monitors. P2L2 is an acronym for Pulse Phase Locked Loop. The ultrasound system which measures the stress that occurs when a bolt becomes elongated in the process of tightening, transmits sound waves to the bolt being fastened and receives a return signal indicating changes in bolt stress. Results are translated into a digital reading of the actual stress on the bolt. Device monitors the bolt tensioning process on mine roof bolts that provide increased safety within the mine. Also has utility in industrial applications.

  7. Integration of Geographical Information Systems and Geophysical Applications with Distributed Computing Technologies.

    NASA Astrophysics Data System (ADS)

    Pierce, M. E.; Aktas, M. S.; Aydin, G.; Fox, G. C.; Gadgil, H.; Sayar, A.

    2005-12-01

    We examine the application of Web Service Architectures and Grid-based distributed computing technologies to geophysics and geo-informatics. We are particularly interested in the integration of Geographical Information System (GIS) services with distributed data mining applications. GIS services provide the general purpose framework for building archival data services, real time streaming data services, and map-based visualization services that may be integrated with data mining and other applications through the use of distributed messaging systems and Web Service orchestration tools. Building upon on our previous work in these areas, we present our current research efforts. These include fundamental investigations into increasing XML-based Web service performance, supporting real time data streams, and integrating GIS mapping tools with audio/video collaboration systems for shared display and annotation.

  8. Interactive text mining with Pipeline Pilot: a bibliographic web-based tool for PubMed.

    PubMed

    Vellay, S G P; Latimer, N E Miller; Paillard, G

    2009-06-01

    Text mining has become an integral part of all research in the medical field. Many text analysis software platforms support particular use cases and only those. We show an example of a bibliographic tool that can be used to support virtually any use case in an agile manner. Here we focus on a Pipeline Pilot web-based application that interactively analyzes and reports on PubMed search results. This will be of interest to any scientist to help identify the most relevant papers in a topical area more quickly and to evaluate the results of query refinement. Links with Entrez databases help both the biologist and the chemist alike. We illustrate this application with Leishmaniasis, a neglected tropical disease, as a case study.

  9. Using natural language processing techniques to inform research on nanotechnology.

    PubMed

    Lewinski, Nastassja A; McInnes, Bridget T

    2015-01-01

    Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics.

  10. Interactive intelligent remote operations: application to space robotics

    NASA Astrophysics Data System (ADS)

    Dupuis, Erick; Gillett, G. R.; Boulanger, Pierre; Edwards, Eric; Lipsett, Michael G.

    1999-11-01

    A set of tolls addressing the problems specific to the control and monitoring of remote robotic systems from extreme distances has been developed. The tools include the capability to model and visualize the remote environment, to generate and edit complex task scripts, to execute the scripts to supervisory control mode and to monitor and diagnostic equipment from multiple remote locations. Two prototype systems are implemented for demonstration. The first demonstration, using a prototype joint design called Dexter, shows the applicability of the approach to space robotic operation in low Earth orbit. The second demonstration uses a remotely controlled excavator in an operational open-pit tar sand mine. This demonstrates that the tools developed can also be used for planetary exploration operations as well as for terrestrial mining applications.

  11. Fluvial transport and surface enrichment of arsenic in semi-arid mining regions: examples from the Mojave Desert, California.

    PubMed

    Kim, Christopher S; Stack, David H; Rytuba, James J

    2012-07-01

    As a result of extensive gold and silver mining in the Mojave Desert, southern California, mine wastes and tailings containing highly elevated arsenic (As) concentrations remain exposed at a number of former mining sites. Decades of weathering and erosion have contributed to the mobilization of As-enriched tailings, which now contaminate surrounding communities. Fluvial transport plays an intermittent yet important and relatively undocumented role in the migration and dispersal of As-contaminated mine wastes in semi-arid climates. Assessing the contribution of fluvial systems to tailings mobilization is critical in order to assess the distribution and long-term exposure potential of tailings in a mining-impacted environment. Extensive sampling, chemical analysis, and geospatial mapping of dry streambed (wash) sediments, tailings piles, alluvial fans, and rainwater runoff at multiple mine sites have aided the development of a conceptual model to explain the fluvial migration of mine wastes in semi-arid climates. Intense and episodic precipitation events mobilize mine wastes downstream and downslope as a series of discrete pulses, causing dispersion both down and lateral to washes with exponential decay behavior as distance from the source increases. Accordingly a quantitative model of arsenic concentrations in wash sediments, represented as a series of overlapping exponential power-law decay curves, results in the acceptable reproducibility of observed arsenic concentration patterns. Such a model can be transferable to other abandoned mine lands as a predictive tool for monitoring the fate and transport of arsenic and related contaminants in similar settings. Effective remediation of contaminated mine wastes in a semi-arid environment requires addressing concurrent changes in the amounts of potential tailings released through fluvial processes and the transport capacity of a wash.

  12. Mapping of the dilemma of mining against forest and conservation in the Lom and Djérem Division, Cameroon

    NASA Astrophysics Data System (ADS)

    Tchindjang, Mesmin; Voundi, Eric; Mbevo Fendoung, Philippes; Haman, Unusa; Saha, Frédéric; Casimir Njombissie Petcheu, Igor

    2018-05-01

    Mining practices in Cameroon began since the colonial period. The artisanal mining sector before independence contributed to 11-20 % of GDP. From 2000, the rich potential of the Cameroonian subsoil attract many foreign investors with over 600 research and mining permits already granted during the last decade. But, Cameroonian forests also have a long history from the colonial period to the pre-sent. However, mining activities in forest environments are governed by two different legal frameworks, including mining code i.e. Law No. 001 of 16 April 2001 organizing the mining industry and Law No. 94-01 of 20 January 1994 governing forests, wildlife and fisheries. Therefore, in the absence of detailed studies of these laws, there are conflicts of interests, rights and obligations that overlap, requiring research needs and taking appropriate decisions. The objective of this research in the Lom and Djérem division is to study, apart from the proliferation of mining li-censes and actors, the dilemma as well as the impact of the extension of mining activities on the degradation of forest cover. Using geospatial tools through multi-temporal and multisensor satellite images (Landsat from 1976 to 2015, IKONOS, GEOEYE, Google Earth) coupled with field investigations; we mapped the dynamic of different forms of land use (mining permits, FMU and protected areas of permanent forest estate) and highlighted paradoxically the conflict of land use. We came to the conclusion that the rhythm of issuing mining permits and authorizations in this forestall zone is so fast that one can wonder whether we still find a patch of forest within 50 years.

  13. Fluvial transport and surface enrichment of arsenic in semi-arid mining regions: examples from the Mojave Desert, California

    USGS Publications Warehouse

    Kim, Christopher S.; Slack, David H.; Rytuba, James J.

    2012-01-01

    As a result of extensive gold and silver mining in the Mojave Desert, southern California, mine wastes and tailings containing highly elevated arsenic (As) concentrations remain exposed at a number of former mining sites. Decades of weathering and erosion have contributed to the mobilization of As-enriched tailings, which now contaminate surrounding communities. Fluvial transport plays an intermittent yet important and relatively undocumented role in the migration and dispersal of As-contaminated mine wastes in semi-arid climates. Assessing the contribution of fluvial systems to tailings mobilization is critical in order to assess the distribution and long-term exposure potential of tailings in a mining-impacted environment. Extensive sampling, chemical analysis, and geospatial mapping of dry streambed (wash) sediments, tailings piles, alluvial fans, and rainwater runoff at multiple mine sites have aided the development of a conceptual model to explain the fluvial migration of mine wastes in semi-arid climates. Intense and episodic precipitation events mobilize mine wastes downstream and downslope as a series of discrete pulses, causing dispersion both down and lateral to washes with exponential decay behavior as distance from the source increases. Accordingly a quantitative model of arsenic concentrations in wash sediments, represented as a series of overlapping exponential power-law decay curves, results in the acceptable reproducibility of observed arsenic concentration patterns. Such a model can be transferable to other abandoned mine lands as a predictive tool for monitoring the fate and transport of arsenic and related contaminants in similar settings. Effective remediation of contaminated mine wastes in a semi-arid environment requires addressing concurrent changes in the amounts of potential tailings released through fluvial processes and the transport capacity of a wash.

  14. Data mining in soft computing framework: a survey.

    PubMed

    Mitra, S; Pal, S K; Mitra, P

    2002-01-01

    The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included.

  15. The use of unmanned aerial systems for the mapping of legacy uranium mines.

    PubMed

    Martin, P G; Payton, O D; Fardoulis, J S; Richards, D A; Scott, T B

    2015-05-01

    Historical mining of uranium mineral veins within Cornwall, England, has resulted in a significant amount of legacy radiological contamination spread across numerous long disused mining sites. Factors including the poorly documented and aged condition of these sites as well as the highly localised nature of radioactivity limit the success of traditional survey methods. A newly developed terrain-independent unmanned aerial system [UAS] carrying an integrated gamma radiation mapping unit was used for the radiological characterisation of a single legacy mining site. Using this instrument to produce high-spatial-resolution maps, it was possible to determine the radiologically contaminated land areas and to rapidly identify and quantify the degree of contamination and its isotopic nature. The instrument was demonstrated to be a viable tool for the characterisation of similar sites worldwide. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  16. Differentiation of closely related isomers: application of data mining techniques in conjunction with variable wavelength infrared multiple photon dissociation mass spectrometry for identification of glucose-containing disaccharide ions.

    PubMed

    Stefan, Sarah E; Ehsan, Mohammad; Pearson, Wright L; Aksenov, Alexander; Boginski, Vladimir; Bendiak, Brad; Eyler, John R

    2011-11-15

    Data mining algorithms have been used to analyze the infrared multiple photon dissociation (IRMPD) patterns of gas-phase lithiated disaccharide isomers irradiated with either a line-tunable CO(2) laser or a free electron laser (FEL). The IR fragmentation patterns over the wavelength range of 9.2-10.6 μm have been shown in earlier work to correlate uniquely with the asymmetry at the anomeric carbon in each disaccharide. Application of data mining approaches for data analysis allowed unambiguous determination of the anomeric carbon configurations for each disaccharide isomer pair using fragmentation data at a single wavelength. In addition, the linkage positions were easily assigned. This combination of wavelength-selective IRMPD and data mining offers a powerful and convenient tool for differentiation of structurally closely related isomers, including those of gas-phase carbohydrate complexes.

  17. Hymenoptera Genome Database: integrating genome annotations in HymenopteraMine

    PubMed Central

    Elsik, Christine G.; Tayal, Aditi; Diesh, Colin M.; Unni, Deepak R.; Emery, Marianne L.; Nguyen, Hung N.; Hagen, Darren E.

    2016-01-01

    We report an update of the Hymenoptera Genome Database (HGD) (http://HymenopteraGenome.org), a model organism database for insect species of the order Hymenoptera (ants, bees and wasps). HGD maintains genomic data for 9 bee species, 10 ant species and 1 wasp, including the versions of genome and annotation data sets published by the genome sequencing consortiums and those provided by NCBI. A new data-mining warehouse, HymenopteraMine, based on the InterMine data warehousing system, integrates the genome data with data from external sources and facilitates cross-species analyses based on orthology. New genome browsers and annotation tools based on JBrowse/WebApollo provide easy genome navigation, and viewing of high throughput sequence data sets and can be used for collaborative genome annotation. All of the genomes and annotation data sets are combined into a single BLAST server that allows users to select and combine sequence data sets to search. PMID:26578564

  18. Light Stable Isotopes in Aquifers Affected by Mining Activities in a Brazilian Mining Province

    NASA Astrophysics Data System (ADS)

    Moreira, R. M.; de Carvalho, J. B.

    2013-05-01

    Iron ore is presently a main item in the Brazilian commercial agenda. Large reserves have converted this utility into an important source of export earnings and, secondarily, of raw materials for the domestic industry. Parallel to a boom in mining activities in the last years environmental impacts and a stress on natural resources have soared. A region exhibiting pronouncedly intensive mining activities lies in the central part of the State of Minas Gerais, the third economy of the federation. Mines are sited right beside the capital and neighbor towns amounting to nearly five million inhabitants and a pronounced dependence on groundwater resources. Besides, this region is a water divide enclosing the sources of main contributors to the most strategic fluvial basins in the country. Iron ore is by large the main mineral but other metals (including gold and uranium), as well as non-metals such as limestone, quartz and granite, also occur. Given the significance of this commodity in the country's trade balance and the demand of water resources with acceptable quality for human consumption, the scale of ensuing water use conflicts caused by its exploration is wide ranging and has to be coped with well grounded environmental assessment approaches. Tracer hydrology techniques might be a valuable tool in this context. The characteristics of the area being impacted have been surveyed, including climate and pluviometry, stratigraphic litology, geological structure, use of soil, mineral resources and their exploration, surface and ground water hydrology and their sundry uses. Data to be processed have been procured at local public agencies but as regard local hydrological features, particularly isotopic compositions, ad hoc surveys and methodologies were required. One instance concerns pluviometric isotopy due to the alpine character of the surveyed region altitude and temperature effects might take place. Hence different sites were monitored; cumulative pluviometer samples collected on a monthly base had to be stored in specially designed containers to avoid fractionation due to evaporation. Meteoric and groundwater samples were collected at a monthly rate along a whole year period at a total of forty seven stations including wells, springs and drainage sources, encompassing six aquifer units. Physical-chemical parameters, major ions, and both stable and radioactive nuclides were measured in the collected samples. Stable isotope measurements comprised the 2H/1H and 18O/16O ratios. Tritium level measurements were measured to evaluate the water residence time in the aquifers; since these levels are presently so low in the southern hemisphere electrolytic enrichment was required.

  19. Data Mining and Machine Learning in Astronomy

    NASA Astrophysics Data System (ADS)

    Ball, Nicholas M.; Brunner, Robert J.

    We review the current state of data mining and machine learning in astronomy. Data Mining can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be little more than the black box application of complex computing algorithms that may give little physical insight, and provide questionable results. Here, we give an overview of the entire data mining process, from data collection through to the interpretation of results. We cover common machine learning algorithms, such as artificial neural networks and support vector machines, applications from a broad range of astronomy, emphasizing those in which data mining techniques directly contributed to improving science, and important current and future directions, including probability density functions, parallel algorithms, Peta-Scale computing, and the time domain. We conclude that, so long as one carefully selects an appropriate algorithm and is guided by the astronomical problem at hand, data mining can be very much the powerful tool, and not the questionable black box.

  20. Imaging informatics-based multimedia ePR system for data management and decision support in rehabilitation research

    NASA Astrophysics Data System (ADS)

    Wang, Ximing; Verma, Sneha; Qin, Yi; Sterling, Josh; Zhou, Alyssa; Zhang, Jeffrey; Martinez, Clarisa; Casebeer, Narissa; Koh, Hyunwook; Winstein, Carolee; Liu, Brent

    2013-03-01

    With the rapid development of science and technology, large-scale rehabilitation centers and clinical rehabilitation trials usually involve significant volumes of multimedia data. Due to the global aging crisis, millions of new patients with age-related chronic diseases will produce huge amounts of data and contribute to soaring costs of medical care. Hence, a solution for effective data management and decision support will significantly reduce the expenditure and finally improve the patient life quality. Inspired from the concept of the electronic patient record (ePR), we developed a prototype system for the field of rehabilitation engineering. The system is subject or patient-oriented and customized for specific projects. The system components include data entry modules, multimedia data presentation and data retrieval. To process the multimedia data, the system includes a DICOM viewer with annotation tools and video/audio player. The system also serves as a platform for integrating decision-support tools and data mining tools. Based on the prototype system design, we developed two specific applications: 1) DOSE (a phase 1 randomized clinical trial to determine the optimal dose of therapy for rehabilitation of the arm and hand after stroke.); and 2) NEXUS project from the Rehabilitation Engineering Research Center(RERC, a NIDRR funded Rehabilitation Engineering Research Center). Currently, the system is being evaluated in the context of the DOSE trial with a projected enrollment of 60 participants over 5 years, and will be evaluated by the NEXUS project with 30 subjects. By applying the ePR concept, we developed a system in order to improve the current research workflow, reduce the cost of managing data, and provide a platform for the rapid development of future decision-support tools.

  1. Empirical advances with text mining of electronic health records.

    PubMed

    Delespierre, T; Denormandie, P; Bar-Hen, A; Josseran, L

    2017-08-22

    Korian is a private group specializing in medical accommodations for elderly and dependent people. A professional data warehouse (DWH) established in 2010 hosts all of the residents' data. Inside this information system (IS), clinical narratives (CNs) were used only by medical staff as a residents' care linking tool. The objective of this study was to show that, through qualitative and quantitative textual analysis of a relatively small physiotherapy and well-defined CN sample, it was possible to build a physiotherapy corpus and, through this process, generate a new body of knowledge by adding relevant information to describe the residents' care and lives. Meaningful words were extracted through Standard Query Language (SQL) with the LIKE function and wildcards to perform pattern matching, followed by text mining and a word cloud using R® packages. Another step involved principal components and multiple correspondence analyses, plus clustering on the same residents' sample as well as on other health data using a health model measuring the residents' care level needs. By combining these techniques, physiotherapy treatments could be characterized by a list of constructed keywords, and the residents' health characteristics were built. Feeding defects or health outlier groups could be detected, physiotherapy residents' data and their health data were matched, and differences in health situations showed qualitative and quantitative differences in physiotherapy narratives. This textual experiment using a textual process in two stages showed that text mining and data mining techniques provide convenient tools to improve residents' health and quality of care by adding new, simple, useable data to the electronic health record (EHR). When used with a normalized physiotherapy problem list, text mining through information extraction (IE), named entity recognition (NER) and data mining (DM) can provide a real advantage to describe health care, adding new medical material and helping to integrate the EHR system into the health staff work environment.

  2. Text Mining for Neuroscience

    NASA Astrophysics Data System (ADS)

    Tirupattur, Naveen; Lapish, Christopher C.; Mukhopadhyay, Snehasis

    2011-06-01

    Text mining, sometimes alternately referred to as text analytics, refers to the process of extracting high-quality knowledge from the analysis of textual data. Text mining has wide variety of applications in areas such as biomedical science, news analysis, and homeland security. In this paper, we describe an approach and some relatively small-scale experiments which apply text mining to neuroscience research literature to find novel associations among a diverse set of entities. Neuroscience is a discipline which encompasses an exceptionally wide range of experimental approaches and rapidly growing interest. This combination results in an overwhelmingly large and often diffuse literature which makes a comprehensive synthesis difficult. Understanding the relations or associations among the entities appearing in the literature not only improves the researchers current understanding of recent advances in their field, but also provides an important computational tool to formulate novel hypotheses and thereby assist in scientific discoveries. We describe a methodology to automatically mine the literature and form novel associations through direct analysis of published texts. The method first retrieves a set of documents from databases such as PubMed using a set of relevant domain terms. In the current study these terms yielded a set of documents ranging from 160,909 to 367,214 documents. Each document is then represented in a numerical vector form from which an Association Graph is computed which represents relationships between all pairs of domain terms, based on co-occurrence. Association graphs can then be subjected to various graph theoretic algorithms such as transitive closure and cycle (circuit) detection to derive additional information, and can also be visually presented to a human researcher for understanding. In this paper, we present three relatively small-scale problem-specific case studies to demonstrate that such an approach is very successful in replicating a neuroscience expert's mental model of object-object associations entirely by means of text mining. These preliminary results provide the confidence that this type of text mining based research approach provides an extremely powerful tool to better understand the literature and drive novel discovery for the neuroscience community.

  3. The future of Yellowcake: a global assessment of uranium resources and mining.

    PubMed

    Mudd, Gavin M

    2014-02-15

    Uranium (U) mining remains controversial in many parts of the world, especially in a post-Fukushima context, and often in areas with significant U resources. Although nuclear proponents point to the relatively low carbon intensity of nuclear power compared to fossil fuels, opponents argue that this will be eroded in the future as ore grades decline and energy and greenhouse gas emissions (GGEs) intensity increases as a result. Invariably both sides fail to make use of the increasingly available data reported by some U mines through sustainability reporting - allowing a comprehensive assessment of recent trends in the energy and GGE intensity of U production, as well as combining this with reported mineral resources to allow more comprehensive modelling of future energy and GGEs intensity. In this study, detailed data sets are compiled on reported U resources by deposit type, as well as mine production, energy and GGE intensity. Some important aspects included are the relationship between ore grade, deposit type and recovery, which are crucial in future projections of U mining. Overall, the paper demonstrates that there are extensive U resources known to meet potential short to medium term demand, although the future of U mining remains uncertain due to the doubt about the future of nuclear power as well as a range of complex social, environmental, economic and some site-specific technical issues. Copyright © 2013 Elsevier B.V. All rights reserved.

  4. Planform: an application and database of graph-encoded planarian regenerative experiments.

    PubMed

    Lobo, Daniel; Malone, Taylor J; Levin, Michael

    2013-04-15

    Understanding the mechanisms governing the regeneration capabilities of many organisms is a fundamental interest in biology and medicine. An ever-increasing number of manipulation and molecular experiments are attempting to discover a comprehensive model for regeneration, with the planarian flatworm being one of the most important model species. Despite much effort, no comprehensive, constructive, mechanistic models exist yet, and it is now clear that computational tools are needed to mine this huge dataset. However, until now, there is no database of regenerative experiments, and the current genotype-phenotype ontologies and databases are based on textual descriptions, which are not understandable by computers. To overcome these difficulties, we present here Planform (Planarian formalization), a manually curated database and software tool for planarian regenerative experiments, based on a mathematical graph formalism. The database contains more than a thousand experiments from the main publications in the planarian literature. The software tool provides the user with a graphical interface to easily interact with and mine the database. The presented system is a valuable resource for the regeneration community and, more importantly, will pave the way for the application of novel artificial intelligence tools to extract knowledge from this dataset. The database and software tool are freely available at http://planform.daniel-lobo.com.

  5. Mining planer with pivotal tool holder

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Braun, E.; Braun, G.

    1983-10-04

    A planer assembly is disclosed for winning minerals from a mineral face comprising, a planer guide member, at least two planers slidably engaged on the guide member for movement in a travel direction along the face, a hinge interconnecting the two planers for transferring rotational moment applied to the planers and a planer tool holder mounted on each planer which carries a tool. The planer tool holder is positionable at an angle to make a cut of a selected depth with the depth increasing from planer to planer in a direction opposite the travel direction.

  6. A Data Mining Approach to Study the Impact of the Methodology Followed in Chemistry Lab Classes on the Weight Attributed by the Students to the Lab Work on Learning and Motivation

    ERIC Educational Resources Information Center

    Figueiredo, M.; Esteves, L.; Neves, J.; Vicente, H.

    2016-01-01

    This study reports the use of data mining tools in order to examine the influence of the methodology used in chemistry lab classes, on the weight attributed by the students to the lab work on learning and own motivation. The answer frequency analysis was unable to discriminate the opinions expressed by the respondents according to the type of the…

  7. From Geometry to Diagnosis: Experiences of Geomatics in Structural Engineering

    NASA Astrophysics Data System (ADS)

    Riveiro, B.; Arias, P.; Armesto, J.; Caamaño, J. C.; Solla, M.

    2012-07-01

    Terrestrial photogrammetry and laser scanning are technologies that have been successfully used for metric surveying and 3D modelling in many different fields (archaeological and architectural documentation, industrial retrofitting, mining, structural monitoring, road surveying, etc.). In the case of structural applications, these techniques have been successfully applied to 3D modelling and sometimes monitoring; but they have not been sufficiently implemented to date, as routine tools in infrastructure management systems, in terms of automation of data processing and integration in the condition assessment procedures. In this context, this paper presents a series of experiences in the usage of terrestrial photogrammetry and laser scanning in the context of dimensional and structural evaluation of structures. These experiences are particularly focused on historical masonry structures, but modern prestressed concrete bridges are also investigated. The development of methodological procedures for data collection, and data integration in some cases, is tackled for each particular structure (with access limitations, geometrical configuration, range of measurement, etc.). The accurate geometrical information provided by both terrestrial techniques motivates the implementation of such results in the complex, and sometimes slightly approximated, geometric scene that is frequently used in structural analysis. In this sense, quantitative evaluating of the influence of real and accurate geometry in structural analysis results must be carried out. As main result in this paper, a series of experiences based on the usage of photogrammetric and laser scanning to structural engineering are presented.

  8. Environmental consequences of the Retsof Salt Mine roof collapse

    USGS Publications Warehouse

    Yager, Richard M.

    2013-01-01

    In 1994, the largest salt mine in North America, which had been in operation for more than 100 years, catastrophically flooded when the mine ceiling collapsed. In addition to causing the loss of the mine and the mineral resources it provided, this event formed sinkholes, caused widespread subsidence to land, caused structures to crack and subside, and changed stream flow and erosion patterns. Subsequent flooding of the mine drained overlying aquifers, changed the groundwater salinity distribution (rendering domestic wells unusable), and allowed locally present natural gas to enter dwellings through water wells. Investigations including exploratory drilling, hydrologic and water-quality monitoring, geologic and geophysical studies, and numerical simulation of groundwater flow, salinity, and subsidence have been effective tools in understanding the environmental consequences of the mine collapse and informing decisions about management of those consequences for the future. Salt mines are generally dry, but are susceptible to leaks and can become flooded if groundwater from overlying aquifers or surface water finds a way downward into the mined cavity through hundreds of feet of rock. With its potential to flood the entire mine cavity, groundwater is a constant source of concern for mine operators. The problem is compounded by the viscous nature of salt and the fact that salt mines commonly lie beneath water-bearing aquifers. Salt (for example halite or potash) deforms and “creeps” into the mined openings over time spans that range from years to centuries. This movement of salt can destabilize the overlying rock layers and lead to their eventual sagging and collapse, creating permeable pathways for leakage of water and depressions or openings at land surface, such as sinkholes. Salt is also highly soluble in water; therefore, whenever water begins to flow into a salt mine, the channels through which it flows increase in diameter as the surrounding salt dissolves. Some mines leak at a slow rate for decades before a section of rock gives way, allowing what initially was a trickle of water to suddenly become a cascade and finally a torrent. Other mines become flooded and are destroyed when an errant drill hole punctures the mine ceiling, allowing water from overlying sources to flow into the mine. Either scenario can cause catastrophic flooding and permanent loss of the mine. Occasionally, a mine that has remained dry for a century will undergo a roof collapse that results in flooding.

  9. Active Storage with Analytics Capabilities and I/O Runtime System for Petascale Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Choudhary, Alok

    Computational scientists must understand results from experimental, observational and computational simulation generated data to gain insights and perform knowledge discovery. As systems approach the petascale range, problems that were unimaginable a few years ago are within reach. With the increasing volume and complexity of data produced by ultra-scale simulations and high-throughput experiments, understanding the science is largely hampered by the lack of comprehensive I/O, storage, acceleration of data manipulation, analysis, and mining tools. Scientists require techniques, tools and infrastructure to facilitate better understanding of their data, in particular the ability to effectively perform complex data analysis, statistical analysis and knowledgemore » discovery. The goal of this work is to enable more effective analysis of scientific datasets through the integration of enhancements in the I/O stack, from active storage support at the file system layer to MPI-IO and high-level I/O library layers. We propose to provide software components to accelerate data analytics, mining, I/O, and knowledge discovery for large-scale scientific applications, thereby increasing productivity of both scientists and the systems. Our approaches include 1) design the interfaces in high-level I/O libraries, such as parallel netCDF, for applications to activate data mining operations at the lower I/O layers; 2) Enhance MPI-IO runtime systems to incorporate the functionality developed as a part of the runtime system design; 3) Develop parallel data mining programs as part of runtime library for server-side file system in PVFS file system; and 4) Prototype an active storage cluster, which will utilize multicore CPUs, GPUs, and FPGAs to carry out the data mining workload.« less

  10. Geochemical Analyses of Surface and Shallow Gas Flux and Composition Over a Proposed Carbon Sequestration Site in Eastern Kentucky

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Thomas Parris; Michael Solis; Kathryn Takacs

    2009-12-31

    Using soil gas chemistry to detect leakage from underground reservoirs (i.e. microseepage) requires that the natural range of soil gas flux and chemistry be fully characterized. To meet this need, soil gas flux (CO{sub 2}, CH{sub 4}) and the bulk (CO{sub 2}, CH{sub 4}) and isotopic chemistry ({delta}{sup 13}C-CO2) of shallow soil gases (<1 m, 3.3 ft) were measured at 25 locations distributed among two active oil and gas fields, an active strip mine, and a relatively undisturbed research forest in eastern Kentucky. The measurements apportion the biologic, atmospheric, and geologic influences on soil gas composition under varying degrees ofmore » human surface disturbance. The measurements also highlight potential challenges in using soil gas chemistry as a monitoring tool where the surface cover consists of reclaimed mine land or is underlain by shallow coals. For example, enrichment of ({delta}{sup 13}C-CO2) and high CH{sub 4} concentrations in soils have been historically used as indicators of microseepage, but in the reclaimed mine lands similar soil chemistry characteristics likely result from dissolution of carbonate cement in siliciclastic clasts having {delta}{sup 13}C values close to 0{per_thousand} and degassing of coal fragments. The gases accumulate in the reclaimed mine land soils because intense compaction reduces soil permeability, thereby impeding equilibration with the atmosphere. Consequently, the reclaimed mine lands provide a false microseepage anomaly. Further potential challenges arise from low permeability zones associated with compacted soils in reclaimed mine lands and shallow coals in undisturbed areas that might impede upward gas migration. To investigate the effect of these materials on gas migration and composition, four 10 m (33 ft) deep monitoring wells were drilled in reclaimed mine material and in undisturbed soils with and without coals. The wells, configured with sampling zones at discrete intervals, show the persistence of some of the aforementioned anomalies at depth. Moreover, high CO{sub 2} concentrations associated with coals in the vadose zone suggest a strong affinity for adsorbing CO{sub 2}. Overall, the low permeability of reclaimed mine lands and coals and CO2 adsorption by the latter is likely to reduce the ability of surface geochemistry tools to detect a microseepage signal.« less

  11. Analog Tools in Digital History Classrooms: An Activity-Theory Case Study of Learning Opportunities in Digital Humanities

    ERIC Educational Resources Information Center

    Craig, Kalani

    2017-01-01

    Digital humanities is often presented as classroom savior, a narrative that competes against the idea that technology virtually guarantees student distraction. However, these arguments are often based on advocacy and anecdote, so we lack systematic research that explores the effect of digital-humanities tools and techniques such as text mining,…

  12. Phylogenetic Reconstruction as a Broadly Applicable Teaching Tool in the Biology Classroom: The Value of Data in Estimating Likely Answers

    ERIC Educational Resources Information Center

    Julius, Matthew L.; Schoenfuss, Heiko L.

    2006-01-01

    This laboratory exercise introduces students to a fundamental tool in evolutionary biology--phylogenetic inference. Students are required to create a data set via observation and through mining preexisting data sets. These student data sets are then used to develop and compare competing hypotheses of vertebrate phylogeny. The exercise uses readily…

  13. Saccharomyces cerevisiae as a tool for mining, studying and engineering fungal polyketide synthases

    PubMed Central

    Bond, Carly; Tang, Yi; Li, Li

    2016-01-01

    Small molecule secondary metabolites produced by organisms such as plants, bacteria, and fungi form a fascinating and important group of natural products, many of which have shown promise as medicines. Fungi in particular have been important sources of natural product polyketide pharmaceuticals. While the structural complexity of these polyketides makes them interesting and useful bioactive compounds, these same features also make them difficult and expensive to prepare and scale-up using synthetic methods. Currently, nearly all commercial polyketides are prepared through fermentation or semi-synthesis. However, elucidation and engineering of polyketide pathways in the native filamentous fungi hosts are often hampered due to a lack of established genetic tools and of understanding of the regulation of fungal secondary metabolisms. Saccharomyces cerevisiae has many advantages beneficial to the study and development of polyketide pathways from filamentous fungi due to its extensive genetic toolbox and well-studied metabolism. This review highlights the benefits S. cerevisiae provides as a tool for mining, studying, and engineering fungal polyketide synthases (PKSs), as well as notable insights this versatile tool has given us into the mechanisms and products of fungal PKSs. PMID:26850128

  14. Saccharomyces cerevisiae as a tool for mining, studying and engineering fungal polyketide synthases.

    PubMed

    Bond, Carly; Tang, Yi; Li, Li

    2016-04-01

    Small molecule secondary metabolites produced by organisms such as plants, bacteria, and fungi form a fascinating and important group of natural products, many of which have shown promise as medicines. Fungi in particular have been important sources of natural product polyketide pharmaceuticals. While the structural complexity of these polyketides makes them interesting and useful bioactive compounds, these same features also make them difficult and expensive to prepare and scale-up using synthetic methods. Currently, nearly all commercial polyketides are prepared through fermentation or semi-synthesis. However, elucidation and engineering of polyketide pathways in the native filamentous fungi hosts are often hampered due to a lack of established genetic tools and of understanding of the regulation of fungal secondary metabolisms. Saccharomyces cerevisiae has many advantages beneficial to the study and development of polyketide pathways from filamentous fungi due to its extensive genetic toolbox and well-studied metabolism. This review highlights the benefits S. cerevisiae provides as a tool for mining, studying, and engineering fungal polyketide synthases (PKSs), as well as notable insights this versatile tool has given us into the mechanisms and products of fungal PKSs. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Bioaccessibility and risk assessment of cadmium from uncooked rice using an in vitro digestion model.

    PubMed

    Yang, Lin-Sheng; Zhang, Xiu-Wu; Li, Yong-Hua; Li, Hai-Rong; Wang, Ying; Wang, Wu-Yi

    2012-01-01

    Cadmium (Cd)-contaminated rice is one of the most important sources of cadmium exposure in the general population from some Asian countries. This study was conducted to assess cadmium exposure from uncooked rice in rural mining areas based on the bioaccessible fraction of cadmium using an in vitro digestion model. The biotoxic effects of cadmium in uncooked rice from mining areas were much higher than those in the control area, based not only on their higher total concentration (52.49 vs. 7.93 μg kg(-1)), but also on their higher bioaccessibility (16.94% vs. 2.38%). In the mining areas, the bioaccessible fraction of cadmium in uncooked rice has a significant positive correlation with the total concentration of cadmium in rice and there was quarterly unsafe rice to the public in the mining areas. The results indicated that the in vitro digestion model could be a useful and economical tool for providing the solubilization or bioaccessibility of uncooked rice in the mining area. The results could be helpful in conducting future experiments of cooked rice in the vitro model.

  16. [Research of bleeding volume and method in blood-letting acupuncture therapy based on data mining].

    PubMed

    Liu, Xin; Jia, Chun-Sheng; Wang, Jian-Ling; Du, Yu-Zhu; Zhang, Xiao-Xu; Shi, Jing; Li, Xiao-Feng; Sun, Yan-Hui; Zhang, Shen; Zhang, Xuan-Ping; Gang, Wei-Juan

    2014-03-01

    Through computer-based technology and data mining method, with treatment in cases of bloodletting acupuncture therapy in collected literature as sample data, the association rule in data mining was applied. According to self-built database platform, the data was input, arranged and summarized, and eventually required data was acquired to perform the data mining of bleeding volume and method in blood-letting acupuncture therapy, which summarized its application rules and clinical values to provide better guide for clinical practice. There were 9 kinds of blood-letting tools in the literature, in which the frequency of three-edge needle was the highest, accounting for 84.4% (1239/1468). The bleeding volume was classified into six levels, in which less volume (less than 0.1 mL) had the highest frequency (401 times). According to the results of the data mining, blood-letting acupuncture therapy was widely applied in clinical practice of acupuncture, in which use of three-edge needle and less volume (less than 0.1 mL) of blood were the most common, however, there was no central tendency in general.

  17. Design pattern mining using distributed learning automata and DNA sequence alignment.

    PubMed

    Esmaeilpour, Mansour; Naderifar, Vahideh; Shukur, Zarina

    2014-01-01

    Over the last decade, design patterns have been used extensively to generate reusable solutions to frequently encountered problems in software engineering and object oriented programming. A design pattern is a repeatable software design solution that provides a template for solving various instances of a general problem. This paper describes a new method for pattern mining, isolating design patterns and relationship between them; and a related tool, DLA-DNA for all implemented pattern and all projects used for evaluation. DLA-DNA achieves acceptable precision and recall instead of other evaluated tools based on distributed learning automata (DLA) and deoxyribonucleic acid (DNA) sequences alignment. The proposed method mines structural design patterns in the object oriented source code and extracts the strong and weak relationships between them, enabling analyzers and programmers to determine the dependency rate of each object, component, and other section of the code for parameter passing and modular programming. The proposed model can detect design patterns better that available other tools those are Pinot, PTIDEJ and DPJF; and the strengths of their relationships. The result demonstrate that whenever the source code is build standard and non-standard, based on the design patterns, then the result of the proposed method is near to DPJF and better that Pinot and PTIDEJ. The proposed model is tested on the several source codes and is compared with other related models and available tools those the results show the precision and recall of the proposed method, averagely 20% and 9.6% are more than Pinot, 27% and 31% are more than PTIDEJ and 3.3% and 2% are more than DPJF respectively. The primary idea of the proposed method is organized in two following steps: the first step, elemental design patterns are identified, while at the second step, is composed to recognize actual design patterns.

  18. GraphBench

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sukumar, Sreenivas R.; Hong, Seokyong; Lee, Sangkeun

    2016-06-01

    GraphBench is a benchmark suite for graph pattern mining and graph analysis systems. The benchmark suite is a significant addition to conducting apples-apples comparison of graph analysis software (databases, in-memory tools, triple stores, etc.)

  19. Abandoned Mine Lands: Revitalization and Reuse

    EPA Pesticide Factsheets

    EPA recognizes that reuse opportunities at AMLs may provide the critical impetus to expedite environmental cleanup. EPA’s AML Team is dedicated to providing tools and resources to support the reuse of AMLs.

  20. ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials

    PubMed Central

    2012-01-01

    Clinical trials are mandatory protocols describing medical research on humans and among the most valuable sources of medical practice evidence. Searching for trials relevant to some query is laborious due to the immense number of existing protocols. Apart from search, writing new trials includes composing detailed eligibility criteria, which might be time-consuming, especially for new researchers. In this paper we present ASCOT, an efficient search application customised for clinical trials. ASCOT uses text mining and data mining methods to enrich clinical trials with metadata, that in turn serve as effective tools to narrow down search. In addition, ASCOT integrates a component for recommending eligibility criteria based on a set of selected protocols. PMID:22595088

  1. ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials.

    PubMed

    Korkontzelos, Ioannis; Mu, Tingting; Ananiadou, Sophia

    2012-04-30

    Clinical trials are mandatory protocols describing medical research on humans and among the most valuable sources of medical practice evidence. Searching for trials relevant to some query is laborious due to the immense number of existing protocols. Apart from search, writing new trials includes composing detailed eligibility criteria, which might be time-consuming, especially for new researchers. In this paper we present ASCOT, an efficient search application customised for clinical trials. ASCOT uses text mining and data mining methods to enrich clinical trials with metadata, that in turn serve as effective tools to narrow down search. In addition, ASCOT integrates a component for recommending eligibility criteria based on a set of selected protocols.

  2. Mining Security Pipe(TSM)with Underground GPS Global(RSPG)Escape Security Device in Underground Mining

    NASA Astrophysics Data System (ADS)

    Giménez, Rafael Barrionuevo

    2016-06-01

    TSM is escape pipe in case of collapse of terrain. The TSM is a passive security tool placed underground to connect the work area with secure area (mining gallery mainly). TSM is light and hand able pipe made with aramid (Kevlar), carbon fibre, or other kind of new material. The TSM will be placed as a pipe line network with many in/out entrances/exits to rich and connect problem work areas with another parts in a safe mode. Different levels of instrumentation could be added inside such as micro-led escape way suggested, temperature, humidity, level of oxygen, etc.). The open hardware and software like Arduino will be the heart of control and automation system.

  3. Gold-rush in a forested El Dorado: deforestation leakages and the need for regional cooperation

    NASA Astrophysics Data System (ADS)

    Dezécache, Camille; Faure, Emmanuel; Gond, Valéry; Salles, Jean-Michel; Vieilledent, Ghislain; Hérault, Bruno

    2017-03-01

    Tropical forests of the Guiana Shield are the most affected by gold-mining in South America, experiencing an exponential increase in deforestation since the early 2000’s. Using yearly deforestation data encompassing Guyana, Suriname, French Guiana and the Brazilian State of Amapá, we demonstrated a strong relationship between deforestation due to gold-mining and gold-prices at the regional scale. In order to assess additional drivers of deforestation due to gold-mining, we focused on the national scale and highlighted the heterogeneity of the response to gold-prices under different political contexts. Deforestation due to gold-mining over the Guiana Shield occurs mainly in Guyana and Suriname. On the contrary, past and current repressive policies in Amapá and French Guiana likely contribute to the decorrelation of deforestation and gold prices. In this work, we finally present a case study focusing on French Guiana and Suriname, two neighbouring countries with very different levels of law enforcement against illegal gold-mining. We developed a modelling framework to estimate potential deforestation leakages from French Guiana to Suriname in the border areas. Based on our assumptions, we estimated a decrease in deforestation due to gold-mining of approx. 4300 hectares in French Guiana and an increase of approx. 12 100 hectares in Suriname in response to the active military repression of illegal gold-mining launched in French Guiana. Gold-mining in the Guiana Shield provides challenging questions regarding REDD+ implementation. These questions are discussed at the end of this study and are important to policy makers who need to provide sustainable alternative employment to local populations in order to ensure the effectiveness of environmental policies.

  4. The persistence of lead from past gasoline emissions and mining drainage in a large riparian system: Evidence from lead isotopes in the Sacramento River, California

    USGS Publications Warehouse

    Dunlap, C.E.; Alpers, Charles N.; Bouse, R.; Taylor, Howard E.; Unruh, D.M.; Flegal, A.R.

    2008-01-01

    Lead concentrations and isotope ratios measured in river water colloids and streambed sediment samples along 426 km of the Sacramento River, California reveal that the influence of lead from the historical mining of massive sulfide deposits in the West Shasta Cu-mining district (at the headwaters of the Sacramento River) is confined to a 60 km stretch of river immediately downstream of that mining region, whereas inputs from past leaded gasoline emissions and historical hydraulic Au-mining in the Sierra Nevadan foothills are the dominant lead sources in the remaining 370 km of the river. Binary mixing calculations suggest that more than 50% of the lead in the Sacramento River outside of the region of influence of the West Shasta Cu-mining district is derived from past depositions of leaded gasoline emissions. This predominance is the first direct documentation of the geographic extent of gasoline lead persistence throughout a large riparian system (>160,000 km2) and corroborates previous observations based on samples taken at the mouth of the Sacramento River. In addition, new analyses of sediment samples from the hydraulic gold mines of the Sierra Nevada foothills confirm the present-day fluxes into the Sacramento River of contaminant metals derived from historical hydraulic Au-mining that occurred during the latter half of the 19th and early part of the 20th centuries. These fluxes occur predominantly during periods of elevated river discharge associated with heavy winter precipitation in northern California. In the broadest context, the study demonstrates the potential for altered precipitation patterns resulting from climate change to affect the mobility and transport of soil-bound contaminants in the surface environment. ?? 2008 Elsevier Ltd.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Solc, J.

    The reclamation effort typically deals with consequences of mining activity instead of being planned well before the mining. Detailed assessment of principal hydro- and geochemical processes participating in pore and groundwater chemistry evolution was carried out at three surface mine localities in North Dakota-the Fritz mine, the Indian Head mine, and the Velva mine. The geochemical model MINTEQUA2 and advanced statistical analysis coupled with traditional interpretive techniques were used to determine site-specific environmental characteristics and to compare the differences between study sites. Multivariate statistical analysis indicates that sulfate, magnesium, calcium, the gypsum saturation index, and sodium contribute the most tomore » overall differences in groundwater chemistry between study sites. Soil paste extract pH and EC measurements performed on over 3700 samples document extremely acidic soils at the Fritz mine. The number of samples with pH <5.5 reaches 80%-90% of total samples from discrete depth near the top of the soil profile at the Fritz mine. Soil samples from Indian Head and Velva do not indicate the acidity below the pH of 5.5 limit. The percentage of samples with EC > 3 mS cm{sup -1} is between 20% and 40% at the Fritz mine and below 20% for samples from Indian Head and Velva. The results of geochemical modeling indicate an increased tendency for gypsum saturation within the vadose zone, particularly within the lands disturbed by mining activity. This trend is directly associated with increased concentrations of sulfate anions as a result of mineral oxidation. Geochemical modeling, statistical analysis, and soil extract pH and EC measurements proved to be reliable, fast, and relatively cost-effective tools for the assessment of soil acidity, the extent of the oxidation zone, and the potential for negative impact on pore and groundwater chemistry.« less

  6. Analysis, Mining and Visualization Service at NCSA

    NASA Astrophysics Data System (ADS)

    Wilhelmson, R.; Cox, D.; Welge, M.

    2004-12-01

    NCSA's goal is to create a balanced system that fully supports high-end computing as well as: 1) high-end data management and analysis; 2) visualization of massive, highly complex data collections; 3) large databases; 4) geographically distributed Grid computing; and 5) collaboratories, all based on a secure computational environment and driven with workflow-based services. To this end NCSA has defined a new technology path that includes the integration and provision of cyberservices in support of data analysis, mining, and visualization. NCSA has begun to develop and apply a data mining system-NCSA Data-to-Knowledge (D2K)-in conjunction with both the application and research communities. NCSA D2K will enable the formation of model-based application workflows and visual programming interfaces for rapid data analysis. The Java-based D2K framework, which integrates analytical data mining methods with data management, data transformation, and information visualization tools, will be configurable from the cyberservices (web and grid services, tools, ..) viewpoint to solve a wide range of important data mining problems. This effort will use modules, such as a new classification methods for the detection of high-risk geoscience events, and existing D2K data management, machine learning, and information visualization modules. A D2K cyberservices interface will be developed to seamlessly connect client applications with remote back-end D2K servers, providing computational resources for data mining and integration with local or remote data stores. This work is being coordinated with SDSC's data and services efforts. The new NCSA Visualization embedded workflow environment (NVIEW) will be integrated with D2K functionality to tightly couple informatics and scientific visualization with the data analysis and management services. Visualization services will access and filter disparate data sources, simplifying tasks such as fusing related data from distinct sources into a coherent visual representation. This approach enables collaboration among geographically dispersed researchers via portals and front-end clients, and the coupling with data management services enables recording associations among datasets and building annotation systems into visualization tools and portals, giving scientists a persistent, shareable, virtual lab notebook. To facilitate provision of these cyberservices to the national community, NCSA will be providing a computational environment for large-scale data assimilation, analysis, mining, and visualization. This will be initially implemented on the new 512 processor shared memory SGI's recently purchased by NCSA. In addition to standard batch capabilities, NCSA will provide on-demand capabilities for those projects requiring rapid response (e.g., development of severe weather, earthquake events) for decision makers. It will also be used for non-sequential interactive analysis of data sets where it is important have access to large data volumes over space and time.

  7. “Even if I were to consent, my family will never agree”: exploring autopsy services for posthumous occupational lung disease compensation among mineworkers in South Africa

    PubMed Central

    Banyini, Audrey V.; Rees, David; Gilbert, Leah

    2013-01-01

    Context In the South African mining sector, cardiorespiratory-specific autopsies are conducted under the Occupational Diseases in Mines and Works Act (ODMWA) on deceased mineworkers to determine eligibility for compensation. However, low levels of autopsy utilisation undermine the value of the service. Objective To explore enablers and barriers to consent that impact on ODMWA autopsy utilisation for posthumous monetary compensation. Methods In-depth interviews were conducted with mineworkers, widows and relatives of deceased mineworkers as well as traditional healers and mine occupational health practitioners. Results A range of socio-cultural barriers to consent for an autopsy was identified. These barriers were largely related to gendered power relations, traditional and religious beliefs, and communication and trust. Understanding these barriers presents opportunities to intervene so as to increase autopsy utilisation. Conclusions Effective interventions could include engagement with healthy mine-workers and their families and re-evaluating the permanent removal of organs. The study adds to our understanding of utilisation of the autopsy services. PMID:23364088

  8. Mining Hesitation Information by Vague Association Rules

    NASA Astrophysics Data System (ADS)

    Lu, An; Ng, Wilfred

    In many online shopping applications, such as Amazon and eBay, traditional Association Rule (AR) mining has limitations as it only deals with the items that are sold but ignores the items that are almost sold (for example, those items that are put into the basket but not checked out). We say that those almost sold items carry hesitation information, since customers are hesitating to buy them. The hesitation information of items is valuable knowledge for the design of good selling strategies. However, there is no conceptual model that is able to capture different statuses of hesitation information. Herein, we apply and extend vague set theory in the context of AR mining. We define the concepts of attractiveness and hesitation of an item, which represent the overall information of a customer's intent on an item. Based on the two concepts, we propose the notion of Vague Association Rules (VARs). We devise an efficient algorithm to mine the VARs. Our experiments show that our algorithm is efficient and the VARs capture more specific and richer information than do the traditional ARs.

  9. SPATIAL DISTRIBUTIONS OF ARSENIC EXPOSURE AND MINING COMMUNITIES FROM NHEXAS ARIZONA

    EPA Science Inventory

    Within the context of the National Human Exposure Assessment Survey (NHEXAS), metals were evaluated in the air, soil, dust, water, food, beverages, and urine of a single respondent. Potential doses were calculated for five metals including arsenic. In this paper, we seek to val...

  10. Review of Lead-Zinc Mining Impact on Landscape in the Tri-State Mining District using Small Unmanned Aerial Vehicles.

    NASA Astrophysics Data System (ADS)

    Bhakta, K. D.; Yeboah-Forson, A.

    2015-12-01

    The Tri-State lead and zinc mining district in SW Missouri, SE Kansas, and NE Oklahoma encompasses nearly 2,500 sq. miles of land and at its peak accounted for half of the US zinc (23,000,000 tons) production that surpassed one billion dollars in economic value. Once these lead and zinc rich ores were extracted, mining and milling sites were abandoned leaving behind a new landscape with numerous environmental challenges. Since 1970, most of the sites have been targeted for remediation and reclamation by federal and state agencies including the EPA. In order to capture the full extent of the impact of lead and zinc mining in the Tri-State area, numerous geoscientific approaches including data from small unmanned aerial vehicle (UAV) were employed to investigate the influence of mining in the study area. The study presented here is focused on observational assessment of the existing landscape using multiple commercial high-definitions data from UAVs to study different sites across areas of concern in the three states. Primary results (images) gathered and analyzed DEM and GIS data from abandoned mines showed the potential to provide a quick snapshot of successful or unsuccessful remediated areas. Although research and remediation of the Tri-State mining district are a continuous process, evidence from this geomorphic study suggest that UAVs can provide a quick overview of the remediated landscape or serve as a primary background tool for a more detail site-specific environmental study.

  11. Assumption-aware tools and agency; an interrogation of the primary artifacts of the program evaluation and design profession in working with complex evaluands and complex contexts.

    PubMed

    Morrow, Nathan; Nkwake, Apollo M

    2016-12-01

    Like artisans in a professional guild, we evaluators create tools to suit our ever evolving practice. The tools we use as evaluators are the primary artifacts of our profession, reflect our practice and embody an amalgamation of paradigms and assumptions. With the increasing shifts in evaluation purposes from judging program worth to understanding how programs work, the evaluator's role is changing to that of facilitating stakeholders in a learning process. This involves clarifying purposes and choices, as well as unearthing critical assumptions. In such a role, evaluators become major tool-users and begin to innovate with small refinements or produce completely new tools to fit a specific challenge or context. We interrogate the form and function of 12 tools used by evaluators when working with complex evaluands and complex contexts. The form is described in terms of traditional qualitative techniques and particular characteristics of the elements, use and presentation of each tool. Then the function of each tool is analyzed with respect to articulating assumptions and affecting the agency of evaluators and stakeholders in complex contexts. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Neural networks in astronomy.

    PubMed

    Tagliaferri, Roberto; Longo, Giuseppe; Milano, Leopoldo; Acernese, Fausto; Barone, Fabrizio; Ciaramella, Angelo; De Rosa, Rosario; Donalek, Ciro; Eleuteri, Antonio; Raiconi, Giancarlo; Sessa, Salvatore; Staiano, Antonino; Volpicelli, Alfredo

    2003-01-01

    In the last decade, the use of neural networks (NN) and of other soft computing methods has begun to spread also in the astronomical community which, due to the required accuracy of the measurements, is usually reluctant to use automatic tools to perform even the most common tasks of data reduction and data mining. The federation of heterogeneous large astronomical databases which is foreseen in the framework of the astrophysical virtual observatory and national virtual observatory projects, is, however, posing unprecedented data mining and visualization problems which will find a rather natural and user friendly answer in artificial intelligence tools based on NNs, fuzzy sets or genetic algorithms. This review is aimed to both astronomers (who often have little knowledge of the methodological background) and computer scientists (who often know little about potentially interesting applications), and therefore will be structured as follows: after giving a short introduction to the subject, we shall summarize the methodological background and focus our attention on some of the most interesting fields of application, namely: object extraction and classification, time series analysis, noise identification, and data mining. Most of the original work described in the paper has been performed in the framework of the AstroNeural collaboration (Napoli-Salerno).

  13. Detecting Diseases in Medical Prescriptions Using Data Mining Tools and Combining Techniques.

    PubMed

    Teimouri, Mehdi; Farzadfar, Farshad; Soudi Alamdari, Mahsa; Hashemi-Meshkini, Amir; Adibi Alamdari, Parisa; Rezaei-Darzi, Ehsan; Varmaghani, Mehdi; Zeynalabedini, Aysan

    2016-01-01

    Data about the prevalence of communicable and non-communicable diseases, as one of the most important categories of epidemiological data, is used for interpreting health status of communities. This study aims to calculate the prevalence of outpatient diseases through the characterization of outpatient prescriptions. The data used in this study is collected from 1412 prescriptions for various types of diseases from which we have focused on the identification of ten diseases. In this study, data mining tools are used to identify diseases for which prescriptions are written. In order to evaluate the performances of these methods, we compare the results with Naïve method. Then, combining methods are used to improve the results. Results showed that Support Vector Machine, with an accuracy of 95.32%, shows better performance than the other methods. The result of Naive method, with an accuracy of 67.71%, is 20% worse than Nearest Neighbor method which has the lowest level of accuracy among the other classification algorithms. The results indicate that the implementation of data mining algorithms resulted in a good performance in characterization of outpatient diseases. These results can help to choose appropriate methods for the classification of prescriptions in larger scales.

  14. Possibilities of magnetotelluric methods in geophysical exploration for ore minerals

    NASA Astrophysics Data System (ADS)

    Varentsov M., Iv.; Kulikov, V. A.; Yakovlev, A. G.; Yakovlev, D. V.

    2013-05-01

    In the past decade, the applications of magnetotelluric method in the electric prospecting for ore bodies have been rapidly progressing. In the present work, we summarize the first results on this way. We discuss the specificity of the geoelectrical models in the problems of mining prospecting for ore bodies. The state-of-the-art capabilities of the method, which rely on the synchronous observation systems and the procedure of joint inversion of magnetotelluric and magnetovariational responses, are considered in the context of ore mineral exploration. The results of modeling a typical mining audio-magnetotelluric survey for ore minerals are presented. On the basis of these simulations and the data provided by in-situ soundings, the efficient approaches to the processing, analysis, and inversion of these data are discussed and illustrated. The future trends in magnetotellurics as applied to the mining prospecting are analyzed.

  15. The determination of methane resources from liquidated coal mines

    NASA Astrophysics Data System (ADS)

    Trenczek, Stanisław

    2017-11-01

    The article refers to methane presented in hard coal seams, which may pose a serious risk to workers, as evidenced by examples of incidents, and may also be a high energy source. That second issue concerns the possibility of obtaining methane from liquidated coal mines. There is discussed the current methodology for determination of methane resources from hard coal deposits. Methods of assessing methane emissions from hard coal deposits are given, including the degree of rock mass fracture, which is affected and not affected by mining. Additional criteria for methane recovery from the methane deposit are discussed by one example (of many types) of methane power generation equipment in the context of the estimation of potential viable resources. Finally, the concept of “methane resource exploitation from coal mine” refers to the potential for exploitation of the resource and the acquisition of methane for business purposes.

  16. Local Sustainability and Gender Ratio: Evaluating the Impacts of Mining and Tourism on Sustainable Development in Yunnan, China

    PubMed Central

    Huang, Ganlin; Ali, Saleem

    2015-01-01

    This study employed rapid evaluation methods to investigate how the leading industries of mining and tourism impact sustainability as manifest through social, economic and environmental dimensions in Yunnan, China. Within the social context, we also consider the differentiated impact on gender ratio—which is a salient feature of sustained development trajectories. Our results indicate that mining areas performed better than tourism areas in economic aspects but fell behind in social development, especially regarding the issue of gender balance. Conclusions on environmental status cannot be drawn due to a lack of data.  The results from the environmental indicators are mixed. Our study demonstrates that rapid evaluation using currently available data can provide a means of greater understanding regarding local sustainability and highlights areas that need attention from policy makers, agencies and academia. PMID:25607602

  17. Integrated pathway-based transcription regulation network mining and visualization based on gene expression profiles.

    PubMed

    Kibinge, Nelson; Ono, Naoaki; Horie, Masafumi; Sato, Tetsuo; Sugiura, Tadao; Altaf-Ul-Amin, Md; Saito, Akira; Kanaya, Shigehiko

    2016-06-01

    Conventionally, workflows examining transcription regulation networks from gene expression data involve distinct analytical steps. There is a need for pipelines that unify data mining and inference deduction into a singular framework to enhance interpretation and hypotheses generation. We propose a workflow that merges network construction with gene expression data mining focusing on regulation processes in the context of transcription factor driven gene regulation. The pipeline implements pathway-based modularization of expression profiles into functional units to improve biological interpretation. The integrated workflow was implemented as a web application software (TransReguloNet) with functions that enable pathway visualization and comparison of transcription factor activity between sample conditions defined in the experimental design. The pipeline merges differential expression, network construction, pathway-based abstraction, clustering and visualization. The framework was applied in analysis of actual expression datasets related to lung, breast and prostrate cancer. Copyright © 2016 Elsevier Inc. All rights reserved.

  18. An Adaptive Sensor Mining Framework for Pervasive Computing Applications

    NASA Astrophysics Data System (ADS)

    Rashidi, Parisa; Cook, Diane J.

    Analyzing sensor data in pervasive computing applications brings unique challenges to the KDD community. The challenge is heightened when the underlying data source is dynamic and the patterns change. We introduce a new adaptive mining framework that detects patterns in sensor data, and more importantly, adapts to the changes in the underlying model. In our framework, the frequent and periodic patterns of data are first discovered by the Frequent and Periodic Pattern Miner (FPPM) algorithm; and then any changes in the discovered patterns over the lifetime of the system are discovered by the Pattern Adaptation Miner (PAM) algorithm, in order to adapt to the changing environment. This framework also captures vital context information present in pervasive computing applications, such as the startup triggers and temporal information. In this paper, we present a description of our mining framework and validate the approach using data collected in the CASAS smart home testbed.

  19. PARPs database: A LIMS systems for protein-protein interaction data mining or laboratory information management system

    PubMed Central

    Droit, Arnaud; Hunter, Joanna M; Rouleau, Michèle; Ethier, Chantal; Picard-Cloutier, Aude; Bourgais, David; Poirier, Guy G

    2007-01-01

    Background In the "post-genome" era, mass spectrometry (MS) has become an important method for the analysis of proteins and the rapid advancement of this technique, in combination with other proteomics methods, results in an increasing amount of proteome data. This data must be archived and analysed using specialized bioinformatics tools. Description We herein describe "PARPs database," a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS) proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, as well as data-mining of the peptides and proteins identified. Conclusion Using this pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4 and 5. PMID:18093328

  20. Using natural language processing techniques to inform research on nanotechnology

    PubMed Central

    Lewinski, Nastassja A

    2015-01-01

    Summary Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics. PMID:26199848

Top