Sample records for text mining part

  1. 40 CFR 372.23 - SIC and NAICS codes to which this Part applies.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... facilities primarily engaged in reproducing text, drawings, plans, maps, or other copy, by blueprinting...)); 212324Kaolin and Ball Clay Mining Limited to facilities operating without a mine or quarry and that are...)); 212393Other Chemical and Fertilizer Mineral Mining Limited to facilities operating without a mine or quarry...

  2. 76 FR 40649 - Indiana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-11

    ... at 312 IAC 25-6-30 Surface mining; explosives; general requirements. The full text of the program... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 914... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period on proposed...

  3. 76 FR 12849 - Kentucky Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-09

    ... (underground mining). The text of the Kentucky regulations can be found in the administrative record and online... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 917 [KY-252-FOR; OSM-2009-0011] Kentucky Regulatory Program AGENCY: Office of Surface Mining Reclamation...

  4. Automatic target validation based on neuroscientific literature mining for tractography

    PubMed Central

    Vasques, Xavier; Richardet, Renaud; Hill, Sean L.; Slater, David; Chappelier, Jean-Cedric; Pralong, Etienne; Bloch, Jocelyne; Draganski, Bogdan; Cif, Laura

    2015-01-01

    Target identification for tractography studies requires solid anatomical knowledge validated by an extensive literature review across species for each seed structure to be studied. Manual literature review to identify targets for a given seed region is tedious and potentially subjective. Therefore, complementary approaches would be useful. We propose to use text-mining models to automatically suggest potential targets from the neuroscientific literature, full-text articles and abstracts, so that they can be used for anatomical connection studies and more specifically for tractography. We applied text-mining models to three structures: two well-studied structures, since validated deep brain stimulation targets, the internal globus pallidus and the subthalamic nucleus and, the nucleus accumbens, an exploratory target for treating psychiatric disorders. We performed a systematic review of the literature to document the projections of the three selected structures and compared it with the targets proposed by text-mining models, both in rat and primate (including human). We ran probabilistic tractography on the nucleus accumbens and compared the output with the results of the text-mining models and literature review. Overall, text-mining the literature could find three times as many targets as two man-weeks of curation could. The overall efficiency of the text-mining against literature review in our study was 98% recall (at 36% precision), meaning that over all the targets for the three selected seeds, only one target has been missed by text-mining. We demonstrate that connectivity for a structure of interest can be extracted from a very large amount of publications and abstracts. We believe this tool will be useful in helping the neuroscience community to facilitate connectivity studies of particular brain regions. The text mining tools used for the study are part of the HBP Neuroinformatics Platform, publicly available at http://connectivity-brainer.rhcloud.com/. PMID:26074781

  5. 78 FR 64397 - Mississippi Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-10-29

    ... text of the program amendment available at www.regulations.gov . A. Mississippi Surface Coal Mining... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 924...; S2D2SSS08011000SX066A00033F13XS501520] Mississippi Regulatory Program AGENCY: Office of Surface Mining Reclamation and Enforcement...

  6. Biocuration workflows and text mining: overview of the BioCreative 2012 Workshop Track II.

    PubMed

    Lu, Zhiyong; Hirschman, Lynette

    2012-01-01

    Manual curation of data from the biomedical literature is a rate-limiting factor for many expert curated databases. Despite the continuing advances in biomedical text mining and the pressing needs of biocurators for better tools, few existing text-mining tools have been successfully integrated into production literature curation systems such as those used by the expert curated databases. To close this gap and better understand all aspects of literature curation, we invited submissions of written descriptions of curation workflows from expert curated databases for the BioCreative 2012 Workshop Track II. We received seven qualified contributions, primarily from model organism databases. Based on these descriptions, we identified commonalities and differences across the workflows, the common ontologies and controlled vocabularies used and the current and desired uses of text mining for biocuration. Compared to a survey done in 2009, our 2012 results show that many more databases are now using text mining in parts of their curation workflows. In addition, the workshop participants identified text-mining aids for finding gene names and symbols (gene indexing), prioritization of documents for curation (document triage) and ontology concept assignment as those most desired by the biocurators. DATABASE URL: http://www.biocreative.org/tasks/bc-workshop-2012/workflow/.

  7. PathText: a text mining integrator for biological pathway visualizations

    PubMed Central

    Kemper, Brian; Matsuzaki, Takuya; Matsuoka, Yukiko; Tsuruoka, Yoshimasa; Kitano, Hiroaki; Ananiadou, Sophia; Tsujii, Jun'ichi

    2010-01-01

    Motivation: Metabolic and signaling pathways are an increasingly important part of organizing knowledge in systems biology. They serve to integrate collective interpretations of facts scattered throughout literature. Biologists construct a pathway by reading a large number of articles and interpreting them as a consistent network, but most of the models constructed currently lack direct links to those articles. Biologists who want to check the original articles have to spend substantial amounts of time to collect relevant articles and identify the sections relevant to the pathway. Furthermore, with the scientific literature expanding by several thousand papers per week, keeping a model relevant requires a continuous curation effort. In this article, we present a system designed to integrate a pathway visualizer, text mining systems and annotation tools into a seamless environment. This will enable biologists to freely move between parts of a pathway and relevant sections of articles, as well as identify relevant papers from large text bases. The system, PathText, is developed by Systems Biology Institute, Okinawa Institute of Science and Technology, National Centre for Text Mining (University of Manchester) and the University of Tokyo, and is being used by groups of biologists from these locations. Contact: brian@monrovian.com. PMID:20529930

  8. Assessing semantic similarity of texts - Methods and algorithms

    NASA Astrophysics Data System (ADS)

    Rozeva, Anna; Zerkova, Silvia

    2017-12-01

    Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.

  9. 30 CFR 900.12 - State regulatory programs.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....12 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE INTRODUCTION § 900.12 State... to be codified under the applicable part number assigned to the State. The full text will not appear...

  10. 30 CFR 745.11 - Application and agreement.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....11 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR... approval under part 731 of this chapter, and has or may have within the State surface coal mining and... the full text of the terms of the proposed cooperative agreement as submitted or as subsequently...

  11. 29 CFR 570.119 - Fourteen-year minimum.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... occupations other than manufacturing and mining, the Secretary is authorized to issue regulations or orders... Subpart C of this part. 29-30 [Reserved] (a) Manufacturing, mining, or processing occupations; (b... of the user, the revised text is set forth as follows: § 570.119 Fourteen-year minimum. With respect...

  12. Advances in Machine Learning and Data Mining for Astronomy

    NASA Astrophysics Data System (ADS)

    Way, Michael J.; Scargle, Jeffrey D.; Ali, Kamal M.; Srivastava, Ashok N.

    2012-03-01

    Advances in Machine Learning and Data Mining for Astronomy documents numerous successful collaborations among computer scientists, statisticians, and astronomers who illustrate the application of state-of-the-art machine learning and data mining techniques in astronomy. Due to the massive amount and complexity of data in most scientific disciplines, the material discussed in this text transcends traditional boundaries between various areas in the sciences and computer science. The book's introductory part provides context to issues in the astronomical sciences that are also important to health, social, and physical sciences, particularly probabilistic and statistical aspects of classification and cluster analysis. The next part describes a number of astrophysics case studies that leverage a range of machine learning and data mining technologies. In the last part, developers of algorithms and practitioners of machine learning and data mining show how these tools and techniques are used in astronomical applications. With contributions from leading astronomers and computer scientists, this book is a practical guide to many of the most important developments in machine learning, data mining, and statistics. It explores how these advances can solve current and future problems in astronomy and looks at how they could lead to the creation of entirely new algorithms within the data mining community.

  13. Text Classification for Organizational Researchers

    PubMed Central

    Kobayashi, Vladimer B.; Mol, Stefan T.; Berkers, Hannah A.; Kismihók, Gábor; Den Hartog, Deanne N.

    2017-01-01

    Organizations are increasingly interested in classifying texts or parts thereof into categories, as this enables more effective use of their information. Manual procedures for text classification work well for up to a few hundred documents. However, when the number of documents is larger, manual procedures become laborious, time-consuming, and potentially unreliable. Techniques from text mining facilitate the automatic assignment of text strings to categories, making classification expedient, fast, and reliable, which creates potential for its application in organizational research. The purpose of this article is to familiarize organizational researchers with text mining techniques from machine learning and statistics. We describe the text classification process in several roughly sequential steps, namely training data preparation, preprocessing, transformation, application of classification techniques, and validation, and provide concrete recommendations at each step. To help researchers develop their own text classifiers, the R code associated with each step is presented in a tutorial. The tutorial draws from our own work on job vacancy mining. We end the article by discussing how researchers can validate a text classification model and the associated output. PMID:29881249

  14. Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy.

    PubMed

    Bekhuis, Tanja

    2006-04-03

    Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians.

  15. Online discourse on fibromyalgia: text-mining to identify clinical distinction and patient concerns.

    PubMed

    Park, Jungsik; Ryu, Young Uk

    2014-10-07

    The purpose of this study was to evaluate the possibility of using text-mining to identify clinical distinctions and patient concerns in online memoires posted by patients with fibromyalgia (FM). A total of 399 memoirs were collected from an FM group website. The unstructured data of memoirs associated with FM were collected through a crawling process and converted into structured data with a concordance, parts of speech tagging, and word frequency. We also conducted a lexical analysis and phrase pattern identification. After examining the data, a set of FM-related keywords were obtained and phrase net relationships were set through a web-based visualization tool. The clinical distinction of FM was verified. Pain is the biggest issue to the FM patients. The pains were affecting body parts including 'muscles,' 'leg,' 'neck,' 'back,' 'joints,' and 'shoulders' with accompanying symptoms such as 'spasms,' 'stiffness,' and 'aching,' and were described as 'sever,' 'chronic,' and 'constant.' This study also demonstrated that it was possible to understand the interests and concerns of FM patients through text-mining. FM patients wanted to escape from the pain and symptoms, so they were interested in medical treatment and help. Also, they seemed to have interest in their work and occupation, and hope to continue to live life through the relationships with the people around them. This research shows the potential for extracting keywords to confirm the clinical distinction of a certain disease, and text-mining can help objectively understand the concerns of patients by generalizing their large number of subjective illness experiences. However, it is believed that there are limitations to the processes and methods for organizing and classifying large amounts of text, so these limits have to be considered when analyzing the results. The development of research methodology to overcome these limitations is greatly needed.

  16. BioC implementations in Go, Perl, Python and Ruby

    PubMed Central

    Liu, Wanli; Islamaj Doğan, Rezarta; Kwon, Dongseop; Marques, Hernani; Rinaldi, Fabio; Wilbur, W. John; Comeau, Donald C.

    2014-01-01

    As part of a communitywide effort for evaluating text mining and information extraction systems applied to the biomedical domain, BioC is focused on the goal of interoperability, currently a major barrier to wide-scale adoption of text mining tools. BioC is a simple XML format, specified by DTD, for exchanging data for biomedical natural language processing. With initial implementations in C++ and Java, BioC provides libraries of code for reading and writing BioC text documents and annotations. We extend BioC to Perl, Python, Go and Ruby. We used SWIG to extend the C++ implementation for Perl and one Python implementation. A second Python implementation and the Ruby implementation use native data structures and libraries. BioC is also implemented in the Google language Go. BioC modules are functional in all of these languages, which can facilitate text mining tasks. BioC implementations are freely available through the BioC site: http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net/ PMID:24961236

  17. 49 CFR 1155.2 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... otherwise provided in the text of these regulations, the following definitions apply in this part: (1... Waste Disposal Act (42 U.S.C. 6921 et seq.), mining or oil and gas waste. (5) Institutional waste means...

  18. BioC implementations in Go, Perl, Python and Ruby.

    PubMed

    Liu, Wanli; Islamaj Doğan, Rezarta; Kwon, Dongseop; Marques, Hernani; Rinaldi, Fabio; Wilbur, W John; Comeau, Donald C

    2014-01-01

    As part of a communitywide effort for evaluating text mining and information extraction systems applied to the biomedical domain, BioC is focused on the goal of interoperability, currently a major barrier to wide-scale adoption of text mining tools. BioC is a simple XML format, specified by DTD, for exchanging data for biomedical natural language processing. With initial implementations in C++ and Java, BioC provides libraries of code for reading and writing BioC text documents and annotations. We extend BioC to Perl, Python, Go and Ruby. We used SWIG to extend the C++ implementation for Perl and one Python implementation. A second Python implementation and the Ruby implementation use native data structures and libraries. BioC is also implemented in the Google language Go. BioC modules are functional in all of these languages, which can facilitate text mining tasks. BioC implementations are freely available through the BioC site: http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net/ Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.

  19. Text Mining in Biomedical Domain with Emphasis on Document Clustering.

    PubMed

    Renganathan, Vinaitheerthan

    2017-07-01

    With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

  20. Text Mining in Biomedical Domain with Emphasis on Document Clustering

    PubMed Central

    2017-01-01

    Objectives With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. Methods This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Results Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Conclusions Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise. PMID:28875048

  1. Text Mining of UU-ITE Implementation in Indonesia

    NASA Astrophysics Data System (ADS)

    Hakim, Lukmanul; Kusumasari, Tien F.; Lubis, Muharman

    2018-04-01

    At present, social media and networks act as one of the main platforms for sharing information, idea, thought and opinions. Many people share their knowledge and express their views on the specific topics or current hot issues that interest them. The social media texts have rich information about the complaints, comments, recommendation and suggestion as the automatic reaction or respond to government initiative or policy in order to overcome certain issues.This study examines the sentiment from netizensas part of citizen who has vocal sound about the implementation of UU ITE as the first cyberlaw in Indonesia as a means to identify the current tendency of citizen perception. To perform text mining techniques, this study used Twitter Rest API while R programming was utilized for the purpose of classification analysis based on hierarchical cluster.

  2. Interactive text mining with Pipeline Pilot: a bibliographic web-based tool for PubMed.

    PubMed

    Vellay, S G P; Latimer, N E Miller; Paillard, G

    2009-06-01

    Text mining has become an integral part of all research in the medical field. Many text analysis software platforms support particular use cases and only those. We show an example of a bibliographic tool that can be used to support virtually any use case in an agile manner. Here we focus on a Pipeline Pilot web-based application that interactively analyzes and reports on PubMed search results. This will be of interest to any scientist to help identify the most relevant papers in a topical area more quickly and to evaluate the results of query refinement. Links with Entrez databases help both the biologist and the chemist alike. We illustrate this application with Leishmaniasis, a neglected tropical disease, as a case study.

  3. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges

    PubMed Central

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie; Lemberger, Thomas; McEntyre, Johanna; Polson, Shawn; Xenarios, Ioannis; Arighi, Cecilia; Lu, Zhiyong

    2016-01-01

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system ‘accuracy’ remains a challenge and identify several additional common difficulties and potential research directions including (i) the ‘scalability’ issue due to the increasing need of mining information from millions of full-text articles, (ii) the ‘interoperability’ issue of integrating various text-mining systems into existing curation workflows and (iii) the ‘reusability’ issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. Finally, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators. PMID:28025348

  4. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges

    DOE PAGES

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie; ...

    2016-12-26

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system ‘accuracy’ remains a challenge and identify several additional common difficulties and potential research directions including (i) the ‘scalability’ issue due to themore » increasing need of mining information from millions of full-text articles, (ii) the ‘interoperability’ issue of integrating various text-mining systems into existing curation workflows and (iii) the ‘reusability’ issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. In conclusion, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators.« less

  5. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system ‘accuracy’ remains a challenge and identify several additional common difficulties and potential research directions including (i) the ‘scalability’ issue due to themore » increasing need of mining information from millions of full-text articles, (ii) the ‘interoperability’ issue of integrating various text-mining systems into existing curation workflows and (iii) the ‘reusability’ issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. In conclusion, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators.« less

  6. A semantic model for multimodal data mining in healthcare information systems.

    PubMed

    Iakovidis, Dimitris; Smailis, Christos

    2012-01-01

    Electronic health records (EHRs) are representative examples of multimodal/multisource data collections; including measurements, images and free texts. The diversity of such information sources and the increasing amounts of medical data produced by healthcare institutes annually, pose significant challenges in data mining. In this paper we present a novel semantic model that describes knowledge extracted from the lowest-level of a data mining process, where information is represented by multiple features i.e. measurements or numerical descriptors extracted from measurements, images, texts or other medical data, forming multidimensional feature spaces. Knowledge collected by manual annotation or extracted by unsupervised data mining from one or more feature spaces is modeled through generalized qualitative spatial semantics. This model enables a unified representation of knowledge across multimodal data repositories. It contributes to bridging the semantic gap, by enabling direct links between low-level features and higher-level concepts e.g. describing body parts, anatomies and pathological findings. The proposed model has been developed in web ontology language based on description logics (OWL-DL) and can be applied to a variety of data mining tasks in medical informatics. It utility is demonstrated for automatic annotation of medical data.

  7. Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges.

    PubMed

    Singhal, Ayush; Leaman, Robert; Catlett, Natalie; Lemberger, Thomas; McEntyre, Johanna; Polson, Shawn; Xenarios, Ioannis; Arighi, Cecilia; Lu, Zhiyong

    2016-01-01

    Text mining in the biomedical sciences is rapidly transitioning from small-scale evaluation to large-scale application. In this article, we argue that text-mining technologies have become essential tools in real-world biomedical research. We describe four large scale applications of text mining, as showcased during a recent panel discussion at the BioCreative V Challenge Workshop. We draw on these applications as case studies to characterize common requirements for successfully applying text-mining techniques to practical biocuration needs. We note that system 'accuracy' remains a challenge and identify several additional common difficulties and potential research directions including (i) the 'scalability' issue due to the increasing need of mining information from millions of full-text articles, (ii) the 'interoperability' issue of integrating various text-mining systems into existing curation workflows and (iii) the 'reusability' issue on the difficulty of applying trained systems to text genres that are not seen previously during development. We then describe related efforts within the text-mining community, with a special focus on the BioCreative series of challenge workshops. We believe that focusing on the near-term challenges identified in this work will amplify the opportunities afforded by the continued adoption of text-mining tools. Finally, in order to sustain the curation ecosystem and have text-mining systems adopted for practical benefits, we call for increased collaboration between text-mining researchers and various stakeholders, including researchers, publishers and biocurators. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.

  8. Text Mining for Adverse Drug Events: the Promise, Challenges, and State of the Art

    PubMed Central

    Harpaz, Rave; Callahan, Alison; Tamang, Suzanne; Low, Yen; Odgers, David; Finlayson, Sam; Jung, Kenneth; LePendu, Paea; Shah, Nigam H.

    2014-01-01

    Text mining is the computational process of extracting meaningful information from large amounts of unstructured text. Text mining is emerging as a tool to leverage underutilized data sources that can improve pharmacovigilance, including the objective of adverse drug event detection and assessment. This article provides an overview of recent advances in pharmacovigilance driven by the application of text mining, and discusses several data sources—such as biomedical literature, clinical narratives, product labeling, social media, and Web search logs—that are amenable to text-mining for pharmacovigilance. Given the state of the art, it appears text mining can be applied to extract useful ADE-related information from multiple textual sources. Nonetheless, further research is required to address remaining technical challenges associated with the text mining methodologies, and to conclusively determine the relative contribution of each textual source to improving pharmacovigilance. PMID:25151493

  9. Text Mining.

    ERIC Educational Resources Information Center

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  10. Biomedical text mining and its applications in cancer research.

    PubMed

    Zhu, Fei; Patumcharoenpol, Preecha; Zhang, Cheng; Yang, Yang; Chan, Jonathan; Meechai, Asawin; Vongsangnak, Wanwipa; Shen, Bairong

    2013-04-01

    Cancer is a malignant disease that has caused millions of human deaths. Its study has a long history of well over 100years. There have been an enormous number of publications on cancer research. This integrated but unstructured biomedical text is of great value for cancer diagnostics, treatment, and prevention. The immense body and rapid growth of biomedical text on cancer has led to the appearance of a large number of text mining techniques aimed at extracting novel knowledge from scientific text. Biomedical text mining on cancer research is computationally automatic and high-throughput in nature. However, it is error-prone due to the complexity of natural language processing. In this review, we introduce the basic concepts underlying text mining and examine some frequently used algorithms, tools, and data sets, as well as assessing how much these algorithms have been utilized. We then discuss the current state-of-the-art text mining applications in cancer research and we also provide some resources for cancer text mining. With the development of systems biology, researchers tend to understand complex biomedical systems from a systems biology viewpoint. Thus, the full utilization of text mining to facilitate cancer systems biology research is fast becoming a major concern. To address this issue, we describe the general workflow of text mining in cancer systems biology and each phase of the workflow. We hope that this review can (i) provide a useful overview of the current work of this field; (ii) help researchers to choose text mining tools and datasets; and (iii) highlight how to apply text mining to assist cancer systems biology research. Copyright © 2012 Elsevier Inc. All rights reserved.

  11. Compatibility between Text Mining and Qualitative Research in the Perspectives of Grounded Theory, Content Analysis, and Reliability

    ERIC Educational Resources Information Center

    Yu, Chong Ho; Jannasch-Pennell, Angel; DiGangi, Samuel

    2011-01-01

    The objective of this article is to illustrate that text mining and qualitative research are epistemologically compatible. First, like many qualitative research approaches, such as grounded theory, text mining encourages open-mindedness and discourages preconceptions. Contrary to the popular belief that text mining is a linear and fully automated…

  12. Text mining meets workflow: linking U-Compare with Taverna

    PubMed Central

    Kano, Yoshinobu; Dobson, Paul; Nakanishi, Mio; Tsujii, Jun'ichi; Ananiadou, Sophia

    2010-01-01

    Summary: Text mining from the biomedical literature is of increasing importance, yet it is not easy for the bioinformatics community to create and run text mining workflows due to the lack of accessibility and interoperability of the text mining resources. The U-Compare system provides a wide range of bio text mining resources in a highly interoperable workflow environment where workflows can very easily be created, executed, evaluated and visualized without coding. We have linked U-Compare to Taverna, a generic workflow system, to expose text mining functionality to the bioinformatics community. Availability: http://u-compare.org/taverna.html, http://u-compare.org Contact: kano@is.s.u-tokyo.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20709690

  13. Naive Bayes as opinion classifier to evaluate students satisfaction based on student sentiment in Twitter Social Media

    NASA Astrophysics Data System (ADS)

    Candra Permana, Fahmi; Rosmansyah, Yusep; Setiawan Abdullah, Atje

    2017-10-01

    Students activity on social media can provide implicit knowledge and new perspectives for an educational system. Sentiment analysis is a part of text mining that can help to analyze and classify the opinion data. This research uses text mining and naive Bayes method as opinion classifier, to be used as an alternative methods in the process of evaluating studentss satisfaction for educational institution. Based on test results, this system can determine the opinion classification in Bahasa Indonesia using naive Bayes as opinion classifier with accuracy level of 84% correct, and the comparison between the existing system and the proposed system to evaluate students satisfaction in learning process, there is only a difference of 16.49%.

  14. An open-source framework for large-scale, flexible evaluation of biomedical text mining systems.

    PubMed

    Baumgartner, William A; Cohen, K Bretonnel; Hunter, Lawrence

    2008-01-29

    Improved evaluation methodologies have been identified as a necessary prerequisite to the improvement of text mining theory and practice. This paper presents a publicly available framework that facilitates thorough, structured, and large-scale evaluations of text mining technologies. The extensibility of this framework and its ability to uncover system-wide characteristics by analyzing component parts as well as its usefulness for facilitating third-party application integration are demonstrated through examples in the biomedical domain. Our evaluation framework was assembled using the Unstructured Information Management Architecture. It was used to analyze a set of gene mention identification systems involving 225 combinations of system, evaluation corpus, and correctness measure. Interactions between all three were found to affect the relative rankings of the systems. A second experiment evaluated gene normalization system performance using as input 4,097 combinations of gene mention systems and gene mention system-combining strategies. Gene mention system recall is shown to affect gene normalization system performance much more than does gene mention system precision, and high gene normalization performance is shown to be achievable with remarkably low levels of gene mention system precision. The software presented in this paper demonstrates the potential for novel discovery resulting from the structured evaluation of biomedical language processing systems, as well as the usefulness of such an evaluation framework for promoting collaboration between developers of biomedical language processing technologies. The code base is available as part of the BioNLP UIMA Component Repository on SourceForge.net.

  15. An open-source framework for large-scale, flexible evaluation of biomedical text mining systems

    PubMed Central

    Baumgartner, William A; Cohen, K Bretonnel; Hunter, Lawrence

    2008-01-01

    Background Improved evaluation methodologies have been identified as a necessary prerequisite to the improvement of text mining theory and practice. This paper presents a publicly available framework that facilitates thorough, structured, and large-scale evaluations of text mining technologies. The extensibility of this framework and its ability to uncover system-wide characteristics by analyzing component parts as well as its usefulness for facilitating third-party application integration are demonstrated through examples in the biomedical domain. Results Our evaluation framework was assembled using the Unstructured Information Management Architecture. It was used to analyze a set of gene mention identification systems involving 225 combinations of system, evaluation corpus, and correctness measure. Interactions between all three were found to affect the relative rankings of the systems. A second experiment evaluated gene normalization system performance using as input 4,097 combinations of gene mention systems and gene mention system-combining strategies. Gene mention system recall is shown to affect gene normalization system performance much more than does gene mention system precision, and high gene normalization performance is shown to be achievable with remarkably low levels of gene mention system precision. Conclusion The software presented in this paper demonstrates the potential for novel discovery resulting from the structured evaluation of biomedical language processing systems, as well as the usefulness of such an evaluation framework for promoting collaboration between developers of biomedical language processing technologies. The code base is available as part of the BioNLP UIMA Component Repository on SourceForge.net. PMID:18230184

  16. Survey of Natural Language Processing Techniques in Bioinformatics.

    PubMed

    Zeng, Zhiqiang; Shi, Hua; Wu, Yun; Hong, Zhiling

    2015-01-01

    Informatics methods, such as text mining and natural language processing, are always involved in bioinformatics research. In this study, we discuss text mining and natural language processing methods in bioinformatics from two perspectives. First, we aim to search for knowledge on biology, retrieve references using text mining methods, and reconstruct databases. For example, protein-protein interactions and gene-disease relationship can be mined from PubMed. Then, we analyze the applications of text mining and natural language processing techniques in bioinformatics, including predicting protein structure and function, detecting noncoding RNA. Finally, numerous methods and applications, as well as their contributions to bioinformatics, are discussed for future use by text mining and natural language processing researchers.

  17. Text Mining in Organizational Research

    PubMed Central

    Kobayashi, Vladimer B.; Berkers, Hannah A.; Kismihók, Gábor; Den Hartog, Deanne N.

    2017-01-01

    Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies. PMID:29881248

  18. Text Mining in Organizational Research.

    PubMed

    Kobayashi, Vladimer B; Mol, Stefan T; Berkers, Hannah A; Kismihók, Gábor; Den Hartog, Deanne N

    2018-07-01

    Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.

  19. Text mining for adverse drug events: the promise, challenges, and state of the art.

    PubMed

    Harpaz, Rave; Callahan, Alison; Tamang, Suzanne; Low, Yen; Odgers, David; Finlayson, Sam; Jung, Kenneth; LePendu, Paea; Shah, Nigam H

    2014-10-01

    Text mining is the computational process of extracting meaningful information from large amounts of unstructured text. It is emerging as a tool to leverage underutilized data sources that can improve pharmacovigilance, including the objective of adverse drug event (ADE) detection and assessment. This article provides an overview of recent advances in pharmacovigilance driven by the application of text mining, and discusses several data sources-such as biomedical literature, clinical narratives, product labeling, social media, and Web search logs-that are amenable to text mining for pharmacovigilance. Given the state of the art, it appears text mining can be applied to extract useful ADE-related information from multiple textual sources. Nonetheless, further research is required to address remaining technical challenges associated with the text mining methodologies, and to conclusively determine the relative contribution of each textual source to improving pharmacovigilance.

  20. Health Terrain: Visualizing Large Scale Health Data

    DTIC Science & Technology

    2015-12-01

    Text mining ; Data mining . 16. SECURITY  CLASSIFICATION  OF: 17... text   mining  algorithms  to  construct  a  concept  space.  A   browser-­‐based  user  interface  is  developed  to...Public  health  data,  Notifiable  condition  detector,   Text   mining ,  Data   mining   4 of 29 Disease Patient Location Term

  1. Introducing Text Analytics as a Graduate Business School Course

    ERIC Educational Resources Information Center

    Edgington, Theresa M.

    2011-01-01

    Text analytics refers to the process of analyzing unstructured data from documented sources, including open-ended surveys, blogs, and other types of web dialog. Text analytics has enveloped the concept of text mining, an analysis approach influenced heavily from data mining. While text mining has been covered extensively in various computer…

  2. The potential of text mining in data integration and network biology for plant research: a case study on Arabidopsis.

    PubMed

    Van Landeghem, Sofie; De Bodt, Stefanie; Drebert, Zuzanna J; Inzé, Dirk; Van de Peer, Yves

    2013-03-01

    Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies.

  3. Integration of Text- and Data-Mining Technologies for Use in Banking Applications

    NASA Astrophysics Data System (ADS)

    Maslankowski, Jacek

    Unstructured data, most of it in the form of text files, typically accounts for 85% of an organization's knowledge stores, but it's not always easy to find, access, analyze or use (Robb 2004). That is why it is important to use solutions based on text and data mining. This solution is known as duo mining. This leads to improve management based on knowledge owned in organization. The results are interesting. Data mining provides to lead with structuralized data, usually powered from data warehouses. Text mining, sometimes called web mining, looks for patterns in unstructured data — memos, document and www. Integrating text-based information with structured data enriches predictive modeling capabilities and provides new stores of insightful and valuable information for driving business and research initiatives forward.

  4. SparkText: Biomedical Text Mining on Big Data Framework.

    PubMed

    Ye, Zhan; Tafti, Ahmad P; He, Karen Y; Wang, Kai; He, Max M

    Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research.

  5. SparkText: Biomedical Text Mining on Big Data Framework

    PubMed Central

    He, Karen Y.; Wang, Kai

    2016-01-01

    Background Many new biomedical research articles are published every day, accumulating rich information, such as genetic variants, genes, diseases, and treatments. Rapid yet accurate text mining on large-scale scientific literature can discover novel knowledge to better understand human diseases and to improve the quality of disease diagnosis, prevention, and treatment. Results In this study, we designed and developed an efficient text mining framework called SparkText on a Big Data infrastructure, which is composed of Apache Spark data streaming and machine learning methods, combined with a Cassandra NoSQL database. To demonstrate its performance for classifying cancer types, we extracted information (e.g., breast, prostate, and lung cancers) from tens of thousands of articles downloaded from PubMed, and then employed Naïve Bayes, Support Vector Machine (SVM), and Logistic Regression to build prediction models to mine the articles. The accuracy of predicting a cancer type by SVM using the 29,437 full-text articles was 93.81%. While competing text-mining tools took more than 11 hours, SparkText mined the dataset in approximately 6 minutes. Conclusions This study demonstrates the potential for mining large-scale scientific articles on a Big Data infrastructure, with real-time update from new articles published daily. SparkText can be extended to other areas of biomedical research. PMID:27685652

  6. Working with Data: Discovering Knowledge through Mining and Analysis; Systematic Knowledge Management and Knowledge Discovery; Text Mining; Methodological Approach in Discovering User Search Patterns through Web Log Analysis; Knowledge Discovery in Databases Using Formal Concept Analysis; Knowledge Discovery with a Little Perspective.

    ERIC Educational Resources Information Center

    Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.

    2000-01-01

    These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)

  7. A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts.

    PubMed

    Westergaard, David; Stærfeldt, Hans-Henrik; Tønsberg, Christian; Jensen, Lars Juhl; Brunak, Søren

    2018-02-01

    Across academia and industry, text mining has become a popular strategy for keeping up with the rapid growth of the scientific literature. Text mining of the scientific literature has mostly been carried out on collections of abstracts, due to their availability. Here we present an analysis of 15 million English scientific full-text articles published during the period 1823-2016. We describe the development in article length and publication sub-topics during these nearly 250 years. We showcase the potential of text mining by extracting published protein-protein, disease-gene, and protein subcellular associations using a named entity recognition system, and quantitatively report on their accuracy using gold standard benchmark data sets. We subsequently compare the findings to corresponding results obtained on 16.5 million abstracts included in MEDLINE and show that text mining of full-text articles consistently outperforms using abstracts only.

  8. A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts

    PubMed Central

    Westergaard, David; Stærfeldt, Hans-Henrik

    2018-01-01

    Across academia and industry, text mining has become a popular strategy for keeping up with the rapid growth of the scientific literature. Text mining of the scientific literature has mostly been carried out on collections of abstracts, due to their availability. Here we present an analysis of 15 million English scientific full-text articles published during the period 1823–2016. We describe the development in article length and publication sub-topics during these nearly 250 years. We showcase the potential of text mining by extracting published protein–protein, disease–gene, and protein subcellular associations using a named entity recognition system, and quantitatively report on their accuracy using gold standard benchmark data sets. We subsequently compare the findings to corresponding results obtained on 16.5 million abstracts included in MEDLINE and show that text mining of full-text articles consistently outperforms using abstracts only. PMID:29447159

  9. PubRunner: A light-weight framework for updating text mining results.

    PubMed

    Anekalla, Kishore R; Courneya, J P; Fiorini, Nicolas; Lever, Jake; Muchow, Michael; Busby, Ben

    2017-01-01

    Biomedical text mining promises to assist biologists in quickly navigating the combined knowledge in their domain. This would allow improved understanding of the complex interactions within biological systems and faster hypothesis generation. New biomedical research articles are published daily and text mining tools are only as good as the corpus from which they work. Many text mining tools are underused because their results are static and do not reflect the constantly expanding knowledge in the field. In order for biomedical text mining to become an indispensable tool used by researchers, this problem must be addressed. To this end, we present PubRunner, a framework for regularly running text mining tools on the latest publications. PubRunner is lightweight, simple to use, and can be integrated with an existing text mining tool. The workflow involves downloading the latest abstracts from PubMed, executing a user-defined tool, pushing the resulting data to a public FTP or Zenodo dataset, and publicizing the location of these results on the public PubRunner website. We illustrate the use of this tool by re-running the commonly used word2vec tool on the latest PubMed abstracts to generate up-to-date word vector representations for the biomedical domain. This shows a proof of concept that we hope will encourage text mining developers to build tools that truly will aid biologists in exploring the latest publications.

  10. Text mining for the biocuration workflow

    PubMed Central

    Hirschman, Lynette; Burns, Gully A. P. C; Krallinger, Martin; Arighi, Cecilia; Cohen, K. Bretonnel; Valencia, Alfonso; Wu, Cathy H.; Chatr-Aryamontri, Andrew; Dowell, Karen G.; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G.

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on ‘Text Mining for the BioCuration Workflow’ at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community. PMID:22513129

  11. Text mining for the biocuration workflow.

    PubMed

    Hirschman, Lynette; Burns, Gully A P C; Krallinger, Martin; Arighi, Cecilia; Cohen, K Bretonnel; Valencia, Alfonso; Wu, Cathy H; Chatr-Aryamontri, Andrew; Dowell, Karen G; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on 'Text Mining for the BioCuration Workflow' at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community.

  12. The Potential of Text Mining in Data Integration and Network Biology for Plant Research: A Case Study on Arabidopsis[C][W

    PubMed Central

    Van Landeghem, Sofie; De Bodt, Stefanie; Drebert, Zuzanna J.; Inzé, Dirk; Van de Peer, Yves

    2013-01-01

    Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein–protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies. PMID:23532071

  13. Frontiers of biomedical text mining: current progress

    PubMed Central

    Zweigenbaum, Pierre; Demner-Fushman, Dina; Yu, Hong; Cohen, Kevin B.

    2008-01-01

    It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a number of problems at the frontiers of biomedical text mining continue to present interesting challenges and opportunities for great improvements and interesting research. In this article we review the current state of the art in biomedical text mining or ‘BioNLP’ in general, focusing primarily on papers published within the past year. PMID:17977867

  14. Automated detection of follow-up appointments using text mining of discharge records.

    PubMed

    Ruud, Kari L; Johnson, Matthew G; Liesinger, Juliette T; Grafft, Carrie A; Naessens, James M

    2010-06-01

    To determine whether text mining can accurately detect specific follow-up appointment criteria in free-text hospital discharge records. Cross-sectional study. Mayo Clinic Rochester hospitals. Inpatients discharged from general medicine services in 2006 (n = 6481). Textual hospital dismissal summaries were manually reviewed to determine whether the records contained specific follow-up appointment arrangement elements: date, time and either physician or location for an appointment. The data set was evaluated for the same criteria using SAS Text Miner software. The two assessments were compared to determine the accuracy of text mining for detecting records containing follow-up appointment arrangements. Agreement of text-mined appointment findings with gold standard (manual abstraction) including sensitivity, specificity, positive predictive and negative predictive values (PPV and NPV). About 55.2% (3576) of discharge records contained all criteria for follow-up appointment arrangements according to the manual review, 3.2% (113) of which were missed through text mining. Text mining incorrectly identified 3.7% (107) follow-up appointments that were not considered valid through manual review. Therefore, the text mining analysis concurred with the manual review in 96.6% of the appointment findings. Overall sensitivity and specificity were 96.8 and 96.3%, respectively; and PPV and NPV were 97.0 and 96.1%, respectively. of individual appointment criteria resulted in accuracy rates of 93.5% for date, 97.4% for time, 97.5% for physician and 82.9% for location. Text mining of unstructured hospital dismissal summaries can accurately detect documentation of follow-up appointment arrangement elements, thus saving considerable resources for performance assessment and quality-related research.

  15. 75 FR 17511 - Coal Mine Dust Sampling Devices

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-06

    ... Part III Department of Labor Mine Safety and Health Adminisration 30 CFR Parts 18, 74, and 75 Coal Mine Dust Sampling Devices; High-Voltage Continuous Mining Machine Standard for Underground Coal Mines...-AB61 Coal Mine Dust Sampling Devices AGENCY: Mine Safety and Health Administration, Labor. ACTION...

  16. Sampling and monitoring for the mine life cycle

    USGS Publications Warehouse

    McLemore, Virginia T.; Smith, Kathleen S.; Russell, Carol C.

    2014-01-01

    Sampling and Monitoring for the Mine Life Cycle provides an overview of sampling for environmental purposes and monitoring of environmentally relevant variables at mining sites. It focuses on environmental sampling and monitoring of surface water, and also considers groundwater, process water streams, rock, soil, and other media including air and biological organisms. The handbook includes an appendix of technical summaries written by subject-matter experts that describe field measurements, collection methods, and analytical techniques and procedures relevant to environmental sampling and monitoring.The sixth of a series of handbooks on technologies for management of metal mine and metallurgical process drainage, this handbook supplements and enhances current literature and provides an awareness of the critical components and complexities involved in environmental sampling and monitoring at the mine site. It differs from most information sources by providing an approach to address all types of mining influenced water and other sampling media throughout the mine life cycle.Sampling and Monitoring for the Mine Life Cycle is organized into a main text and six appendices that are an integral part of the handbook. Sidebars and illustrations are included to provide additional detail about important concepts, to present examples and brief case studies, and to suggest resources for further information. Extensive references are included.

  17. 77 FR 42760 - Proposed Information Collection; Request for Comments

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-20

    .... Data OMB Control Number: 1024-0064. Title: 36 CFR Part 9, Subpart A--Mining and Mining Claims, 36 CFR....gov . Please reference ``1024-0064, 36 CFR Part 9, Subpart A--Mining and Mining Claims, 36 CFR Part 9... regulates mineral development activities inside park boundaries pursuant to rights associated with mining...

  18. Adaptive semantic tag mining from heterogeneous clinical research texts.

    PubMed

    Hao, T; Weng, C

    2015-01-01

    To develop an adaptive approach to mine frequent semantic tags (FSTs) from heterogeneous clinical research texts. We develop a "plug-n-play" framework that integrates replaceable unsupervised kernel algorithms with formatting, functional, and utility wrappers for FST mining. Temporal information identification and semantic equivalence detection were two example functional wrappers. We first compared this approach's recall and efficiency for mining FSTs from ClinicalTrials.gov to that of a recently published tag-mining algorithm. Then we assessed this approach's adaptability to two other types of clinical research texts: clinical data requests and clinical trial protocols, by comparing the prevalence trends of FSTs across three texts. Our approach increased the average recall and speed by 12.8% and 47.02% respectively upon the baseline when mining FSTs from ClinicalTrials.gov, and maintained an overlap in relevant FSTs with the base- line ranging between 76.9% and 100% for varying FST frequency thresholds. The FSTs saturated when the data size reached 200 documents. Consistent trends in the prevalence of FST were observed across the three texts as the data size or frequency threshold changed. This paper contributes an adaptive tag-mining framework that is scalable and adaptable without sacrificing its recall. This component-based architectural design can be potentially generalizable to improve the adaptability of other clinical text mining methods.

  19. Deploying and sharing U-Compare workflows as web services.

    PubMed

    Kontonatsios, Georgios; Korkontzelos, Ioannis; Kolluru, Balakrishna; Thompson, Paul; Ananiadou, Sophia

    2013-02-18

    U-Compare is a text mining platform that allows the construction, evaluation and comparison of text mining workflows. U-Compare contains a large library of components that are tuned to the biomedical domain. Users can rapidly develop biomedical text mining workflows by mixing and matching U-Compare's components. Workflows developed using U-Compare can be exported and sent to other users who, in turn, can import and re-use them. However, the resulting workflows are standalone applications, i.e., software tools that run and are accessible only via a local machine, and that can only be run with the U-Compare platform. We address the above issues by extending U-Compare to convert standalone workflows into web services automatically, via a two-click process. The resulting web services can be registered on a central server and made publicly available. Alternatively, users can make web services available on their own servers, after installing the web application framework, which is part of the extension to U-Compare. We have performed a user-oriented evaluation of the proposed extension, by asking users who have tested the enhanced functionality of U-Compare to complete questionnaires that assess its functionality, reliability, usability, efficiency and maintainability. The results obtained reveal that the new functionality is well received by users. The web services produced by U-Compare are built on top of open standards, i.e., REST and SOAP protocols, and therefore, they are decoupled from the underlying platform. Exported workflows can be integrated with any application that supports these open standards. We demonstrate how the newly extended U-Compare enhances the cross-platform interoperability of workflows, by seamlessly importing a number of text mining workflow web services exported from U-Compare into Taverna, i.e., a generic scientific workflow construction platform.

  20. Deploying and sharing U-Compare workflows as web services

    PubMed Central

    2013-01-01

    Background U-Compare is a text mining platform that allows the construction, evaluation and comparison of text mining workflows. U-Compare contains a large library of components that are tuned to the biomedical domain. Users can rapidly develop biomedical text mining workflows by mixing and matching U-Compare’s components. Workflows developed using U-Compare can be exported and sent to other users who, in turn, can import and re-use them. However, the resulting workflows are standalone applications, i.e., software tools that run and are accessible only via a local machine, and that can only be run with the U-Compare platform. Results We address the above issues by extending U-Compare to convert standalone workflows into web services automatically, via a two-click process. The resulting web services can be registered on a central server and made publicly available. Alternatively, users can make web services available on their own servers, after installing the web application framework, which is part of the extension to U-Compare. We have performed a user-oriented evaluation of the proposed extension, by asking users who have tested the enhanced functionality of U-Compare to complete questionnaires that assess its functionality, reliability, usability, efficiency and maintainability. The results obtained reveal that the new functionality is well received by users. Conclusions The web services produced by U-Compare are built on top of open standards, i.e., REST and SOAP protocols, and therefore, they are decoupled from the underlying platform. Exported workflows can be integrated with any application that supports these open standards. We demonstrate how the newly extended U-Compare enhances the cross-platform interoperability of workflows, by seamlessly importing a number of text mining workflow web services exported from U-Compare into Taverna, i.e., a generic scientific workflow construction platform. PMID:23419017

  1. Biomedical text mining for research rigor and integrity: tasks, challenges, directions.

    PubMed

    Kilicoglu, Halil

    2017-06-13

    An estimated quarter of a trillion US dollars is invested in the biomedical research enterprise annually. There is growing alarm that a significant portion of this investment is wasted because of problems in reproducibility of research findings and in the rigor and integrity of research conduct and reporting. Recent years have seen a flurry of activities focusing on standardization and guideline development to enhance the reproducibility and rigor of biomedical research. Research activity is primarily communicated via textual artifacts, ranging from grant applications to journal publications. These artifacts can be both the source and the manifestation of practices leading to research waste. For example, an article may describe a poorly designed experiment, or the authors may reach conclusions not supported by the evidence presented. In this article, we pose the question of whether biomedical text mining techniques can assist the stakeholders in the biomedical research enterprise in doing their part toward enhancing research integrity and rigor. In particular, we identify four key areas in which text mining techniques can make a significant contribution: plagiarism/fraud detection, ensuring adherence to reporting guidelines, managing information overload and accurate citation/enhanced bibliometrics. We review the existing methods and tools for specific tasks, if they exist, or discuss relevant research that can provide guidance for future work. With the exponential increase in biomedical research output and the ability of text mining approaches to perform automatic tasks at large scale, we propose that such approaches can support tools that promote responsible research practices, providing significant benefits for the biomedical research enterprise. Published by Oxford University Press 2017. This work is written by a US Government employee and is in the public domain in the US.

  2. Text mining resources for the life sciences.

    PubMed

    Przybyła, Piotr; Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia

    2016-01-01

    Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable-those that have the crucial ability to share information, enabling smooth integration and reusability. © The Author(s) 2016. Published by Oxford University Press.

  3. Chapter 16: text mining for translational bioinformatics.

    PubMed

    Cohen, K Bretonnel; Hunter, Lawrence E

    2013-04-01

    Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

  4. Text mining resources for the life sciences

    PubMed Central

    Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia

    2016-01-01

    Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable—those that have the crucial ability to share information, enabling smooth integration and reusability. PMID:27888231

  5. Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies.

    PubMed

    Cohen, Raphael; Elhadad, Michael; Elhadad, Noémie

    2013-01-16

    The increasing availability of Electronic Health Record (EHR) data and specifically free-text patient notes presents opportunities for phenotype extraction. Text-mining methods in particular can help disease modeling by mapping named-entities mentions to terminologies and clustering semantically related terms. EHR corpora, however, exhibit specific statistical and linguistic characteristics when compared with corpora in the biomedical literature domain. We focus on copy-and-paste redundancy: clinicians typically copy and paste information from previous notes when documenting a current patient encounter. Thus, within a longitudinal patient record, one expects to observe heavy redundancy. In this paper, we ask three research questions: (i) How can redundancy be quantified in large-scale text corpora? (ii) Conventional wisdom is that larger corpora yield better results in text mining. But how does the observed EHR redundancy affect text mining? Does such redundancy introduce a bias that distorts learned models? Or does the redundancy introduce benefits by highlighting stable and important subsets of the corpus? (iii) How can one mitigate the impact of redundancy on text mining? We analyze a large-scale EHR corpus and quantify redundancy both in terms of word and semantic concept repetition. We observe redundancy levels of about 30% and non-standard distribution of both words and concepts. We measure the impact of redundancy on two standard text-mining applications: collocation identification and topic modeling. We compare the results of these methods on synthetic data with controlled levels of redundancy and observe significant performance variation. Finally, we compare two mitigation strategies to avoid redundancy-induced bias: (i) a baseline strategy, keeping only the last note for each patient in the corpus; (ii) removing redundant notes with an efficient fingerprinting-based algorithm. (a)For text mining, preprocessing the EHR corpus with fingerprinting yields significantly better results. Before applying text-mining techniques, one must pay careful attention to the structure of the analyzed corpora. While the importance of data cleaning has been known for low-level text characteristics (e.g., encoding and spelling), high-level and difficult-to-quantify corpus characteristics, such as naturally occurring redundancy, can also hurt text mining. Fingerprinting enables text-mining techniques to leverage available data in the EHR corpus, while avoiding the bias introduced by redundancy.

  6. ParaBTM: A Parallel Processing Framework for Biomedical Text Mining on Supercomputers.

    PubMed

    Xing, Yuting; Wu, Chengkun; Yang, Xi; Wang, Wei; Zhu, En; Yin, Jianping

    2018-04-27

    A prevailing way of extracting valuable information from biomedical literature is to apply text mining methods on unstructured texts. However, the massive amount of literature that needs to be analyzed poses a big data challenge to the processing efficiency of text mining. In this paper, we address this challenge by introducing parallel processing on a supercomputer. We developed paraBTM, a runnable framework that enables parallel text mining on the Tianhe-2 supercomputer. It employs a low-cost yet effective load balancing strategy to maximize the efficiency of parallel processing. We evaluated the performance of paraBTM on several datasets, utilizing three types of named entity recognition tasks as demonstration. Results show that, in most cases, the processing efficiency can be greatly improved with parallel processing, and the proposed load balancing strategy is simple and effective. In addition, our framework can be readily applied to other tasks of biomedical text mining besides NER.

  7. Uranium mining wastes, garden exhibition and health risks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schmidt, Gerhard; Schmidt, Peter; Hinz, Wilko

    2007-07-01

    Available in abstract form only. Full text of publication follows: For more than 40 years the Soviet-German stockholding company SDAG WISMUT mined and milled Uranium in the East of Germany and became up to 1990 the world's third largest Uranium producer. After reunification of Germany, the new found state own company Wismut GmbH was faced with the task of decommissioning and rehabilitation of the mining and milling sites. One of the largest mining areas in the world, that had to be cleaned up, was located close to the municipality of Ronneburg near the City of Gera in Thuringia. After closingmore » the operations of the Ronneburg underground mine and at the 160 m deep open pit mine with a free volume of 84 Mio.m{sup 3}, the open pit and 7 large piles of mine waste, together 112 Mio.m{sup 3} of material, had to be cleaned up. As a result of an optimisation procedure it was chosen to relocate the waste rock piles back into the open pit. After taking this decision and approval of the plan the disposal operation was started. Even though the transport task was done by large trucks, this took 16 years. The work will be finished in 2007, a cover consisting of 40 cm of uncontaminated material will be placed on top of the material, and the re-vegetation of the former open pit area will be established. When in 2002 the City of Gera applied to host the largest garden exhibition in Germany, Bundesgartenschau (BUGA), in 2007, Wismut GmbH supported this plan by offering parts of the territory of the former mining site as an exhibition ground. Finally, it was decided by the BUGA organizers to arrange its 2007 exhibition on grounds in Gera and in the valley adjacent to the former open pit mine, with parts of the remediated area within the fence of the exhibition. (authors)« less

  8. Text-mining and information-retrieval services for molecular biology

    PubMed Central

    Krallinger, Martin; Valencia, Alfonso

    2005-01-01

    Text-mining in molecular biology - defined as the automatic extraction of information about genes, proteins and their functional relationships from text documents - has emerged as a hybrid discipline on the edges of the fields of information science, bioinformatics and computational linguistics. A range of text-mining applications have been developed recently that will improve access to knowledge for biologists and database annotators. PMID:15998455

  9. Text mining for traditional Chinese medical knowledge discovery: a survey.

    PubMed

    Zhou, Xuezhong; Peng, Yonghong; Liu, Baoyan

    2010-08-01

    Extracting meaningful information and knowledge from free text is the subject of considerable research interest in the machine learning and data mining fields. Text data mining (or text mining) has become one of the most active research sub-fields in data mining. Significant developments in the area of biomedical text mining during the past years have demonstrated its great promise for supporting scientists in developing novel hypotheses and new knowledge from the biomedical literature. Traditional Chinese medicine (TCM) provides a distinct methodology with which to view human life. It is one of the most complete and distinguished traditional medicines with a history of several thousand years of studying and practicing the diagnosis and treatment of human disease. It has been shown that the TCM knowledge obtained from clinical practice has become a significant complementary source of information for modern biomedical sciences. TCM literature obtained from the historical period and from modern clinical studies has recently been transformed into digital data in the form of relational databases or text documents, which provide an effective platform for information sharing and retrieval. This motivates and facilitates research and development into knowledge discovery approaches and to modernize TCM. In order to contribute to this still growing field, this paper presents (1) a comparative introduction to TCM and modern biomedicine, (2) a survey of the related information sources of TCM, (3) a review and discussion of the state of the art and the development of text mining techniques with applications to TCM, (4) a discussion of the research issues around TCM text mining and its future directions. Copyright 2010 Elsevier Inc. All rights reserved.

  10. Managing biological networks by using text mining and computer-aided curation

    NASA Astrophysics Data System (ADS)

    Yu, Seok Jong; Cho, Yongseong; Lee, Min-Ho; Lim, Jongtae; Yoo, Jaesoo

    2015-11-01

    In order to understand a biological mechanism in a cell, a researcher should collect a huge number of protein interactions with experimental data from experiments and the literature. Text mining systems that extract biological interactions from papers have been used to construct biological networks for a few decades. Even though the text mining of literature is necessary to construct a biological network, few systems with a text mining tool are available for biologists who want to construct their own biological networks. We have developed a biological network construction system called BioKnowledge Viewer that can generate a biological interaction network by using a text mining tool and biological taggers. It also Boolean simulation software to provide a biological modeling system to simulate the model that is made with the text mining tool. A user can download PubMed articles and construct a biological network by using the Multi-level Knowledge Emergence Model (KMEM), MetaMap, and A Biomedical Named Entity Recognizer (ABNER) as a text mining tool. To evaluate the system, we constructed an aging-related biological network that consist 9,415 nodes (genes) by using manual curation. With network analysis, we found that several genes, including JNK, AP-1, and BCL-2, were highly related in aging biological network. We provide a semi-automatic curation environment so that users can obtain a graph database for managing text mining results that are generated in the server system and can navigate the network with BioKnowledge Viewer, which is freely available at http://bioknowledgeviewer.kisti.re.kr.

  11. An overview of the biocreative 2012 workshop track III: Interactive text mining task

    USDA-ARS?s Scientific Manuscript database

    An important question is how to make use of text mining to enhance the biocuration workflow. A number of groups have developed tools for text mining from a computer science/linguistics perspective and there are many initiatives to curate some aspect of biology from the literature. In some cases the ...

  12. Mines, Quarries and Landscape. Visuality and Transformation

    NASA Astrophysics Data System (ADS)

    Jimeno, Carlos López; Torrijos, Ignacio Díez; González, Carmen Mataix

    2016-06-01

    In this paper a review of two basic concepts is carried out: scenery and landscape integration, proposing a new concept: "visuality", alternative to the classical "visibility" used in landscape studies related to mining activity, which explores the qualitative aspects that define the visual relationships between observer and environment. In relation to landscape integration studies, some reflections on substantive issues are made which induce certain prejudices at the time of addressing the issue of mining operations landscape integration, and some guidance and integration strategies are formulated. In the second part of the text, a new approach to the landscape integration of mines and quarries is raised, closely linked to the concept of visuality which are based on a basic goal: the re-qualification of the place, and give innovative answers to re-qualify the place and show how to catch the opportunity in the deep transformation generated by the development of mining activities. As a conclusion, a case study is presented in the last section, the landscape integration study conducted on marble exploitations Coto Pinos (Alicante, Spain), considered the largest ornamental rock quarry in Europe.

  13. Text Mining in Cancer Gene and Pathway Prioritization

    PubMed Central

    Luo, Yuan; Riedlinger, Gregory; Szolovits, Peter

    2014-01-01

    Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes. PMID:25392685

  14. Text mining in cancer gene and pathway prioritization.

    PubMed

    Luo, Yuan; Riedlinger, Gregory; Szolovits, Peter

    2014-01-01

    Prioritization of cancer implicated genes has received growing attention as an effective way to reduce wet lab cost by computational analysis that ranks candidate genes according to the likelihood that experimental verifications will succeed. A multitude of gene prioritization tools have been developed, each integrating different data sources covering gene sequences, differential expressions, function annotations, gene regulations, protein domains, protein interactions, and pathways. This review places existing gene prioritization tools against the backdrop of an integrative Omic hierarchy view toward cancer and focuses on the analysis of their text mining components. We explain the relatively slow progress of text mining in gene prioritization, identify several challenges to current text mining methods, and highlight a few directions where more effective text mining algorithms may improve the overall prioritization task and where prioritizing the pathways may be more desirable than prioritizing only genes.

  15. Citation Mining: Integrating Text Mining and Bibliometrics for Research User Profiling.

    ERIC Educational Resources Information Center

    Kostoff, Ronald N.; del Rio, J. Antonio; Humenik, James A.; Garcia, Esther Ofilia; Ramirez, Ana Maria

    2001-01-01

    Discusses the importance of identifying the users and impact of research, and describes an approach for identifying the pathways through which research can impact other research, technology development, and applications. Describes a study that used citation mining, an integration of citation bibliometrics and text mining, on articles from the…

  16. Using text-mining techniques in electronic patient records to identify ADRs from medicine use.

    PubMed

    Warrer, Pernille; Hansen, Ebba Holme; Juhl-Jensen, Lars; Aagaard, Lise

    2012-05-01

    This literature review included studies that use text-mining techniques in narrative documents stored in electronic patient records (EPRs) to investigate ADRs. We searched PubMed, Embase, Web of Science and International Pharmaceutical Abstracts without restrictions from origin until July 2011. We included empirically based studies on text mining of electronic patient records (EPRs) that focused on detecting ADRs, excluding those that investigated adverse events not related to medicine use. We extracted information on study populations, EPR data sources, frequencies and types of the identified ADRs, medicines associated with ADRs, text-mining algorithms used and their performance. Seven studies, all from the United States, were eligible for inclusion in the review. Studies were published from 2001, the majority between 2009 and 2010. Text-mining techniques varied over time from simple free text searching of outpatient visit notes and inpatient discharge summaries to more advanced techniques involving natural language processing (NLP) of inpatient discharge summaries. Performance appeared to increase with the use of NLP, although many ADRs were still missed. Due to differences in study design and populations, various types of ADRs were identified and thus we could not make comparisons across studies. The review underscores the feasibility and potential of text mining to investigate narrative documents in EPRs for ADRs. However, more empirical studies are needed to evaluate whether text mining of EPRs can be used systematically to collect new information about ADRs. © 2011 The Authors. British Journal of Clinical Pharmacology © 2011 The British Pharmacological Society.

  17. Using text-mining techniques in electronic patient records to identify ADRs from medicine use

    PubMed Central

    Warrer, Pernille; Hansen, Ebba Holme; Juhl-Jensen, Lars; Aagaard, Lise

    2012-01-01

    This literature review included studies that use text-mining techniques in narrative documents stored in electronic patient records (EPRs) to investigate ADRs. We searched PubMed, Embase, Web of Science and International Pharmaceutical Abstracts without restrictions from origin until July 2011. We included empirically based studies on text mining of electronic patient records (EPRs) that focused on detecting ADRs, excluding those that investigated adverse events not related to medicine use. We extracted information on study populations, EPR data sources, frequencies and types of the identified ADRs, medicines associated with ADRs, text-mining algorithms used and their performance. Seven studies, all from the United States, were eligible for inclusion in the review. Studies were published from 2001, the majority between 2009 and 2010. Text-mining techniques varied over time from simple free text searching of outpatient visit notes and inpatient discharge summaries to more advanced techniques involving natural language processing (NLP) of inpatient discharge summaries. Performance appeared to increase with the use of NLP, although many ADRs were still missed. Due to differences in study design and populations, various types of ADRs were identified and thus we could not make comparisons across studies. The review underscores the feasibility and potential of text mining to investigate narrative documents in EPRs for ADRs. However, more empirical studies are needed to evaluate whether text mining of EPRs can be used systematically to collect new information about ADRs. PMID:22122057

  18. What Online Communities Can Tell Us About Electronic Cigarettes and Hookah Use: A Study Using Text Mining and Visualization Techniques.

    PubMed

    Chen, Annie T; Zhu, Shu-Hong; Conway, Mike

    2015-09-29

    The rise in popularity of electronic cigarettes (e-cigarettes) and hookah over recent years has been accompanied by some confusion and uncertainty regarding the development of an appropriate regulatory response towards these emerging products. Mining online discussion content can lead to insights into people's experiences, which can in turn further our knowledge of how to address potential health implications. In this work, we take a novel approach to understanding the use and appeal of these emerging products by applying text mining techniques to compare consumer experiences across discussion forums. This study examined content from the websites Vapor Talk, Hookah Forum, and Reddit to understand people's experiences with different tobacco products. Our investigation involves three parts. First, we identified contextual factors that inform our understanding of tobacco use behaviors, such as setting, time, social relationships, and sensory experience, and compared the forums to identify the ones where content on these factors is most common. Second, we compared how the tobacco use experience differs with combustible cigarettes and e-cigarettes. Third, we investigated differences between e-cigarette and hookah use. In the first part of our study, we employed a lexicon-based extraction approach to estimate prevalence of contextual factors, and then we generated a heat map based on these estimates to compare the forums. In the second and third parts of the study, we employed a text mining technique called topic modeling to identify important topics and then developed a visualization, Topic Bars, to compare topic coverage across forums. In the first part of the study, we identified two forums, Vapor Talk Health & Safety and the Stopsmoking subreddit, where discussion concerning contextual factors was particularly common. The second part showed that the discussion in Vapor Talk Health & Safety focused on symptoms and comparisons of combustible cigarettes and e-cigarettes, and the Stopsmoking subreddit focused on psychological aspects of quitting. Last, we examined the discussion content on Vapor Talk and Hookah Forum. Prominent topics included equipment, technique, experiential elements of use, and the buying and selling of equipment. This study has three main contributions. Discussion forums differ in the extent to which their content may help us understand behaviors with potential health implications. Identifying dimensions of interest and using a heat map visualization to compare across forums can be helpful for identifying forums with the greatest density of health information. Additionally, our work has shown that the quitting experience can potentially be very different depending on whether or not e-cigarettes are used. Finally, e-cigarette and hookah forums are similar in that members represent a "hobbyist culture" that actively engages in information exchange. These differences have important implications for both tobacco regulation and smoking cessation intervention design.

  19. Gene prioritization and clustering by multi-view text mining

    PubMed Central

    2010-01-01

    Background Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model. Results We present a multi-view approach to retrieve biomedical knowledge using different controlled vocabularies. These controlled vocabularies are selected on the basis of nine well-known bio-ontologies and are applied to index the vast amounts of gene-based free-text information available in the MEDLINE repository. The text mining result specified by a vocabulary is considered as a view and the obtained multiple views are integrated by multi-source learning algorithms. We investigate the effect of integration in two fundamental computational disease gene identification tasks: gene prioritization and gene clustering. The performance of the proposed approach is systematically evaluated and compared on real benchmark data sets. In both tasks, the multi-view approach demonstrates significantly better performance than other comparing methods. Conclusions In practical research, the relevance of specific vocabulary pertaining to the task is usually unknown. In such case, multi-view text mining is a superior and promising strategy for text-based disease gene identification. PMID:20074336

  20. 78 FR 39531 - Mine Rescue Teams

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-07-01

    ... Administration 30 CFR Part 49 Mine Rescue Teams; CFR Correction #0;#0;Federal Register / Vol. 78 , No. 126... Health Administration 30 CFR Part 49 Mine Rescue Teams CFR Correction In Title 30 of the Code of Federal... Teams Type of mine rescue team Requirement Mine-site Composite Contract State-sponsored...

  1. Scholars Are Wary of Deal on Google's Book Search

    ERIC Educational Resources Information Center

    Howard, Jennifer

    2009-01-01

    Google's Book Search program mines the holdings of research libraries for texts to digitize. Some of that material is out of copyright; a lot of it isn't. A lawsuit came about when some authors and publishers decided that Google's project exceeded the bounds of fair use. As part of a settlement, the parties have proposed creating a Book Rights…

  2. Contextual Text Mining

    ERIC Educational Resources Information Center

    Mei, Qiaozhu

    2009-01-01

    With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the…

  3. Text mining applied to electronic cardiovascular procedure reports to identify patients with trileaflet aortic stenosis and coronary artery disease.

    PubMed

    Small, Aeron M; Kiss, Daniel H; Zlatsin, Yevgeny; Birtwell, David L; Williams, Heather; Guerraty, Marie A; Han, Yuchi; Anwaruddin, Saif; Holmes, John H; Chirinos, Julio A; Wilensky, Robert L; Giri, Jay; Rader, Daniel J

    2017-08-01

    Interrogation of the electronic health record (EHR) using billing codes as a surrogate for diagnoses of interest has been widely used for clinical research. However, the accuracy of this methodology is variable, as it reflects billing codes rather than severity of disease, and depends on the disease and the accuracy of the coding practitioner. Systematic application of text mining to the EHR has had variable success for the detection of cardiovascular phenotypes. We hypothesize that the application of text mining algorithms to cardiovascular procedure reports may be a superior method to identify patients with cardiovascular conditions of interest. We adapted the Oracle product Endeca, which utilizes text mining to identify terms of interest from a NoSQL-like database, for purposes of searching cardiovascular procedure reports and termed the tool "PennSeek". We imported 282,569 echocardiography reports representing 81,164 individuals and 27,205 cardiac catheterization reports representing 14,567 individuals from non-searchable databases into PennSeek. We then applied clinical criteria to these reports in PennSeek to identify patients with trileaflet aortic stenosis (TAS) and coronary artery disease (CAD). Accuracy of patient identification by text mining through PennSeek was compared with ICD-9 billing codes. Text mining identified 7115 patients with TAS and 9247 patients with CAD. ICD-9 codes identified 8272 patients with TAS and 6913 patients with CAD. 4346 patients with AS and 6024 patients with CAD were identified by both approaches. A randomly selected sample of 200-250 patients uniquely identified by text mining was compared with 200-250 patients uniquely identified by billing codes for both diseases. We demonstrate that text mining was superior, with a positive predictive value (PPV) of 0.95 compared to 0.53 by ICD-9 for TAS, and a PPV of 0.97 compared to 0.86 for CAD. These results highlight the superiority of text mining algorithms applied to electronic cardiovascular procedure reports in the identification of phenotypes of interest for cardiovascular research. Copyright © 2017. Published by Elsevier Inc.

  4. The Distribution of the Informative Intensity of the Text in Terms of its Structure (On Materials of the English Texts in the Mining Sphere)

    NASA Astrophysics Data System (ADS)

    Znikina, Ludmila; Rozhneva, Elena

    2017-11-01

    The article deals with the distribution of informative intensity of the English-language scientific text based on its structural features contributing to the process of formalization of the scientific text and the preservation of the adequacy of the text with derived semantic information in relation to the primary. Discourse analysis is built on specific compositional and meaningful examples of scientific texts taken from the mining field. It also analyzes the adequacy of the translation of foreign texts into another language, the relationships between elements of linguistic systems, the degree of a formal conformance, translation with the specific objectives and information needs of the recipient. Some key words and ideas are emphasized in the paragraphs of the English-language mining scientific texts. The article gives the characteristic features of the structure of paragraphs of technical text and examples of constructions in English scientific texts based on a mining theme with the aim to explain the possible ways of their adequate translation.

  5. An Evaluation of Text Mining Tools as Applied to Selected Scientific and Engineering Literature.

    ERIC Educational Resources Information Center

    Trybula, Walter J.; Wyllys, Ronald E.

    2000-01-01

    Addresses an approach to the discovery of scientific knowledge through an examination of data mining and text mining techniques. Presents the results of experiments that investigated knowledge acquisition from a selected set of technical documents by domain experts. (Contains 15 references.) (Author/LRW)

  6. Introduction to the JASIST Special Topic Issue on Web Retrieval and Mining: A Machine Learning Perspective.

    ERIC Educational Resources Information Center

    Chen, Hsinchun

    2003-01-01

    Discusses information retrieval techniques used on the World Wide Web. Topics include machine learning in information extraction; relevance feedback; information filtering and recommendation; text classification and text clustering; Web mining, based on data mining techniques; hyperlink structure; and Web size. (LRW)

  7. Application of text mining in the biomedical domain.

    PubMed

    Fleuren, Wilco W M; Alkema, Wynand

    2015-03-01

    In recent years the amount of experimental data that is produced in biomedical research and the number of papers that are being published in this field have grown rapidly. In order to keep up to date with developments in their field of interest and to interpret the outcome of experiments in light of all available literature, researchers turn more and more to the use of automated literature mining. As a consequence, text mining tools have evolved considerably in number and quality and nowadays can be used to address a variety of research questions ranging from de novo drug target discovery to enhanced biological interpretation of the results from high throughput experiments. In this paper we introduce the most important techniques that are used for a text mining and give an overview of the text mining tools that are currently being used and the type of problems they are typically applied for. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Mining adverse drug reactions from online healthcare forums using hidden Markov model.

    PubMed

    Sampathkumar, Hariprasad; Chen, Xue-wen; Luo, Bo

    2014-10-23

    Adverse Drug Reactions are one of the leading causes of injury or death among patients undergoing medical treatments. Not all Adverse Drug Reactions are identified before a drug is made available in the market. Current post-marketing drug surveillance methods, which are based purely on voluntary spontaneous reports, are unable to provide the early indications necessary to prevent the occurrence of such injuries or fatalities. The objective of this research is to extract reports of adverse drug side-effects from messages in online healthcare forums and use them as early indicators to assist in post-marketing drug surveillance. We treat the task of extracting adverse side-effects of drugs from healthcare forum messages as a sequence labeling problem and present a Hidden Markov Model(HMM) based Text Mining system that can be used to classify a message as containing drug side-effect information and then extract the adverse side-effect mentions from it. A manually annotated dataset from http://www.medications.com is used in the training and validation of the HMM based Text Mining system. A 10-fold cross-validation on the manually annotated dataset yielded on average an F-Score of 0.76 from the HMM Classifier, in comparison to 0.575 from the Baseline classifier. Without the Plain Text Filter component as a part of the Text Processing module, the F-Score of the HMM Classifier was reduced to 0.378 on average, while absence of the HTML Filter component was found to have no impact. Reducing the Drug names dictionary size by half, on average reduced the F-Score of the HMM Classifier to 0.359, while a similar reduction to the side-effects dictionary yielded an F-Score of 0.651 on average. Adverse side-effects mined from http://www.medications.com and http://www.steadyhealth.com were found to match the Adverse Drug Reactions on the Drug Package Labels of several drugs. In addition, some novel adverse side-effects, which can be potential Adverse Drug Reactions, were also identified. The results from the HMM based Text Miner are encouraging to pursue further enhancements to this approach. The mined novel side-effects can act as early indicators for health authorities to help focus their efforts in post-marketing drug surveillance.

  9. A preliminary report on the geology of the volcanic-rock-hosted Barite Hill gold deposit, Carolina slate belt

    USGS Publications Warehouse

    Clark, Sandra H.; Greig, David D.; Bryan, Norman L.

    1992-01-01

    Text and copies of slides for a paper presented at the meeting of the Southeastern Section of the Geological Society of America in Winston-Salem, North Carolina, March 19, 1992. The Barite Hill mine is located in the southern part of the Carolina slate belt of South Carolina just north of the Georgia border.

  10. The Islamic State Battle Plan: Press Release Natural Language Processing

    DTIC Science & Technology

    2016-06-01

    Processing, text mining , corpus, generalized linear model, cascade, R Shiny, leaflet, data visualization 15. NUMBER OF PAGES 83 16. PRICE CODE...Terrorism and Responses to Terrorism TDM Term Document Matrix TF Term Frequency TF-IDF Term Frequency-Inverse Document Frequency tm text mining (R...package=leaflet. Feinerer I, Hornik K (2015) Text Mining Package “tm,” Version 0.6-2. (Jul 3) https://cran.r-project.org/web/packages/tm/tm.pdf

  11. OntoGene web services for biomedical text mining.

    PubMed

    Rinaldi, Fabio; Clematide, Simon; Marques, Hernani; Ellendorff, Tilia; Romacker, Martin; Rodriguez-Esteban, Raul

    2014-01-01

    Text mining services are rapidly becoming a crucial component of various knowledge management pipelines, for example in the process of database curation, or for exploration and enrichment of biomedical data within the pharmaceutical industry. Traditional architectures, based on monolithic applications, do not offer sufficient flexibility for a wide range of use case scenarios, and therefore open architectures, as provided by web services, are attracting increased interest. We present an approach towards providing advanced text mining capabilities through web services, using a recently proposed standard for textual data interchange (BioC). The web services leverage a state-of-the-art platform for text mining (OntoGene) which has been tested in several community-organized evaluation challenges,with top ranked results in several of them.

  12. Text mining patents for biomedical knowledge.

    PubMed

    Rodriguez-Esteban, Raul; Bundschus, Markus

    2016-06-01

    Biomedical text mining of scientific knowledge bases, such as Medline, has received much attention in recent years. Given that text mining is able to automatically extract biomedical facts that revolve around entities such as genes, proteins, and drugs, from unstructured text sources, it is seen as a major enabler to foster biomedical research and drug discovery. In contrast to the biomedical literature, research into the mining of biomedical patents has not reached the same level of maturity. Here, we review existing work and highlight the associated technical challenges that emerge from automatically extracting facts from patents. We conclude by outlining potential future directions in this domain that could help drive biomedical research and drug discovery. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. OntoGene web services for biomedical text mining

    PubMed Central

    2014-01-01

    Text mining services are rapidly becoming a crucial component of various knowledge management pipelines, for example in the process of database curation, or for exploration and enrichment of biomedical data within the pharmaceutical industry. Traditional architectures, based on monolithic applications, do not offer sufficient flexibility for a wide range of use case scenarios, and therefore open architectures, as provided by web services, are attracting increased interest. We present an approach towards providing advanced text mining capabilities through web services, using a recently proposed standard for textual data interchange (BioC). The web services leverage a state-of-the-art platform for text mining (OntoGene) which has been tested in several community-organized evaluation challenges, with top ranked results in several of them. PMID:25472638

  14. BioC: a minimalist approach to interoperability for biomedical text processing

    PubMed Central

    Comeau, Donald C.; Islamaj Doğan, Rezarta; Ciccarese, Paolo; Cohen, Kevin Bretonnel; Krallinger, Martin; Leitner, Florian; Lu, Zhiyong; Peng, Yifan; Rinaldi, Fabio; Torii, Manabu; Valencia, Alfonso; Verspoor, Karin; Wiegers, Thomas C.; Wu, Cathy H.; Wilbur, W. John

    2013-01-01

    A vast amount of scientific information is encoded in natural language text, and the quantity of such text has become so great that it is no longer economically feasible to have a human as the first step in the search process. Natural language processing and text mining tools have become essential to facilitate the search for and extraction of information from text. This has led to vigorous research efforts to create useful tools and to create humanly labeled text corpora, which can be used to improve such tools. To encourage combining these efforts into larger, more powerful and more capable systems, a common interchange format to represent, store and exchange the data in a simple manner between different language processing systems and text mining tools is highly desirable. Here we propose a simple extensible mark-up language format to share text documents and annotations. The proposed annotation approach allows a large number of different annotations to be represented including sentences, tokens, parts of speech, named entities such as genes or diseases and relationships between named entities. In addition, we provide simple code to hold this data, read it from and write it back to extensible mark-up language files and perform some sample processing. We also describe completed as well as ongoing work to apply the approach in several directions. Code and data are available at http://bioc.sourceforge.net/. Database URL: http://bioc.sourceforge.net/ PMID:24048470

  15. What Online Communities Can Tell Us About Electronic Cigarettes and Hookah Use: A Study Using Text Mining and Visualization Techniques

    PubMed Central

    Zhu, Shu-Hong; Conway, Mike

    2015-01-01

    Background The rise in popularity of electronic cigarettes (e-cigarettes) and hookah over recent years has been accompanied by some confusion and uncertainty regarding the development of an appropriate regulatory response towards these emerging products. Mining online discussion content can lead to insights into people’s experiences, which can in turn further our knowledge of how to address potential health implications. In this work, we take a novel approach to understanding the use and appeal of these emerging products by applying text mining techniques to compare consumer experiences across discussion forums. Objective This study examined content from the websites Vapor Talk, Hookah Forum, and Reddit to understand people’s experiences with different tobacco products. Our investigation involves three parts. First, we identified contextual factors that inform our understanding of tobacco use behaviors, such as setting, time, social relationships, and sensory experience, and compared the forums to identify the ones where content on these factors is most common. Second, we compared how the tobacco use experience differs with combustible cigarettes and e-cigarettes. Third, we investigated differences between e-cigarette and hookah use. Methods In the first part of our study, we employed a lexicon-based extraction approach to estimate prevalence of contextual factors, and then we generated a heat map based on these estimates to compare the forums. In the second and third parts of the study, we employed a text mining technique called topic modeling to identify important topics and then developed a visualization, Topic Bars, to compare topic coverage across forums. Results In the first part of the study, we identified two forums, Vapor Talk Health & Safety and the Stopsmoking subreddit, where discussion concerning contextual factors was particularly common. The second part showed that the discussion in Vapor Talk Health & Safety focused on symptoms and comparisons of combustible cigarettes and e-cigarettes, and the Stopsmoking subreddit focused on psychological aspects of quitting. Last, we examined the discussion content on Vapor Talk and Hookah Forum. Prominent topics included equipment, technique, experiential elements of use, and the buying and selling of equipment. Conclusions This study has three main contributions. Discussion forums differ in the extent to which their content may help us understand behaviors with potential health implications. Identifying dimensions of interest and using a heat map visualization to compare across forums can be helpful for identifying forums with the greatest density of health information. Additionally, our work has shown that the quitting experience can potentially be very different depending on whether or not e-cigarettes are used. Finally, e-cigarette and hookah forums are similar in that members represent a “hobbyist culture” that actively engages in information exchange. These differences have important implications for both tobacco regulation and smoking cessation intervention design. PMID:26420469

  16. 30 CFR 33.38 - Electrical parts.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... MINING PRODUCTS DUST COLLECTORS FOR USE IN CONNECTION WITH ROCK DRILLING IN COAL MINES Test Requirements... Part 18 of Subchapter D of this chapter (Bureau of Mines Schedule 2, revised, the current revision of..., the current revision of which is Schedule 2F). (c) Units with electrical parts and designed for...

  17. 30 CFR 33.38 - Electrical parts.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... MINING PRODUCTS DUST COLLECTORS FOR USE IN CONNECTION WITH ROCK DRILLING IN COAL MINES Test Requirements... Part 18 of Subchapter D of this chapter (Bureau of Mines Schedule 2, revised, the current revision of..., the current revision of which is Schedule 2F). (c) Units with electrical parts and designed for...

  18. Text mining approach to predict hospital admissions using early medical records from the emergency department.

    PubMed

    Lucini, Filipe R; S Fogliatto, Flavio; C da Silveira, Giovani J; L Neyeloff, Jeruza; Anzanello, Michel J; de S Kuchenbecker, Ricardo; D Schaan, Beatriz

    2017-04-01

    Emergency department (ED) overcrowding is a serious issue for hospitals. Early information on short-term inward bed demand from patients receiving care at the ED may reduce the overcrowding problem, and optimize the use of hospital resources. In this study, we use text mining methods to process data from early ED patient records using the SOAP framework, and predict future hospitalizations and discharges. We try different approaches for pre-processing of text records and to predict hospitalization. Sets-of-words are obtained via binary representation, term frequency, and term frequency-inverse document frequency. Unigrams, bigrams and trigrams are tested for feature formation. Feature selection is based on χ 2 and F-score metrics. In the prediction module, eight text mining methods are tested: Decision Tree, Random Forest, Extremely Randomized Tree, AdaBoost, Logistic Regression, Multinomial Naïve Bayes, Support Vector Machine (Kernel linear) and Nu-Support Vector Machine (Kernel linear). Prediction performance is evaluated by F1-scores. Precision and Recall values are also informed for all text mining methods tested. Nu-Support Vector Machine was the text mining method with the best overall performance. Its average F1-score in predicting hospitalization was 77.70%, with a standard deviation (SD) of 0.66%. The method could be used to manage daily routines in EDs such as capacity planning and resource allocation. Text mining could provide valuable information and facilitate decision-making by inward bed management teams. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  19. Text and Structural Data Mining of Influenza Mentions in Web and Social Media

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Corley, Courtney D.; Cook, Diane; Mikler, Armin R.

    Text and structural data mining of Web and social media (WSM) provides a novel disease surveillance resource and can identify online communities for targeted public health communications (PHC) to assure wide dissemination of pertinent information. WSM that mention influenza are harvested over a 24-week period, 5-October-2008 to 21-March-2009. Link analysis reveals communities for targeted PHC. Text mining is shown to identify trends in flu posts that correlate to real-world influenza-like-illness patient report data. We also bring to bear a graph-based data mining technique to detect anomalies among flu blogs connected by publisher type, links, and user-tags.

  20. Vaccine adverse event text mining system for extracting features from vaccine safety reports.

    PubMed

    Botsis, Taxiarchis; Buttolph, Thomas; Nguyen, Michael D; Winiecki, Scott; Woo, Emily Jane; Ball, Robert

    2012-01-01

    To develop and evaluate a text mining system for extracting key clinical features from vaccine adverse event reporting system (VAERS) narratives to aid in the automated review of adverse event reports. Based upon clinical significance to VAERS reviewing physicians, we defined the primary (diagnosis and cause of death) and secondary features (eg, symptoms) for extraction. We built a novel vaccine adverse event text mining (VaeTM) system based on a semantic text mining strategy. The performance of VaeTM was evaluated using a total of 300 VAERS reports in three sequential evaluations of 100 reports each. Moreover, we evaluated the VaeTM contribution to case classification; an information retrieval-based approach was used for the identification of anaphylaxis cases in a set of reports and was compared with two other methods: a dedicated text classifier and an online tool. The performance metrics of VaeTM were text mining metrics: recall, precision and F-measure. We also conducted a qualitative difference analysis and calculated sensitivity and specificity for classification of anaphylaxis cases based on the above three approaches. VaeTM performed best in extracting diagnosis, second level diagnosis, drug, vaccine, and lot number features (lenient F-measure in the third evaluation: 0.897, 0.817, 0.858, 0.874, and 0.914, respectively). In terms of case classification, high sensitivity was achieved (83.1%); this was equal and better compared to the text classifier (83.1%) and the online tool (40.7%), respectively. Our VaeTM implementation of a semantic text mining strategy shows promise in providing accurate and efficient extraction of key features from VAERS narratives.

  1. DISEASES: text mining and data integration of disease-gene associations.

    PubMed

    Pletscher-Frankild, Sune; Pallejà, Albert; Tsafou, Kalliopi; Binder, Janos X; Jensen, Lars Juhl

    2015-03-01

    Text mining is a flexible technology that can be applied to numerous different tasks in biology and medicine. We present a system for extracting disease-gene associations from biomedical abstracts. The system consists of a highly efficient dictionary-based tagger for named entity recognition of human genes and diseases, which we combine with a scoring scheme that takes into account co-occurrences both within and between sentences. We show that this approach is able to extract half of all manually curated associations with a false positive rate of only 0.16%. Nonetheless, text mining should not stand alone, but be combined with other types of evidence. For this reason, we have developed the DISEASES resource, which integrates the results from text mining with manually curated disease-gene associations, cancer mutation data, and genome-wide association studies from existing databases. The DISEASES resource is accessible through a web interface at http://diseases.jensenlab.org/, where the text-mining software and all associations are also freely available for download. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  2. Mechanism of Rock Burst Occurrence in Specially Thick Coal Seam with Rock Parting

    NASA Astrophysics Data System (ADS)

    Wang, Jian-chao; Jiang, Fu-xing; Meng, Xiang-jun; Wang, Xu-you; Zhu, Si-tao; Feng, Yu

    2016-05-01

    Specially thick coal seam with complex construction, such as rock parting and alternative soft and hard coal, is called specially thick coal seam with rock parting (STCSRP), which easily leads to rock burst during mining. Based on the stress distribution of rock parting zone, this study investigated the mechanism, engineering discriminant conditions, prevention methods, and risk evaluation method of rock burst occurrence in STCSRP through setting up a mechanical model. The main conclusions of this study are as follows. (1) When the mining face moves closer to the rock parting zone, the original non-uniform stress of the rock parting zone and the advancing stress of the mining face are combined to intensify gradually the shearing action of coal near the mining face. When the shearing action reaches a certain degree, rock burst easily occurs near the mining face. (2) Rock burst occurrence in STCSRP is positively associated with mining depth, advancing stress concentration factor of the mining face, thickness of rock parting, bursting liability of coal, thickness ratio of rock parting to coal seam, and difference of elastic modulus between rock parting and coal, whereas negatively associated with shear strength. (3) Technologies of large-diameter drilling, coal seam water injection, and deep hole blasting can reduce advancing stress concentration factor, thickness of rock parting, and difference of elastic modulus between rock parting and coal to lower the risk of rock burst in STCSRP. (4) The research result was applied to evaluate and control the risk of rock burst occurrence in STCSRP.

  3. Text Mining for Precision Medicine: Bringing structure to EHRs and biomedical literature to understand genes and health

    PubMed Central

    Simmons, Michael; Singhal, Ayush; Lu, Zhiyong

    2018-01-01

    The key question of precision medicine is whether it is possible to find clinically actionable granularity in diagnosing disease and classifying patient risk. The advent of next generation sequencing and the widespread adoption of electronic health records (EHRs) have provided clinicians and researchers a wealth of data and made possible the precise characterization of individual patient genotypes and phenotypes. Unstructured text — found in biomedical publications and clinical notes — is an important component of genotype and phenotype knowledge. Publications in the biomedical literature provide essential information for interpreting genetic data. Likewise, clinical notes contain the richest source of phenotype information in EHRs. Text mining can render these texts computationally accessible and support information extraction and hypothesis generation. This chapter reviews the mechanics of text mining in precision medicine and discusses several specific use cases, including database curation for personalized cancer medicine, patient outcome prediction from EHR-derived cohorts, and pharmacogenomic research. Taken as a whole, these use cases demonstrate how text mining enables effective utilization of existing knowledge sources and thus promotes increased value for patients and healthcare systems. Text mining is an indispensable tool for translating genotype-phenotype data into effective clinical care that will undoubtedly play an important role in the eventual realization of precision medicine. PMID:27807747

  4. Text Mining for Precision Medicine: Bringing Structure to EHRs and Biomedical Literature to Understand Genes and Health.

    PubMed

    Simmons, Michael; Singhal, Ayush; Lu, Zhiyong

    2016-01-01

    The key question of precision medicine is whether it is possible to find clinically actionable granularity in diagnosing disease and classifying patient risk. The advent of next-generation sequencing and the widespread adoption of electronic health records (EHRs) have provided clinicians and researchers a wealth of data and made possible the precise characterization of individual patient genotypes and phenotypes. Unstructured text-found in biomedical publications and clinical notes-is an important component of genotype and phenotype knowledge. Publications in the biomedical literature provide essential information for interpreting genetic data. Likewise, clinical notes contain the richest source of phenotype information in EHRs. Text mining can render these texts computationally accessible and support information extraction and hypothesis generation. This chapter reviews the mechanics of text mining in precision medicine and discusses several specific use cases, including database curation for personalized cancer medicine, patient outcome prediction from EHR-derived cohorts, and pharmacogenomic research. Taken as a whole, these use cases demonstrate how text mining enables effective utilization of existing knowledge sources and thus promotes increased value for patients and healthcare systems. Text mining is an indispensable tool for translating genotype-phenotype data into effective clinical care that will undoubtedly play an important role in the eventual realization of precision medicine.

  5. SciLite: a platform for displaying text-mined annotations as a means to link research articles with biological data.

    PubMed

    Venkatesan, Aravind; Kim, Jee-Hyub; Talo, Francesco; Ide-Smith, Michele; Gobeill, Julien; Carter, Jacob; Batista-Navarro, Riza; Ananiadou, Sophia; Ruch, Patrick; McEntyre, Johanna

    2016-01-01

    The tremendous growth in biological data has resulted in an increase in the number of research papers being published. This presents a great challenge for scientists in searching and assimilating facts described in those papers. Particularly, biological databases depend on curators to add highly precise and useful information that are usually extracted by reading research articles. Therefore, there is an urgent need to find ways to improve linking literature to the underlying data, thereby minimising the effort in browsing content and identifying key biological concepts.   As part of the development of Europe PMC, we have developed a new platform, SciLite, which integrates text-mined annotations from different sources and overlays those outputs on research articles. The aim is to aid researchers and curators using Europe PMC in finding key concepts more easily and provide links to related resources or tools, bridging the gap between literature and biological data.

  6. Detection of interaction articles and experimental methods in biomedical literature.

    PubMed

    Schneider, Gerold; Clematide, Simon; Rinaldi, Fabio

    2011-10-03

    This article describes the approaches taken by the OntoGene group at the University of Zurich in dealing with two tasks of the BioCreative III competition: classification of articles which contain curatable protein-protein interactions (PPI-ACT) and extraction of experimental methods (PPI-IMT). Two main achievements are described in this paper: (a) a system for document classification which crucially relies on the results of an advanced pipeline of natural language processing tools; (b) a system which is capable of detecting all experimental methods mentioned in scientific literature, and listing them with a competitive ranking (AUC iP/R > 0.5). The results of the BioCreative III shared evaluation clearly demonstrate that significant progress has been achieved in the domain of biomedical text mining in the past few years. Our own contribution, together with the results of other participants, provides evidence that natural language processing techniques have become by now an integral part of advanced text mining approaches.

  7. SciLite: a platform for displaying text-mined annotations as a means to link research articles with biological data

    PubMed Central

    Talo, Francesco; Ide-Smith, Michele; Gobeill, Julien; Carter, Jacob; Batista-Navarro, Riza; Ananiadou, Sophia; Ruch, Patrick; McEntyre, Johanna

    2017-01-01

    The tremendous growth in biological data has resulted in an increase in the number of research papers being published. This presents a great challenge for scientists in searching and assimilating facts described in those papers. Particularly, biological databases depend on curators to add highly precise and useful information that are usually extracted by reading research articles. Therefore, there is an urgent need to find ways to improve linking literature to the underlying data, thereby minimising the effort in browsing content and identifying key biological concepts.   As part of the development of Europe PMC, we have developed a new platform, SciLite, which integrates text-mined annotations from different sources and overlays those outputs on research articles. The aim is to aid researchers and curators using Europe PMC in finding key concepts more easily and provide links to related resources or tools, bridging the gap between literature and biological data. PMID:28948232

  8. Using Text Mining to Uncover Students' Technology-Related Problems in Live Video Streaming

    ERIC Educational Resources Information Center

    Abdous, M'hammed; He, Wu

    2011-01-01

    Because of their capacity to sift through large amounts of data, text mining and data mining are enabling higher education institutions to reveal valuable patterns in students' learning behaviours without having to resort to traditional survey methods. In an effort to uncover live video streaming (LVS) students' technology related-problems and to…

  9. The Feasibility of Using Large-Scale Text Mining to Detect Adverse Childhood Experiences in a VA-Treated Population.

    PubMed

    Hammond, Kenric W; Ben-Ari, Alon Y; Laundry, Ryan J; Boyko, Edward J; Samore, Matthew H

    2015-12-01

    Free text in electronic health records resists large-scale analysis. Text records facts of interest not found in encoded data, and text mining enables their retrieval and quantification. The U.S. Department of Veterans Affairs (VA) clinical data repository affords an opportunity to apply text-mining methodology to study clinical questions in large populations. To assess the feasibility of text mining, investigation of the relationship between exposure to adverse childhood experiences (ACEs) and recorded diagnoses was conducted among all VA-treated Gulf war veterans, utilizing all progress notes recorded from 2000-2011. Text processing extracted ACE exposures recorded among 44.7 million clinical notes belonging to 243,973 veterans. The relationship of ACE exposure to adult illnesses was analyzed using logistic regression. Bias considerations were assessed. ACE score was strongly associated with suicide attempts and serious mental disorders (ORs = 1.84 to 1.97), and less so with behaviorally mediated and somatic conditions (ORs = 1.02 to 1.36) per unit. Bias adjustments did not remove persistent associations between ACE score and most illnesses. Text mining to detect ACE exposure in a large population was feasible. Analysis of the relationship between ACE score and adult health conditions yielded patterns of association consistent with prior research. Copyright © 2015 International Society for Traumatic Stress Studies.

  10. Alaska Resource Data File: Chignik quadrangle, Alaska

    USGS Publications Warehouse

    Pilcher, Steven H.

    2000-01-01

    Descriptions of the mineral occurrences can be found in the report. See U.S. Geological Survey (1996) for a description of the information content of each field in the records. The data presented here are maintained as part of a statewide database on mines, prospects and mineral occurrences throughout Alaska. There is a website from which you can obtain the data for this report in text and Filemaker Pro formats

  11. Data mining of text as a tool in authorship attribution

    NASA Astrophysics Data System (ADS)

    Visa, Ari J. E.; Toivonen, Jarmo; Autio, Sami; Maekinen, Jarno; Back, Barbro; Vanharanta, Hannu

    2001-03-01

    It is common that text documents are characterized and classified by keywords that the authors use to give them. Visa et al. have developed a new methodology based on prototype matching. The prototype is an interesting document or a part of an extracted, interesting text. This prototype is matched with the document database of the monitored document flow. The new methodology is capable of extracting the meaning of the document in a certain degree. Our claim is that the new methodology is also capable of authenticating the authorship. To verify this claim two tests were designed. The test hypothesis was that the words and the word order in the sentences could authenticate the author. In the first test three authors were selected. The selected authors were William Shakespeare, Edgar Allan Poe, and George Bernard Shaw. Three texts from each author were examined. Every text was one by one used as a prototype. The two nearest matches with the prototype were noted. The second test uses the Reuters-21578 financial news database. A group of 25 short financial news reports from five different authors are examined. Our new methodology and the interesting results from the two tests are reported in this paper. In the first test, for Shakespeare and for Poe all cases were successful. For Shaw one text was confused with Poe. In the second test the Reuters-21578 financial news were identified by the author relatively well. The resolution is that our text mining methodology seems to be capable of authorship attribution.

  12. Text mining and its potential applications in systems biology.

    PubMed

    Ananiadou, Sophia; Kell, Douglas B; Tsujii, Jun-ichi

    2006-12-01

    With biomedical literature increasing at a rate of several thousand papers per week, it is impossible to keep abreast of all developments; therefore, automated means to manage the information overload are required. Text mining techniques, which involve the processes of information retrieval, information extraction and data mining, provide a means of solving this. By adding meaning to text, these techniques produce a more structured analysis of textual knowledge than simple word searches, and can provide powerful tools for the production and analysis of systems biology models.

  13. VisualUrText: A Text Analytics Tool for Unstructured Textual Data

    NASA Astrophysics Data System (ADS)

    Zainol, Zuraini; Jaymes, Mohd T. H.; Nohuddin, Puteri N. E.

    2018-05-01

    The growing amount of unstructured text over Internet is tremendous. Text repositories come from Web 2.0, business intelligence and social networking applications. It is also believed that 80-90% of future growth data is available in the form of unstructured text databases that may potentially contain interesting patterns and trends. Text Mining is well known technique for discovering interesting patterns and trends which are non-trivial knowledge from massive unstructured text data. Text Mining covers multidisciplinary fields involving information retrieval (IR), text analysis, natural language processing (NLP), data mining, machine learning statistics and computational linguistics. This paper discusses the development of text analytics tool that is proficient in extracting, processing, analyzing the unstructured text data and visualizing cleaned text data into multiple forms such as Document Term Matrix (DTM), Frequency Graph, Network Analysis Graph, Word Cloud and Dendogram. This tool, VisualUrText, is developed to assist students and researchers for extracting interesting patterns and trends in document analyses.

  14. 30 CFR 90.101 - Respirable dust standard when quartz is present.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... quartz is present. When the respirable dust in the mine atmosphere of the active workings to which a Part... average concentration of respirable dust in the mine atmosphere during each shift to which a Part 90 miner...%. Therefore, the average concentration of respirable dust in the mine atmosphere associated with that Part 90...

  15. 30 CFR 90.101 - Respirable dust standard when quartz is present.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... quartz is present. When the respirable dust in the mine atmosphere of the active workings to which a Part... average concentration of respirable dust in the mine atmosphere during each shift to which a Part 90 miner...%. Therefore, the average concentration of respirable dust in the mine atmosphere associated with that Part 90...

  16. 30 CFR 90.101 - Respirable dust standard when quartz is present.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... quartz is present. When the respirable dust in the mine atmosphere of the active workings to which a Part... average concentration of respirable dust in the mine atmosphere during each shift to which a Part 90 miner...%. Therefore, the average concentration of respirable dust in the mine atmosphere associated with that Part 90...

  17. 30 CFR 90.101 - Respirable dust standard when quartz is present.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... quartz is present. When the respirable dust in the mine atmosphere of the active workings to which a Part... average concentration of respirable dust in the mine atmosphere during each shift to which a Part 90 miner...%. Therefore, the average concentration of respirable dust in the mine atmosphere associated with that Part 90...

  18. 30 CFR 90.101 - Respirable dust standard when quartz is present.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... quartz is present. When the respirable dust in the mine atmosphere of the active workings to which a Part... average concentration of respirable dust in the mine atmosphere during each shift to which a Part 90 miner...%. Therefore, the average concentration of respirable dust in the mine atmosphere associated with that Part 90...

  19. BioCreative Workshops for DOE Genome Sciences: Text Mining for Metagenomics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Cathy H.; Hirschman, Lynette

    The objective of this project was to host BioCreative workshops to define and develop text mining tasks to meet the needs of the Genome Sciences community, focusing on metadata information extraction in metagenomics. Following the successful introduction of metagenomics at the BioCreative IV workshop, members of the metagenomics community and BioCreative communities continued discussion to identify candidate topics for a BioCreative metagenomics track for BioCreative V. Of particular interest was the capture of environmental and isolation source information from text. The outcome was to form a “community of interest” around work on the interactive EXTRACT system, which supported interactive taggingmore » of environmental and species data. This experiment is included in the BioCreative V virtual issue of Database. In addition, there was broad participation by members of the metagenomics community in the panels held at BioCreative V, leading to valuable exchanges between the text mining developers and members of the metagenomics research community. These exchanges are reflected in a number of the overview and perspective pieces also being captured in the BioCreative V virtual issue. Overall, this conversation has exposed the metagenomics researchers to the possibilities of text mining, and educated the text mining developers to the specific needs of the metagenomics community.« less

  20. Benchmarking infrastructure for mutation text mining

    PubMed Central

    2014-01-01

    Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600

  1. Benchmarking infrastructure for mutation text mining.

    PubMed

    Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

    2014-02-25

    Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.

  2. DrugQuest - a text mining workflow for drug association discovery.

    PubMed

    Papanikolaou, Nikolas; Pavlopoulos, Georgios A; Theodosiou, Theodosios; Vizirianakis, Ioannis S; Iliopoulos, Ioannis

    2016-06-06

    Text mining and data integration methods are gaining ground in the field of health sciences due to the exponential growth of bio-medical literature and information stored in biological databases. While such methods mostly try to extract bioentity associations from PubMed, very few of them are dedicated in mining other types of repositories such as chemical databases. Herein, we apply a text mining approach on the DrugBank database in order to explore drug associations based on the DrugBank "Description", "Indication", "Pharmacodynamics" and "Mechanism of Action" text fields. We apply Name Entity Recognition (NER) techniques on these fields to identify chemicals, proteins, genes, pathways, diseases, and we utilize the TextQuest algorithm to find additional biologically significant words. Using a plethora of similarity and partitional clustering techniques, we group the DrugBank records based on their common terms and investigate possible scenarios why these records are clustered together. Different views such as clustered chemicals based on their textual information, tag clouds consisting of Significant Terms along with the terms that were used for clustering are delivered to the user through a user-friendly web interface. DrugQuest is a text mining tool for knowledge discovery: it is designed to cluster DrugBank records based on text attributes in order to find new associations between drugs. The service is freely available at http://bioinformatics.med.uoc.gr/drugquest .

  3. Evaluating a Bilingual Text-Mining System with a Taxonomy of Key Words and Hierarchical Visualization for Understanding Learner-Generated Text

    ERIC Educational Resources Information Center

    Kong, Siu Cheung; Li, Ping; Song, Yanjie

    2018-01-01

    This study evaluated a bilingual text-mining system, which incorporated a bilingual taxonomy of key words and provided hierarchical visualization, for understanding learner-generated text in the learning management systems through automatic identification and counting of matching key words. A class of 27 in-service teachers studied a course…

  4. Beyond accuracy: creating interoperable and scalable text-mining web services.

    PubMed

    Wei, Chih-Hsuan; Leaman, Robert; Lu, Zhiyong

    2016-06-15

    The biomedical literature is a knowledge-rich resource and an important foundation for future research. With over 24 million articles in PubMed and an increasing growth rate, research in automated text processing is becoming increasingly important. We report here our recently developed web-based text mining services for biomedical concept recognition and normalization. Unlike most text-mining software tools, our web services integrate several state-of-the-art entity tagging systems (DNorm, GNormPlus, SR4GN, tmChem and tmVar) and offer a batch-processing mode able to process arbitrary text input (e.g. scholarly publications, patents and medical records) in multiple formats (e.g. BioC). We support multiple standards to make our service interoperable and allow simpler integration with other text-processing pipelines. To maximize scalability, we have preprocessed all PubMed articles, and use a computer cluster for processing large requests of arbitrary text. Our text-mining web service is freely available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/#curl : Zhiyong.Lu@nih.gov. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.

  5. Geologic map of the Kechumstuk fault zone in the Mount Veta area, Fortymile mining district, east-central Alaska

    USGS Publications Warehouse

    Day, Warren C.; O’Neill, J. Michael; Dusel-Bacon, Cynthia; Aleinikoff, John N.; Siron, Christopher R.

    2014-01-01

    This map was developed by the U.S. Geological Survey Mineral Resources Program to depict the fundamental geologic features for the western part of the Fortymile mining district of east-central Alaska, and to delineate the location of known bedrock mineral prospects and their relationship to rock types and structural features. This geospatial map database presents a 1:63,360-scale geologic map for the Kechumstuk fault zone and surrounding area, which lies 55 km northwest of Chicken, Alaska. The Kechumstuk fault zone is a northeast-trending zone of faults that transects the crystalline basement rocks of the Yukon-Tanana Upland of the western part of the Fortymile mining district. The crystalline basement rocks include Paleozoic metasedimentary and metaigneous rocks as well as granitoid intrusions of Triassic, Jurassic, and Cretaceous age. The geologic units represented by polygons in this dataset are based on new geologic mapping and geochronological data coupled with an interpretation of regional and new geophysical data collected by the Alaska Department of Natural Resources, Division of Geological and Geophysical Surveys. The geochronological data are reported in the accompanying geologic map text and represent new U-Pb dates on zircons collected from the igneous and metaigneous units within the map area.

  6. 75 FR 51291 - National Science Board: Sunshine Act Meetings; Notice

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-08-19

    ...-Gathering Activities. [cir] COV Report Text-Mining. [cir] Design of Research Questions for External Input. [cir] SBE/CISE Text-Mining Projects. [cir] Using a Blog for Informal Input. Committee on Education and...

  7. Imitating manual curation of text-mined facts in biomedicine.

    PubMed

    Rodriguez-Esteban, Raul; Iossifov, Ivan; Rzhetsky, Andrey

    2006-09-08

    Text-mining algorithms make mistakes in extracting facts from natural-language texts. In biomedical applications, which rely on use of text-mined data, it is critical to assess the quality (the probability that the message is correctly extracted) of individual facts--to resolve data conflicts and inconsistencies. Using a large set of almost 100,000 manually produced evaluations (most facts were independently reviewed more than once, producing independent evaluations), we implemented and tested a collection of algorithms that mimic human evaluation of facts provided by an automated information-extraction system. The performance of our best automated classifiers closely approached that of our human evaluators (ROC score close to 0.95). Our hypothesis is that, were we to use a larger number of human experts to evaluate any given sentence, we could implement an artificial-intelligence curator that would perform the classification job at least as accurately as an average individual human evaluator. We illustrated our analysis by visualizing the predicted accuracy of the text-mined relations involving the term cocaine.

  8. Assimilating Text-Mining & Bio-Informatics Tools to Analyze Cellulase structures

    NASA Astrophysics Data System (ADS)

    Satyasree, K. P. N. V., Dr; Lalitha Kumari, B., Dr; Jyotsna Devi, K. S. N. V.; Choudri, S. M. Roy; Pratap Joshi, K.

    2017-08-01

    Text-mining is one of the best potential way of automatically extracting information from the huge biological literature. To exploit its prospective, the knowledge encrypted in the text should be converted to some semantic representation such as entities and relations, which could be analyzed by machines. But large-scale practical systems for this purpose are rare. But text mining could be helpful for generating or validating predictions. Cellulases have abundant applications in various industries. Cellulose degrading enzymes are cellulases and the same producing bacteria - Bacillus subtilis & fungus Pseudomonas putida were isolated from top soil of Guntur Dt. A.P. India. Absolute cultures were conserved on potato dextrose agar medium for molecular studies. In this paper, we presented how well the text mining concepts can be used to analyze cellulase producing bacteria and fungi, their comparative structures are also studied with the aid of well-establised, high quality standard bioinformatic tools such as Bioedit, Swissport, Protparam, EMBOSSwin with which a complete data on Cellulases like structure, constituents of the enzyme has been obtained.

  9. Automatic detection of adverse events to predict drug label changes using text and data mining techniques.

    PubMed

    Gurulingappa, Harsha; Toldo, Luca; Rajput, Abdul Mateen; Kors, Jan A; Taweel, Adel; Tayrouz, Yorki

    2013-11-01

    The aim of this study was to assess the impact of automatically detected adverse event signals from text and open-source data on the prediction of drug label changes. Open-source adverse effect data were collected from FAERS, Yellow Cards and SIDER databases. A shallow linguistic relation extraction system (JSRE) was applied for extraction of adverse effects from MEDLINE case reports. Statistical approach was applied on the extracted datasets for signal detection and subsequent prediction of label changes issued for 29 drugs by the UK Regulatory Authority in 2009. 76% of drug label changes were automatically predicted. Out of these, 6% of drug label changes were detected only by text mining. JSRE enabled precise identification of four adverse drug events from MEDLINE that were undetectable otherwise. Changes in drug labels can be predicted automatically using data and text mining techniques. Text mining technology is mature and well-placed to support the pharmacovigilance tasks. Copyright © 2013 John Wiley & Sons, Ltd.

  10. Mining Adverse Drug Reactions in Social Media with Named Entity Recognition and Semantic Methods.

    PubMed

    Chen, Xiaoyi; Deldossi, Myrtille; Aboukhamis, Rim; Faviez, Carole; Dahamna, Badisse; Karapetiantz, Pierre; Guenegou-Arnoux, Armelle; Girardeau, Yannick; Guillemin-Lanne, Sylvie; Lillo-Le-Louët, Agnès; Texier, Nathalie; Burgun, Anita; Katsahian, Sandrine

    2017-01-01

    Suspected adverse drug reactions (ADR) reported by patients through social media can be a complementary source to current pharmacovigilance systems. However, the performance of text mining tools applied to social media text data to discover ADRs needs to be evaluated. In this paper, we introduce the approach developed to mine ADR from French social media. A protocol of evaluation is highlighted, which includes a detailed sample size determination and evaluation corpus constitution. Our text mining approach provided very encouraging preliminary results with F-measures of 0.94 and 0.81 for recognition of drugs and symptoms respectively, and with F-measure of 0.70 for ADR detection. Therefore, this approach is promising for downstream pharmacovigilance analysis.

  11. Detection and Evaluation of Cheating on College Exams Using Supervised Classification

    ERIC Educational Resources Information Center

    Cavalcanti, Elmano Ramalho; Pires, Carlos Eduardo; Cavalcanti, Elmano Pontes; Pires, Vládia Freire

    2012-01-01

    Text mining has been used for various purposes, such as document classification and extraction of domain-specific information from text. In this paper we present a study in which text mining methodology and algorithms were properly employed for academic dishonesty (cheating) detection and evaluation on open-ended college exams, based on document…

  12. What the papers say: Text mining for genomics and systems biology

    PubMed Central

    2010-01-01

    Keeping up with the rapidly growing literature has become virtually impossible for most scientists. This can have dire consequences. First, we may waste research time and resources on reinventing the wheel simply because we can no longer maintain a reliable grasp on the published literature. Second, and perhaps more detrimental, judicious (or serendipitous) combination of knowledge from different scientific disciplines, which would require following disparate and distinct research literatures, is rapidly becoming impossible for even the most ardent readers of research publications. Text mining -- the automated extraction of information from (electronically) published sources -- could potentially fulfil an important role -- but only if we know how to harness its strengths and overcome its weaknesses. As we do not expect that the rate at which scientific results are published will decrease, text mining tools are now becoming essential in order to cope with, and derive maximum benefit from, this information explosion. In genomics, this is particularly pressing as more and more rare disease-causing variants are found and need to be understood. Not being conversant with this technology may put scientists and biomedical regulators at a severe disadvantage. In this review, we introduce the basic concepts underlying modern text mining and its applications in genomics and systems biology. We hope that this review will serve three purposes: (i) to provide a timely and useful overview of the current status of this field, including a survey of present challenges; (ii) to enable researchers to decide how and when to apply text mining tools in their own research; and (iii) to highlight how the research communities in genomics and systems biology can help to make text mining from biomedical abstracts and texts more straightforward. PMID:21106487

  13. Knowledge based word-concept model estimation and refinement for biomedical text mining.

    PubMed

    Jimeno Yepes, Antonio; Berlanga, Rafael

    2015-02-01

    Text mining of scientific literature has been essential for setting up large public biomedical databases, which are being widely used by the research community. In the biomedical domain, the existence of a large number of terminological resources and knowledge bases (KB) has enabled a myriad of machine learning methods for different text mining related tasks. Unfortunately, KBs have not been devised for text mining tasks but for human interpretation, thus performance of KB-based methods is usually lower when compared to supervised machine learning methods. The disadvantage of supervised methods though is they require labeled training data and therefore not useful for large scale biomedical text mining systems. KB-based methods do not have this limitation. In this paper, we describe a novel method to generate word-concept probabilities from a KB, which can serve as a basis for several text mining tasks. This method not only takes into account the underlying patterns within the descriptions contained in the KB but also those in texts available from large unlabeled corpora such as MEDLINE. The parameters of the model have been estimated without training data. Patterns from MEDLINE have been built using MetaMap for entity recognition and related using co-occurrences. The word-concept probabilities were evaluated on the task of word sense disambiguation (WSD). The results showed that our method obtained a higher degree of accuracy than other state-of-the-art approaches when evaluated on the MSH WSD data set. We also evaluated our method on the task of document ranking using MEDLINE citations. These results also showed an increase in performance over existing baseline retrieval approaches. Copyright © 2014 Elsevier Inc. All rights reserved.

  14. 75 FR 60271 - Technical Amendments 2010

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-09-29

    ... Part VI Department of the Interior Office of Surface Mining Reclamation and Enforcement 30 CFR... INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Parts 740, 761, 773, 795, 816, 817...: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Final rule. SUMMARY: We, the...

  15. 78 FR 35974 - Proposed Information Collection; Comment Request; Coal Mine Rescue Teams; Arrangements for...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-06-14

    ... Request; Coal Mine Rescue Teams; Arrangements for Emergency Medical Assistance and Transportation for... Part 49, Mine Rescue Teams, Subpart B--Mine Rescue Teams for Underground Coal Mines, sets standards related to the availability of mine rescue teams; alternate mine rescue capability for small and remote...

  16. Evaluation of the mining techniques in constructing a traditional Chinese-language nursing recording system.

    PubMed

    Liao, Pei-Hung; Chu, William; Chu, Woei-Chyn

    2014-05-01

    In 2009, the Department of Health, part of Taiwan's Executive Yuan, announced the advent of electronic medical records to reduce medical expenses and facilitate the international exchange of medical record information. An information technology platform for nursing records in medical institutions was then quickly established, which improved nursing information systems and electronic databases. The purpose of the present study was to explore the usability of the data mining techniques to enhance completeness and ensure consistency of nursing records in the database system.First, the study used a Chinese word-segmenting system on common and special terms often used by the nursing staff. We also used text-mining techniques to collect keywords and create a keyword lexicon. We then used an association rule and artificial neural network to measure the correlation and forecasting capability for keywords. Finally, nursing staff members were provided with an on-screen pop-up menu to use when establishing nursing records. Our study found that by using mining techniques we were able to create a powerful keyword lexicon and establish a forecasting model for nursing diagnoses, ensuring the consistency of nursing terminology and improving the nursing staff's work efficiency and productivity.

  17. Data Mining.

    ERIC Educational Resources Information Center

    Benoit, Gerald

    2002-01-01

    Discusses data mining (DM) and knowledge discovery in databases (KDD), taking the view that KDD is the larger view of the entire process, with DM emphasizing the cleaning, warehousing, mining, and visualization of knowledge discovery in databases. Highlights include algorithms; users; the Internet; text mining; and information extraction.…

  18. Ask and Ye Shall Receive? Automated Text Mining of Michigan Capital Facility Finance Bond Election Proposals to Identify Which Topics Are Associated with Bond Passage and Voter Turnout

    ERIC Educational Resources Information Center

    Bowers, Alex J.; Chen, Jingjing

    2015-01-01

    The purpose of this study is to bring together recent innovations in the research literature around school district capital facility finance, municipal bond elections, statistical models of conditional time-varying outcomes, and data mining algorithms for automated text mining of election ballot proposals to examine the factors that influence the…

  19. New directions in biomedical text annotation: definitions, guidelines and corpus construction

    PubMed Central

    Wilbur, W John; Rzhetsky, Andrey; Shatkay, Hagit

    2006-01-01

    Background While biomedical text mining is emerging as an important research area, practical results have proven difficult to achieve. We believe that an important first step towards more accurate text-mining lies in the ability to identify and characterize text that satisfies various types of information needs. We report here the results of our inquiry into properties of scientific text that have sufficient generality to transcend the confines of a narrow subject area, while supporting practical mining of text for factual information. Our ultimate goal is to annotate a significant corpus of biomedical text and train machine learning methods to automatically categorize such text along certain dimensions that we have defined. Results We have identified five qualitative dimensions that we believe characterize a broad range of scientific sentences, and are therefore useful for supporting a general approach to text-mining: focus, polarity, certainty, evidence, and directionality. We define these dimensions and describe the guidelines we have developed for annotating text with regard to them. To examine the effectiveness of the guidelines, twelve annotators independently annotated the same set of 101 sentences that were randomly selected from current biomedical periodicals. Analysis of these annotations shows 70–80% inter-annotator agreement, suggesting that our guidelines indeed present a well-defined, executable and reproducible task. Conclusion We present our guidelines defining a text annotation task, along with annotation results from multiple independently produced annotations, demonstrating the feasibility of the task. The annotation of a very large corpus of documents along these guidelines is currently ongoing. These annotations form the basis for the categorization of text along multiple dimensions, to support viable text mining for experimental results, methodology statements, and other forms of information. We are currently developing machine learning methods, to be trained and tested on the annotated corpus, that would allow for the automatic categorization of biomedical text along the general dimensions that we have presented. The guidelines in full detail, along with annotated examples, are publicly available. PMID:16867190

  20. Pathway enrichment based on text mining and its validation on carotenoid and vitamin A metabolism.

    PubMed

    Waagmeester, Andra; Pezik, Piotr; Coort, Susan; Tourniaire, Franck; Evelo, Chris; Rebholz-Schuhmann, Dietrich

    2009-10-01

    Carotenoid metabolism is relevant to the prevention of various diseases. Although the main actors in this metabolic pathway are known, our understanding of the pathway is still incomplete. The information on the carotenoids is scattered in the large and growing body of scientific literature. We designed a text-mining work flow to enrich existing pathways. It has been validated on the vitamin A pathway, which is a well-studied part of the carotenoid metabolism. In this study we used the vitamin A metabolism pathway as it has been described by an expert team on carotenoid metabolism from the European network of excellence in Nutrigenomics (NuGO). This work flow uses an initial set of publications cited in a review paper (1,191 publications), enlarges this corpus with Medline abstracts (13,579 documents), and then extracts the key terminology from all relevant publications. Domain experts validated the intermediate and final results of our text-mining work flow. With our approach we were able to enrich the pathway representing vitamin A metabolism. We found 37 new and relevant terms from a total of 89,086 terms, which have been qualified for inclusion in the analyzed pathway. These 37 terms have been assessed manually and as a result 13 new terms were then added as entities to the pathway. Another 14 entities belonged to other pathways, which could form the link of these pathways with the vitamin A pathway. The remaining 10 terms were classified as biomarkers or nutrients. Automatic literature analysis improves the enrichment of pathways with entities already described in the scientific literature.

  1. 30 CFR 33.38 - Electrical parts.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Electrical parts. 33.38 Section 33.38 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR TESTING, EVALUATION, AND APPROVAL OF MINING PRODUCTS DUST COLLECTORS FOR USE IN CONNECTION WITH ROCK DRILLING IN COAL MINES Test Requirements...

  2. 30 CFR 33.38 - Electrical parts.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Electrical parts. 33.38 Section 33.38 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR TESTING, EVALUATION, AND APPROVAL OF MINING PRODUCTS DUST COLLECTORS FOR USE IN CONNECTION WITH ROCK DRILLING IN COAL MINES Test Requirements...

  3. 30 CFR 33.38 - Electrical parts.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Electrical parts. 33.38 Section 33.38 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR TESTING, EVALUATION, AND APPROVAL OF MINING PRODUCTS DUST COLLECTORS FOR USE IN CONNECTION WITH ROCK DRILLING IN COAL MINES Test Requirements...

  4. An Enhanced Text-Mining Framework for Extracting Disaster Relevant Data through Social Media and Remote Sensing Data Fusion

    NASA Astrophysics Data System (ADS)

    Scheele, C. J.; Huang, Q.

    2016-12-01

    In the past decade, the rise in social media has led to the development of a vast number of social media services and applications. Disaster management represents one of such applications leveraging massive data generated for event detection, response, and recovery. In order to find disaster relevant social media data, current approaches utilize natural language processing (NLP) methods based on keywords, or machine learning algorithms relying on text only. However, these approaches cannot be perfectly accurate due to the variability and uncertainty in language used on social media. To improve current methods, the enhanced text-mining framework is proposed to incorporate location information from social media and authoritative remote sensing datasets for detecting disaster relevant social media posts, which are determined by assessing the textual content using common text mining methods and how the post relates spatiotemporally to the disaster event. To assess the framework, geo-tagged Tweets were collected for three different spatial and temporal disaster events: hurricane, flood, and tornado. Remote sensing data and products for each event were then collected using RealEarthTM. Both Naive Bayes and Logistic Regression classifiers were used to compare the accuracy within the enhanced text-mining framework. Finally, the accuracies from the enhanced text-mining framework were compared to the current text-only methods for each of the case study disaster events. The results from this study address the need for more authoritative data when using social media in disaster management applications.

  5. Text Mining for Neuroscience

    NASA Astrophysics Data System (ADS)

    Tirupattur, Naveen; Lapish, Christopher C.; Mukhopadhyay, Snehasis

    2011-06-01

    Text mining, sometimes alternately referred to as text analytics, refers to the process of extracting high-quality knowledge from the analysis of textual data. Text mining has wide variety of applications in areas such as biomedical science, news analysis, and homeland security. In this paper, we describe an approach and some relatively small-scale experiments which apply text mining to neuroscience research literature to find novel associations among a diverse set of entities. Neuroscience is a discipline which encompasses an exceptionally wide range of experimental approaches and rapidly growing interest. This combination results in an overwhelmingly large and often diffuse literature which makes a comprehensive synthesis difficult. Understanding the relations or associations among the entities appearing in the literature not only improves the researchers current understanding of recent advances in their field, but also provides an important computational tool to formulate novel hypotheses and thereby assist in scientific discoveries. We describe a methodology to automatically mine the literature and form novel associations through direct analysis of published texts. The method first retrieves a set of documents from databases such as PubMed using a set of relevant domain terms. In the current study these terms yielded a set of documents ranging from 160,909 to 367,214 documents. Each document is then represented in a numerical vector form from which an Association Graph is computed which represents relationships between all pairs of domain terms, based on co-occurrence. Association graphs can then be subjected to various graph theoretic algorithms such as transitive closure and cycle (circuit) detection to derive additional information, and can also be visually presented to a human researcher for understanding. In this paper, we present three relatively small-scale problem-specific case studies to demonstrate that such an approach is very successful in replicating a neuroscience expert's mental model of object-object associations entirely by means of text mining. These preliminary results provide the confidence that this type of text mining based research approach provides an extremely powerful tool to better understand the literature and drive novel discovery for the neuroscience community.

  6. Using Open Web APIs in Teaching Web Mining

    ERIC Educational Resources Information Center

    Chen, Hsinchun; Li, Xin; Chau, M.; Ho, Yi-Jen; Tseng, Chunju

    2009-01-01

    With the advent of the World Wide Web, many business applications that utilize data mining and text mining techniques to extract useful business information on the Web have evolved from Web searching to Web mining. It is important for students to acquire knowledge and hands-on experience in Web mining during their education in information systems…

  7. Text mining of rheumatoid arthritis and diabetes mellitus to understand the mechanisms of Chinese medicine in different diseases with same treatment.

    PubMed

    Zhao, Ning; Zheng, Guang; Li, Jian; Zhao, Hong-Yan; Lu, Cheng; Jiang, Miao; Zhang, Chi; Guo, Hong-Tao; Lu, Ai-Ping

    2018-01-09

    To identify the commonalities between rheumatoid arthritis (RA) and diabetes mellitus (DM) to understand the mechanisms of Chinese medicine (CM) in different diseases with the same treatment. A text mining approach was adopted to analyze the commonalities between RA and DM according to CM and biological elements. The major commonalities were subsequently verifified in RA and DM rat models, in which herbal formula for the treatment of both RA and DM identifified via text mining was used as the intervention. Similarities were identifified between RA and DM regarding the CM approach used for diagnosis and treatment, as well as the networks of biological activities affected by each disease, including the involvement of adhesion molecules, oxidative stress, cytokines, T-lymphocytes, apoptosis, and inflfl ammation. The Ramulus Cinnamomi-Radix Paeoniae Alba-Rhizoma Anemarrhenae is an herbal combination used to treat RA and DM. This formula demonstrated similar effects on oxidative stress and inflfl ammation in rats with collagen-induced arthritis, which supports the text mining results regarding the commonalities between RA and DM. Commonalities between the biological activities involved in RA and DM were identifified through text mining, and both RA and DM might be responsive to the same intervention at a specifific stage.

  8. Mine Safety and Health Administration's Part 50 program does not fully capture chronic disease and injury in the Illinois mining industry.

    PubMed

    Almberg, Kirsten S; Friedman, Lee S; Swedler, David; Cohen, Robert A

    2018-05-01

    The Mine Safety and Health Administration (MSHA) requires reporting of injuries and illnesses to their Part 50 program. A 2011 study indicated that the Part 50 program did not capture many cases of injury in Kentucky, causing concern about underreporting in other states. MSHA Part 50 reports from Illinois for 2001-2013 were linked to Illinois Workers' Compensation Commission (IWCC) data. IWCC cases not found in the Part 50 data were considered unreported. Overall, the Part 50 Program did not capture 66% of IWCC cases from 2001 to 2013. Chronic injuries or illnesses were more likely to be unreported to MSHA. The majority of occupational injuries and illnesses found in the IWCC from this time period, were not captured by Part 50. Inaccurate reporting of injuries and illnesses to the Part 50 program hinders MSHA's ability to enforce safety and health standards in the mining industry. © 2018 Wiley Periodicals, Inc.

  9. Application of remote-sensing techniques to hydrologic studies in selected coal-mine areas of southeastern Kansas

    USGS Publications Warehouse

    Kenny, J.F.; McCauley, J.R.

    1983-01-01

    Disturbances resulting from intensive coal mining in the Cherry Creek basin of southeastern Kansas were investigated using color and color-infrared aerial photography in conjunction with water-quality data from simultaneously acquired samples. Imagery was used to identify the type and extent of vegetative cover on strip-mined lands and the extent and success of reclamation practices. Drainage patterns, point sources of acid mine drainage, and recharge areas for underground mines were located for onsite inspection. Comparison of these interpretations with water-quality data illustrated differences between the eastern and western parts of the Cherry Creek basin. Contamination in the eastern part is due largely to circulation of water from unreclaimed strip mines and collapse features through the network of underground mines and subsequent discharge of acidic drainage through seeps. Contamination in the western part is primarily caused by runoff and seepage from strip-mined lands in which surfaces have frequently been graded and limed but are generally devoid of mature stands of soil-anchoring vegetation. The successful use of aerial photography in the study of Cherry Creek basin indicates the potential of using remote-sensing techniques in studies of other coal-mined regions. (USGS)

  10. 30 CFR 56.14107 - Moving machine parts.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Moving machine parts. 56.14107 Section 56.14107 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE... Safety Devices and Maintenance Requirements § 56.14107 Moving machine parts. (a) Moving machine parts...

  11. Text Mining of Journal Articles for Sleep Disorder Terminologies.

    PubMed

    Lam, Calvin; Lai, Fu-Chih; Wang, Chia-Hui; Lai, Mei-Hsin; Hsu, Nanly; Chung, Min-Huey

    2016-01-01

    Research on publication trends in journal articles on sleep disorders (SDs) and the associated methodologies by using text mining has been limited. The present study involved text mining for terms to determine the publication trends in sleep-related journal articles published during 2000-2013 and to identify associations between SD and methodology terms as well as conducting statistical analyses of the text mining findings. SD and methodology terms were extracted from 3,720 sleep-related journal articles in the PubMed database by using MetaMap. The extracted data set was analyzed using hierarchical cluster analyses and adjusted logistic regression models to investigate publication trends and associations between SD and methodology terms. MetaMap had a text mining precision, recall, and false positive rate of 0.70, 0.77, and 11.51%, respectively. The most common SD term was breathing-related sleep disorder, whereas narcolepsy was the least common. Cluster analyses showed similar methodology clusters for each SD term, except narcolepsy. The logistic regression models showed an increasing prevalence of insomnia, parasomnia, and other sleep disorders but a decreasing prevalence of breathing-related sleep disorder during 2000-2013. Different SD terms were positively associated with different methodology terms regarding research design terms, measure terms, and analysis terms. Insomnia-, parasomnia-, and other sleep disorder-related articles showed an increasing publication trend, whereas those related to breathing-related sleep disorder showed a decreasing trend. Furthermore, experimental studies more commonly focused on hypersomnia and other SDs and less commonly on insomnia, breathing-related sleep disorder, narcolepsy, and parasomnia. Thus, text mining may facilitate the exploration of the publication trends in SDs and the associated methodologies.

  12. Text Mining to Support Gene Ontology Curation and Vice Versa.

    PubMed

    Ruch, Patrick

    2017-01-01

    In this chapter, we explain how text mining can support the curation of molecular biology databases dealing with protein functions. We also show how curated data can play a disruptive role in the developments of text mining methods. We review a decade of efforts to improve the automatic assignment of Gene Ontology (GO) descriptors, the reference ontology for the characterization of genes and gene products. To illustrate the high potential of this approach, we compare the performances of an automatic text categorizer and show a large improvement of +225 % in both precision and recall on benchmarked data. We argue that automatic text categorization functions can ultimately be embedded into a Question-Answering (QA) system to answer questions related to protein functions. Because GO descriptors can be relatively long and specific, traditional QA systems cannot answer such questions. A new type of QA system, so-called Deep QA which uses machine learning methods trained with curated contents, is thus emerging. Finally, future advances of text mining instruments are directly dependent on the availability of high-quality annotated contents at every curation step. Databases workflows must start recording explicitly all the data they curate and ideally also some of the data they do not curate.

  13. [Text mining, a method for computer-assisted analysis of scientific texts, demonstrated by an analysis of author networks].

    PubMed

    Hahn, P; Dullweber, F; Unglaub, F; Spies, C K

    2014-06-01

    Searching for relevant publications is becoming more difficult with the increasing number of scientific articles. Text mining as a specific form of computer-based data analysis may be helpful in this context. Highlighting relations between authors and finding relevant publications concerning a specific subject using text analysis programs are illustrated graphically by 2 performed examples. © Georg Thieme Verlag KG Stuttgart · New York.

  14. 30 CFR Appendix I to Subpart C of... - National Consensus Standards

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Subpart C of Part 57 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Fire Prevention and Control Pt. 57, Subpt. C., App. I Appendix I to Subpart C of Part 57—National...

  15. 40 CFR Appendix A to Part 434 - Alternate Storm Limitations for Acid or Ferruginous Mine Drainage

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 30 2014-07-01 2014-07-01 false Alternate Storm Limitations for Acid or Ferruginous Mine Drainage A Appendix A to Part 434 Protection of Environment ENVIRONMENTAL...—Alternate Storm Limitations for Acid or Ferruginous Mine Drainage EC01MY92.113 ...

  16. 40 CFR Appendix A to Part 434 - Alternate Storm Limitations for Acid or Ferruginous Mine Drainage

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 31 2013-07-01 2013-07-01 false Alternate Storm Limitations for Acid or Ferruginous Mine Drainage A Appendix A to Part 434 Protection of Environment ENVIRONMENTAL...—Alternate Storm Limitations for Acid or Ferruginous Mine Drainage EC01MY92.113 ...

  17. 40 CFR Appendix A to Part 434 - Alternate Storm Limitations for Acid or Ferruginous Mine Drainage

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 30 2011-07-01 2011-07-01 false Alternate Storm Limitations for Acid or Ferruginous Mine Drainage A Appendix A to Part 434 Protection of Environment ENVIRONMENTAL... Storm Limitations for Acid or Ferruginous Mine Drainage EC01MY92.113 ...

  18. 40 CFR Appendix A to Part 434 - Alternate Storm Limitations for Acid or Ferruginous Mine Drainage

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 29 2010-07-01 2010-07-01 false Alternate Storm Limitations for Acid or Ferruginous Mine Drainage A Appendix A to Part 434 Protection of Environment ENVIRONMENTAL... Storm Limitations for Acid or Ferruginous Mine Drainage EC01MY92.113 ...

  19. 40 CFR Appendix A to Part 434 - Alternate Storm Limitations for Acid or Ferruginous Mine Drainage

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 31 2012-07-01 2012-07-01 false Alternate Storm Limitations for Acid or Ferruginous Mine Drainage A Appendix A to Part 434 Protection of Environment ENVIRONMENTAL...—Alternate Storm Limitations for Acid or Ferruginous Mine Drainage EC01MY92.113 ...

  20. Mining of Business-Oriented Conversations at a Call Center

    NASA Astrophysics Data System (ADS)

    Takeuchi, Hironori; Nasukawa, Tetsuya; Watanabe, Hideo

    Recently it has become feasible to transcribe textual records from telephone conversations at call centers by using automatic speech recognition. In this research, we extended a text mining system for call summary records and constructed a conversation mining system for the business-oriented conversations at the call center. To acquire useful business insights from the conversational data through the text mining system, it is critical to identify appropriate textual segments and expressions as the viewpoints to focus on. In the analysis of call summary data using a text mining system, some experts defined the viewpoints for the analysis by looking at some sample records and by preparing the dictionaries based on frequent keywords in the sample dataset. However with conversations it is difficult to identify such viewpoints manually and in advance because the target data consists of complete transcripts that are often lengthy and redundant. In this research, we defined a model of the business-oriented conversations and proposed a mining method to identify segments that have impacts on the outcomes of the conversations and can then extract useful expressions in each of these identified segments. In the experiment, we processed the real datasets from a car rental service center and constructed a mining system. With this system, we show the effectiveness of the method based on the defined conversation model.

  1. Recent progress in automatically extracting information from the pharmacogenomic literature

    PubMed Central

    Garten, Yael; Coulet, Adrien; Altman, Russ B

    2011-01-01

    The biomedical literature holds our understanding of pharmacogenomics, but it is dispersed across many journals. In order to integrate our knowledge, connect important facts across publications and generate new hypotheses we must organize and encode the contents of the literature. By creating databases of structured pharmocogenomic knowledge, we can make the value of the literature much greater than the sum of the individual reports. We can, for example, generate candidate gene lists or interpret surprising hits in genome-wide association studies. Text mining automatically adds structure to the unstructured knowledge embedded in millions of publications, and recent years have seen a surge in work on biomedical text mining, some specific to pharmacogenomics literature. These methods enable extraction of specific types of information and can also provide answers to general, systemic queries. In this article, we describe the main tasks of text mining in the context of pharmacogenomics, summarize recent applications and anticipate the next phase of text mining applications. PMID:21047206

  2. Mining protein function from text using term-based support vector machines

    PubMed Central

    Rice, Simon B; Nenadic, Goran; Stapley, Benjamin J

    2005-01-01

    Background Text mining has spurred huge interest in the domain of biology. The goal of the BioCreAtIvE exercise was to evaluate the performance of current text mining systems. We participated in Task 2, which addressed assigning Gene Ontology terms to human proteins and selecting relevant evidence from full-text documents. We approached it as a modified form of the document classification task. We used a supervised machine-learning approach (based on support vector machines) to assign protein function and select passages that support the assignments. As classification features, we used a protein's co-occurring terms that were automatically extracted from documents. Results The results evaluated by curators were modest, and quite variable for different problems: in many cases we have relatively good assignment of GO terms to proteins, but the selected supporting text was typically non-relevant (precision spanning from 3% to 50%). The method appears to work best when a substantial set of relevant documents is obtained, while it works poorly on single documents and/or short passages. The initial results suggest that our approach can also mine annotations from text even when an explicit statement relating a protein to a GO term is absent. Conclusion A machine learning approach to mining protein function predictions from text can yield good performance only if sufficient training data is available, and significant amount of supporting data is used for prediction. The most promising results are for combined document retrieval and GO term assignment, which calls for the integration of methods developed in BioCreAtIvE Task 1 and Task 2. PMID:15960835

  3. Using text mining for study identification in systematic reviews: a systematic review of current approaches.

    PubMed

    O'Mara-Eves, Alison; Thomas, James; McNaught, John; Miwa, Makoto; Ananiadou, Sophia

    2015-01-14

    The large and growing number of published studies, and their increasing rate of publication, makes the task of identifying relevant studies in an unbiased way for inclusion in systematic reviews both complex and time consuming. Text mining has been offered as a potential solution: through automating some of the screening process, reviewer time can be saved. The evidence base around the use of text mining for screening has not yet been pulled together systematically; this systematic review fills that research gap. Focusing mainly on non-technical issues, the review aims to increase awareness of the potential of these technologies and promote further collaborative research between the computer science and systematic review communities. Five research questions led our review: what is the state of the evidence base; how has workload reduction been evaluated; what are the purposes of semi-automation and how effective are they; how have key contextual problems of applying text mining to the systematic review field been addressed; and what challenges to implementation have emerged? We answered these questions using standard systematic review methods: systematic and exhaustive searching, quality-assured data extraction and a narrative synthesis to synthesise findings. The evidence base is active and diverse; there is almost no replication between studies or collaboration between research teams and, whilst it is difficult to establish any overall conclusions about best approaches, it is clear that efficiencies and reductions in workload are potentially achievable. On the whole, most suggested that a saving in workload of between 30% and 70% might be possible, though sometimes the saving in workload is accompanied by the loss of 5% of relevant studies (i.e. a 95% recall). Using text mining to prioritise the order in which items are screened should be considered safe and ready for use in 'live' reviews. The use of text mining as a 'second screener' may also be used cautiously. The use of text mining to eliminate studies automatically should be considered promising, but not yet fully proven. In highly technical/clinical areas, it may be used with a high degree of confidence; but more developmental and evaluative work is needed in other disciplines.

  4. 77 FR 5740 - Tennessee Abandoned Mine Land Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-02-06

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 942... Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule; public comment period and... amendment to the Tennessee Abandoned Mine Land (AML) Reclamation Plan under the Surface Mining Control and...

  5. 30 CFR 870.5 - Definitions.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... situ coal mining means activities conducted on the surface or underground in connection with in-place... not limited to, in situ gasification, in situ leaching, slurry mining, solution mining, bore hole mining, and fluid recovery mining. At this time, part 870 considers only in situ gasification. Inherent...

  6. 30 CFR 870.5 - Definitions.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... situ coal mining means activities conducted on the surface or underground in connection with in-place... not limited to, in situ gasification, in situ leaching, slurry mining, solution mining, bore hole mining, and fluid recovery mining. At this time, part 870 considers only in situ gasification. Inherent...

  7. 30 CFR 870.5 - Definitions.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... situ coal mining means activities conducted on the surface or underground in connection with in-place... not limited to, in situ gasification, in situ leaching, slurry mining, solution mining, bore hole mining, and fluid recovery mining. At this time, part 870 considers only in situ gasification. Inherent...

  8. 30 CFR 870.5 - Definitions.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... situ coal mining means activities conducted on the surface or underground in connection with in-place... not limited to, in situ gasification, in situ leaching, slurry mining, solution mining, bore hole mining, and fluid recovery mining. At this time, part 870 considers only in situ gasification. Inherent...

  9. 30 CFR 870.5 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... situ coal mining means activities conducted on the surface or underground in connection with in-place... not limited to, in situ gasification, in situ leaching, slurry mining, solution mining, bore hole mining, and fluid recovery mining. At this time, part 870 considers only in situ gasification. Inherent...

  10. Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery.

    PubMed

    Gonzalez, Graciela H; Tahsin, Tasnia; Goodale, Britton C; Greene, Anna C; Greene, Casey S

    2016-01-01

    Precision medicine will revolutionize the way we treat and prevent disease. A major barrier to the implementation of precision medicine that clinicians and translational scientists face is understanding the underlying mechanisms of disease. We are starting to address this challenge through automatic approaches for information extraction, representation and analysis. Recent advances in text and data mining have been applied to a broad spectrum of key biomedical questions in genomics, pharmacogenomics and other fields. We present an overview of the fundamental methods for text and data mining, as well as recent advances and emerging applications toward precision medicine. © The Author 2015. Published by Oxford University Press.

  11. Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery

    PubMed Central

    Gonzalez, Graciela H.; Tahsin, Tasnia; Goodale, Britton C.; Greene, Anna C.

    2016-01-01

    Precision medicine will revolutionize the way we treat and prevent disease. A major barrier to the implementation of precision medicine that clinicians and translational scientists face is understanding the underlying mechanisms of disease. We are starting to address this challenge through automatic approaches for information extraction, representation and analysis. Recent advances in text and data mining have been applied to a broad spectrum of key biomedical questions in genomics, pharmacogenomics and other fields. We present an overview of the fundamental methods for text and data mining, as well as recent advances and emerging applications toward precision medicine. PMID:26420781

  12. Application of text mining for customer evaluations in commercial banking

    NASA Astrophysics Data System (ADS)

    Tan, Jing; Du, Xiaojiang; Hao, Pengpeng; Wang, Yanbo J.

    2015-07-01

    Nowadays customer attrition is increasingly serious in commercial banks. To combat this problem roundly, mining customer evaluation texts is as important as mining customer structured data. In order to extract hidden information from customer evaluations, Textual Feature Selection, Classification and Association Rule Mining are necessary techniques. This paper presents all three techniques by using Chinese Word Segmentation, C5.0 and Apriori, and a set of experiments were run based on a collection of real textual data that includes 823 customer evaluations taken from a Chinese commercial bank. Results, consequent solutions, some advice for the commercial bank are given in this paper.

  13. Extracting semantically enriched events from biomedical literature

    PubMed Central

    2012-01-01

    Background Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them. Results Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP’09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP’09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task. Conclusions We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions, dealing with phenomena such as rhetorical intent, speculations, contradictions and negations. This finer grained search functionality can assist in several important tasks, e.g., database curation (by locating new experimental knowledge) and pathway enrichment (by providing information for inference). To allow easy integration into text mining systems, EventMine-MK is provided as a UIMA component that can be used in the interoperable text mining infrastructure, U-Compare. PMID:22621266

  14. Extracting semantically enriched events from biomedical literature.

    PubMed

    Miwa, Makoto; Thompson, Paul; McNaught, John; Kell, Douglas B; Ananiadou, Sophia

    2012-05-23

    Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them. Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP'09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP'09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task. We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions, dealing with phenomena such as rhetorical intent, speculations, contradictions and negations. This finer grained search functionality can assist in several important tasks, e.g., database curation (by locating new experimental knowledge) and pathway enrichment (by providing information for inference). To allow easy integration into text mining systems, EventMine-MK is provided as a UIMA component that can be used in the interoperable text mining infrastructure, U-Compare.

  15. Individual Profiling Using Text Analysis

    DTIC Science & Technology

    2016-04-15

    Mining a Text for Errors. . . . on Knowledge discovery in data mining , pages 624–628, 2005. [12] Michal Kosinski, David Stillwell, and Thore Graepel...AFRL-AFOSR-UK-TR-2016-0011 Individual Profiling using Text Analysis 140333 Mark Stevenson UNIVERSITY OF SHEFFIELD, DEPARTMENT OF PSYCHOLOGY Final...REPORT TYPE      Final 3.  DATES COVERED (From - To)      15 Sep 2014 to 14 Sep 2015 4.  TITLE AND SUBTITLE Individual Profiling using Text Analysis

  16. Mining the pharmacogenomics literature—a survey of the state of the art

    PubMed Central

    Cohen, K. Bretonnel; Garten, Yael; Shah, Nigam H.

    2012-01-01

    This article surveys efforts on text mining of the pharmacogenomics literature, mainly from the period 2008 to 2011. Pharmacogenomics (or pharmacogenetics) is the field that studies how human genetic variation impacts drug response. Therefore, publications span the intersection of research in genotypes, phenotypes and pharmacology, a topic that has increasingly become a focus of active research in recent years. This survey covers efforts dealing with the automatic recognition of relevant named entities (e.g. genes, gene variants and proteins, diseases and other pathological phenomena, drugs and other chemicals relevant for medical treatment), as well as various forms of relations between them. A wide range of text genres is considered, such as scientific publications (abstracts, as well as full texts), patent texts and clinical narratives. We also discuss infrastructure and resources needed for advanced text analytics, e.g. document corpora annotated with corresponding semantic metadata (gold standards and training data), biomedical terminologies and ontologies providing domain-specific background knowledge at different levels of formality and specificity, software architectures for building complex and scalable text analytics pipelines and Web services grounded to them, as well as comprehensive ways to disseminate and interact with the typically huge amounts of semiformal knowledge structures extracted by text mining tools. Finally, we consider some of the novel applications that have already been developed in the field of pharmacogenomic text mining and point out perspectives for future research. PMID:22833496

  17. Mining the pharmacogenomics literature--a survey of the state of the art.

    PubMed

    Hahn, Udo; Cohen, K Bretonnel; Garten, Yael; Shah, Nigam H

    2012-07-01

    This article surveys efforts on text mining of the pharmacogenomics literature, mainly from the period 2008 to 2011. Pharmacogenomics (or pharmacogenetics) is the field that studies how human genetic variation impacts drug response. Therefore, publications span the intersection of research in genotypes, phenotypes and pharmacology, a topic that has increasingly become a focus of active research in recent years. This survey covers efforts dealing with the automatic recognition of relevant named entities (e.g. genes, gene variants and proteins, diseases and other pathological phenomena, drugs and other chemicals relevant for medical treatment), as well as various forms of relations between them. A wide range of text genres is considered, such as scientific publications (abstracts, as well as full texts), patent texts and clinical narratives. We also discuss infrastructure and resources needed for advanced text analytics, e.g. document corpora annotated with corresponding semantic metadata (gold standards and training data), biomedical terminologies and ontologies providing domain-specific background knowledge at different levels of formality and specificity, software architectures for building complex and scalable text analytics pipelines and Web services grounded to them, as well as comprehensive ways to disseminate and interact with the typically huge amounts of semiformal knowledge structures extracted by text mining tools. Finally, we consider some of the novel applications that have already been developed in the field of pharmacogenomic text mining and point out perspectives for future research.

  18. Using Text Mining to Characterize Online Discussion Facilitation

    ERIC Educational Resources Information Center

    Ming, Norma; Baumer, Eric

    2011-01-01

    Facilitating class discussions effectively is a critical yet challenging component of instruction, particularly in online environments where student and faculty interaction is limited. Our goals in this research were to identify facilitation strategies that encourage productive discussion, and to explore text mining techniques that can help…

  19. pubmed.mineR: an R package with text-mining algorithms to analyse PubMed abstracts.

    PubMed

    Rani, Jyoti; Shah, A B Rauf; Ramachandran, Srinivasan

    2015-10-01

    The PubMed literature database is a valuable source of information for scientific research. It is rich in biomedical literature with more than 24 million citations. Data-mining of voluminous literature is a challenging task. Although several text-mining algorithms have been developed in recent years with focus on data visualization, they have limitations such as speed, are rigid and are not available in the open source. We have developed an R package, pubmed.mineR, wherein we have combined the advantages of existing algorithms, overcome their limitations, and offer user flexibility and link with other packages in Bioconductor and the Comprehensive R Network (CRAN) in order to expand the user capabilities for executing multifaceted approaches. Three case studies are presented, namely, 'Evolving role of diabetes educators', 'Cancer risk assessment' and 'Dynamic concepts on disease and comorbidity' to illustrate the use of pubmed.mineR. The package generally runs fast with small elapsed times in regular workstations even on large corpus sizes and with compute intensive functions. The pubmed.mineR is available at http://cran.rproject. org/web/packages/pubmed.mineR.

  20. PubstractHelper: A Web-based Text-Mining Tool for Marking Sentences in Abstracts from PubMed Using Multiple User-Defined Keywords.

    PubMed

    Chen, Chou-Cheng; Ho, Chung-Liang

    2014-01-01

    While a huge amount of information about biological literature can be obtained by searching the PubMed database, reading through all the titles and abstracts resulting from such a search for useful information is inefficient. Text mining makes it possible to increase this efficiency. Some websites use text mining to gather information from the PubMed database; however, they are database-oriented, using pre-defined search keywords while lacking a query interface for user-defined search inputs. We present the PubMed Abstract Reading Helper (PubstractHelper) website which combines text mining and reading assistance for an efficient PubMed search. PubstractHelper can accept a maximum of ten groups of keywords, within each group containing up to ten keywords. The principle behind the text-mining function of PubstractHelper is that keywords contained in the same sentence are likely to be related. PubstractHelper highlights sentences with co-occurring keywords in different colors. The user can download the PMID and the abstracts with color markings to be reviewed later. The PubstractHelper website can help users to identify relevant publications based on the presence of related keywords, which should be a handy tool for their research. http://bio.yungyun.com.tw/ATM/PubstractHelper.aspx and http://holab.med.ncku.edu.tw/ATM/PubstractHelper.aspx.

  1. Graphics-based intelligent search and abstracting using Data Modeling

    NASA Astrophysics Data System (ADS)

    Jaenisch, Holger M.; Handley, James W.; Case, Carl T.; Songy, Claude G.

    2002-11-01

    This paper presents an autonomous text and context-mining algorithm that converts text documents into point clouds for visual search cues. This algorithm is applied to the task of data-mining a scriptural database comprised of the Old and New Testaments from the Bible and the Book of Mormon, Doctrine and Covenants, and the Pearl of Great Price. Results are generated which graphically show the scripture that represents the average concept of the database and the mining of the documents down to the verse level.

  2. 30 CFR 921.764 - Process for designating areas unsuitable for surface coal mining operations.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... surface coal mining operations. 921.764 Section 921.764 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS... mining operations. Part 764 of this chapter, State Processes for Designating Areas Unsuitable for Surface...

  3. 30 CFR 933.764 - Process for designating areas unsuitable for surface coal mining operations.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... surface coal mining operations. 933.764 Section 933.764 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS... mining operations. Part 764 of this chapter, State Processes for Designatng Areas Unsuitable for Surface...

  4. 30 CFR 912.785 - Requirements for permits for special categories of mining.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... of mining. 912.785 Section 912.785 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE IDAHO § 912.785 Requirements for permits for special categories of mining. Part 785 of this...

  5. 30 CFR 903.785 - Requirements for permits for special categories of mining.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... of mining. 903.785 Section 903.785 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE ARIZONA § 903.785 Requirements for permits for special categories of mining. Part 785 of this...

  6. 30 CFR 905.785 - Requirements for permits for special categories of mining.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... of mining. 905.785 Section 905.785 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE CALIFORNIA § 905.785 Requirements for permits for special categories of mining. Part 785 of this...

  7. 30 CFR 910.785 - Requirements for permits for special categories of mining.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... of mining. 910.785 Section 910.785 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE GEORGIA § 910.785 Requirements for permits for special categories of mining. Part 785 of this...

  8. The Labour Welfare Fund Laws (Amendment) Act, 1987 (No. 15 of 1987), 22 May 1987.

    PubMed

    1987-01-01

    This Act authorizes funds constituted under the Mica Mines Labour Welfare Fund Act, 1946, the Limestone and Dolomite Mines Labour Welfare Fund Act, 1972, the Iron Ore Mines, Manganese Ore Mines and Chrome Mines Labour Welfare Fund Act, 1976, and the Beedi Workers Welfare Fund Act, 1976, to be applied for the provision of family welfare, including family planning education and services. full text

  9. Mining Tasks from the Web Anchor Text Graph: MSR Notebook Paper for the TREC 2015 Tasks Track

    DTIC Science & Technology

    2015-11-20

    Mining Tasks from the Web Anchor Text Graph: MSR Notebook Paper for the TREC 2015 Tasks Track Paul N. Bennett Microsoft Research Redmond, USA pauben...anchor text graph has proven useful in the general realm of query reformulation [2], we sought to quantify the value of extracting key phrases from...anchor text in the broader setting of the task understanding track. Given a query, our approach considers a simple method for identifying a relevant

  10. Automated Text Data Mining Analysis of Five Decades of Educational Leadership Research Literature: Probabilistic Topic Modeling of "EAQ" Articles From 1965 to 2014

    ERIC Educational Resources Information Center

    Wang, Yinying; Bowers, Alex J.; Fikis, David J.

    2017-01-01

    Purpose: The purpose of this study is to describe the underlying topics and the topic evolution in the 50-year history of educational leadership research literature. Method: We used automated text data mining with probabilistic latent topic models to examine the full text of the entire publication history of all 1,539 articles published in…

  11. 30 CFR Appendix I to Subpart T of... - Standard Applicability by Category or Subcategory

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Subcategory I Appendix I to Subpart T of Part 57 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... AND NONMETAL MINES Safety Standards for Methane in Metal and Nonmetal Mines Pt. 57, Subpt. T, App. I Appendix I to Subpart T of Part 57—Standard Applicability by Category or Subcategory Subcategory I-A 57...

  12. 30 CFR Appendix I to Subpart T of... - Standard Applicability by Category or Subcategory

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Subcategory I Appendix I to Subpart T of Part 57 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... AND NONMETAL MINES Safety Standards for Methane in Metal and Nonmetal Mines Pt. 57, Subpt. T, App. I Appendix I to Subpart T of Part 57—Standard Applicability by Category or Subcategory Subcategory I-A 57...

  13. Effects of potential surface coal mining on dissolved solids in Otter Creek and in the Otter Creek alluvial aquifer, southeastern Montana

    USGS Publications Warehouse

    Cannon, M.R.

    1985-01-01

    Otter Creek drains an area of 709 square miles in the coal-rich Powder River structural basin of southeastern Montana. The Knobloch coal beds in the Tongue River Member of the Paleocene Fort Union Formation is a shallow aquifer and a target for future surface mining in the downstream part of the Otter Creek basin. A mass-balance model was used to estimate the effects of potential mining on the dissolved solids concentration in Otter Creek and in the alluvial aquifer in the Otter Creek valley. With extensive mining of the Knobloch coal beds, the annual load of dissolved solids to Otter Creek at Ashland at median streamflow could increase by 2,873 tons, or a 32-percent increase compared to the annual pre-mining load. Increased monthly loads of Otter Creek, at the median streamflow, could range from 15 percent in February to 208 percent in August. The post-mining dissolved solids load to the subirrigated part of the alluvial valley could increase by 71 percent. The median dissolved solids concentration in the subirrigated part of the valley could be 4,430 milligrams per liter, compared to the pre-mining median concentration of 2,590 milligrams per liter. Post-mining loads from the potentially mined landscape were calculated using saturated-paste-extract data from 506 overburdened samples collected from 26 wells and test holes. Post-mining loads to the Otter Creek valley likely would continue at increased rates for hundreds of years after mining. If the actual area of Knobloch coal disturbed by mining were less than that used in the model, post-mining loads to the Otter Creek valley would be proportionally smaller. (USGS)

  14. Long term contracts, expansion, innovation and stability: North Dakota's lignite mines thrive

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Buchsbaum, L.

    2009-08-15

    North Dakota's lignite coal industry is mainly located in three countries in the central part of the state. Its large surface lignite mines are tied through long-term (20-40 years) contracts to power plants. The article talks about operations at three of the most productive mines - the Freedom mine, Falkirk mine and Center Mine. 4 figs.

  15. 30 CFR Appendix to Subpart B of... - Optional Form for Certifying Mine Rescue Teams

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Optional Form for Certifying Mine Rescue Teams... LABOR EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Coal Mines Pt. 49, Subpt. B, App. Appendix to Subpart B of Part 49—Optional Form for Certifying Mine Rescue Teams ER08FE08...

  16. 30 CFR Appendix to Subpart B of... - Optional Form for Certifying Mine Rescue Teams

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Optional Form for Certifying Mine Rescue Teams... LABOR EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Coal Mines Pt. 49, Subpt. B, App. Appendix to Subpart B of Part 49—Optional Form for Certifying Mine Rescue Teams ER08FE08...

  17. A text-based data mining and toxicity prediction modeling system for a clinical decision support in radiation oncology: A preliminary study

    NASA Astrophysics Data System (ADS)

    Kim, Kwang Hyeon; Lee, Suk; Shim, Jang Bo; Chang, Kyung Hwan; Yang, Dae Sik; Yoon, Won Sup; Park, Young Je; Kim, Chul Yong; Cao, Yuan Jie

    2017-08-01

    The aim of this study is an integrated research for text-based data mining and toxicity prediction modeling system for clinical decision support system based on big data in radiation oncology as a preliminary research. The structured and unstructured data were prepared by treatment plans and the unstructured data were extracted by dose-volume data image pattern recognition of prostate cancer for research articles crawling through the internet. We modeled an artificial neural network to build a predictor model system for toxicity prediction of organs at risk. We used a text-based data mining approach to build the artificial neural network model for bladder and rectum complication predictions. The pattern recognition method was used to mine the unstructured toxicity data for dose-volume at the detection accuracy of 97.9%. The confusion matrix and training model of the neural network were achieved with 50 modeled plans (n = 50) for validation. The toxicity level was analyzed and the risk factors for 25% bladder, 50% bladder, 20% rectum, and 50% rectum were calculated by the artificial neural network algorithm. As a result, 32 plans could cause complication but 18 plans were designed as non-complication among 50 modeled plans. We integrated data mining and a toxicity modeling method for toxicity prediction using prostate cancer cases. It is shown that a preprocessing analysis using text-based data mining and prediction modeling can be expanded to personalized patient treatment decision support based on big data.

  18. 30 CFR 57.14107 - Moving machine parts.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Moving machine parts. 57.14107 Section 57.14107 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE... Equipment Safety Devices and Maintenance Requirements § 57.14107 Moving machine parts. (a) Moving machine...

  19. A sentence sliding window approach to extract protein annotations from biomedical articles

    PubMed Central

    Krallinger, Martin; Padron, Maria; Valencia, Alfonso

    2005-01-01

    Background Within the emerging field of text mining and statistical natural language processing (NLP) applied to biomedical articles, a broad variety of techniques have been developed during the past years. Nevertheless, there is still a great ned of comparative assessment of the performance of the proposed methods and the development of common evaluation criteria. This issue was addressed by the Critical Assessment of Text Mining Methods in Molecular Biology (BioCreative) contest. The aim of this contest was to assess the performance of text mining systems applied to biomedical texts including tools which recognize named entities such as genes and proteins, and tools which automatically extract protein annotations. Results The "sentence sliding window" approach proposed here was found to efficiently extract text fragments from full text articles containing annotations on proteins, providing the highest number of correctly predicted annotations. Moreover, the number of correct extractions of individual entities (i.e. proteins and GO terms) involved in the relationships used for the annotations was significantly higher than the correct extractions of the complete annotations (protein-function relations). Conclusion We explored the use of averaging sentence sliding windows for information extraction, especially in a context where conventional training data is unavailable. The combination of our approach with more refined statistical estimators and machine learning techniques might be a way to improve annotation extraction for future biomedical text mining applications. PMID:15960831

  20. Text Mining Improves Prediction of Protein Functional Sites

    PubMed Central

    Cohn, Judith D.; Ravikumar, Komandur E.

    2012-01-01

    We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388

  1. Gene Prioritization of Resistant Rice Gene against Xanthomas oryzae pv. oryzae by Using Text Mining Technologies

    PubMed Central

    Xia, Jingbo; Zhang, Xing; Yuan, Daojun; Chen, Lingling; Webster, Jonathan; Fang, Alex Chengyu

    2013-01-01

    To effectively assess the possibility of the unknown rice protein resistant to Xanthomonas oryzae pv. oryzae, a hybrid strategy is proposed to enhance gene prioritization by combining text mining technologies with a sequence-based approach. The text mining technique of term frequency inverse document frequency is used to measure the importance of distinguished terms which reflect biomedical activity in rice before candidate genes are screened and vital terms are produced. Afterwards, a built-in classifier under the chaos games representation algorithm is used to sieve the best possible candidate gene. Our experiment results show that the combination of these two methods achieves enhanced gene prioritization. PMID:24371834

  2. Uncovering text mining: A survey of current work on web-based epidemic intelligence

    PubMed Central

    Collier, Nigel

    2012-01-01

    Real world pandemics such as SARS 2002 as well as popular fiction like the movie Contagion graphically depict the health threat of a global pandemic and the key role of epidemic intelligence (EI). While EI relies heavily on established indicator sources a new class of methods based on event alerting from unstructured digital Internet media is rapidly becoming acknowledged within the public health community. At the heart of automated information gathering systems is a technology called text mining. My contribution here is to provide an overview of the role that text mining technology plays in detecting epidemics and to synthesise my existing research on the BioCaster project. PMID:22783909

  3. Gene prioritization of resistant rice gene against Xanthomas oryzae pv. oryzae by using text mining technologies.

    PubMed

    Xia, Jingbo; Zhang, Xing; Yuan, Daojun; Chen, Lingling; Webster, Jonathan; Fang, Alex Chengyu

    2013-01-01

    To effectively assess the possibility of the unknown rice protein resistant to Xanthomonas oryzae pv. oryzae, a hybrid strategy is proposed to enhance gene prioritization by combining text mining technologies with a sequence-based approach. The text mining technique of term frequency inverse document frequency is used to measure the importance of distinguished terms which reflect biomedical activity in rice before candidate genes are screened and vital terms are produced. Afterwards, a built-in classifier under the chaos games representation algorithm is used to sieve the best possible candidate gene. Our experiment results show that the combination of these two methods achieves enhanced gene prioritization.

  4. Finding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining.

    PubMed

    Kreula, Sanna M; Kaewphan, Suwisa; Ginter, Filip; Jones, Patrik R

    2018-01-01

    The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from 'reading the literature'. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already 'known', and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to ( i ) discover novel candidate associations between different genes or proteins in the network, and ( ii ) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource.

  5. 78 FR 48591 - Refuge Alternatives for Underground Coal Mines

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-08

    ... Administration 30 CFR Parts 7 and 75 Refuge Alternatives for Underground Coal Mines; Proposed Rules #0;#0;Federal... Underground Coal Mines AGENCY: Mine Safety and Health Administration, Labor. ACTION: Limited reopening of the... for miners to deploy and use refuge alternatives in underground coal mines. The U.S. Court of Appeals...

  6. 75 FR 20918 - High-Voltage Continuous Mining Machine Standard for Underground Coal Mines

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-22

    ... DEPARTMENT OF LABOR Mine Safety and Health Administration 30 CFR Parts 18 and 75 RIN 1219-AB34 High-Voltage Continuous Mining Machine Standard for Underground Coal Mines Correction In rule document 2010-7309 beginning on page 17529 in the issue of Tuesday, April 6, 2010, make the following correction...

  7. 75 FR 82074 - Fee Adjustment for Testing, Evaluation, and Approval of Mining Products

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-29

    ..., and Approval of Mining Products AGENCY: Mine Safety and Health Administration (MSHA), Labor. ACTION..., evaluating, and approving mining products as provided by 30 CFR part 5. MSHA charges applicants a fee to... materials manufactured for use in the mining industry. The new fee schedule, effective January 1, 2011, is...

  8. 30 CFR 900.2 - Objectives.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... texts of State and Federal cooperative agreements for regulation of mining on Federal lands. The... Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE INTRODUCTION § 900.2 Objectives. The objective of...

  9. Complementing the Numbers: A Text Mining Analysis of College Course Withdrawals

    ERIC Educational Resources Information Center

    Michalski, Greg V.

    2011-01-01

    Excessive college course withdrawals are costly to the student and the institution in terms of time to degree completion, available classroom space, and other resources. Although generally well quantified, detailed analysis of the reasons given by students for course withdrawal is less common. To address this, a text mining analysis was performed…

  10. Can abstract screening workload be reduced using text mining? User experiences of the tool Rayyan.

    PubMed

    Olofsson, Hanna; Brolund, Agneta; Hellberg, Christel; Silverstein, Rebecca; Stenström, Karin; Österberg, Marie; Dagerhamn, Jessica

    2017-09-01

    One time-consuming aspect of conducting systematic reviews is the task of sifting through abstracts to identify relevant studies. One promising approach for reducing this burden uses text mining technology to identify those abstracts that are potentially most relevant for a project, allowing those abstracts to be screened first. To examine the effectiveness of the text mining functionality of the abstract screening tool Rayyan. User experiences were collected. Rayyan was used to screen abstracts for 6 reviews in 2015. After screening 25%, 50%, and 75% of the abstracts, the screeners logged the relevant references identified. A survey was sent to users. After screening half of the search result with Rayyan, 86% to 99% of the references deemed relevant to the study were identified. Of those studies included in the final reports, 96% to 100% were already identified in the first half of the screening process. Users rated Rayyan 4.5 out of 5. The text mining function in Rayyan successfully helped reviewers identify relevant studies early in the screening process. Copyright © 2017 John Wiley & Sons, Ltd.

  11. Adverse Event extraction from Structured Product Labels using the Event-based Text-mining of Health Electronic Records (ETHER)system.

    PubMed

    Pandey, Abhishek; Kreimeyer, Kory; Foster, Matthew; Botsis, Taxiarchis; Dang, Oanh; Ly, Thomas; Wang, Wei; Forshee, Richard

    2018-01-01

    Structured Product Labels follow an XML-based document markup standard approved by the Health Level Seven organization and adopted by the US Food and Drug Administration as a mechanism for exchanging medical products information. Their current organization makes their secondary use rather challenging. We used the Side Effect Resource database and DailyMed to generate a comparison dataset of 1159 Structured Product Labels. We processed the Adverse Reaction section of these Structured Product Labels with the Event-based Text-mining of Health Electronic Records system and evaluated its ability to extract and encode Adverse Event terms to Medical Dictionary for Regulatory Activities Preferred Terms. A small sample of 100 labels was then selected for further analysis. Of the 100 labels, Event-based Text-mining of Health Electronic Records achieved a precision and recall of 81 percent and 92 percent, respectively. This study demonstrated Event-based Text-mining of Health Electronic Record's ability to extract and encode Adverse Event terms from Structured Product Labels which may potentially support multiple pharmacoepidemiological tasks.

  12. A Framework for Text Mining in Scientometric Study: A Case Study in Biomedicine Publications

    NASA Astrophysics Data System (ADS)

    Silalahi, V. M. M.; Hardiyati, R.; Nadhiroh, I. M.; Handayani, T.; Rahmaida, R.; Amelia, M.

    2018-04-01

    The data of Indonesians research publications in the domain of biomedicine has been collected to be text mined for the purpose of a scientometric study. The goal is to build a predictive model that provides a classification of research publications on the potency for downstreaming. The model is based on the drug development processes adapted from the literatures. An effort is described to build the conceptual model and the development of a corpus on the research publications in the domain of Indonesian biomedicine. Then an investigation is conducted relating to the problems associated with building a corpus and validating the model. Based on our experience, a framework is proposed to manage the scientometric study based on text mining. Our method shows the effectiveness of conducting a scientometric study based on text mining in order to get a valid classification model. This valid model is mainly supported by the iterative and close interactions with the domain experts starting from identifying the issues, building a conceptual model, to the labelling, validation and results interpretation.

  13. Data Processing and Text Mining Technologies on Electronic Medical Records: A Review

    PubMed Central

    Sun, Wencheng; Li, Yangyang; Liu, Fang; Fang, Shengqun; Wang, Guoyan

    2018-01-01

    Currently, medical institutes generally use EMR to record patient's condition, including diagnostic information, procedures performed, and treatment results. EMR has been recognized as a valuable resource for large-scale analysis. However, EMR has the characteristics of diversity, incompleteness, redundancy, and privacy, which make it difficult to carry out data mining and analysis directly. Therefore, it is necessary to preprocess the source data in order to improve data quality and improve the data mining results. Different types of data require different processing technologies. Most structured data commonly needs classic preprocessing technologies, including data cleansing, data integration, data transformation, and data reduction. For semistructured or unstructured data, such as medical text, containing more health information, it requires more complex and challenging processing methods. The task of information extraction for medical texts mainly includes NER (named-entity recognition) and RE (relation extraction). This paper focuses on the process of EMR processing and emphatically analyzes the key techniques. In addition, we make an in-depth study on the applications developed based on text mining together with the open challenges and research issues for future work. PMID:29849998

  14. 30 CFR 876.1 - Scope.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR ABANDONED MINE LAND RECLAMATION ACID MINE DRAINAGE TREATMENT AND ABATEMENT PROGRAM § 876.1 Scope. This part establishes the requirements and procedures for the preparation, submission and approval of State or Indian tribe Acid Mine...

  15. 30 CFR 57.4201 - Inspection.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Fire Prevention and Control...) Water pipes, valves, outlets, hydrants, and hoses that are part of the mine's firefighting system shall...

  16. 30 CFR 57.4201 - Inspection.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Fire Prevention and Control...) Water pipes, valves, outlets, hydrants, and hoses that are part of the mine's firefighting system shall...

  17. 30 CFR 56.4201 - Inspection.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Fire Prevention and Control...) Water pipes, valves, outlets, hydrants, and hoses that are part of the mine's firefighting system shall...

  18. 30 CFR 56.4201 - Inspection.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Fire Prevention and Control...) Water pipes, valves, outlets, hydrants, and hoses that are part of the mine's firefighting system shall...

  19. 30 CFR 876.1 - Scope.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR ABANDONED MINE LAND RECLAMATION ACID MINE DRAINAGE TREATMENT AND ABATEMENT PROGRAM § 876.1 Scope. This part establishes the requirements and procedures for the preparation, submission and approval of State or Indian tribe Acid Mine...

  20. 30 CFR 876.1 - Scope.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR ABANDONED MINE LAND RECLAMATION ACID MINE DRAINAGE TREATMENT AND ABATEMENT PROGRAM § 876.1 Scope. This part establishes the requirements and procedures for the preparation, submission and approval of State or Indian tribe Acid Mine...

  1. 30 CFR 876.1 - Scope.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR ABANDONED MINE LAND RECLAMATION ACID MINE DRAINAGE TREATMENT AND ABATEMENT PROGRAM § 876.1 Scope. This part establishes the requirements and procedures for the preparation, submission and approval of State or Indian tribe Acid Mine...

  2. 30 CFR 57.4201 - Inspection.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Fire Prevention and Control...) Water pipes, valves, outlets, hydrants, and hoses that are part of the mine's firefighting system shall...

  3. 30 CFR 876.1 - Scope.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR ABANDONED MINE LAND RECLAMATION ACID MINE DRAINAGE TREATMENT AND ABATEMENT PROGRAM § 876.1 Scope. This part establishes the requirements and procedures for the preparation, submission and approval of State or Indian tribe Acid Mine...

  4. 30 CFR 57.4201 - Inspection.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Fire Prevention and Control...) Water pipes, valves, outlets, hydrants, and hoses that are part of the mine's firefighting system shall...

  5. 30 CFR 56.4201 - Inspection.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Fire Prevention and Control...) Water pipes, valves, outlets, hydrants, and hoses that are part of the mine's firefighting system shall...

  6. 30 CFR 56.4201 - Inspection.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Fire Prevention and Control...) Water pipes, valves, outlets, hydrants, and hoses that are part of the mine's firefighting system shall...

  7. Corpus annotation for mining biomedical events from literature

    PubMed Central

    Kim, Jin-Dong; Ohta, Tomoko; Tsujii, Jun'ichi

    2008-01-01

    Background Advanced Text Mining (TM) such as semantic enrichment of papers, event or relation extraction, and intelligent Question Answering have increasingly attracted attention in the bio-medical domain. For such attempts to succeed, text annotation from the biological point of view is indispensable. However, due to the complexity of the task, semantic annotation has never been tried on a large scale, apart from relatively simple term annotation. Results We have completed a new type of semantic annotation, event annotation, which is an addition to the existing annotations in the GENIA corpus. The corpus has already been annotated with POS (Parts of Speech), syntactic trees, terms, etc. The new annotation was made on half of the GENIA corpus, consisting of 1,000 Medline abstracts. It contains 9,372 sentences in which 36,114 events are identified. The major challenges during event annotation were (1) to design a scheme of annotation which meets specific requirements of text annotation, (2) to achieve biology-oriented annotation which reflect biologists' interpretation of text, and (3) to ensure the homogeneity of annotation quality across annotators. To meet these challenges, we introduced new concepts such as Single-facet Annotation and Semantic Typing, which have collectively contributed to successful completion of a large scale annotation. Conclusion The resulting event-annotated corpus is the largest and one of the best in quality among similar annotation efforts. We expect it to become a valuable resource for NLP (Natural Language Processing)-based TM in the bio-medical domain. PMID:18182099

  8. An overview of the BioCreative 2012 Workshop Track III: interactive text mining task

    PubMed Central

    Arighi, Cecilia N.; Carterette, Ben; Cohen, K. Bretonnel; Krallinger, Martin; Wilbur, W. John; Fey, Petra; Dodson, Robert; Cooper, Laurel; Van Slyke, Ceri E.; Dahdul, Wasila; Mabee, Paula; Li, Donghui; Harris, Bethany; Gillespie, Marc; Jimenez, Silvia; Roberts, Phoebe; Matthews, Lisa; Becker, Kevin; Drabkin, Harold; Bello, Susan; Licata, Luana; Chatr-aryamontri, Andrew; Schaeffer, Mary L.; Park, Julie; Haendel, Melissa; Van Auken, Kimberly; Li, Yuling; Chan, Juancarlos; Muller, Hans-Michael; Cui, Hong; Balhoff, James P.; Chi-Yang Wu, Johnny; Lu, Zhiyong; Wei, Chih-Hsuan; Tudor, Catalina O.; Raja, Kalpana; Subramani, Suresh; Natarajan, Jeyakumar; Cejuela, Juan Miguel; Dubey, Pratibha; Wu, Cathy

    2013-01-01

    In many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to assist in biocuration becomes increasingly relevant. A number of groups have developed tools for text mining from a computer science/linguistics perspective, and there are many initiatives to curate some aspect of biology from the literature. Some biocuration efforts already make use of a text mining tool, but there have not been many broad-based systematic efforts to study which aspects of a text mining tool contribute to its usefulness for a curation task. Here, we report on an effort to bring together text mining tool developers and database biocurators to test the utility and usability of tools. Six text mining systems presenting diverse biocuration tasks participated in a formal evaluation, and appropriate biocurators were recruited for testing. The performance results from this evaluation indicate that some of the systems were able to improve efficiency of curation by speeding up the curation task significantly (∼1.7- to 2.5-fold) over manual curation. In addition, some of the systems were able to improve annotation accuracy when compared with the performance on the manually curated set. In terms of inter-annotator agreement, the factors that contributed to significant differences for some of the systems included the expertise of the biocurator on the given curation task, the inherent difficulty of the curation and attention to annotation guidelines. After the task, annotators were asked to complete a survey to help identify strengths and weaknesses of the various systems. The analysis of this survey highlights how important task completion is to the biocurators’ overall experience of a system, regardless of the system’s high score on design, learnability and usability. In addition, strategies to refine the annotation guidelines and systems documentation, to adapt the tools to the needs and query types the end user might have and to evaluate performance in terms of efficiency, user interface, result export and traditional evaluation metrics have been analyzed during this task. This analysis will help to plan for a more intense study in BioCreative IV. PMID:23327936

  9. An overview of the BioCreative 2012 Workshop Track III: interactive text mining task.

    PubMed

    Arighi, Cecilia N; Carterette, Ben; Cohen, K Bretonnel; Krallinger, Martin; Wilbur, W John; Fey, Petra; Dodson, Robert; Cooper, Laurel; Van Slyke, Ceri E; Dahdul, Wasila; Mabee, Paula; Li, Donghui; Harris, Bethany; Gillespie, Marc; Jimenez, Silvia; Roberts, Phoebe; Matthews, Lisa; Becker, Kevin; Drabkin, Harold; Bello, Susan; Licata, Luana; Chatr-aryamontri, Andrew; Schaeffer, Mary L; Park, Julie; Haendel, Melissa; Van Auken, Kimberly; Li, Yuling; Chan, Juancarlos; Muller, Hans-Michael; Cui, Hong; Balhoff, James P; Chi-Yang Wu, Johnny; Lu, Zhiyong; Wei, Chih-Hsuan; Tudor, Catalina O; Raja, Kalpana; Subramani, Suresh; Natarajan, Jeyakumar; Cejuela, Juan Miguel; Dubey, Pratibha; Wu, Cathy

    2013-01-01

    In many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to assist in biocuration becomes increasingly relevant. A number of groups have developed tools for text mining from a computer science/linguistics perspective, and there are many initiatives to curate some aspect of biology from the literature. Some biocuration efforts already make use of a text mining tool, but there have not been many broad-based systematic efforts to study which aspects of a text mining tool contribute to its usefulness for a curation task. Here, we report on an effort to bring together text mining tool developers and database biocurators to test the utility and usability of tools. Six text mining systems presenting diverse biocuration tasks participated in a formal evaluation, and appropriate biocurators were recruited for testing. The performance results from this evaluation indicate that some of the systems were able to improve efficiency of curation by speeding up the curation task significantly (∼1.7- to 2.5-fold) over manual curation. In addition, some of the systems were able to improve annotation accuracy when compared with the performance on the manually curated set. In terms of inter-annotator agreement, the factors that contributed to significant differences for some of the systems included the expertise of the biocurator on the given curation task, the inherent difficulty of the curation and attention to annotation guidelines. After the task, annotators were asked to complete a survey to help identify strengths and weaknesses of the various systems. The analysis of this survey highlights how important task completion is to the biocurators' overall experience of a system, regardless of the system's high score on design, learnability and usability. In addition, strategies to refine the annotation guidelines and systems documentation, to adapt the tools to the needs and query types the end user might have and to evaluate performance in terms of efficiency, user interface, result export and traditional evaluation metrics have been analyzed during this task. This analysis will help to plan for a more intense study in BioCreative IV.

  10. Imrovement of operation stability of crucial parts and constructions when repairing dredges and other mining machines exploited in conditions of North

    NASA Astrophysics Data System (ADS)

    Broido, V. L.; Krasnoshtanov, S. U.

    2018-03-01

    The problems of a choice of rational technoloqy and materials for restoring crucial parts and large-sized welded constructions of dredges and other mining machines with use of methods of welding and surfasing are considered. Welding and surfacing occupy a significant share in the overall labor intensity of performing repair work at mining enterprises. Both manual arc welding and surfacing as well as mechanized methods are used, which ensure a 24-fold increase in productivity. The work shows examples of using the technology of restoring parts and structures at gold mining enterprises in Irkutsk region. Some marks of welding and surfasing materials are shown, which production is mastered by Irkutsk Heavy Engineering Plant (IZTM)

  11. Research on Classification of Chinese Text Data Based on SVM

    NASA Astrophysics Data System (ADS)

    Lin, Yuan; Yu, Hongzhi; Wan, Fucheng; Xu, Tao

    2017-09-01

    Data Mining has important application value in today’s industry and academia. Text classification is a very important technology in data mining. At present, there are many mature algorithms for text classification. KNN, NB, AB, SVM, decision tree and other classification methods all show good classification performance. Support Vector Machine’ (SVM) classification method is a good classifier in machine learning research. This paper will study the classification effect based on the SVM method in the Chinese text data, and use the support vector machine method in the chinese text to achieve the classify chinese text, and to able to combination of academia and practical application.

  12. StemTextSearch: Stem cell gene database with evidence from abstracts.

    PubMed

    Chen, Chou-Cheng; Ho, Chung-Liang

    2017-05-01

    Previous studies have used many methods to find biomarkers in stem cells, including text mining, experimental data and image storage. However, no text-mining methods have yet been developed which can identify whether a gene plays a positive or negative role in stem cells. StemTextSearch identifies the role of a gene in stem cells by using a text-mining method to find combinations of gene regulation, stem-cell regulation and cell processes in the same sentences of biomedical abstracts. The dataset includes 5797 genes, with 1534 genes having positive roles in stem cells, 1335 genes having negative roles, 1654 genes with both positive and negative roles, and 1274 with an uncertain role. The precision of gene role in StemTextSearch is 0.66, and the recall is 0.78. StemTextSearch is a web-based engine with queries that specify (i) gene, (ii) category of stem cell, (iii) gene role, (iv) gene regulation, (v) cell process, (vi) stem-cell regulation, and (vii) species. StemTextSearch is available through http://bio.yungyun.com.tw/StemTextSearch.aspx. Copyright © 2017. Published by Elsevier Inc.

  13. The structural and content aspects of abstracts versus bodies of full text journal articles are different

    PubMed Central

    2010-01-01

    Background An increase in work on the full text of journal articles and the growth of PubMedCentral have the opportunity to create a major paradigm shift in how biomedical text mining is done. However, until now there has been no comprehensive characterization of how the bodies of full text journal articles differ from the abstracts that until now have been the subject of most biomedical text mining research. Results We examined the structural and linguistic aspects of abstracts and bodies of full text articles, the performance of text mining tools on both, and the distribution of a variety of semantic classes of named entities between them. We found marked structural differences, with longer sentences in the article bodies and much heavier use of parenthesized material in the bodies than in the abstracts. We found content differences with respect to linguistic features. Three out of four of the linguistic features that we examined were statistically significantly differently distributed between the two genres. We also found content differences with respect to the distribution of semantic features. There were significantly different densities per thousand words for three out of four semantic classes, and clear differences in the extent to which they appeared in the two genres. With respect to the performance of text mining tools, we found that a mutation finder performed equally well in both genres, but that a wide variety of gene mention systems performed much worse on article bodies than they did on abstracts. POS tagging was also more accurate in abstracts than in article bodies. Conclusions Aspects of structure and content differ markedly between article abstracts and article bodies. A number of these differences may pose problems as the text mining field moves more into the area of processing full-text articles. However, these differences also present a number of opportunities for the extraction of data types, particularly that found in parenthesized text, that is present in article bodies but not in article abstracts. PMID:20920264

  14. Application of the Gmc-1000 and Gmc-2000 Mine Cooling Units for Central Air-Conditioning in Underground Mines / Zastosowanie górniczego urządzenia chłodniczego gmc-1000 i gmc-2000 w centralnej klimatyzacji kopalń podziemnych

    NASA Astrophysics Data System (ADS)

    Wojciechowski, Jerzy

    2013-03-01

    The paper describes the design and results of operating measurements of the GMC-1000 and GMC- 2000 Mine Cooling Units. The first part describes the design of the cooling unit and its key components: the chiller, evaporator, condenser, oil cooler, evaporative water cooler and gallery air cooler. The possibilities of use in central air conditioning systems of underground mines are described. The second part discusses the results of the workstation and operating measurements and determines the coefficients for evaluating the performance of the mine cooling unit.

  15. 30 CFR 57.4261 - Shaft-station waterlines.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....4261 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-UNDERGROUND METAL AND NONMETAL MINES Fire Prevention... located at underground shaft stations and are part of the mine's fire protection system shall have at...

  16. 30 CFR 20.12 - How approvals are granted.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... APPROVAL OF MINING PRODUCTS ELECTRIC MINE LAMPS OTHER THAN STANDARD CAP LAMPS § 20.12 How approvals are... part only when the testing engineers judge that the lamp has met the requirements of this part and...

  17. 30 CFR 20.12 - How approvals are granted.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... APPROVAL OF MINING PRODUCTS ELECTRIC MINE LAMPS OTHER THAN STANDARD CAP LAMPS § 20.12 How approvals are... part only when the testing engineers judge that the lamp has met the requirements of this part and...

  18. 30 CFR 20.12 - How approvals are granted.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... APPROVAL OF MINING PRODUCTS ELECTRIC MINE LAMPS OTHER THAN STANDARD CAP LAMPS § 20.12 How approvals are... part only when the testing engineers judge that the lamp has met the requirements of this part and...

  19. 30 CFR 20.12 - How approvals are granted.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... APPROVAL OF MINING PRODUCTS ELECTRIC MINE LAMPS OTHER THAN STANDARD CAP LAMPS § 20.12 How approvals are... part only when the testing engineers judge that the lamp has met the requirements of this part and...

  20. 30 CFR 20.12 - How approvals are granted.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... APPROVAL OF MINING PRODUCTS ELECTRIC MINE LAMPS OTHER THAN STANDARD CAP LAMPS § 20.12 How approvals are... part only when the testing engineers judge that the lamp has met the requirements of this part and...

  1. WEMINUCHE WILDERNESS, COLORADO.

    USGS Publications Warehouse

    Steven, Thomas A.; Williams, F.E.

    1984-01-01

    A mineral survey of the Weminuche Wilderness, Colorado was conducted. Although little mineral production has been recorded in the area, it borders several highly productive mining districts and mineral deposits probably exist within parts of the wilderness. Within and near the wilderness, evidence of substantiated mineral-resource potential was found in the following four areas: (1) the Needle Mountains mining district, in the southwestern part of the wilderness, (2) Whitehead Gulch, in the northwestern part of the wilderness, (3) the Beartown mining district, along the north margin of the wilderness, and (4) the Trout Creek-Middle Fork Piedra River area, in and adjacent to the northeastern part of the wilderness. Of the four areas, the Needle Mountains mining district has the most promise for significant mineral resources, particularly of molydenum and uranium. A probable oil and gas resource potential exists in the eastern half of the area in traps in sedimentary rocks under volcanic cover.

  2. Implementation of a Flexible Tool for Automated Literature-Mining and Knowledgebase Development (DevToxMine)

    EPA Science Inventory

    Deriving novel relationships from the scientific literature is an important adjunct to datamining activities for complex datasets in genomics and high-throughput screening activities. Automated text-mining algorithms can be used to extract relevant content from the literature and...

  3. 20 CFR 726.3 - Relationship of this part to other parts in this subchapter.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ..., DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS... this subchapter. (a) This part 726 implements and effectuates responsibilities for the payment of black... govern the responsibilities and obligations of coal mine operators to secure the payment of black lung...

  4. 20 CFR 726.3 - Relationship of this part to other parts in this subchapter.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ..., DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS... this subchapter. (a) This part 726 implements and effectuates responsibilities for the payment of black... govern the responsibilities and obligations of coal mine operators to secure the payment of black lung...

  5. 20 CFR 726.3 - Relationship of this part to other parts in this subchapter.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ..., DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS... this subchapter. (a) This part 726 implements and effectuates responsibilities for the payment of black... govern the responsibilities and obligations of coal mine operators to secure the payment of black lung...

  6. 20 CFR 726.3 - Relationship of this part to other parts in this subchapter.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ..., DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS... this subchapter. (a) This part 726 implements and effectuates responsibilities for the payment of black... govern the responsibilities and obligations of coal mine operators to secure the payment of black lung...

  7. A Feature Mining Based Approach for the Classification of Text Documents into Disjoint Classes.

    ERIC Educational Resources Information Center

    Nieto Sanchez, Salvador; Triantaphyllou, Evangelos; Kraft, Donald

    2002-01-01

    Proposes a new approach for classifying text documents into two disjoint classes. Highlights include a brief overview of document clustering; a data mining approach called the One Clause at a Time (OCAT) algorithm which is based on mathematical logic; vector space model (VSM); and comparing the OCAT to the VSM. (Author/LRW)

  8. Examining Mobile Learning Trends 2003-2008: A Categorical Meta-Trend Analysis Using Text Mining Techniques

    ERIC Educational Resources Information Center

    Hung, Jui-Long; Zhang, Ke

    2012-01-01

    This study investigated the longitudinal trends of academic articles in Mobile Learning (ML) using text mining techniques. One hundred and nineteen (119) refereed journal articles and proceedings papers from the SCI/SSCI database were retrieved and analyzed. The taxonomies of ML publications were grouped into twelve clusters (topics) and four…

  9. Trends of E-Learning Research from 2000 to 2008: Use of Text Mining and Bibliometrics

    ERIC Educational Resources Information Center

    Hung, Jui-long

    2012-01-01

    This study investigated the longitudinal trends of e-learning research using text mining techniques. Six hundred and eighty-nine (689) refereed journal articles and proceedings were retrieved from the Science Citation Index/Social Science Citation Index database in the period from 2000 to 2008. All e-learning publications were grouped into two…

  10. Hydrology of area 2, Eastern Coal Province, Pennsylvania and New York

    USGS Publications Warehouse

    Herb, W.J.; Brown, D.E.; Shaw, L.C.; Stoner, J.E.; Felbinger, J.K.

    1983-01-01

    Provisions of the Surface Mining Control and Reclamation Act of 1977 recognized a nationwide need for hydrologic information in mined and potentially mined areas. This report is designed to be useful to mine owners, operators, regulatory authorities, citizens groups, and others by presenting information on existing hydrologic conditions and by identifying additional sources of hydrologic information. General hydrologic information is presented in a brief text accompanied by a map, chart, graph, or other illustration for each of a series of water-resourcesrelated topics. The summation of the topical discussions provides a description of the hydrology of the area. The Eastern Coal Province has been divided into 24 hydrologic study areas which are shown on the cover of this report. The divisions are based on hydrologic factors, location, and size. Hydrologic units (surface drainage basins) or parts of units are combined to form each study area. Study Area 2 covers northwestern Pennsylvania and a small part of southwestern New York. Most exposed bedrock is of Pennsylvanian, Mi;;sissippian, or Devonian ages. Glacial drift covers most of the bedrock in the northwestern part of the area. During 1979, more than 7 million tons of bituminous coal was produced from about 230 mines in Area 2 counties. Over 99 percent of the area's coal production is from surface mining. Streamflow data are available for 18 continuousrecord stations; 1 crest-stage, partial-record station; 1 low-flow, partial-record station; and 65 miscellaneous sites. Water-quality data are available for 78 locations. Streams having the highest median specific conductance, highest median dissolved-solids concentrations, lowest median pH, highest median total-iron concentration, highest median total-manganese concentration, and highest dissolved-sulfate concentrations were found in Clarion County, the leading coal-producing county in the area. Statistics on low flow, mean flow, peak flow, and flow duration for gaging stations can be computed from recorded mean daily flows. Similar statistics can be estimated for ungaged streams by regression and graphical techniques. Five ground-water observation wells are being operated in Area 2. Ground-water levels fluctuate seasonally. Depth to water increases with well depth in upland areas and decreases with well depth in valleys. Well yields in the area range from less than 1 to more than 2,000 gallons per minute. Wells in unconsolidated materials usually have higher yields. Ground-water quality is adequate for most domestic purposes, except locally. Additional water-data information are available through: (1) The National Water Data Exchange, (2) The National Water Data Storage and Retrieva

  11. 76 FR 12852 - Louisiana Regulatory Program/Abandoned Mine Land Reclamation Plan

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-09

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 918... Reclamation Plan AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are...

  12. 75 FR 60373 - Louisiana Regulatory Program/Abandoned Mine Land Reclamation Plan

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-09-30

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 918... Reclamation Plan AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule... of Surface Mining Reclamation and Enforcement (OSM), are announcing receipt of a proposed amendment...

  13. 30 CFR 75.1 - Scope.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR COAL MINE SAFETY AND HEALTH MANDATORY SAFETY STANDARDS-UNDERGROUND COAL MINES General § 75.1 Scope. This part 75 sets forth safety standards compliance with which is mandatory in each underground coal mine subject to the Federal Mine Safety and Health Act...

  14. 30 CFR 75.1 - Scope.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR COAL MINE SAFETY AND HEALTH MANDATORY SAFETY STANDARDS-UNDERGROUND COAL MINES General § 75.1 Scope. This part 75 sets forth safety standards compliance with which is mandatory in each underground coal mine subject to the Federal Mine Safety and Health Act...

  15. 30 CFR 75.1 - Scope.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR COAL MINE SAFETY AND HEALTH MANDATORY SAFETY STANDARDS-UNDERGROUND COAL MINES General § 75.1 Scope. This part 75 sets forth safety standards compliance with which is mandatory in each underground coal mine subject to the Federal Mine Safety and Health Act...

  16. 30 CFR 75.1 - Scope.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR COAL MINE SAFETY AND HEALTH MANDATORY SAFETY STANDARDS-UNDERGROUND COAL MINES General § 75.1 Scope. This part 75 sets forth safety standards compliance with which is mandatory in each underground coal mine subject to the Federal Mine Safety and Health Act...

  17. 30 CFR 75.1 - Scope.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR COAL MINE SAFETY AND HEALTH MANDATORY SAFETY STANDARDS-UNDERGROUND COAL MINES General § 75.1 Scope. This part 75 sets forth safety standards compliance with which is mandatory in each underground coal mine subject to the Federal Mine Safety and Health Act...

  18. Finding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining

    PubMed Central

    Kreula, Sanna M.; Kaewphan, Suwisa; Ginter, Filip

    2018-01-01

    The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from ‘reading the literature’. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for Synechocystis sp. PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already ‘known’, and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to (i) discover novel candidate associations between different genes or proteins in the network, and (ii) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource. PMID:29844966

  19. Discriminative and informative features for biomolecular text mining with ensemble feature selection.

    PubMed

    Van Landeghem, Sofie; Abeel, Thomas; Saeys, Yvan; Van de Peer, Yves

    2010-09-15

    In the field of biomolecular text mining, black box behavior of machine learning systems currently limits understanding of the true nature of the predictions. However, feature selection (FS) is capable of identifying the most relevant features in any supervised learning setting, providing insight into the specific properties of the classification algorithm. This allows us to build more accurate classifiers while at the same time bridging the gap between the black box behavior and the end-user who has to interpret the results. We show that our FS methodology successfully discards a large fraction of machine-generated features, improving classification performance of state-of-the-art text mining algorithms. Furthermore, we illustrate how FS can be applied to gain understanding in the predictions of a framework for biomolecular event extraction from text. We include numerous examples of highly discriminative features that model either biological reality or common linguistic constructs. Finally, we discuss a number of insights from our FS analyses that will provide the opportunity to considerably improve upon current text mining tools. The FS algorithms and classifiers are available in Java-ML (http://java-ml.sf.net). The datasets are publicly available from the BioNLP'09 Shared Task web site (http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA/SharedTask/).

  20. Generative Topic Modeling in Image Data Mining and Bioinformatics Studies

    ERIC Educational Resources Information Center

    Chen, Xin

    2012-01-01

    Probabilistic topic models have been developed for applications in various domains such as text mining, information retrieval and computer vision and bioinformatics domain. In this thesis, we focus on developing novel probabilistic topic models for image mining and bioinformatics studies. Specifically, a probabilistic topic-connection (PTC) model…

  1. 78 FR 40496 - Notice of availability of the Final Environmental Impact Statement for the Proposed Hollister...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-07-05

    ... silver mining operation. Most of the infrastructure to support a mining operation was authorized and.... The Proposed Action consists of underground mining, constructing a new production shaft, improving.... Public comments resulted in the addition of clarifying text, but did not significantly change the...

  2. DEMONSTRATION OF AQUAFIX AND SAPS PASSIVE MINE WATER TREATMENT TECHNOLOGIES AT SUMMITVILLE MINE SITE, INNOVATIVE TECHNOLOGY EVALUATION REPORT

    EPA Science Inventory

    As part of the Superfund Innovative Technology Evaluation (SITE) Program, the U.S. Environmental Protection Agency evaluated two passive water treatment (PWT) technologies for metals removal from acid mine drainage (AMD) at the Summitville Mine Superfund Site in southern Colorado...

  3. 76 FR 76104 - Arkansas Regulatory Program and Abandoned Mine Land Reclamation Plan

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-06

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 904... Reclamation Plan AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation and...

  4. 77 FR 55430 - Arkansas Regulatory Program and Abandoned Mine Land Reclamation Plan

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-10

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 904... Reclamation Plan AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation and...

  5. 30 CFR Appendix I to Subpart C of... - National Consensus Standards

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Subpart C of Part 56 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR METAL AND NONMETAL MINE SAFETY AND HEALTH SAFETY AND HEALTH STANDARDS-SURFACE METAL AND NONMETAL MINES Fire... Standards Mine operators seeking further information in the area of fire prevention and control may consult...

  6. 30 CFR 90.3 - Part 90 option; notice of eligibility; exercise of option.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... as measured by the Mining Research Establishment (MRE) instrument. When the approved sampling device... concentrations. Mechanized mining unit (MMU). A unit of mining equipment including hand loading equipment used for the production of material; or a specialized unit which uses mining equipment other than specified...

  7. 30 CFR 49.1 - Purpose and scope.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Purpose and scope. 49.1 Section 49.1 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Metal and Nonmetal Mines § 49.1 Purpose and scope. This part...

  8. 30 CFR 49.1 - Purpose and scope.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Purpose and scope. 49.1 Section 49.1 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Metal and Nonmetal Mines § 49.1 Purpose and scope. This part...

  9. 30 CFR 49.1 - Purpose and scope.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Purpose and scope. 49.1 Section 49.1 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING MINE RESCUE TEAMS Mine Rescue Teams for Underground Metal and Nonmetal Mines § 49.1 Purpose and scope. This part...

  10. New perspectives on a 140-year legacy of mining and abandoned mine cleanup in the San Juan Mountains, Colorado

    USGS Publications Warehouse

    Yager, Douglas B.; Fey, David L.; Chapin, Thomas; Johnson, Raymond H.

    2016-01-01

    The Gold King mine water release that occurred on 5 August 2015 near the historical mining community of Silverton, Colorado, highlights the environmental legacy that abandoned mines have on the environment. During reclamation efforts, a breach of collapsed workings at the Gold King mine sent 3 million gallons of acidic and metal-rich mine water into the upper Animas River, a tributary to the Colorado River basin. The Gold King mine is located in the scenic, western San Juan Mountains, a region renowned for its volcano-tectonic and gold-silver-base metal mineralization history. Prior to mining, acidic drainage from hydrothermally altered areas was a major source of metals and acidity to streams, and it continues to be so. In addition to abandoned hard rock metal mines, uranium mine waste poses a long-term storage and immobilization challenge in this area. Uranium resources are mined in the Colorado Plateau, which borders the San Juan Mountains on the west. Uranium processing and repository sites along the Animas River near Durango, Colorado, are a prime example of how the legacy of mining must be managed for the health and well-being of future generations. The San Juan Mountains are part of a geoenvironmental nexus where geology, mining, agriculture, recreation, and community issues converge. This trip will explore the geology, mining, and mine cleanup history in which a community-driven, watershed-based stakeholder process is an integral part. Research tools and historical data useful for understanding complex watersheds impacted by natural sources of metals and acidity overprinted by mining will also be discussed.

  11. MetaRanker 2.0: a web server for prioritization of genetic variation data

    PubMed Central

    Pers, Tune H.; Dworzyński, Piotr; Thomas, Cecilia Engel; Lage, Kasper; Brunak, Søren

    2013-01-01

    MetaRanker 2.0 is a web server for prioritization of common and rare frequency genetic variation data. Based on heterogeneous data sets including genetic association data, protein–protein interactions, large-scale text-mining data, copy number variation data and gene expression experiments, MetaRanker 2.0 prioritizes the protein-coding part of the human genome to shortlist candidate genes for targeted follow-up studies. MetaRanker 2.0 is made freely available at www.cbs.dtu.dk/services/MetaRanker-2.0. PMID:23703204

  12. MetaRanker 2.0: a web server for prioritization of genetic variation data.

    PubMed

    Pers, Tune H; Dworzyński, Piotr; Thomas, Cecilia Engel; Lage, Kasper; Brunak, Søren

    2013-07-01

    MetaRanker 2.0 is a web server for prioritization of common and rare frequency genetic variation data. Based on heterogeneous data sets including genetic association data, protein-protein interactions, large-scale text-mining data, copy number variation data and gene expression experiments, MetaRanker 2.0 prioritizes the protein-coding part of the human genome to shortlist candidate genes for targeted follow-up studies. MetaRanker 2.0 is made freely available at www.cbs.dtu.dk/services/MetaRanker-2.0.

  13. Text mining to decipher free-response consumer complaints: insights from the NHTSA vehicle owner's complaint database.

    PubMed

    Ghazizadeh, Mahtab; McDonald, Anthony D; Lee, John D

    2014-09-01

    This study applies text mining to extract clusters of vehicle problems and associated trends from free-response data in the National Highway Traffic Safety Administration's vehicle owner's complaint database. As the automotive industry adopts new technologies, it is important to systematically assess the effect of these changes on traffic safety. Driving simulators, naturalistic driving data, and crash databases all contribute to a better understanding of how drivers respond to changing vehicle technology, but other approaches, such as automated analysis of incident reports, are needed. Free-response data from incidents representing two severity levels (fatal incidents and incidents involving injury) were analyzed using a text mining approach: latent semantic analysis (LSA). LSA and hierarchical clustering identified clusters of complaints for each severity level, which were compared and analyzed across time. Cluster analysis identified eight clusters of fatal incidents and six clusters of incidents involving injury. Comparisons showed that although the airbag clusters across the two severity levels have the same most frequent terms, the circumstances around the incidents differ. The time trends show clear increases in complaints surrounding the Ford/Firestone tire recall and the Toyota unintended acceleration recall. Increases in complaints may be partially driven by these recall announcements and the associated media attention. Text mining can reveal useful information from free-response databases that would otherwise be prohibitively time-consuming and difficult to summarize manually. Text mining can extend human analysis capabilities for large free-response databases to support earlier detection of problems and more timely safety interventions.

  14. Uranium Mines and Mills Location Database

    EPA Pesticide Factsheets

    EPA has compiled mine location information from federal, state, and Tribal agencies into a single database as part of its investigation into the potential environmental hazards of wastes from abandoned uranium mines in the western United States.

  15. Analysis of Nature of Science Included in Recent Popular Writing Using Text Mining Techniques

    ERIC Educational Resources Information Center

    Jiang, Feng; McComas, William F.

    2014-01-01

    This study examined the inclusion of nature of science (NOS) in popular science writing to determine whether it could serve supplementary resource for teaching NOS and to evaluate the accuracy of text mining and classification as a viable research tool in science education research. Four groups of documents published from 2001 to 2010 were…

  16. The Determination of Children's Knowledge of Global Lunar Patterns from Online Essays Using Text Mining Analysis

    ERIC Educational Resources Information Center

    Cheon, Jongpil; Lee, Sangno; Smith, Walter; Song, Jaeki; Kim, Yongjin

    2013-01-01

    The purpose of this study was to use text mining analysis of early adolescents' online essays to determine their knowledge of global lunar patterns. Australian and American students in grades five to seven wrote about global lunar patterns they had discovered by sharing observations with each other via the Internet. These essays were analyzed for…

  17. Impact of Text-Mining and Imitating Strategies on Lexical Richness, Lexical Diversity and General Success in Second Language Writing

    ERIC Educational Resources Information Center

    Çepni, Sevcan Bayraktar; Demirel, Elif Tokdemir

    2016-01-01

    This study aimed to find out the impact of "text mining and imitating" strategies on lexical richness, lexical diversity and general success of students in their compositions in second language writing. The participants were 98 students studying their first year in Karadeniz Technical University in English Language and Literature…

  18. Science and Technology Text Mining: Text Mining of the Journal Cortex

    DTIC Science & Technology

    2004-01-01

    Amnesia Retrograde Amnesia GENERAL Semantic Memory Episodic Memory Working Memory TEST Serial Position Curve...in Cortex can be reasonably divided into four categories (papers in each category in parenthesis): Semantic Memory (151); Handedness (145); Amnesia ... Semantic Memory (151) is divided into Verbal/ Numerical (76) and Visual/ Spatial (75). Amnesia (119) is divided into Amnesia Symptoms (50) and

  19. Experiences with Text Mining Large Collections of Unstructured Systems Development Artifacts at JPL

    NASA Technical Reports Server (NTRS)

    Port, Dan; Nikora, Allen; Hihn, Jairus; Huang, LiGuo

    2011-01-01

    Often repositories of systems engineering artifacts at NASA's Jet Propulsion Laboratory (JPL) are so large and poorly structured that they have outgrown our capability to effectively manually process their contents to extract useful information. Sophisticated text mining methods and tools seem a quick, low-effort approach to automating our limited manual efforts. Our experiences of exploring such methods mainly in three areas including historical risk analysis, defect identification based on requirements analysis, and over-time analysis of system anomalies at JPL, have shown that obtaining useful results requires substantial unanticipated efforts - from preprocessing the data to transforming the output for practical applications. We have not observed any quick 'wins' or realized benefit from short-term effort avoidance through automation in this area. Surprisingly we have realized a number of unexpected long-term benefits from the process of applying text mining to our repositories. This paper elaborates some of these benefits and our important lessons learned from the process of preparing and applying text mining to large unstructured system artifacts at JPL aiming to benefit future TM applications in similar problem domains and also in hope for being extended to broader areas of applications.

  20. The Study of Geotechnical Properties of Sediment in C-C Zone in the Northeastern Pacific for Deep-sea Mining

    NASA Astrophysics Data System (ADS)

    Chi, S.; Kim, K.; Lee, H.; Ju, S.; Yoo, C.

    2007-12-01

    Recently the market price of valuable metals are rapidly increased due to the high demand and limited resources. Therefore, manganese (Mn)-nodules (Polymetallic nodules) in the Clarion-Clipperton fracture zone have stimulated economic interest. Nickel, copper, cobalt and manganese are the economically most interesting metals of Mn-nodules. In order to mine Mn-nodules from sea floor, understanding the geotechnical properties of surface sediment are very important for two major reasons. First, geotechnical data are required to design and build the stable and environmentally acceptable mining vehicles. Second, deep-sea mining activity could significantly effect on the surface layer of deep sea floor. For example, surface sediments will be redistributed through the resuspension and redeposition. Reliable sedimentological and soil mechanical baseline data of the undisturbed benthic environment are essential to assess and evaluate these environmental impacts by mining activity using physical and numerical modeling. The 225 times deployments of the multiple corer guaranteed undisturbed sediment samples in which geotechnical parameters were measured including sediment grain size, density, water content, shear strength. The sea floor sediments in this study area are generally characterized into three different types as follow. The seabed of the middle part (8-12° N) of this study area is mainly covered with biogenic siliceous sediment compared with pelagic red clays in the northern part (16-17° N). However, the southern part (5-6° N) is dominant with calcareous sediments because its water depth is shallower than the carbonate compensation depth (CCD). This result suggests that middle area, covered with siliceous sediment, is more feasible for commercial mining than northern area, covered with pelagic red clay, with the consideration of the nodule miner maneuverability and the environmental impact. Especially, middle part with the highest nodule abundance and valuable metal contents is mainly (more than 90% of area) covered with consolidated sediments, which are expected to be appropriate for effective miner movement. Furthermore, middle part with coarse siliceous sediments could be less environmentally disturbed by the mining activity. It makes middle part more plausible site than other sites in this study area for the commercial mining.

  1. Biomedical hypothesis generation by text mining and gene prioritization.

    PubMed

    Petric, Ingrid; Ligeti, Balazs; Gyorffy, Balazs; Pongor, Sandor

    2014-01-01

    Text mining methods can facilitate the generation of biomedical hypotheses by suggesting novel associations between diseases and genes. Previously, we developed a rare-term model called RaJoLink (Petric et al, J. Biomed. Inform. 42(2): 219-227, 2009) in which hypotheses are formulated on the basis of terms rarely associated with a target domain. Since many current medical hypotheses are formulated in terms of molecular entities and molecular mechanisms, here we extend the methodology to proteins and genes, using a standardized vocabulary as well as a gene/protein network model. The proposed enhanced RaJoLink rare-term model combines text mining and gene prioritization approaches. Its utility is illustrated by finding known as well as potential gene-disease associations in ovarian cancer using MEDLINE abstracts and the STRING database.

  2. The Functional Genomics Network in the evolution of biological text mining over the past decade.

    PubMed

    Blaschke, Christian; Valencia, Alfonso

    2013-03-25

    Different programs of The European Science Foundation (ESF) have contributed significantly to connect researchers in Europe and beyond through several initiatives. This support was particularly relevant for the development of the areas related with extracting information from papers (text-mining) because it supported the field in its early phases long before it was recognized by the community. We review the historical development of text mining research and how it was introduced in bioinformatics. Specific applications in (functional) genomics are described like it's integration in genome annotation pipelines and the support to the analysis of high-throughput genomics experimental data, and we highlight the activities of evaluation of methods and benchmarking for which the ESF programme support was instrumental. Copyright © 2013 Elsevier B.V. All rights reserved.

  3. Agile Text Mining for the 2014 i2b2/UTHealth Cardiac Risk Factors Challenge

    PubMed Central

    Cormack, James; Nath, Chinmoy; Milward, David; Raja, Kalpana; Jonnalagadda, Siddhartha R

    2016-01-01

    This paper describes the use of an agile text mining platform (Linguamatics’ Interactive Information Extraction Platform, I2E) to extract document-level cardiac risk factors in patient records as defined in the i2b2/UTHealth 2014 Challenge. The approach uses a data-driven rule-based methodology with the addition of a simple supervised classifier. We demonstrate that agile text mining allows for rapid optimization of extraction strategies, while post-processing can leverage annotation guidelines, corpus statistics and logic inferred from the gold standard data. We also show how data imbalance in a training set affects performance. Evaluation of this approach on the test data gave an F-Score of 91.7%, one percent behind the top performing system. PMID:26209007

  4. Supporting the annotation of chronic obstructive pulmonary disease (COPD) phenotypes with text mining workflows.

    PubMed

    Fu, Xiao; Batista-Navarro, Riza; Rak, Rafal; Ananiadou, Sophia

    2015-01-01

    Chronic obstructive pulmonary disease (COPD) is a life-threatening lung disorder whose recent prevalence has led to an increasing burden on public healthcare. Phenotypic information in electronic clinical records is essential in providing suitable personalised treatment to patients with COPD. However, as phenotypes are often "hidden" within free text in clinical records, clinicians could benefit from text mining systems that facilitate their prompt recognition. This paper reports on a semi-automatic methodology for producing a corpus that can ultimately support the development of text mining tools that, in turn, will expedite the process of identifying groups of COPD patients. A corpus of 30 full-text papers was formed based on selection criteria informed by the expertise of COPD specialists. We developed an annotation scheme that is aimed at producing fine-grained, expressive and computable COPD annotations without burdening our curators with a highly complicated task. This was implemented in the Argo platform by means of a semi-automatic annotation workflow that integrates several text mining tools, including a graphical user interface for marking up documents. When evaluated using gold standard (i.e., manually validated) annotations, the semi-automatic workflow was shown to obtain a micro-averaged F-score of 45.70% (with relaxed matching). Utilising the gold standard data to train new concept recognisers, we demonstrated that our corpus, although still a work in progress, can foster the development of significantly better performing COPD phenotype extractors. We describe in this work the means by which we aim to eventually support the process of COPD phenotype curation, i.e., by the application of various text mining tools integrated into an annotation workflow. Although the corpus being described is still under development, our results thus far are encouraging and show great potential in stimulating the development of further automatic COPD phenotype extractors.

  5. 30 CFR 905.764 - Process for designating areas unsuitable for surface coal mining operations.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... surface coal mining operations. 905.764 Section 905.764 Mineral Resources OFFICE OF SURFACE MINING... WITHIN EACH STATE CALIFORNIA § 905.764 Process for designating areas unsuitable for surface coal mining operations. Part 764 of this chapter, State Processes for Designating Areas Unsuitable for Surface Coal...

  6. 30 CFR 910.764 - Process for designating areas unsuitable for surface coal mining operations.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... surface coal mining operations. 910.764 Section 910.764 Mineral Resources OFFICE OF SURFACE MINING... WITHIN EACH STATE GEORGIA § 910.764 Process for designating areas unsuitable for surface coal mining operations. Part 764 of this chapter, State Processes for Designating Areas Unsuitable for Surface Coal...

  7. 30 CFR 912.764 - Process for designating areas unsuitable for surface coal mining operations.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... surface coal mining operations. 912.764 Section 912.764 Mineral Resources OFFICE OF SURFACE MINING... WITHIN EACH STATE IDAHO § 912.764 Process for designating areas unsuitable for surface coal mining operations. Part 764 of this chapter, State Processes for Designating Areas Unsuitable for Surface Coal...

  8. 30 CFR 903.764 - Process for designating areas unsuitable for surface coal mining operations.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... surface coal mining operations. 903.764 Section 903.764 Mineral Resources OFFICE OF SURFACE MINING... WITHIN EACH STATE ARIZONA § 903.764 Process for designating areas unsuitable for surface coal mining operations. Part 764 of this chapter, State Processes for Designating Areas Unsuitable for Surface Coal...

  9. 78 FR 53775 - Renewal of Approved Information Collection

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-30

    ... regulations. These regulations pertain to the location, recording, and maintenance of mining claims and sites... Location Notices and Mining Claims; Payment of Fees (43 CFR parts 3832-3838). OMB Control Number: 1004-0114..., recording, and maintenance of mining claims and sites, in accordance with the General Mining Law (30 U.S.C...

  10. 30 CFR 784.1 - Scope.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... RECLAMATION OPERATIONS PERMITS AND COAL EXPLORATION SYSTEMS UNDER REGULATORY PROGRAMS UNDERGROUND MINING PERMIT APPLICATIONS-MINIMUM REQUIREMENTS FOR RECLAMATION AND OPERATION PLAN § 784.1 Scope. This part... mining operations and reclamation plans portions of applications for permits for underground mining...

  11. Redundancy and Novelty Mining in the Business Blogosphere

    ERIC Educational Resources Information Center

    Tsai, Flora S.; Chan, Kap Luk

    2010-01-01

    Purpose: The paper aims to explore the performance of redundancy and novelty mining in the business blogosphere, which has not been studied before. Design/methodology/approach: Novelty mining techniques are implemented to single out novel information out of a massive set of text documents. This paper adopted the mixed metric approach which…

  12. Heavy metal pollution of coal mine-affected agricultural soils in the northern part of Bangladesh.

    PubMed

    Bhuiyan, Mohammad A H; Parvez, Lutfar; Islam, M A; Dampare, Samuel B; Suzuki, Shigeyuki

    2010-01-15

    Total concentrations of heavy metals in the soils of mine drainage and surrounding agricultural fields in the northern part of Bangladesh were determined to evaluate the level of contamination. The average concentrations of Ti, Mn, Zn, Pb, As, Fe, Rb, Sr, Nb and Zr exceeded the world normal averages and, in some cases, Mn, Zn, As and Pb exceeded the toxic limit of the respective metals. Soil pollution assessment was carried out using enrichment factor (EF), geoaccumulation index (I(geo)) and pollution load index (PLI). The soils show significant enrichment with Ti, Mn, Zn, Pb, As, Fe, Sr and Nb, indicating inputs from mining activities. The I(geo) values have revealed that Mn (1.24+/-0.38), Zn (1.49+/-0.58) and Pb (1.63+/-0.38) are significantly accumulated in the study area. The PLIs derived from contamination factors indicate that the distal part of the coal mine-affected area is the most polluted (PLI of 4.02). Multivariate statistical analyses, principal component and cluster analyses, suggest that Mn, Zn, Pb and Ti are derived from anthropogenic sources, particularly coal mining activities, and the extreme proximal and distal parts are heavily contaminated with maximum heavy metals.

  13. Tectonic analysis of mine tremor mechanisms from the Upper Silesian Coal Basin

    NASA Astrophysics Data System (ADS)

    Sagan, Grzegorz; Teper, Lesław; Zuberek, Waclaw M.

    1996-07-01

    Fault network of the Upper Silesian Coal Basin (USCB) is built of sets of strike-slip, oblique-slip and dip-slip faults. It is a typical product of force couple which acts evenly with the parallel of latitude, causing horizontal and anti-clockwise movement of rock-mass. Earlier research of focal mechanisms of mine tremors, using a standard fault plane solution, has shown that some events are related to tectonic directions in main structural units of the USCB. An attempt was undertaken to analyze the records of mine tremors from the period 1992 1994 in the selected coal fields. The digital records of about 200 mine tremors with energy larger than 1×104 J ( M L >1.23) were analyzed with SMT software for seismic moment tensor inversion. The decomposition of seismic moment tensor of mine tremors was segmented into isotropic (I) part, compensated linear vector dipole (CLVD) part and double-couple (DC) part. The DC part is prevalent (up to 70%) in the majority of quakes from the central region of the USCB. A group of mine tremors with large I element (up to 50%) can also be observed. The spatial orientation of the fault and auxiliary planes were obtained from the computations for the seismic moment DC part. Study of the DC part of the seismic moment tensor made it possible for us to separate the group of events which might be acknowledged to have their origin in unstable energy release on surfaces of faults forming a regional structural pattern. The possible influence of the Cainozoic tectonic history of the USCB on the recent shape of stress field is discussed.

  14. Hydrology of area 18, Eastern Coal Province, Tennessee

    USGS Publications Warehouse

    May, V.J.

    1981-01-01

    The Eastern Coal Province is divided into 24 hydrologic reporting areas. This report describes the hydrology of area 18 which is located in the Cumberland River basin in central Tennessee near the southern end of the Province. Hydrologic information and sources are presented as text, tables, maps, and other illustrations designed to be useful to mine owners, operators, and consulting engineers in implementing permit applications that comply with the environmental requirements of the ' Surface Mining Control and Reclamation Act of 1977. ' Area 18 encompasses parts of three physiographic regions; from east to west the Cumberland Plateau, Highland Rim, and Central Basin. The Plateau is underlain by sandstones and shales, with thin interbedded coal beds, of Pennsylvanian age. The Highland Rim and Central Basin are underlain by limestone and dolomite of Mississippian age. Field and laboratory analyses of chemical and physical water-quality parameters of streamflow samples show no widespread water quality problems. Some streams, however, in the heavily mined areas have concentrations of sulfate, iron, manganese, and sediment above natural levels, and pH values below natural levels. Mine seepage and direct mine drainage were not sampled. Ground water occurs in and moves through fractures in the sandstones and shales and solution openings in the limestones and dolomites. Depth to water is variable, ranging from about 5 to 70 feet below land-surface in the limestones and dolomites, and 15 to 40 feet in the coal-bearing rocks. The quality of ground water is generally good. Locally, in coal-bearing rocks, acidic water and high concentrations of manganese, chloride, and iron have been detected. (USGS)

  15. 30 CFR Appendix I to Subpart J of... - Appendix I to Subpart J of Part 7

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Appendix I to Subpart J of Part 7 I Appendix I to Subpart J of Part 7 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR TESTING, EVALUATION, AND APPROVAL OF MINING PRODUCTS TESTING BY APPLICANT OR THIRD PARTY Electric Motor...

  16. Comparisons and Selections of Features and Classifiers for Short Text Classification

    NASA Astrophysics Data System (ADS)

    Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi

    2017-10-01

    Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.

  17. 30 CFR 870.11 - Applicability.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... LAND RECLAMATION ABANDONED MINE RECLAMATION FUND-FEE COLLECTION AND COAL PRODUCTION REPORTING § 870.11 Applicability. The regulations in this part apply to all surface and underground coal mining operations except... him; (b) The extraction of coal as an incidental part of Federal, State, or local government-financed...

  18. 30 CFR 870.11 - Applicability.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... LAND RECLAMATION ABANDONED MINE RECLAMATION FUND-FEE COLLECTION AND COAL PRODUCTION REPORTING § 870.11 Applicability. The regulations in this part apply to all surface and underground coal mining operations except... him; (b) The extraction of coal as an incidental part of Federal, State, or local government-financed...

  19. 30 CFR 870.11 - Applicability.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... LAND RECLAMATION ABANDONED MINE RECLAMATION FUND-FEE COLLECTION AND COAL PRODUCTION REPORTING § 870.11 Applicability. The regulations in this part apply to all surface and underground coal mining operations except... him; (b) The extraction of coal as an incidental part of Federal, State, or local government-financed...

  20. 30 CFR 870.11 - Applicability.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... LAND RECLAMATION ABANDONED MINE RECLAMATION FUND-FEE COLLECTION AND COAL PRODUCTION REPORTING § 870.11 Applicability. The regulations in this part apply to all surface and underground coal mining operations except... him; (b) The extraction of coal as an incidental part of Federal, State, or local government-financed...

  1. 30 CFR 870.11 - Applicability.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... LAND RECLAMATION ABANDONED MINE RECLAMATION FUND-FEE COLLECTION AND COAL PRODUCTION REPORTING § 870.11 Applicability. The regulations in this part apply to all surface and underground coal mining operations except... him; (b) The extraction of coal as an incidental part of Federal, State, or local government-financed...

  2. 30 CFR 90.103 - Compensation.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR COAL MINE SAFETY AND HEALTH MANDATORY HEALTH STANDARDS-COAL MINERS WHO HAVE EVIDENCE OF THE DEVELOPMENT OF PNEUMOCONIOSIS Dust Standards... under § 90.3 (Part 90 option; notice of eligibility; exercise of option). (b) Whenever a Part 90 miner...

  3. Applying Suffix Rules to Organization Name Recognition

    NASA Astrophysics Data System (ADS)

    Inui, Takashi; Murakami, Koji; Hashimoto, Taiichi; Utsumi, Kazuo; Ishikawa, Masamichi

    This paper presents a method for boosting the performance of the organization name recognition, which is a part of named entity recognition (NER). Although gazetteers (lists of the NEs) have been known as one of the effective features for supervised machine learning approaches on the NER task, the previous methods which have applied the gazetteers to the NER were very simple. The gazetteers have been used just for searching the exact matches between input text and NEs included in them. The proposed method generates regular expression rules from gazetteers, and, with these rules, it can realize a high-coverage searches based on looser matches between input text and NEs. To generate these rules, we focus on the two well-known characteristics of NE expressions; 1) most of NE expressions can be divided into two parts, class-reference part and instance-reference part, 2) for most of NE expressions the class-reference parts are located at the suffix position of them. A pattern mining algorithm runs on the set of NEs in the gazetteers, and some frequent word sequences from which NEs are constructed are found. Then, we employ only word sequences which have the class-reference part at the suffix position as suffix rules. Experimental results showed that our proposed method improved the performance of the organization name recognition, and achieved the 84.58 F-value for evaluation data.

  4. 76 FR 64047 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-17

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Montana regulatory program (hereinafter, the ``Montana program'') under the Surface Mining...

  5. 76 FR 36040 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-21

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Wyoming regulatory program (hereinafter, the ``Wyoming program'') under the Surface Mining...

  6. 78 FR 16204 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-03-14

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Wyoming regulatory program (hereinafter, the ``Wyoming program'') under the Surface Mining...

  7. 75 FR 34962 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-06-21

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938 [PA-154-FOR; OSM 2010-0002] Pennsylvania Regulatory Program AGENCY: Office of Surface Mining... the Pennsylvania regulatory program (the ``Pennsylvania program'') under the Surface Mining Control...

  8. 76 FR 80310 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-23

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Wyoming regulatory program (hereinafter, the ``Wyoming program'') under the Surface Mining...

  9. 76 FR 67635 - Alaska Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-11-02

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 902... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Alaska regulatory program (hereinafter, the ``Alaska program'') under the Surface Mining...

  10. 76 FR 64045 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-17

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Montana regulatory program (hereinafter, the ``Montana program'') under the Surface Mining...

  11. 76 FR 76111 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-06

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... amendment to the Montana regulatory program (hereinafter, the ``Montana program'') under the Surface Mining...

  12. 77 FR 25874 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-02

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938... Mining Reclamation and Enforcement (OSM), Interior. ACTION: Final rule; removal of required amendment... regulatory program (the ``Pennsylvania program'') regulations under the Surface Mining Control and...

  13. 77 FR 1430 - Maryland Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-01-10

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 920... Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule; extension of the comment... the Maryland regulatory program (the ``Maryland program'') under the Surface Mining Control and...

  14. State summaries: Kentucky

    USGS Publications Warehouse

    Greb, S.F.; Anderson, W.H.

    2006-01-01

    Kentucky mines coal, limestone, clay, sand and gravel. Coal mining operations are carried out mainly in the Western Kentucky Coal Field and the Eastern Kentucky Coal field. As to nonfuel minerals, Mississippian limestones are mined in the Mississippian Plateaus Region and along Pine Mountain in southeastern Kentucky. Ordovician and Silurian limestones are mined from the central part of the state. Clay minerals that are mined in the state include common clay, ceramic and ball clays, refractory clay and shale. Just like in 2004, mining activities in the state remain significant.

  15. Effect of Name Change of Schizophrenia on Mass Media Between 1985 and 2013 in Japan: A Text Data Mining Analysis

    PubMed Central

    Koike, Shinsuke; Yamaguchi, Sosei; Ojio, Yasutaka; Ohta, Kazusa; Ando, Shuntaro

    2016-01-01

    Background: Mass media such as newspapers and TV news affect mental health-related stigma. In Japan, the name of schizophrenia was changed in 2002 for the purposes of stigma reduction; however, little has been known about the effect of name change of schizophrenia on mass media. Method: Articles including old and new names of schizophrenia, depressive disorder, and diabetes mellitus (DM) in headlines and/or text were extracted from 23169092 articles in 4 major Japanese newspapers and 1 TV news program (1985–2013). The trajectory of the number of articles including each term was determined across years. Then, all text in news headlines was segmented as per part-of-speech level using text data mining. Segmented words were classified into 6 categories and in each category of extracted words by target term and period were also tested. Results: Total 51789 and 1106 articles including target terms in newspaper articles and TV news segments were obtained, respectively. The number of articles including the target terms increased across years. Relative increase was observed in the articles published on schizophrenia since 2003 compared with those on DM and between 2000 and 2005 compared with those on depressive disorder. Word tendency used in headlines was equivalent before and after 2002 for the articles including each target term. Articles for schizophrenia contained more negative words than depressive disorder and DM (31.5%, 16.0%, and 8.2%, respectively). Conclusions: Name change of schizophrenia had a limited effect on the articles published and little effect on its contents. PMID:26614786

  16. 78 FR 49079 - Lease Modifications, Lease and Logical Mining Unit Diligence, Advance Royalty, Royalty Rates, and...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-12

    ... Management 43 CFR Parts 3000, 3400, 3430, et al. Lease Modifications, Lease and Logical Mining Unit Diligence... Lease Modifications, Lease and Logical Mining Unit Diligence, Advance Royalty, Royalty Rates, and Bonds... leases and logical mining units (LMUs). The proposed rule would implement Title IV, Subtitle D of the...

  17. 20 CFR 725.202 - Miner defined; condition of entitlement, miner.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED CLAIMS FOR BENEFITS UNDER PART C..., preparation, or transportation of coal, and any person who works or has worked in coal mine construction or... transportation of coal while working at the mine site, or in maintenance or construction of the mine site; or (2...

  18. 20 CFR 725.202 - Miner defined; condition of entitlement, miner.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED CLAIMS FOR BENEFITS UNDER PART C..., preparation, or transportation of coal, and any person who works or has worked in coal mine construction or... transportation of coal while working at the mine site, or in maintenance or construction of the mine site; or (2...

  19. 20 CFR 725.202 - Miner defined; condition of entitlement, miner.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED CLAIMS FOR BENEFITS UNDER PART C OF..., preparation, or transportation of coal, and any person who works or has worked in coal mine construction or... transportation of coal while working at the mine site, or in maintenance or construction of the mine site; or (2...

  20. 20 CFR 725.202 - Miner defined; condition of entitlement, miner.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED CLAIMS FOR BENEFITS UNDER PART C..., preparation, or transportation of coal, and any person who works or has worked in coal mine construction or... transportation of coal while working at the mine site, or in maintenance or construction of the mine site; or (2...

  1. 20 CFR 725.202 - Miner defined; condition of entitlement, miner.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED CLAIMS FOR BENEFITS UNDER PART C..., preparation, or transportation of coal, and any person who works or has worked in coal mine construction or... transportation of coal while working at the mine site, or in maintenance or construction of the mine site; or (2...

  2. Application of Ferulic Acid for Alzheimer’s Disease: Combination of Text Mining and Experimental Validation

    PubMed Central

    Meng, Guilin; Meng, Xiulin; Ma, Xiaoye; Zhang, Gengping; Hu, Xiaolin; Jin, Aiping; Liu, Xueyuan

    2018-01-01

    Alzheimer’s disease (AD) is an increasing concern in human health. Despite significant research, highly effective drugs to treat AD are lacking. The present study describes the text mining process to identify drug candidates from a traditional Chinese medicine (TCM) database, along with associated protein target mechanisms. We carried out text mining to identify literatures that referenced both AD and TCM and focused on identifying compounds and protein targets of interest. After targeting one potential TCM candidate, corresponding protein-protein interaction (PPI) networks were assembled in STRING to decipher the most possible mechanism of action. This was followed by validation using Western blot and co-immunoprecipitation in an AD cell model. The text mining strategy using a vast amount of AD-related literature and the TCM database identified curcumin, whose major component was ferulic acid (FA). This was used as a key candidate compound for further study. Using the top calculated interaction score in STRING, BACE1 and MMP2 were implicated in the activity of FA in AD. Exposure of SHSY5Y-APP cells to FA resulted in the decrease in expression levels of BACE-1 and APP, while the expression of MMP-2 and MMP-9 increased in a dose-dependent manner. This suggests that FA induced BACE1 and MMP2 pathways maybe novel potential mechanisms involved in AD. The text mining of literature and TCM database related to AD suggested FA as a promising TCM ingredient for the treatment of AD. Potential mechanisms interconnected and integrated with Aβ aggregation inhibition and extracellular matrix remodeling underlying the activity of FA were identified using in vitro studies. PMID:29896095

  3. Application of Ferulic Acid for Alzheimer's Disease: Combination of Text Mining and Experimental Validation.

    PubMed

    Meng, Guilin; Meng, Xiulin; Ma, Xiaoye; Zhang, Gengping; Hu, Xiaolin; Jin, Aiping; Zhao, Yanxin; Liu, Xueyuan

    2018-01-01

    Alzheimer's disease (AD) is an increasing concern in human health. Despite significant research, highly effective drugs to treat AD are lacking. The present study describes the text mining process to identify drug candidates from a traditional Chinese medicine (TCM) database, along with associated protein target mechanisms. We carried out text mining to identify literatures that referenced both AD and TCM and focused on identifying compounds and protein targets of interest. After targeting one potential TCM candidate, corresponding protein-protein interaction (PPI) networks were assembled in STRING to decipher the most possible mechanism of action. This was followed by validation using Western blot and co-immunoprecipitation in an AD cell model. The text mining strategy using a vast amount of AD-related literature and the TCM database identified curcumin, whose major component was ferulic acid (FA). This was used as a key candidate compound for further study. Using the top calculated interaction score in STRING, BACE1 and MMP2 were implicated in the activity of FA in AD. Exposure of SHSY5Y-APP cells to FA resulted in the decrease in expression levels of BACE-1 and APP, while the expression of MMP-2 and MMP-9 increased in a dose-dependent manner. This suggests that FA induced BACE1 and MMP2 pathways maybe novel potential mechanisms involved in AD. The text mining of literature and TCM database related to AD suggested FA as a promising TCM ingredient for the treatment of AD. Potential mechanisms interconnected and integrated with Aβ aggregation inhibition and extracellular matrix remodeling underlying the activity of FA were identified using in vitro studies.

  4. Text Mining Effectively Scores and Ranks the Literature for Improving Chemical-Gene-Disease Curation at the Comparative Toxicogenomics Database

    PubMed Central

    Johnson, Robin J.; Lay, Jean M.; Lennon-Hopkins, Kelley; Saraceni-Richards, Cynthia; Sciaky, Daniela; Murphy, Cynthia Grondin; Mattingly, Carolyn J.

    2013-01-01

    The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) is a public resource that curates interactions between environmental chemicals and gene products, and their relationships to diseases, as a means of understanding the effects of environmental chemicals on human health. CTD provides a triad of core information in the form of chemical-gene, chemical-disease, and gene-disease interactions that are manually curated from scientific articles. To increase the efficiency, productivity, and data coverage of manual curation, we have leveraged text mining to help rank and prioritize the triaged literature. Here, we describe our text-mining process that computes and assigns each article a document relevancy score (DRS), wherein a high DRS suggests that an article is more likely to be relevant for curation at CTD. We evaluated our process by first text mining a corpus of 14,904 articles triaged for seven heavy metals (cadmium, cobalt, copper, lead, manganese, mercury, and nickel). Based upon initial analysis, a representative subset corpus of 3,583 articles was then selected from the 14,094 articles and sent to five CTD biocurators for review. The resulting curation of these 3,583 articles was analyzed for a variety of parameters, including article relevancy, novel data content, interaction yield rate, mean average precision, and biological and toxicological interpretability. We show that for all measured parameters, the DRS is an effective indicator for scoring and improving the ranking of literature for the curation of chemical-gene-disease information at CTD. Here, we demonstrate how fully incorporating text mining-based DRS scoring into our curation pipeline enhances manual curation by prioritizing more relevant articles, thereby increasing data content, productivity, and efficiency. PMID:23613709

  5. A New Framework for Textual Information Mining over Parse Trees. CRESST Report 805

    ERIC Educational Resources Information Center

    Mousavi, Hamid; Kerr, Deirdre; Iseli, Markus R.

    2011-01-01

    Textual information mining is a challenging problem that has resulted in the creation of many different rule-based linguistic query languages. However, these languages generally are not optimized for the purpose of text mining. In other words, they usually consider queries as individuals and only return raw results for each query. Moreover they…

  6. Data Mining: A Hybrid Methodology for Complex and Dynamic Research

    ERIC Educational Resources Information Center

    Lang, Susan; Baehr, Craig

    2012-01-01

    This article provides an overview of the ways in which data and text mining have potential as research methodologies in composition studies. It introduces data mining in the context of the field of composition studies and discusses ways in which this methodology can complement and extend our existing research practices by blending the best of what…

  7. 77 FR 58056 - Mississippi Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-19

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 924... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM...

  8. 76 FR 36039 - Colorado Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-06-21

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 906... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). Colorado proposes both additions...

  9. 77 FR 34890 - Oklahoma Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-12

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 936... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  10. 76 FR 50708 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-08-16

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing. SUMMARY: We, the Office of Surface Mining Reclamation...

  11. 75 FR 60371 - Alabama Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-09-30

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 901... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  12. 77 FR 41680 - Indiana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-16

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 914... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving amendments to the Indiana...

  13. 77 FR 25949 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-02

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  14. 76 FR 76109 - Colorado Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-06

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 906... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; reopening and extension of public...'') under the Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). Colorado...

  15. 77 FR 66574 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-11-06

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  16. 77 FR 18149 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-03-27

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; reopening and extension of public... receipt of Montana's response to the Office of Surface Mining Reclamation and Enforcement's (OSM) November...

  17. 76 FR 16714 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-25

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938 [PA-160-FOR; OSM 2010-0019] Pennsylvania Regulatory Program AGENCY: Office of Surface Mining... Pennsylvania regulatory program (the ``Pennsylvania program'') under the Surface Mining Control and Reclamation...

  18. 77 FR 24661 - North Dakota Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-04-25

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 934... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). North Dakota proposes...

  19. 76 FR 23522 - Oklahoma Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-27

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 936... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM...

  20. 75 FR 21534 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-26

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  1. 75 FR 46877 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-08-04

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938 [PA-156-FOR; OSM 2010-0004] Pennsylvania Regulatory Program AGENCY: Office of Surface Mining... Pennsylvania program (the ``Pennsylvania program'') under the Surface Mining Control and Reclamation Act of...

  2. 77 FR 34892 - Utah Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-12

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 944... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  3. 77 FR 18738 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-03-28

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  4. 76 FR 9700 - Alabama Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-22

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 901... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and opportunity for public hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation...

  5. 77 FR 40796 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-11

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Final rule. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are removing a disapproval codified in OSM regulations...

  6. 77 FR 34894 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-12

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; withdrawal. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are announcing the withdrawal of a proposed rule...

  7. 76 FR 12857 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-09

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... of Surface Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment... the Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). Montana proposed...

  8. 78 FR 11617 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-19

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938... Surface Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule; reopening of comment... regulatory program (the ``Pennsylvania program'') under the Surface Mining Control and Reclamation Act of...

  9. Text mining and medicine: usefulness in respiratory diseases.

    PubMed

    Piedra, David; Ferrer, Antoni; Gea, Joaquim

    2014-03-01

    It is increasingly common to have medical information in electronic format. This includes scientific articles as well as clinical management reviews, and even records from health institutions with patient data. However, traditional instruments, both individual and institutional, are of little use for selecting the most appropriate information in each case, either in the clinical or research field. So-called text or data «mining» enables this huge amount of information to be managed, extracting it from various sources using processing systems (filtration and curation), integrating it and permitting the generation of new knowledge. This review aims to provide an overview of text and data mining, and of the potential usefulness of this bioinformatic technique in the exercise of care in respiratory medicine and in research in the same field. Copyright © 2013 SEPAR. Published by Elsevier Espana. All rights reserved.

  10. Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge.

    PubMed

    Cormack, James; Nath, Chinmoy; Milward, David; Raja, Kalpana; Jonnalagadda, Siddhartha R

    2015-12-01

    This paper describes the use of an agile text mining platform (Linguamatics' Interactive Information Extraction Platform, I2E) to extract document-level cardiac risk factors in patient records as defined in the i2b2/UTHealth 2014 challenge. The approach uses a data-driven rule-based methodology with the addition of a simple supervised classifier. We demonstrate that agile text mining allows for rapid optimization of extraction strategies, while post-processing can leverage annotation guidelines, corpus statistics and logic inferred from the gold standard data. We also show how data imbalance in a training set affects performance. Evaluation of this approach on the test data gave an F-Score of 91.7%, one percent behind the top performing system. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Overview of the gene ontology task at BioCreative IV.

    PubMed

    Mao, Yuqing; Van Auken, Kimberly; Li, Donghui; Arighi, Cecilia N; McQuilton, Peter; Hayman, G Thomas; Tweedie, Susan; Schaeffer, Mary L; Laulederkind, Stanley J F; Wang, Shur-Jen; Gobeill, Julien; Ruch, Patrick; Luu, Anh Tuan; Kim, Jung-Jae; Chiang, Jung-Hsien; Chen, Yu-De; Yang, Chia-Jung; Liu, Hongfang; Zhu, Dongqing; Li, Yanpeng; Yu, Hong; Emadzadeh, Ehsan; Gonzalez, Graciela; Chen, Jian-Ming; Dai, Hong-Jie; Lu, Zhiyong

    2014-01-01

    Gene ontology (GO) annotation is a common task among model organism databases (MODs) for capturing gene function data from journal articles. It is a time-consuming and labor-intensive task, and is thus often considered as one of the bottlenecks in literature curation. There is a growing need for semiautomated or fully automated GO curation techniques that will help database curators to rapidly and accurately identify gene function information in full-length articles. Despite multiple attempts in the past, few studies have proven to be useful with regard to assisting real-world GO curation. The shortage of sentence-level training data and opportunities for interaction between text-mining developers and GO curators has limited the advances in algorithm development and corresponding use in practical circumstances. To this end, we organized a text-mining challenge task for literature-based GO annotation in BioCreative IV. More specifically, we developed two subtasks: (i) to automatically locate text passages that contain GO-relevant information (a text retrieval task) and (ii) to automatically identify relevant GO terms for the genes in a given article (a concept-recognition task). With the support from five MODs, we provided teams with >4000 unique text passages that served as the basis for each GO annotation in our task data. Such evidence text information has long been recognized as critical for text-mining algorithm development but was never made available because of the high cost of curation. In total, seven teams participated in the challenge task. From the team results, we conclude that the state of the art in automatically mining GO terms from literature has improved over the past decade while much progress is still needed for computer-assisted GO curation. Future work should focus on addressing remaining technical challenges for improved performance of automatic GO concept recognition and incorporating practical benefits of text-mining tools into real-world GO annotation. http://www.biocreative.org/tasks/biocreative-iv/track-4-GO/. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.

  12. PPInterFinder--a mining tool for extracting causal relations on human proteins from literature.

    PubMed

    Raja, Kalpana; Subramani, Suresh; Natarajan, Jeyakumar

    2013-01-01

    One of the most common and challenging problem in biomedical text mining is to mine protein-protein interactions (PPIs) from MEDLINE abstracts and full-text research articles because PPIs play a major role in understanding the various biological processes and the impact of proteins in diseases. We implemented, PPInterFinder--a web-based text mining tool to extract human PPIs from biomedical literature. PPInterFinder uses relation keyword co-occurrences with protein names to extract information on PPIs from MEDLINE abstracts and consists of three phases. First, it identifies the relation keyword using a parser with Tregex and a relation keyword dictionary. Next, it automatically identifies the candidate PPI pairs with a set of rules related to PPI recognition. Finally, it extracts the relations by matching the sentence with a set of 11 specific patterns based on the syntactic nature of PPI pair. We find that PPInterFinder is capable of predicting PPIs with the accuracy of 66.05% on AIMED corpus and outperforms most of the existing systems. DATABASE URL: http://www.biomining-bu.in/ppinterfinder/

  13. PPInterFinder—a mining tool for extracting causal relations on human proteins from literature

    PubMed Central

    Raja, Kalpana; Subramani, Suresh; Natarajan, Jeyakumar

    2013-01-01

    One of the most common and challenging problem in biomedical text mining is to mine protein–protein interactions (PPIs) from MEDLINE abstracts and full-text research articles because PPIs play a major role in understanding the various biological processes and the impact of proteins in diseases. We implemented, PPInterFinder—a web-based text mining tool to extract human PPIs from biomedical literature. PPInterFinder uses relation keyword co-occurrences with protein names to extract information on PPIs from MEDLINE abstracts and consists of three phases. First, it identifies the relation keyword using a parser with Tregex and a relation keyword dictionary. Next, it automatically identifies the candidate PPI pairs with a set of rules related to PPI recognition. Finally, it extracts the relations by matching the sentence with a set of 11 specific patterns based on the syntactic nature of PPI pair. We find that PPInterFinder is capable of predicting PPIs with the accuracy of 66.05% on AIMED corpus and outperforms most of the existing systems. Database URL: http://www.biomining-bu.in/ppinterfinder/ PMID:23325628

  14. Mining Clinicians' Electronic Documentation to Identify Heart Failure Patients with Ineffective Self-Management: A Pilot Text-Mining Study.

    PubMed

    Topaz, Maxim; Radhakrishnan, Kavita; Lei, Victor; Zhou, Li

    2016-01-01

    Effective self-management can decrease up to 50% of heart failure hospitalizations. Unfortunately, self-management by patients with heart failure remains poor. This pilot study aimed to explore the use of text-mining to identify heart failure patients with ineffective self-management. We first built a comprehensive self-management vocabulary based on the literature and clinical notes review. We then randomly selected 545 heart failure patients treated within Partners Healthcare hospitals (Boston, MA, USA) and conducted a regular expression search with the compiled vocabulary within 43,107 interdisciplinary clinical notes of these patients. We found that 38.2% (n = 208) patients had documentation of ineffective heart failure self-management in the domains of poor diet adherence (28.4%), missed medical encounters (26.4%) poor medication adherence (20.2%) and non-specified self-management issues (e.g., "compliance issues", 34.6%). We showed the feasibility of using text-mining to identify patients with ineffective self-management. More natural language processing algorithms are needed to help busy clinicians identify these patients.

  15. Integration of Artificial Market Simulation and Text Mining for Market Analysis

    NASA Astrophysics Data System (ADS)

    Izumi, Kiyoshi; Matsui, Hiroki; Matsuo, Yutaka

    We constructed an evaluation system of the self-impact in a financial market using an artificial market and text-mining technology. Economic trends were first extracted from text data circulating in the real world. Then, the trends were inputted into the market simulation. Our simulation revealed that an operation by intervention could reduce over 70% of rate fluctuation in 1995. By the simulation results, the system was able to help for its user to find the exchange policy which can stabilize the yen-dollar rate.

  16. Survey of nine surface mines in North America. [Nine different mines in USA and Canada

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hayes, L.G.; Brackett, R.D.; Floyd, F.D.

    This report presents the information gathered by three mining engineers in a 1980 survey of nine surface mines in the United States and Canada. The mines visited included seven coal mines, one copper mine, and one tar sands mine selected as representative of present state of the art in open pit, strip, and terrace pit mining. The purpose of the survey was to investigate mining methods, equipment requirements, operating costs, reclamation procedures and costs, and other aspects of current surface mining practices in order to acquire basic data for a study comparing conventional and terrace pit mining methods, particularly inmore » deeper overburdens. The survey was conducted as part of a project under DOE Contract No. DE-AC01-79ET10023 titled The Development of Optimal Terrace Pit Coal Mining Systems.« less

  17. Protective and control relays as coal-mine power-supply ACS subsystem

    NASA Astrophysics Data System (ADS)

    Kostin, V. N.; Minakova, T. E.

    2017-10-01

    The paper presents instantaneous selective short-circuit protection for the cabling of the underground part of a coal mine and central control algorithms as a Coal-Mine Power-Supply ACS Subsystem. In order to improve the reliability of electricity supply and reduce the mining equipment down-time, a dual channel relay protection and central control system is proposed as a subsystem of the coal-mine power-supply automated control system (PS ACS).

  18. 20 CFR 410.101 - Introduction.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... Employees' Benefits SOCIAL SECURITY ADMINISTRATION FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, TITLE IV.... The regulations in this part 410 (Regulation No. 10 of the Social Security Administration) relate to the provisions of part B (Black Lung Benefits) of title IV of the Federal Coal Mine Health and Safety...

  19. 30 CFR 33.23 - Mechanical positioning of parts.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 30 Mineral Resources 1 2011-07-01 2011-07-01 false Mechanical positioning of parts. 33.23 Section 33.23 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR TESTING, EVALUATION, AND APPROVAL OF MINING PRODUCTS DUST COLLECTORS FOR USE IN CONNECTION WITH ROCK DRILLING IN COAL...

  20. 30 CFR 33.23 - Mechanical positioning of parts.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 30 Mineral Resources 1 2013-07-01 2013-07-01 false Mechanical positioning of parts. 33.23 Section 33.23 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR TESTING, EVALUATION, AND APPROVAL OF MINING PRODUCTS DUST COLLECTORS FOR USE IN CONNECTION WITH ROCK DRILLING IN COAL...

  1. 30 CFR 33.23 - Mechanical positioning of parts.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 30 Mineral Resources 1 2014-07-01 2014-07-01 false Mechanical positioning of parts. 33.23 Section 33.23 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR TESTING, EVALUATION, AND APPROVAL OF MINING PRODUCTS DUST COLLECTORS FOR USE IN CONNECTION WITH ROCK DRILLING IN COAL...

  2. 30 CFR 33.23 - Mechanical positioning of parts.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 30 Mineral Resources 1 2012-07-01 2012-07-01 false Mechanical positioning of parts. 33.23 Section 33.23 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR TESTING, EVALUATION, AND APPROVAL OF MINING PRODUCTS DUST COLLECTORS FOR USE IN CONNECTION WITH ROCK DRILLING IN COAL...

  3. 20 CFR 726.2 - Purpose and scope of this part.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 20 Employees' Benefits 3 2010-04-01 2010-04-01 false Purpose and scope of this part. 726.2 Section 726.2 Employees' Benefits EMPLOYMENT STANDARDS ADMINISTRATION, DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS; REQUIREMENTS FOR COAL MINE OPERATOR'S...

  4. 77 FR 55420 - Minerals Management: Adjustment of Cost Recovery Fees

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-10

    ... mining, oil and gas extraction, and the mining and quarrying of nonmetallic minerals) as an individual...\\ New value \\4\\ New fee \\5\\ Oil & Gas (parts 3100, 3110, 3120, 3130, 3150) Noncompetitive lease... Leasing of Solid Minerals Other Than Coal and Oil Shale (parts 3500, 3580) Applications other than those...

  5. @Note: a workbench for biomedical text mining.

    PubMed

    Lourenço, Anália; Carreira, Rafael; Carneiro, Sónia; Maia, Paulo; Glez-Peña, Daniel; Fdez-Riverola, Florentino; Ferreira, Eugénio C; Rocha, Isabel; Rocha, Miguel

    2009-08-01

    Biomedical Text Mining (BioTM) is providing valuable approaches to the automated curation of scientific literature. However, most efforts have addressed the benchmarking of new algorithms rather than user operational needs. Bridging the gap between BioTM researchers and biologists' needs is crucial to solve real-world problems and promote further research. We present @Note, a platform for BioTM that aims at the effective translation of the advances between three distinct classes of users: biologists, text miners and software developers. Its main functional contributions are the ability to process abstracts and full-texts; an information retrieval module enabling PubMed search and journal crawling; a pre-processing module with PDF-to-text conversion, tokenisation and stopword removal; a semantic annotation schema; a lexicon-based annotator; a user-friendly annotation view that allows to correct annotations and a Text Mining Module supporting dataset preparation and algorithm evaluation. @Note improves the interoperability, modularity and flexibility when integrating in-home and open-source third-party components. Its component-based architecture allows the rapid development of new applications, emphasizing the principles of transparency and simplicity of use. Although it is still on-going, it has already allowed the development of applications that are currently being used.

  6. Text mining in livestock animal science: introducing the potential of text mining to animal sciences.

    PubMed

    Sahadevan, S; Hofmann-Apitius, M; Schellander, K; Tesfaye, D; Fluck, J; Friedrich, C M

    2012-10-01

    In biological research, establishing the prior art by searching and collecting information already present in the domain has equal importance as the experiments done. To obtain a complete overview about the relevant knowledge, researchers mainly rely on 2 major information sources: i) various biological databases and ii) scientific publications in the field. The major difference between the 2 information sources is that information from databases is available, typically well structured and condensed. The information content in scientific literature is vastly unstructured; that is, dispersed among the many different sections of scientific text. The traditional method of information extraction from scientific literature occurs by generating a list of relevant publications in the field of interest and manually scanning these texts for relevant information, which is very time consuming. It is more than likely that in using this "classical" approach the researcher misses some relevant information mentioned in the literature or has to go through biological databases to extract further information. Text mining and named entity recognition methods have already been used in human genomics and related fields as a solution to this problem. These methods can process and extract information from large volumes of scientific text. Text mining is defined as the automatic extraction of previously unknown and potentially useful information from text. Named entity recognition (NER) is defined as the method of identifying named entities (names of real world objects; for example, gene/protein names, drugs, enzymes) in text. In animal sciences, text mining and related methods have been briefly used in murine genomics and associated fields, leaving behind other fields of animal sciences, such as livestock genomics. The aim of this work was to develop an information retrieval platform in the livestock domain focusing on livestock publications and the recognition of relevant data from cattle and pigs. For this purpose, the rather noncomprehensive resources of pig and cattle gene and protein terminologies were enriched with orthologue synonyms, integrated in the NER platform, ProMiner, which is successfully used in human genomics domain. Based on the performance tests done, the present system achieved a fair performance with precision 0.64, recall 0.74, and F(1) measure of 0.69 in a test scenario based on cattle literature.

  7. miRTex: A Text Mining System for miRNA-Gene Relation Extraction

    PubMed Central

    Li, Gang; Ross, Karen E.; Arighi, Cecilia N.; Peng, Yifan; Wu, Cathy H.; Vijay-Shanker, K.

    2015-01-01

    MicroRNAs (miRNAs) regulate a wide range of cellular and developmental processes through gene expression suppression or mRNA degradation. Experimentally validated miRNA gene targets are often reported in the literature. In this paper, we describe miRTex, a text mining system that extracts miRNA-target relations, as well as miRNA-gene and gene-miRNA regulation relations. The system achieves good precision and recall when evaluated on a literature corpus of 150 abstracts with F-scores close to 0.90 on the three different types of relations. We conducted full-scale text mining using miRTex to process all the Medline abstracts and all the full-length articles in the PubMed Central Open Access Subset. The results for all the Medline abstracts are stored in a database for interactive query and file download via the website at http://proteininformationresource.org/mirtex. Using miRTex, we identified genes potentially regulated by miRNAs in Triple Negative Breast Cancer, as well as miRNA-gene relations that, in conjunction with kinase-substrate relations, regulate the response to abiotic stress in Arabidopsis thaliana. These two use cases demonstrate the usefulness of miRTex text mining in the analysis of miRNA-regulated biological processes. PMID:26407127

  8. LimTox: a web tool for applied text mining of adverse event and toxicity associations of compounds, drugs and genes

    PubMed Central

    Cañada, Andres; Rabal, Obdulia; Oyarzabal, Julen; Valencia, Alfonso

    2017-01-01

    Abstract A considerable effort has been devoted to retrieve systematically information for genes and proteins as well as relationships between them. Despite the importance of chemical compounds and drugs as a central bio-entity in pharmacological and biological research, only a limited number of freely available chemical text-mining/search engine technologies are currently accessible. Here we present LimTox (Literature Mining for Toxicology), a web-based online biomedical search tool with special focus on adverse hepatobiliary reactions. It integrates a range of text mining, named entity recognition and information extraction components. LimTox relies on machine-learning, rule-based, pattern-based and term lookup strategies. This system processes scientific abstracts, a set of full text articles and medical agency assessment reports. Although the main focus of LimTox is on adverse liver events, it enables also basic searches for other organ level toxicity associations (nephrotoxicity, cardiotoxicity, thyrotoxicity and phospholipidosis). This tool supports specialized search queries for: chemical compounds/drugs, genes (with additional emphasis on key enzymes in drug metabolism, namely P450 cytochromes—CYPs) and biochemical liver markers. The LimTox website is free and open to all users and there is no login requirement. LimTox can be accessed at: http://limtox.bioinfo.cnio.es PMID:28531339

  9. Text mining a self-report back-translation.

    PubMed

    Blanch, Angel; Aluja, Anton

    2016-06-01

    There are several recommendations about the routine to undertake when back translating self-report instruments in cross-cultural research. However, text mining methods have been generally ignored within this field. This work describes a text mining innovative application useful to adapt a personality questionnaire to 12 different languages. The method is divided in 3 different stages, a descriptive analysis of the available back-translated instrument versions, a dissimilarity assessment between the source language instrument and the 12 back-translations, and an item assessment of item meaning equivalence. The suggested method contributes to improve the back-translation process of self-report instruments for cross-cultural research in 2 significant intertwined ways. First, it defines a systematic approach to the back translation issue, allowing for a more orderly and informed evaluation concerning the equivalence of different versions of the same instrument in different languages. Second, it provides more accurate instrument back-translations, which has direct implications for the reliability and validity of the instrument's test scores when used in different cultures/languages. In addition, this procedure can be extended to the back-translation of self-reports measuring psychological constructs in clinical assessment. Future research works could refine the suggested methodology and use additional available text mining tools. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  10. 78 FR 6062 - North Dakota Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-29

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 934... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). North Dakota intends to...

  11. 77 FR 34888 - Kentucky Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-12

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 917 [KY-255-FOR; OSM-2012-0004] Kentucky Regulatory Program AGENCY: Office of Surface Mining Reclamation... Program (hereinafter, the ``Kentucky program'') under the Surface Mining Control and Reclamation Act of...

  12. 77 FR 31486 - Virginia Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-29

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 946 [VA-126-FOR; OSM-2008-0012] Virginia Regulatory Program AGENCY: Office of Surface Mining Reclamation... an amendment to the Virginia regulatory program under the Surface Mining Control and Reclamation Act...

  13. 76 FR 4266 - New Mexico Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-25

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 931... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; public comment period and... Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). New Mexico proposes revisions to...

  14. 76 FR 9642 - Alabama Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-22

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 901... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving an amendment to the Alabama...

  15. 78 FR 13002 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-26

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938... Mining Reclamation and Enforcement (``OSM''), Interior. ACTION: Proposed rule; public comment period and... regulatory program under the Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or the ``Act...

  16. 78 FR 11579 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-19

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving an amendment to the Texas...

  17. 75 FR 34960 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-06-21

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938 [PA-155-FOR; OSM 2010-0003] Pennsylvania Regulatory Program AGENCY: Office of Surface Mining... ``Pennsylvania program'') under the Surface Mining Control and Reclamation Act of 1977 (SMCRA or the Act...

  18. 78 FR 10512 - Wyoming Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-14

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 950... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment with certain... ``Wyoming program'') under the Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act...

  19. 76 FR 50436 - Kentucky Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-08-15

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 917 [KY-254-FOR; OSM-2011-0005] Kentucky Regulatory Program AGENCY: Office of Surface Mining Reclamation... Program (hereinafter, the ``Kentucky program'') under the Surface Mining Control and Reclamation Act of...

  20. 77 FR 8144 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-02-14

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... AGENCY: Office of Surface Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving three...

  1. 78 FR 9807 - Utah Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-12

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 944... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We are approving an amendment to the Utah regulatory program (the ``Utah program'') under the Surface Mining...

  2. 76 FR 30008 - Alabama Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-24

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 901... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving an amendment to the Alabama...

  3. 75 FR 43476 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-26

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926... Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; reopening and extension of public...'') under the Surface Mining Control and Reclamation Act of 1977 (``SMCRA'' or ``the Act''). Montana revised...

  4. 30 CFR 46.4 - Training plan implementation.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING..., SURFACE CLAY, COLLOIDAL PHOSPHATE, OR SURFACE LIMESTONE MINES. § 46.4 Training plan implementation. (a....9 of this part. (d) Training methods may consist of classroom instruction, instruction at the mine...

  5. 30 CFR 46.4 - Training plan implementation.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING..., SURFACE CLAY, COLLOIDAL PHOSPHATE, OR SURFACE LIMESTONE MINES. § 46.4 Training plan implementation. (a....9 of this part. (d) Training methods may consist of classroom instruction, instruction at the mine...

  6. 30 CFR 46.4 - Training plan implementation.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING..., SURFACE CLAY, COLLOIDAL PHOSPHATE, OR SURFACE LIMESTONE MINES. § 46.4 Training plan implementation. (a....9 of this part. (d) Training methods may consist of classroom instruction, instruction at the mine...

  7. 30 CFR 46.4 - Training plan implementation.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING..., SURFACE CLAY, COLLOIDAL PHOSPHATE, OR SURFACE LIMESTONE MINES. § 46.4 Training plan implementation. (a....9 of this part. (d) Training methods may consist of classroom instruction, instruction at the mine...

  8. 30 CFR 46.4 - Training plan implementation.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR EDUCATION AND TRAINING..., SURFACE CLAY, COLLOIDAL PHOSPHATE, OR SURFACE LIMESTONE MINES. § 46.4 Training plan implementation. (a....9 of this part. (d) Training methods may consist of classroom instruction, instruction at the mine...

  9. 75 FR 81122 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-27

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving an amendment to the Texas...

  10. 77 FR 58025 - Texas Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-19

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 943... Mining Reclamation and Enforcement, Interior. ACTION: Final rule; approval of amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are approving an amendment to the Texas...

  11. Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach.

    PubMed

    Schneider, Nadine; Fechner, Nikolas; Landrum, Gregory A; Stiefl, Nikolaus

    2017-08-28

    Big data is one of the key transformative factors which increasingly influences all aspects of modern life. Although this transformation brings vast opportunities it also generates novel challenges, not the least of which is organizing and searching this data deluge. The field of medicinal chemistry is not different: more and more data are being generated, for instance, by technologies such as DNA encoded libraries, peptide libraries, text mining of large literature corpora, and new in silico enumeration methods. Handling those huge sets of molecules effectively is quite challenging and requires compromises that often come at the expense of the interpretability of the results. In order to find an intuitive and meaningful approach to organizing large molecular data sets, we adopted a probabilistic framework called "topic modeling" from the text-mining field. Here we present the first chemistry-related implementation of this method, which allows large molecule sets to be assigned to "chemical topics" and investigating the relationships between those. In this first study, we thoroughly evaluate this novel method in different experiments and discuss both its disadvantages and advantages. We show very promising results in reproducing human-assigned concepts using the approach to identify and retrieve chemical series from sets of molecules. We have also created an intuitive visualization of the chemical topics output by the algorithm. This is a huge benefit compared to other unsupervised machine-learning methods, like clustering, which are commonly used to group sets of molecules. Finally, we applied the new method to the 1.6 million molecules of the ChEMBL22 data set to test its robustness and efficiency. In about 1 h we built a 100-topic model of this large data set in which we could identify interesting topics like "proteins", "DNA", or "steroids". Along with this publication we provide our data sets and an open-source implementation of the new method (CheTo) which will be part of an upcoming version of the open-source cheminformatics toolkit RDKit.

  12. Ion Channel ElectroPhysiology Ontology (ICEPO) - a case study of text mining assisted ontology development.

    PubMed

    Elayavilli, Ravikumar Komandur; Liu, Hongfang

    2016-01-01

    Computational modeling of biological cascades is of great interest to quantitative biologists. Biomedical text has been a rich source for quantitative information. Gathering quantitative parameters and values from biomedical text is one significant challenge in the early steps of computational modeling as it involves huge manual effort. While automatically extracting such quantitative information from bio-medical text may offer some relief, lack of ontological representation for a subdomain serves as impedance in normalizing textual extractions to a standard representation. This may render textual extractions less meaningful to the domain experts. In this work, we propose a rule-based approach to automatically extract relations involving quantitative data from biomedical text describing ion channel electrophysiology. We further translated the quantitative assertions extracted through text mining to a formal representation that may help in constructing ontology for ion channel events using a rule based approach. We have developed Ion Channel ElectroPhysiology Ontology (ICEPO) by integrating the information represented in closely related ontologies such as, Cell Physiology Ontology (CPO), and Cardiac Electro Physiology Ontology (CPEO) and the knowledge provided by domain experts. The rule-based system achieved an overall F-measure of 68.93% in extracting the quantitative data assertions system on an independently annotated blind data set. We further made an initial attempt in formalizing the quantitative data assertions extracted from the biomedical text into a formal representation that offers potential to facilitate the integration of text mining into ontological workflow, a novel aspect of this study. This work is a case study where we created a platform that provides formal interaction between ontology development and text mining. We have achieved partial success in extracting quantitative assertions from the biomedical text and formalizing them in ontological framework. The ICEPO ontology is available for download at http://openbionlp.org/mutd/supplementarydata/ICEPO/ICEPO.owl.

  13. 75 FR 16179 - Notice of Affirmative Decisions on Petitions for Modification Granted in Whole or in Part

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-03-31

    ...-013-C FR Notice: 74 FR 27185 (June 8, 2009). Petitioner: Wolf Run Mining Company, Rt. 3, Box 146... FR 23745 (May 20, 2009). Petitioner: Excel Mining, LLC, Box 4126, State Highway 194 West, Pikeville... Heights, P.O. Box 1944, Superior, Arizona 85273. Mine: Resolution Copper Mine, MSHA I.D. No. 02-00152...

  14. 40 CFR 440.32 - Effluent limitations representing the degree of effluent reduction attainable by the application...

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ..., radium and vanadium including mill-mine facilities and mines using in-situ leach methods shall not exceed...). Except as provided in subpart L of this part and 40 CFR 125.30 through 125.32, any existing point source... available (BPT): (a) The concentration of pollutants discharged in mine drainage from mines, either open-pit...

  15. 40 CFR 440.32 - Effluent limitations representing the degree of effluent reduction attainable by the application...

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ..., radium and vanadium including mill-mine facilities and mines using in-situ leach methods shall not exceed...). Except as provided in subpart L of this part and 40 CFR 125.30 through 125.32, any existing point source... available (BPT): (a) The concentration of pollutants discharged in mine drainage from mines, either open-pit...

  16. A Corporate Responsibility? The Constitution of Fly-in, Fly-out Mining Companies as Governance Partners in Remote, Mine-Affected Localities

    ERIC Educational Resources Information Center

    Cheshire, Lynda

    2010-01-01

    In some remote parts of Australia, mining companies have positioned themselves as central actors in governing nearby affected communities by espousing notions of "voluntary partnerships for sustainability" between business, government and community. It is argued in this paper that the nature and extent of mining company interventions in…

  17. Characteristics of the Roof Behaviors and Mine Pressure Manifestations During the Mining of Steep Coal Seam

    NASA Astrophysics Data System (ADS)

    Hong-Sheng, Tu; Shi-Hao, Tu; Cun, Zhang; Lei, Zhang; Xiao-Gang, Zhang

    2017-12-01

    A steep seam similar simulation system was developed based on the geological conditions of a steep coal seam in the Xintie Coal Mine. Basing on similar simulation, together with theoretical analysis and field measurement, an in-depth study was conducted to characterize the fracture and stability of the roof of steep working face and calculate the width of the region backfilled with gangue in the goaf. The results showed that, as mining progressed, the immediate roof of the steep face fell upon the goaf and backfilled its lower part due to gravity. As a result, the roof in the lower part had higher stability than the roof in the upper part of the working face. The deformation and fracture of main roof mainly occurred in the upper part of the working face; the fractured main roof then formed a "voussoir beam" structure in the strata's dip direction, which was subjected to the slip- and deformation-induced instability. The stability analysis indicated that, when the dip angle increased, the rock masses had greater capacity to withstand slip-induced instability but smaller capacity to withstand deformation-induced instability. Finally, the field measurement of the forces exerted on the hydraulic supports proved the characteristics of the roof's behaviors during the mining of a steep seam.

  18. Flooded Underground Coal Mines: A Significant Source of Inexpensive Geothermal Energy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Watzlaf, G.R.; Ackman, T.E.

    2007-04-01

    Many mining regions in the United States contain extensive areas of flooded underground mines. The water within these mines represents a significant and widespread opportunity for extracting low-grade, geothermal energy. Based on current energy prices, geothermal heat pump systems using mine water could reduce the annual costs for heating to over 70 percent compared to conventional heating methods (natural gas or heating oil). These same systems could reduce annual cooling costs by up to 50 percent over standard air conditioning in many areas of the country. (Formatted full-text version is released by permission of publisher)

  19. 10. DIAMOND MINE YARD FROM THE NORTH SHOWING A COMPRESSED ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    10. DIAMOND MINE YARD FROM THE NORTH SHOWING A COMPRESSED AIR PIPE AND TRESTLE IN THE LOWER LEFT, AND THE LORRY HOUSE. A PART OF A RETAINING WALL IS VISIBLE ABOVE THE RAILROAD CUT - Butte Mineyards, Diamond Mine, Butte, Silver Bow County, MT

  20. 30 CFR 715.11 - General obligations.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR INITIAL... surface coal mining and reclamation operations conducted on lands where any element of the operations is... are established by part 716 of this chapter for— (1) Surface coal mining operations on steep slopes...

  1. 76 FR 64048 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-17

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938... Surface Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule; reopening and extension... Mining Control and Reclamation Act of 1977 (SMCRA or the Act) published on February 7, 2011. In response...

  2. 75 FR 60375 - Utah Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-09-30

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 944 [SATS No. UT-047-FOR; Docket ID OSM-2010-0012] Utah Regulatory Program AGENCY: Office of Surface Mining... amendment to the Utah regulatory program (hereinafter, the ``Utah program'') under the Surface Mining...

  3. 30 CFR 921.700 - Massachusetts Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 921.700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE MASSACHUSETTS § 921.700 Massachusetts Federal program. (a) This part contains all rules that are applicable to surface coal mining...

  4. 76 FR 6587 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-07

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938 [PA-159-FOR; OSM 2010-0017] Pennsylvania Regulatory Program AGENCY: Office of Surface Mining... the Surface Mining Control and Reclamation Act of 1977 (SMCRA or the Act). In response to a required...

  5. 77 FR 58053 - Kentucky Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-09-19

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 917... Mining Reclamation and Enforcement (OSM), Interior. ACTION: Proposed rule; Removal of Required Amendments... program'') under the Surface Mining Control and Reclamation Act of 1977 (SMCRA or the Act). As a result of...

  6. 77 FR 46346 - Ohio Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-08-03

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 935 [OH-254-FOR; Docket ID OSM-2012-0012] Ohio Regulatory Program AGENCY: Office of Surface Mining... under the Surface Mining Control and Reclamation Act of 1977 (SMCRA or the Act). Ohio's proposed...

  7. 76 FR 12920 - Pennsylvania Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-09

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 938 [PA-157-FOR; OSM 2010-0011] Pennsylvania Regulatory Program AGENCY: Office of Surface Mining... the Surface Mining Control and Reclamation Act of 1977 (SMCRA or the Act). In response to a required...

  8. 30 CFR 937.700 - Oregon Federal program.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Federal program. (c) The rules in this part apply to all surface coal mining operations in Oregon... more stringent environmental control and regulation of surface coal mining operations than do the... extent they provide for regulation of surface coal mining and reclamation operations which are exempt...

  9. 30 CFR 740.1 - Scope and purpose.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR FEDERAL LANDS PROGRAM GENERAL REQUIREMENTS FOR SURFACE COAL MINING AND RECLAMATION OPERATIONS ON FEDERAL LANDS § 740.1 Scope and purpose. This part provides for the regulation of surface coal mining and reclamation...

  10. 30 CFR 740.1 - Scope and purpose.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR FEDERAL LANDS PROGRAM GENERAL REQUIREMENTS FOR SURFACE COAL MINING AND RECLAMATION OPERATIONS ON FEDERAL LANDS § 740.1 Scope and purpose. This part provides for the regulation of surface coal mining and reclamation...

  11. 30 CFR 740.1 - Scope and purpose.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR FEDERAL LANDS PROGRAM GENERAL REQUIREMENTS FOR SURFACE COAL MINING AND RECLAMATION OPERATIONS ON FEDERAL LANDS § 740.1 Scope and purpose. This part provides for the regulation of surface coal mining and reclamation...

  12. 30 CFR 740.1 - Scope and purpose.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR FEDERAL LANDS PROGRAM GENERAL REQUIREMENTS FOR SURFACE COAL MINING AND RECLAMATION OPERATIONS ON FEDERAL LANDS § 740.1 Scope and purpose. This part provides for the regulation of surface coal mining and reclamation...

  13. 30 CFR 740.1 - Scope and purpose.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR FEDERAL LANDS PROGRAM GENERAL REQUIREMENTS FOR SURFACE COAL MINING AND RECLAMATION OPERATIONS ON FEDERAL LANDS § 740.1 Scope and purpose. This part provides for the regulation of surface coal mining and reclamation...

  14. 36 CFR 9.2 - Definitions.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... MANAGEMENT Mining and Mining Claims § 9.2 Definitions. The terms used in this part shall have the following... in connection with mining on claims, including: prospecting, exploration, surveying, development and... thereto, including construction or use of roads or other means of access on National Park System lands...

  15. 36 CFR 9.2 - Definitions.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... MANAGEMENT Mining and Mining Claims § 9.2 Definitions. The terms used in this part shall have the following... in connection with mining on claims, including: prospecting, exploration, surveying, development and... thereto, including construction or use of roads or other means of access on National Park System lands...

  16. 36 CFR 9.2 - Definitions.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... MANAGEMENT Mining and Mining Claims § 9.2 Definitions. The terms used in this part shall have the following... in connection with mining on claims, including: prospecting, exploration, surveying, development and... thereto, including construction or use of roads or other means of access on National Park System lands...

  17. 36 CFR 9.2 - Definitions.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... MANAGEMENT Mining and Mining Claims § 9.2 Definitions. The terms used in this part shall have the following... in connection with mining on claims, including: prospecting, exploration, surveying, development and... thereto, including construction or use of roads or other means of access on National Park System lands...

  18. 76 FR 6589 - West Virginia Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-02-07

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 948 [WV-116-FOR; OSM-2009-0008] West Virginia Regulatory Program AGENCY: Office of Surface Mining... Mining Reclamation and Enforcement (OSM) characterized the change as non-substantive, and did not note...

  19. 78 FR 33858 - Renewal of Approved Information Collection

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-06-05

    ..., recording, and maintenance of mining claims and sites. The Office of Management and Budget (OMB) has... is provided for the information collection: Title: Recordation of Location Notices and Mining Claims... Parts 3832 through 3838. These regulations pertain to the location, recording, and maintenance of mining...

  20. 75 FR 22723 - Stream Protection Rule; Environmental Impact Statement

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-30

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Parts 780... of Surface Mining Reclamation and Enforcement, Interior. ACTION: Proposed rule; notice of intent to prepare an environmental impact statement. SUMMARY: We, the Office of Surface Mining Reclamation and...

  1. Mining Quality Phrases from Massive Text Corpora

    PubMed Central

    Liu, Jialu; Shang, Jingbo; Wang, Chi; Ren, Xiang; Han, Jiawei

    2015-01-01

    Text data are ubiquitous and play an essential role in big data applications. However, text data are mostly unstructured. Transforming unstructured text into structured units (e.g., semantically meaningful phrases) will substantially reduce semantic ambiguity and enhance the power and efficiency at manipulating such data using database technology. Thus mining quality phrases is a critical research problem in the field of databases. In this paper, we propose a new framework that extracts quality phrases from text corpora integrated with phrasal segmentation. The framework requires only limited training but the quality of phrases so generated is close to human judgment. Moreover, the method is scalable: both computation time and required space grow linearly as corpus size increases. Our experiments on large text corpora demonstrate the quality and efficiency of the new method. PMID:26705375

  2. A UIMA wrapper for the NCBO annotator.

    PubMed

    Roeder, Christophe; Jonquet, Clement; Shah, Nigam H; Baumgartner, William A; Verspoor, Karin; Hunter, Lawrence

    2010-07-15

    The Unstructured Information Management Architecture (UIMA) framework and web services are emerging as useful tools for integrating biomedical text mining tools. This note describes our work, which wraps the National Center for Biomedical Ontology (NCBO) Annotator-an ontology-based annotation service-to make it available as a component in UIMA workflows. This wrapper is freely available on the web at http://bionlp-uima.sourceforge.net/ as part of the UIMA tools distribution from the Center for Computational Pharmacology (CCP) at the University of Colorado School of Medicine. It has been implemented in Java for support on Mac OS X, Linux and MS Windows.

  3. 20 CFR 726.103 - Application for authority to self-insure; effect of regulations contained in this part.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ...; effect of regulations contained in this part. 726.103 Section 726.103 Employees' Benefits OFFICE OF WORKERS' COMPENSATION PROGRAMS, DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS; REQUIREMENTS FOR COAL MINE OPERATOR'S INSURANCE Authorization of Self...

  4. 20 CFR 726.103 - Application for authority to self-insure; effect of regulations contained in this part.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ...; effect of regulations contained in this part. 726.103 Section 726.103 Employees' Benefits OFFICE OF WORKERS' COMPENSATION PROGRAMS, DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS; REQUIREMENTS FOR COAL MINE OPERATOR'S INSURANCE Authorization of Self...

  5. 20 CFR 726.103 - Application for authority to self-insure; effect of regulations contained in this part.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ...; effect of regulations contained in this part. 726.103 Section 726.103 Employees' Benefits OFFICE OF WORKERS' COMPENSATION PROGRAMS, DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS; REQUIREMENTS FOR COAL MINE OPERATOR'S INSURANCE Authorization of Self...

  6. 25 CFR 211.53 - Assignments, overriding royalties, and operating agreements.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... agreements. 211.53 Section 211.53 Indians BUREAU OF INDIAN AFFAIRS, DEPARTMENT OF THE INTERIOR ENERGY AND..., geothermal, and mining regulations (25 CFR part 216; 43 CFR parts 3160, 3260, 3480, and 3590; and those... approval before abandonment of any oil and gas or geothermal well or mining operation. All such obligations...

  7. 25 CFR 211.53 - Assignments, overriding royalties, and operating agreements.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... agreements. 211.53 Section 211.53 Indians BUREAU OF INDIAN AFFAIRS, DEPARTMENT OF THE INTERIOR ENERGY AND..., geothermal, and mining regulations (25 CFR part 216; 43 CFR parts 3160, 3260, 3480, and 3590; and those... approval before abandonment of any oil and gas or geothermal well or mining operation. All such obligations...

  8. A Review of Recent Advancement in Integrating Omics Data with Literature Mining towards Biomedical Discoveries

    PubMed Central

    Raja, Kalpana; Patrick, Matthew; Gao, Yilin; Madu, Desmond; Yang, Yuyang

    2017-01-01

    In the past decade, the volume of “omics” data generated by the different high-throughput technologies has expanded exponentially. The managing, storing, and analyzing of this big data have been a great challenge for the researchers, especially when moving towards the goal of generating testable data-driven hypotheses, which has been the promise of the high-throughput experimental techniques. Different bioinformatics approaches have been developed to streamline the downstream analyzes by providing independent information to interpret and provide biological inference. Text mining (also known as literature mining) is one of the commonly used approaches for automated generation of biological knowledge from the huge number of published articles. In this review paper, we discuss the recent advancement in approaches that integrate results from omics data and information generated from text mining approaches to uncover novel biomedical information. PMID:28331849

  9. Systematic drug repositioning through mining adverse event data in ClinicalTrials.gov.

    PubMed

    Su, Eric Wen; Sanger, Todd M

    2017-01-01

    Drug repositioning (i.e., drug repurposing) is the process of discovering new uses for marketed drugs. Historically, such discoveries were serendipitous. However, the rapid growth in electronic clinical data and text mining tools makes it feasible to systematically identify drugs with the potential to be repurposed. Described here is a novel method of drug repositioning by mining ClinicalTrials.gov. The text mining tools I2E (Linguamatics) and PolyAnalyst (Megaputer) were utilized. An I2E query extracts "Serious Adverse Events" (SAE) data from randomized trials in ClinicalTrials.gov. Through a statistical algorithm, a PolyAnalyst workflow ranks the drugs where the treatment arm has fewer predefined SAEs than the control arm, indicating that potentially the drug is reducing the level of SAE. Hypotheses could then be generated for the new use of these drugs based on the predefined SAE that is indicative of disease (for example, cancer).

  10. Exploratory analysis of textual data from the Mother and Child Handbook using the text-mining method: Relationships with maternal traits and post-partum depression.

    PubMed

    Matsuda, Yoshio; Manaka, Tomoko; Kobayashi, Makiko; Sato, Shuhei; Ohwada, Michitaka

    2016-06-01

    The aim of the present study was to examine the possibility of screening apprehensive pregnant women and mothers at risk for post-partum depression from an analysis of the textual data in the Mother and Child Handbook by using the text-mining method. Uncomplicated pregnant women (n = 58) were divided into two groups according to State-Trait Anxiety Inventory grade (high trait [group I, n = 21] and low trait [group II, n = 37]) or Edinburgh Postnatal Depression Scale score (high score [group III, n = 15] and low score [group IV, n = 43]). An exploratory analysis of the textual data from the Maternal and Child Handbook was conducted using the text-mining method with the Word Miner software program. A comparison of the 'structure elements' was made between the two groups. The number of structure elements extracted by separated words from text data was 20 004 and the number of structure elements with a threshold of 2 or more as an initial value was 1168. Fifteen key words related to maternal anxiety, and six key words related to post-partum depression were extracted. The text-mining method is useful for the exploratory analysis of textual data obtained from pregnant woman, and this screening method has been suggested to be useful for apprehensive pregnant women and mothers at risk for post-partum depression. © 2016 Japan Society of Obstetrics and Gynecology.

  11. An Integrated Suite of Text and Data Mining Tools - Phase II

    DTIC Science & Technology

    2005-08-30

    Riverside, CA, USA Mazda Motor Corp, Jpn Univ of Darmstadt, Darmstadt, Ger Navy Center for Applied Research in Artificial Intelligence Univ of...with Georgia Tech Research Corporation developed a desktop text-mining software tool named TechOASIS (known commercially as VantagePoint). By the...of this dataset and groups the Corporate Source items that co-occur with the found items. He decides he is only interested in the institutions

  12. Advances in Knowledge Discovery and Data Mining 21st Pacific Asia Conference, PAKDD 2017 Held in Jeju, South Korea, May 23 26, 2017. Proceedings Part I, Part II.

    DTIC Science & Technology

    2017-06-27

    From - To) 05-27-2017 Final 17-03-2017 - 15-03-2018 4. TITLE AND SUBTITLE Sa. CONTRACT NUMBER FA2386-17-1-0102 Advances in Knowledge Discovery and...Springer; Switzerland. 14. ABSTRACT The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference...in the areas of knowledge discovery and data mining (KDD). We had three keynote speeches, delivered by Sang Cha from Seoul National University

  13. Automated Assessment of Patients' Self-Narratives for Posttraumatic Stress Disorder Screening Using Natural Language Processing and Text Mining.

    PubMed

    He, Qiwei; Veldkamp, Bernard P; Glas, Cees A W; de Vries, Theo

    2017-03-01

    Patients' narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four machine-learning algorithms-including decision tree, naive Bayes, support vector machine, and an alternative classification approach called the product score model-were used in combination with n-gram representation models to identify patterns between verbal features in self-narratives and psychiatric diagnoses. With our sample, the product score model with unigrams attained the highest prediction accuracy when compared with practitioners' diagnoses. The addition of multigrams contributed most to balancing the metrics of sensitivity and specificity. This article also demonstrates that text mining is a promising approach for analyzing patients' self-expression behavior, thus helping clinicians identify potential patients from an early stage.

  14. Using ontology network structure in text mining.

    PubMed

    Berndt, Donald J; McCart, James A; Luther, Stephen L

    2010-11-13

    Statistical text mining treats documents as bags of words, with a focus on term frequencies within documents and across document collections. Unlike natural language processing (NLP) techniques that rely on an engineered vocabulary or a full-featured ontology, statistical approaches do not make use of domain-specific knowledge. The freedom from biases can be an advantage, but at the cost of ignoring potentially valuable knowledge. The approach proposed here investigates a hybrid strategy based on computing graph measures of term importance over an entire ontology and injecting the measures into the statistical text mining process. As a starting point, we adapt existing search engine algorithms such as PageRank and HITS to determine term importance within an ontology graph. The graph-theoretic approach is evaluated using a smoking data set from the i2b2 National Center for Biomedical Computing, cast as a simple binary classification task for categorizing smoking-related documents, demonstrating consistent improvements in accuracy.

  15. 30 CFR 912.700 - Idaho Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE IDAHO § 912.700 Idaho Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in Idaho...

  16. 30 CFR 903.700 - Arizona Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE ARIZONA § 903.700 Arizona Federal program. (a) This part establishes a Federal program under the Surface Mining Control and Reclamation Act...

  17. 30 CFR 905.700 - California Federal Program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE CALIFORNIA § 905.700 California Federal Program. (a) This part contains all rules that are applicable to surface coal mining operations in...

  18. 76 FR 64043 - Iowa Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-17

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 915 [Docket No. IA-016-FOR; Docket ID: OSM-2011-0014] Iowa Regulatory Program AGENCY: Office of Surface Mining.... SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM), are announcing receipt of a...

  19. 30 CFR 947.700 - Washington Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE WASHINGTON § 947.700 Washington Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in...

  20. 30 CFR 922.700 - Michigan Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE MICHIGAN § 922.700 Michigan Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in...

  1. 30 CFR 910.700 - Georgia Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE GEORGIA § 910.700 Georgia Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in Georgia...

  2. 30 CFR 937.700 - Oregon Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE OREGON § 937.700 Oregon Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in Oregon...

  3. 77 FR 8185 - Ohio Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-02-14

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 935 [SATS No. OH-252-FOR; Docket ID OSM 2011-0003] Ohio Regulatory Program AGENCY: Office of Surface Mining... amendment to the Ohio regulatory program (the ``Ohio program'') under the Surface Mining Control and...

  4. 78 FR 63909 - Missouri Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-10-25

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 925... 08011000 SX066A00033 F13XS501520] Missouri Regulatory Program AGENCY: Office of Surface Mining Reclamation... hearing on proposed amendment. SUMMARY: We, the Office of Surface Mining Reclamation and Enforcement (OSM...

  5. 30 CFR 942.700 - Tennessee Federal program.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ....700 Mineral Resources OFFICE OF SURFACE MINING RECLAMATION AND ENFORCEMENT, DEPARTMENT OF THE INTERIOR PROGRAMS FOR THE CONDUCT OF SURFACE MINING OPERATIONS WITHIN EACH STATE TENNESSEE § 942.700 Tennessee Federal program. (a) This part contains all rules that are applicable to surface coal mining operations in...

  6. 20 CFR 726.2 - Purpose and scope of this part.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 726.2 Employees' Benefits OFFICE OF WORKERS' COMPENSATION PROGRAMS, DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS; REQUIREMENTS FOR COAL MINE OPERATOR... and controlling the circumstances under which a coal mine operator shall fulfill his insurance...

  7. 20 CFR 726.2 - Purpose and scope of this part.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 726.2 Employees' Benefits OFFICE OF WORKERS' COMPENSATION PROGRAMS, DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS; REQUIREMENTS FOR COAL MINE OPERATOR... and controlling the circumstances under which a coal mine operator shall fulfill his insurance...

  8. 20 CFR 726.2 - Purpose and scope of this part.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 726.2 Employees' Benefits OFFICE OF WORKERS' COMPENSATION PROGRAMS, DEPARTMENT OF LABOR FEDERAL COAL MINE HEALTH AND SAFETY ACT OF 1969, AS AMENDED BLACK LUNG BENEFITS; REQUIREMENTS FOR COAL MINE OPERATOR... and controlling the circumstances under which a coal mine operator shall fulfill his insurance...

  9. Data mining-based coefficient of influence factors optimization of test paper reliability

    NASA Astrophysics Data System (ADS)

    Xu, Peiyao; Jiang, Huiping; Wei, Jieyao

    2018-05-01

    Test is a significant part of the teaching process. It demonstrates the final outcome of school teaching through teachers' teaching level and students' scores. The analysis of test paper is a complex operation that has the characteristics of non-linear relation in the length of the paper, time duration and the degree of difficulty. It is therefore difficult to optimize the coefficient of influence factors under different conditions in order to get text papers with clearly higher reliability with general methods [1]. With data mining techniques like Support Vector Regression (SVR) and Genetic Algorithm (GA), we can model the test paper analysis and optimize the coefficient of impact factors for higher reliability. It's easy to find that the combination of SVR and GA can get an effective advance in reliability from the test results. The optimal coefficient of influence factors optimization has a practicability in actual application, and the whole optimizing operation can offer model basis for test paper analysis.

  10. Hydrology of area 50, Northern Great Plains and Rocky Mountain coal provinces, Wyoming and Montana

    USGS Publications Warehouse

    Lowry, Marlin E.; Wilson, James F.; ,

    1983-01-01

    This report is one of a series designed to characterize the hydrology of drainage basins within coal provinces, nationwide. Area 50 includes all of the Powder River Basin, Wyoming and Montana and the upstream parts of the Cheyenne and Belle Fourche River Basins - a total of 20,676 sq mi. The area has abundant coal (81.2 million tons mined in 1982), but scarce water. The information in the report is intended to describe the hydrology of the ' general area ' of any proposed mine. The report represents a summary of results of the water resources investigations of the U.S. Geological Survey, carried out in cooperation with State and other Federal agencies. Each of more than 50 topics is discussed in a brief text that is accompanied by maps, graphs, and other illustrations. Primary topics in the report are: physiography, economic development, surface-water data networks, surface water quantity and quality, and groundwater. The report also contains an extensive description of sources of additional information. (USGS)

  11. Data Mining for Financial Applications

    NASA Astrophysics Data System (ADS)

    Kovalerchuk, Boris; Vityaev, Evgenii

    This chapter describes Data Mining in finance by discussing financial tasks, specifics of methodologies and techniques in this Data Mining area. It includes time dependence, data selection, forecast horizon, measures of success, quality of patterns, hypothesis evaluation, problem ID, method profile, attribute-based and relational methodologies. The second part of the chapter discusses Data Mining models and practice in finance. It covers use of neural networks in portfolio management, design of interpretable trading rules and discovering money laundering schemes using decision rules and relational Data Mining methodology.

  12. Exploring patterns of epigenetic information with data mining techniques.

    PubMed

    Aguiar-Pulido, Vanessa; Seoane, José A; Gestal, Marcos; Dorado, Julián

    2013-01-01

    Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of epigenetic data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of data for which data mining is essential. Part of these data may contain patterns of epigenetic information which are mitotically and/or meiotically heritable determining gene expression and cellular differentiation, as well as cellular fate. Epigenetic lesions and genetic mutations are acquired by individuals during their life and accumulate with ageing. Both defects, either together or individually, can result in losing control over cell growth and, thus, causing cancer development. Data mining techniques could be then used to extract the previous patterns. This work reviews some of the most important applications of data mining to epigenetics.

  13. Upper Cretaceous bituminous coal deposits of the Olmos Formation, Maverick County, Texas

    USGS Publications Warehouse

    Hook, Robert W.; Warwick, Peter D.; SanFilipo, John R.; Warwick, Peter D.; Karlsen, Alexander K.; Merrill, Matthew D.; Valentine, Brett J.

    2011-01-01

    This report describes the bituminous coal deposits of the Olmos Formation (Navarro Group, Upper Cretaceous; Figures 1, 2) of Maverick County in south Texas. Although these were not evaluated quantitatively as part of the current Gulf Coastal Plain coal-resource assessment, a detailed review is presented in this chapter.Prior to the late 1920s, these coal beds were mined underground on a large scale in the vicinity of Eagle Pass, Texas (Figure 1). Since the 1970s, Olmos Formation coals have been mined extensively in both underground and surface mines in nearby Coahuila, Mexico, to supply mine-mouth fuel for power generation at a plant nearby. A tract northeast of Eagle Pass was permitted in the late 1990s for surface mining. In east-central Maverick County, a coalbed methane field is being developed in coal beds of the lower part of the Olmos Formation (Barker et al., 2002; Scott, 2003).

  14. The structure and infrastructure of the global nanotechnology literature

    NASA Astrophysics Data System (ADS)

    Kostoff, Ronald N.; Stump, Jesse A.; Johnson, Dustin; Murday, James S.; Lau, Clifford G. Y.; Tolles, William M.

    2006-08-01

    Text mining is the extraction of useful information from large volumes of text. A text mining analysis of the global open nanotechnology literature was performed. Records from the Science Citation Index (SCI)/Social SCI were analyzed to provide the infrastructure of the global nanotechnology literature (prolific authors/journals/institutions/countries, most cited authors/papers/journals) and the thematic structure (taxonomy) of the global nanotechnology literature, from a science perspective. Records from the Engineering Compendex (EC) were analyzed to provide a taxonomy from a technology perspective. The Far Eastern countries have expanded nanotechnology publication output dramatically in the past decade.

  15. PubMed-EX: a web browser extension to enhance PubMed search with text mining features.

    PubMed

    Tsai, Richard Tzong-Han; Dai, Hong-Jie; Lai, Po-Ting; Huang, Chi-Hsin

    2009-11-15

    PubMed-EX is a browser extension that marks up PubMed search results with additional text-mining information. PubMed-EX's page mark-up, which includes section categorization and gene/disease and relation mark-up, can help researchers to quickly focus on key terms and provide additional information on them. All text processing is performed server-side, freeing up user resources. PubMed-EX is freely available at http://bws.iis.sinica.edu.tw/PubMed-EX and http://iisr.cse.yzu.edu.tw:8000/PubMed-EX/.

  16. Learning in the context of distribution drift

    DTIC Science & Technology

    2017-05-09

    published in the leading data mining journal, Data Mining and Knowledge Discovery (Webb et. al., 2016)1. We have shown that the previous qualitative...learner Low-bias learner Aggregated classifier Figure 7: Architecture for learning fr m streaming data in th co text of variable or unknown...Learning limited dependence Bayesian classifiers, in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD

  17. Enhancements for a Dynamic Data Warehousing and Mining System for Large-scale HSCB Data

    DTIC Science & Technology

    2016-07-20

    Intelligent Automation Incorporated Enhancements for a Dynamic Data Warehousing and Mining ...Page | 2 Intelligent Automation Incorporated Monthly Report No. 4 Enhancements for a Dynamic Data Warehousing and Mining System Large-Scale HSCB...including Top Videos, Top Users, Top Words, and Top Languages, and also applied NER to the text associated with YouTube posts. We have also developed UI for

  18. Enhancements for a Dynamic Data Warehousing and Mining System for Large-Scale HSCB Data

    DTIC Science & Technology

    2016-07-20

    Intelligent Automation Incorporated Enhancements for a Dynamic Data Warehousing and Mining ...Page | 2 Intelligent Automation Incorporated Monthly Report No. 4 Enhancements for a Dynamic Data Warehousing and Mining System Large-Scale HSCB...including Top Videos, Top Users, Top Words, and Top Languages, and also applied NER to the text associated with YouTube posts. We have also developed UI for

  19. 77 FR 73966 - Utah Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-12-12

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 944 [SATS No. UT-049-FOR; Docket ID OSM-2012-0015] Utah Regulatory Program AGENCY: Office of Surface Mining... Mining Control and Reclamation Act of 1977 (SMCRA or the Act). Utah proposes to revise references to...

  20. 78 FR 63911 - Montana Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-10-25

    ... DEPARTMENT OF THE INTERIOR Office of Surface Mining Reclamation and Enforcement 30 CFR Part 926...; S2D2SSS08011000 SX066A00033 F13XS501520] Montana Regulatory Program AGENCY: Office of Surface Mining Reclamation... regulatory program (hereinafter, the ``Montana program'') under the Surface Mining Control and Reclamation...

  1. 30 CFR 19.1 - Purpose.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... PRODUCTS ELECTRIC CAP LAMPS § 19.1 Purpose. (a) The purpose of investigations made under this part is to promote the development of electric cap lamps that may be used in mines, especially in mines that may... interested in safe equipment for mines may have information in regard to available permissible electric cap...

  2. 30 CFR 75.1108 - Approved conveyor belts.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 30 Mineral Resources 1 2010-07-01 2010-07-01 false Approved conveyor belts. 75.1108 Section 75.1108 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR COAL MINE SAFETY AND... December 31, 2018 all conveyor belts used in underground coal mines shall be approved under Part 14. [73 FR...

  3. 76 FR 69764 - Petitions for Modification of Application of Existing Mandatory Safety Standards

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-11-09

    ... DEPARTMENT OF LABOR Mine Safety and Health Administration Petitions for Modification of Application of Existing Mandatory Safety Standards AGENCY: Mine Safety and Health Administration, Labor. ACTION: Notice. SUMMARY: Section 101(c) of the Federal Mine Safety and Health Act of 1977 and 30 CFR part...

  4. Searching for 'Unknown Unknowns'

    NASA Technical Reports Server (NTRS)

    Parsons, Vickie S.

    2005-01-01

    The NASA Engineering and Safety Center (NESC) was established to improve safety through engineering excellence within NASA programs and projects. As part of this goal, methods are being investigated to enable the NESC to become proactive in identifying areas that may be precursors to future problems. The goal is to find unknown indicators of future problems, not to duplicate the program-specific trending efforts. The data that is critical for detecting these indicators exist in a plethora of dissimilar non-conformance and other databases (without a common format or taxonomy). In fact, much of the data is unstructured text. However, one common database is not required if the right standards and electronic tools are employed. Electronic data mining is a particularly promising tool for this effort into unsupervised learning of common factors. This work in progress began with a systematic evaluation of available data mining software packages, based on documented decision techniques using weighted criteria. The four packages, which were perceived to have the most promise for NASA applications, are being benchmarked and evaluated by independent contractors. Preliminary recommendations for "best practices" in data mining and trending are provided. Final results and recommendations should be available in the Fall 2005. This critical first step in identifying "unknown unknowns" before they become problems is applicable to any set of engineering or programmatic data.

  5. Empirical advances with text mining of electronic health records.

    PubMed

    Delespierre, T; Denormandie, P; Bar-Hen, A; Josseran, L

    2017-08-22

    Korian is a private group specializing in medical accommodations for elderly and dependent people. A professional data warehouse (DWH) established in 2010 hosts all of the residents' data. Inside this information system (IS), clinical narratives (CNs) were used only by medical staff as a residents' care linking tool. The objective of this study was to show that, through qualitative and quantitative textual analysis of a relatively small physiotherapy and well-defined CN sample, it was possible to build a physiotherapy corpus and, through this process, generate a new body of knowledge by adding relevant information to describe the residents' care and lives. Meaningful words were extracted through Standard Query Language (SQL) with the LIKE function and wildcards to perform pattern matching, followed by text mining and a word cloud using R® packages. Another step involved principal components and multiple correspondence analyses, plus clustering on the same residents' sample as well as on other health data using a health model measuring the residents' care level needs. By combining these techniques, physiotherapy treatments could be characterized by a list of constructed keywords, and the residents' health characteristics were built. Feeding defects or health outlier groups could be detected, physiotherapy residents' data and their health data were matched, and differences in health situations showed qualitative and quantitative differences in physiotherapy narratives. This textual experiment using a textual process in two stages showed that text mining and data mining techniques provide convenient tools to improve residents' health and quality of care by adding new, simple, useable data to the electronic health record (EHR). When used with a normalized physiotherapy problem list, text mining through information extraction (IE), named entity recognition (NER) and data mining (DM) can provide a real advantage to describe health care, adding new medical material and helping to integrate the EHR system into the health staff work environment.

  6. Database citation in full text biomedical articles.

    PubMed

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.

  7. Database Citation in Full Text Biomedical Articles

    PubMed Central

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R.

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services. PMID:23734176

  8. HPIminer: A text mining system for building and visualizing human protein interaction networks and pathways.

    PubMed

    Subramani, Suresh; Kalpana, Raja; Monickaraj, Pankaj Moses; Natarajan, Jeyakumar

    2015-04-01

    The knowledge on protein-protein interactions (PPI) and their related pathways are equally important to understand the biological functions of the living cell. Such information on human proteins is highly desirable to understand the mechanism of several diseases such as cancer, diabetes, and Alzheimer's disease. Because much of that information is buried in biomedical literature, an automated text mining system for visualizing human PPI and pathways is highly desirable. In this paper, we present HPIminer, a text mining system for visualizing human protein interactions and pathways from biomedical literature. HPIminer extracts human PPI information and PPI pairs from biomedical literature, and visualize their associated interactions, networks and pathways using two curated databases HPRD and KEGG. To our knowledge, HPIminer is the first system to build interaction networks from literature as well as curated databases. Further, the new interactions mined only from literature and not reported earlier in databases are highlighted as new. A comparative study with other similar tools shows that the resultant network is more informative and provides additional information on interacting proteins and their associated networks. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. Effect of Name Change of Schizophrenia on Mass Media Between 1985 and 2013 in Japan: A Text Data Mining Analysis.

    PubMed

    Koike, Shinsuke; Yamaguchi, Sosei; Ojio, Yasutaka; Ohta, Kazusa; Ando, Shuntaro

    2016-05-01

    Mass media such as newspapers and TV news affect mental health-related stigma. In Japan, the name of schizophrenia was changed in 2002 for the purposes of stigma reduction; however, little has been known about the effect of name change of schizophrenia on mass media. Articles including old and new names of schizophrenia, depressive disorder, and diabetes mellitus (DM) in headlines and/or text were extracted from 23169092 articles in 4 major Japanese newspapers and 1 TV news program (1985-2013). The trajectory of the number of articles including each term was determined across years. Then, all text in news headlines was segmented as per part-of-speech level using text data mining. Segmented words were classified into 6 categories and in each category of extracted words by target term and period were also tested. Total 51789 and 1106 articles including target terms in newspaper articles and TV news segments were obtained, respectively. The number of articles including the target terms increased across years. Relative increase was observed in the articles published on schizophrenia since 2003 compared with those on DM and between 2000 and 2005 compared with those on depressive disorder. Word tendency used in headlines was equivalent before and after 2002 for the articles including each target term. Articles for schizophrenia contained more negative words than depressive disorder and DM (31.5%, 16.0%, and 8.2%, respectively). Name change of schizophrenia had a limited effect on the articles published and little effect on its contents. © The Author 2015. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  10. 30 CFR 20.11 - Material required for MSHA records.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ..., EVALUATION, AND APPROVAL OF MINING PRODUCTS ELECTRIC MINE LAMPS OTHER THAN STANDARD CAP LAMPS § 20.11... lamp as approved, are retained. These drawings are used to identify the lamp and its parts in the...) If MSHA so desires, parts of the lamps which are used in the tests will be retained as a permanent...

  11. 30 CFR 20.11 - Material required for MSHA records.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ..., EVALUATION, AND APPROVAL OF MINING PRODUCTS ELECTRIC MINE LAMPS OTHER THAN STANDARD CAP LAMPS § 20.11... lamp as approved, are retained. These drawings are used to identify the lamp and its parts in the...) If MSHA so desires, parts of the lamps which are used in the tests will be retained as a permanent...

  12. 30 CFR 20.11 - Material required for MSHA records.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ..., EVALUATION, AND APPROVAL OF MINING PRODUCTS ELECTRIC MINE LAMPS OTHER THAN STANDARD CAP LAMPS § 20.11... lamp as approved, are retained. These drawings are used to identify the lamp and its parts in the...) If MSHA so desires, parts of the lamps which are used in the tests will be retained as a permanent...

  13. 30 CFR 20.11 - Material required for MSHA records.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ..., EVALUATION, AND APPROVAL OF MINING PRODUCTS ELECTRIC MINE LAMPS OTHER THAN STANDARD CAP LAMPS § 20.11... lamp as approved, are retained. These drawings are used to identify the lamp and its parts in the...) If MSHA so desires, parts of the lamps which are used in the tests will be retained as a permanent...

  14. 76 FR 59058 - Minerals Management: Adjustment of Cost Recovery Fees

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-09-23

    ... mining, oil and gas extraction, and the mining and quarrying of nonmetallic minerals) as an individual... increase New value New fee \\5\\ fee \\1\\ value \\2\\ \\3\\ \\4\\ Oil & Gas (parts 3100, 3110, 3120, 3130, 3150... interest transfer 60 61.94 0.84 62.78 65 Leasing of Solid Minerals Other Than Coal and Oil Shale (parts...

  15. Facilitating Decision Making, Re-Use and Collaboration: A Knowledge Management Approach to Acquisition Program Self-Awareness

    DTIC Science & Technology

    2009-06-01

    capabilities: web-based, relational/multi-dimensional, client/server, and metadata (data about data) inclusion (pp. 39-40). Text mining, on the other...and Organizational Systems ( CASOS ) (Carley, 2005). Although AutoMap can be used to conduct text-mining, it was utilized only for its visualization...provides insight into how the GMCOI is using the terms, and where there might be redundant terms and need for de -confliction and standardization

  16. Terminologies for text-mining; an experiment in the lipoprotein metabolism domain

    PubMed Central

    Alexopoulou, Dimitra; Wächter, Thomas; Pickersgill, Laura; Eyre, Cecilia; Schroeder, Michael

    2008-01-01

    Background The engineering of ontologies, especially with a view to a text-mining use, is still a new research field. There does not yet exist a well-defined theory and technology for ontology construction. Many of the ontology design steps remain manual and are based on personal experience and intuition. However, there exist a few efforts on automatic construction of ontologies in the form of extracted lists of terms and relations between them. Results We share experience acquired during the manual development of a lipoprotein metabolism ontology (LMO) to be used for text-mining. We compare the manually created ontology terms with the automatically derived terminology from four different automatic term recognition (ATR) methods. The top 50 predicted terms contain up to 89% relevant terms. For the top 1000 terms the best method still generates 51% relevant terms. In a corpus of 3066 documents 53% of LMO terms are contained and 38% can be generated with one of the methods. Conclusions Given high precision, automatic methods can help decrease development time and provide significant support for the identification of domain-specific vocabulary. The coverage of the domain vocabulary depends strongly on the underlying documents. Ontology development for text mining should be performed in a semi-automatic way; taking ATR results as input and following the guidelines we described. Availability The TFIDF term recognition is available as Web Service, described at PMID:18460175

  17. Stopping Antidepressants and Anxiolytics as Major Concerns Reported in Online Health Communities: A Text Mining Approach.

    PubMed

    Abbe, Adeline; Falissard, Bruno

    2017-10-23

    Internet is a particularly dynamic way to quickly capture the perceptions of a population in real time. Complementary to traditional face-to-face communication, online social networks help patients to improve self-esteem and self-help. The aim of this study was to use text mining on material from an online forum exploring patients' concerns about treatment (antidepressants and anxiolytics). Concerns about treatment were collected from discussion titles in patients' online community related to antidepressants and anxiolytics. To examine the content of these titles automatically, we used text mining methods, such as word frequency in a document-term matrix and co-occurrence of words using a network analysis. It was thus possible to identify topics discussed on the forum. The forum included 2415 discussions on antidepressants and anxiolytics over a period of 3 years. After a preprocessing step, the text mining algorithm identified the 99 most frequently occurring words in titles, among which were escitalopram, withdrawal, antidepressant, venlafaxine, paroxetine, and effect. Patients' concerns were related to antidepressant withdrawal, the need to share experience about symptoms, effects, and questions on weight gain with some drugs. Patients' expression on the Internet is a potential additional resource in addressing patients' concerns about treatment. Patient profiles are close to that of patients treated in psychiatry. ©Adeline Abbe, Bruno Falissard. Originally published in JMIR Mental Health (http://mental.jmir.org), 23.10.2017.

  18. Coronary artery disease risk assessment from unstructured electronic health records using text mining.

    PubMed

    Jonnagaddala, Jitendra; Liaw, Siaw-Teng; Ray, Pradeep; Kumar, Manish; Chang, Nai-Wen; Dai, Hong-Jie

    2015-12-01

    Coronary artery disease (CAD) often leads to myocardial infarction, which may be fatal. Risk factors can be used to predict CAD, which may subsequently lead to prevention or early intervention. Patient data such as co-morbidities, medication history, social history and family history are required to determine the risk factors for a disease. However, risk factor data are usually embedded in unstructured clinical narratives if the data is not collected specifically for risk assessment purposes. Clinical text mining can be used to extract data related to risk factors from unstructured clinical notes. This study presents methods to extract Framingham risk factors from unstructured electronic health records using clinical text mining and to calculate 10-year coronary artery disease risk scores in a cohort of diabetic patients. We developed a rule-based system to extract risk factors: age, gender, total cholesterol, HDL-C, blood pressure, diabetes history and smoking history. The results showed that the output from the text mining system was reliable, but there was a significant amount of missing data to calculate the Framingham risk score. A systematic approach for understanding missing data was followed by implementation of imputation strategies. An analysis of the 10-year Framingham risk scores for coronary artery disease in this cohort has shown that the majority of the diabetic patients are at moderate risk of CAD. Copyright © 2015 Elsevier Inc. All rights reserved.

  19. LimTox: a web tool for applied text mining of adverse event and toxicity associations of compounds, drugs and genes.

    PubMed

    Cañada, Andres; Capella-Gutierrez, Salvador; Rabal, Obdulia; Oyarzabal, Julen; Valencia, Alfonso; Krallinger, Martin

    2017-07-03

    A considerable effort has been devoted to retrieve systematically information for genes and proteins as well as relationships between them. Despite the importance of chemical compounds and drugs as a central bio-entity in pharmacological and biological research, only a limited number of freely available chemical text-mining/search engine technologies are currently accessible. Here we present LimTox (Literature Mining for Toxicology), a web-based online biomedical search tool with special focus on adverse hepatobiliary reactions. It integrates a range of text mining, named entity recognition and information extraction components. LimTox relies on machine-learning, rule-based, pattern-based and term lookup strategies. This system processes scientific abstracts, a set of full text articles and medical agency assessment reports. Although the main focus of LimTox is on adverse liver events, it enables also basic searches for other organ level toxicity associations (nephrotoxicity, cardiotoxicity, thyrotoxicity and phospholipidosis). This tool supports specialized search queries for: chemical compounds/drugs, genes (with additional emphasis on key enzymes in drug metabolism, namely P450 cytochromes-CYPs) and biochemical liver markers. The LimTox website is free and open to all users and there is no login requirement. LimTox can be accessed at: http://limtox.bioinfo.cnio.es. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Text Mining for Drugs and Chemical Compounds: Methods, Tools and Applications.

    PubMed

    Vazquez, Miguel; Krallinger, Martin; Leitner, Florian; Valencia, Alfonso

    2011-06-01

    Providing prior knowledge about biological properties of chemicals, such as kinetic values, protein targets, or toxic effects, can facilitate many aspects of drug development. Chemical information is rapidly accumulating in all sorts of free text documents like patents, industry reports, or scientific articles, which has motivated the development of specifically tailored text mining applications. Despite the potential gains, chemical text mining still faces significant challenges. One of the most salient is the recognition of chemical entities mentioned in text. To help practitioners contribute to this area, a good portion of this review is devoted to this issue, and presents the basic concepts and principles underlying the main strategies. The technical details are introduced and accompanied by relevant bibliographic references. Other tasks discussed are retrieving relevant articles, identifying relationships between chemicals and other entities, or determining the chemical structures of chemicals mentioned in text. This review also introduces a number of published applications that can be used to build pipelines in topics like drug side effects, toxicity, and protein-disease-compound network analysis. We conclude the review with an outlook on how we expect the field to evolve, discussing its possibilities and its current limitations. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Mining free-text medical records for companion animal enteric syndrome surveillance.

    PubMed

    Anholt, R M; Berezowski, J; Jamal, I; Ribble, C; Stephen, C

    2014-03-01

    Large amounts of animal health care data are present in veterinary electronic medical records (EMR) and they present an opportunity for companion animal disease surveillance. Veterinary patient records are largely in free-text without clinical coding or fixed vocabulary. Text-mining, a computer and information technology application, is needed to identify cases of interest and to add structure to the otherwise unstructured data. In this study EMR's were extracted from veterinary management programs of 12 participating veterinary practices and stored in a data warehouse. Using commercially available text-mining software (WordStat™), we developed a categorization dictionary that could be used to automatically classify and extract enteric syndrome cases from the warehoused electronic medical records. The diagnostic accuracy of the text-miner for retrieving cases of enteric syndrome was measured against human reviewers who independently categorized a random sample of 2500 cases as enteric syndrome positive or negative. Compared to the reviewers, the text-miner retrieved cases with enteric signs with a sensitivity of 87.6% (95%CI, 80.4-92.9%) and a specificity of 99.3% (95%CI, 98.9-99.6%). Automatic and accurate detection of enteric syndrome cases provides an opportunity for community surveillance of enteric pathogens in companion animals. Copyright © 2014 Elsevier B.V. All rights reserved.

  2. Tertiary coals in South Texas: Anomalous cannel-like coals of Webb County (Claiborne Group, Eocene) and lignites of Atascosa County (Jackson Group, Eocene) - Geologic setting, character, source-rock and coal-bed methane potential

    USGS Publications Warehouse

    Warwick, Peter D.; Aubourg, Claire E.; Willett, Jason C.

    1999-01-01

    The coal-bearing Gulf of Mexico Coastal Plain of North America contains a variety of depositional settings and coal types. The coal-bearing region extends westward from Alabama and Mississippi, across Louisiana to the northern part of the Mississippi Embayment, and then southward to eastern Arkansas, Texas and northern Mexico (fig. 1). Most of the coal currently mined in Texas is lignite from the upper part of the Wilcox Group (Paleocene-Eocene) and, in Louisiana, lignite is mined from the lower part of the Wilcox (fig. 2). Gulf Coast coal is used primarily as fuel for mine-mouth electric plants. On this field trip we will visit the only two non-Wilcox coal mining intervals in the Texas-Louisiana Coastal Plain; these include the San Pedro - Santo Tomas bituminous cannel-like coal zone of the Eocene Claiborne Group, and the San Miguel lignite coal zone of the Eocene Jackson Group (fig. 2). Other coal-mining areas in northern Mexico are currently producing bituminous coal from the Cretaceous Olmos Formation of the Navaro Group (fig. 2).

  3. Closedure - Mine Closure Technologies Resource

    NASA Astrophysics Data System (ADS)

    Kauppila, Päivi; Kauppila, Tommi; Pasanen, Antti; Backnäs, Soile; Liisa Räisänen, Marja; Turunen, Kaisa; Karlsson, Teemu; Solismaa, Lauri; Hentinen, Kimmo

    2015-04-01

    Closure of mining operations is an essential part of the development of eco-efficient mining and the Green Mining concept in Finland to reduce the environmental footprint of mining. Closedure is a 2-year joint research project between Geological Survey of Finland and Technical Research Centre of Finland that aims at developing accessible tools and resources for planning, executing and monitoring mine closure. The main outcome of the Closedure project is an updatable wiki technology-based internet platform (http://mineclosure.gtk.fi) in which comprehensive guidance on the mine closure is provided and main methods and technologies related to mine closure are evaluated. Closedure also provides new data on the key issues of mine closure, such as performance of passive water treatment in Finland, applicability of test methods for evaluating cover structures for mining wastes, prediction of water effluents from mine wastes, and isotopic and geophysical methods to recognize contaminant transport paths in crystalline bedrock.

  4. Selected Metals in Sediments and Streams in the Oklahoma Part of the Tri-State Mining District, 2000-2006

    USGS Publications Warehouse

    Andrews, William J.; Becker, Mark F.; Mashburn, Shana L.; Smith, S. Jerrod

    2009-01-01

    The abandoned Tri-State mining district includes 1,188 square miles in northeastern Oklahoma, southeastern Kansas, and southwestern Missouri. The most productive part of the Tri-State mining district was the 40-square mile part in Oklahoma, commonly referred to as 'the Picher mining district' in north-central Ottawa County, Oklahoma. The Oklahoma part of the Tri-State mining district was a primary producing area of lead and zinc in the United States during the first half of the 20th century. Sulfide minerals of cadmium, iron, lead, and zinc that remained in flooded underground mine workings and in mine tailings on the land surface oxidized and dissolved with time, forming a variety of oxide, hydroxide, and hydroxycarbonate metallic minerals on the land surface and in streams that drain the district. Metals in water and sediments in streams draining the mining district can potentially impair the habitat and health of many forms of aquatic and terrestrial life. Lakebed, streambed and floodplain sediments and/or stream water were sampled at 30 sites in the Oklahoma part of the Tri-State mining district by the U.S. Geological Survey and the Oklahoma Department of Environmental Quality from 2000 to 2006 in cooperation with the U.S. Environmental Protection Agency, and the Quapaw and Seneca-Cayuga Tribes of Oklahoma. Aluminum and iron concentrations of several thousand milligrams per kilogram were measured in sediments collected from the upstream end of Grand Lake O' the Cherokees. Manganese and zinc concentrations in those sediments were several hundred milligrams per kilogram. Lead and cadmium concentrations in those sediments were about 10 percent and 0.1 percent of zinc concentrations, respectively. Sediment cores collected in a transect across the floodplain of Tar Creek near Miami, Oklahoma, in 2004 had similar or greater concentrations of those metals than sediment cores collected at the upstream end of Grand Lake O' the Cherokees. The greatest concentrations of cadmium, iron, lead, and zinc were detected in sediments beneath an intermittent tributary to Tar Creek, a slough which drains mined areas near Commerce, Oklahoma. In surface water, aluminum and iron concentrations were greatest in the Neosho River, perhaps a result of runoff from areas underlain by shales. The greatest aqueous concentrations of cadmium, lead, manganese, and zinc were measured in water from Tar Creek, the primary small stream draining the Picher mining district with the largest proportion of mined area. Water from the Spring River had greater zinc concentrations than water from the Neosho River, perhaps as a result of a greater proportion of mined area in the Spring River Basin. Dissolved metals concentrations were generally much less than total metals concentrations, except for manganese and zinc at sites on Tar Creek, where seepage of ground water from the mine workings, saturated mine tailings, and/or metalliferous streambed sediments may be sources of these dissolved metals. Iron and lead concentrations generally decreased with increasing streamflow in upstream reaches of Tar Creek, indicating dilution of metals-rich ground water by runoff. Farther downstream in Tar Creek, and in the Neosho and Spring Rivers, metals concentrations tended to increase with increasing streamflow, indicating that most metals in these parts of these streams were associated with runoff and re-suspension of metals precipitated as oxide, hydroxide, and hydroxycarbonate minerals on land surface and streambeds. Estimated total aluminum, cadmium, iron, manganese, and zinc loads generally were greatest in water from the Neosho and Spring Rivers, primarily because of comparatively large streamflows in those rivers. Slight increases in metal loads in the downstream directions on those rivers indicated contributions of metals from inflows of small tributaries such as Tar Creek and from runoff.

  5. Data mining in radiology

    PubMed Central

    Kharat, Amit T; Singh, Amarjit; Kulkarni, Vilas M; Shah, Digish

    2014-01-01

    Data mining facilitates the study of radiology data in various dimensions. It converts large patient image and text datasets into useful information that helps in improving patient care and provides informative reports. Data mining technology analyzes data within the Radiology Information System and Hospital Information System using specialized software which assesses relationships and agreement in available information. By using similar data analysis tools, radiologists can make informed decisions and predict the future outcome of a particular imaging finding. Data, information and knowledge are the components of data mining. Classes, Clusters, Associations, Sequential patterns, Classification, Prediction and Decision tree are the various types of data mining. Data mining has the potential to make delivery of health care affordable and ensure that the best imaging practices are followed. It is a tool for academic research. Data mining is considered to be ethically neutral, however concerns regarding privacy and legality exists which need to be addressed to ensure success of data mining. PMID:25024513

  6. Analyzing asset management data using data and text mining.

    DOT National Transportation Integrated Search

    2014-07-01

    Predictive models using text from a sample competitively bid California highway projects have been used to predict a construction : projects likely level of cost overrun. A text description of the project and the text of the five largest project line...

  7. Textpresso site-specific recombinases: A text-mining server for the recombinase literature including Cre mice and conditional alleles.

    PubMed

    Urbanski, William M; Condie, Brian G

    2009-12-01

    Textpresso Site Specific Recombinases (http://ssrc.genetics.uga.edu/) is a text-mining web server for searching a database of more than 9,000 full-text publications. The papers and abstracts in this database represent a wide range of topics related to site-specific recombinase (SSR) research tools. Included in the database are most of the papers that report the characterization or use of mouse strains that express Cre recombinase as well as papers that describe or analyze mouse lines that carry conditional (floxed) alleles or SSR-activated transgenes/knockins. The database also includes reports describing SSR-based cloning methods such as the Gateway or the Creator systems, papers reporting the development or use of SSR-based tools in systems such as Drosophila, bacteria, parasites, stem cells, yeast, plants, zebrafish, and Xenopus as well as publications that describe the biochemistry, genetics, or molecular structure of the SSRs themselves. Textpresso Site Specific Recombinases is the only comprehensive text-mining resource available for the literature describing the biology and technical applications of SSRs. (c) 2009 Wiley-Liss, Inc.

  8. Text mining for metabolic pathways, signaling cascades, and protein networks.

    PubMed

    Hoffmann, Robert; Krallinger, Martin; Andres, Eduardo; Tamames, Javier; Blaschke, Christian; Valencia, Alfonso

    2005-05-10

    The complexity of the information stored in databases and publications on metabolic and signaling pathways, the high throughput of experimental data, and the growing number of publications make it imperative to provide systems to help the researcher navigate through these interrelated information resources. Text-mining methods have started to play a key role in the creation and maintenance of links between the information stored in biological databases and its original sources in the literature. These links will be extremely useful for database updating and curation, especially if a number of technical problems can be solved satisfactorily, including the identification of protein and gene names (entities in general) and the characterization of their types of interactions. The first generation of openly accessible text-mining systems, such as iHOP (Information Hyperlinked over Proteins), provides additional functions to facilitate the reconstruction of protein interaction networks, combine database and text information, and support the scientist in the formulation of novel hypotheses. The next challenge is the generation of comprehensive information regarding the general function of signaling pathways and protein interaction networks.

  9. TOY SAFETY SURVEILLANCE FROM ONLINE REVIEWS

    PubMed Central

    Winkler, Matt; Abrahams, Alan S.; Gruss, Richard; Ehsani, Johnathan P.

    2016-01-01

    Toy-related injuries account for a significant number of childhood injuries and the prevention of these injuries remains a goal for regulatory agencies and manufacturers. Text-mining is an increasingly prevalent method for uncovering the significance of words using big data. This research sets out to determine the effectiveness of text-mining in uncovering potentially dangerous children’s toys. We develop a danger word list, also known as a ‘smoke word’ list, from injury and recall text narratives. We then use the smoke word lists to score over one million Amazon reviews, with the top scores denoting potential safety concerns. We compare the smoke word list to conventional sentiment analysis techniques, in terms of both word overlap and effectiveness. We find that smoke word lists are highly distinct from conventional sentiment dictionaries and provide a statistically significant method for identifying safety concerns in children’s toy reviews. Our findings indicate that text-mining is, in fact, an effective method for the surveillance of safety concerns in children’s toys and could be a gateway to effective prevention of toy-product-related injuries. PMID:27942092

  10. Mining geology of the Pond Creek seam, Pikeville Formation, Middle Pennsylvanian, in part of the Eastern Kentucky Coal Field, USA

    USGS Publications Warehouse

    Greb, S.F.; Popp, J.T.

    1999-01-01

    The Pond Creek seam is one of the leading producers of coal in the Eastern Kentucky Coal Field. The geologic factors that affect mining were investigated in several underground mines and categorized in terms of coal thickness, coal quality, and roof control. The limits of mining and thick coal are defined by splitting along the margin of the coal body. Within the coal body, local thickness variation occurs because of (1) leader coal benches filling narrow, elongated depressions, (2) rider coal benches coming near to or merging with the main bench, (3) overthrust coal benches being included along paleochannel margins, (4) cutouts occuring beneath paleochannels, and (5) very hard and unusual rock partings occuring along narrow, elongated trends. In the study area, the coal is mostly mined as a compliance product: sulfur contents are less than 1% and ash yields are less than 10%. Local increases in sulfur occur beneath sandstones, and are inferred to represent post-depositional migration of fluids through porous sands into the coal. Run-of-mine quality is also affected by several mine-roof conditions and trends of densely concentrated rock partings, which lead to increased in- and out-of-seam dilution and overall ash content of the mined coal. Roof control is largely a function of a heterolithic facies mosaic of coastal-estuarine origin, regional fracture trends, and unloading stress related to varying mine depth beneath the surface. Lateral variability of roof facies is the rule in most mines. The largest falls occur beneath modern valleys and parallel fractures, along paleochannel margins, within tidally affected 'stackrock,' and beneath rider coals. Shale spalling, kettlebottoms, and falls within other more isolated facies also occur. Many of the lithofacies, and falls related to bedding weaknesses within or between lithofacies, occur along northeast-southwest trends, which can be projected in advance of mining. Fracture-related falls occur independently of lithofacies trends along northwest-southeast trends, especially beneath modern valleys where overburden thickness decreases sharply. Differentiating roof falls related to these trends can aid in predicting roof quality in advance of mining.The Pond Creek-Lower Elkhorn seam has been an important exploration target because it typically has very low sulfur contents and ash yields. Geologic research in several large Pond Creek mines suggested variability in roof quality and coal thickness. Due to mine access, geologic problems encountered during mining are documented and described.

  11. Miners' Misconceptions of Flow Distribution Within Circuits as a Factor Influencing Underground Mining Accidents.

    NASA Astrophysics Data System (ADS)

    Passaro, Perry David

    Misconceptions can be thought of as naive approaches to problem solving that are perceptually appealing but incorrect and inconsistent with scientific evidence (Piaget, 1929). One type of misconception involves flow distributions within circuits. This concept is important because miners' conceptual errors about flow distribution changes within complex circuits may be in part responsible for fatal mine disasters. Based on the theory that misconceptions of flow distribution changes within circuits were responsible for underground mine disasters involving mine ventilation circuits, a series of studies was undertaken with mining engineering students, professional mining engineers, as well as mine foremen, mine supervisors, mine rescue members, mine maintenance personnel, mining researchers and working miners to identify these conceptual errors and errors in mine ventilation procedures. Results indicate that misconceptions of flow distribution changes within circuits exist in over 70 percent of the subjects sampled. It is assumed that these misconceptions of flow distribution changes within circuits result in errors of judgment when miners are faced with inferring and changing ventilation arrangements when two or more mine sections are connected. Furthermore, it is assumed that these misconceptions are pervasive in the mining industry and may be responsible for at least two mine ventilation disasters. The findings of this study are consistent with Piaget's (1929) model of figurative and operative knowledge. This model states that misconceptions are in part due to a lack of knowledge of dynamic transformations and how to apply content information. Recommendations for future research include the development of an interactive expert system for training miners with ventilation arrangements. Such a system would meet the educational recommendations made by Piaget (1973b) by involving a hands-on approach that allows discovery, interaction, the opportunity to make mistakes and to review the cognitive concepts on which the subject relied during his manipulation of the ventilation system.

  12. Effects of surface coal mining and reclamation on ground water in small watersheds in the Allegheny Plateau, Ohio

    USGS Publications Warehouse

    Eberle, Michael; Razem, A.C.

    1985-01-01

    The hydrologic effects of surface coal mining in unlimited areas is difficult to predict, partly because of a lack of adequate data collected before and after mining and reclamation. In order to help provide data to assess the effects of surface mining on the hydrology of small basins in the coal fields of the eastern United States, the U.S. Bureau of Mines sponsored a comprehensive hydrologic study at three sites in the Ohio part of the Eastern Coal Province. These sites are within the unqlaciated part of the Allegheny Plateau, and are representative of similar coal-producing areas in Kentucky, West Virginia, and Pennsylvania. The U.S. Geological Survey was responsible for the ground-water phase of the study. The aquifer system at each watershed consisted of two localized perched aquifers (top and middle) above a deeper, more regional aquifer. The premining top aquifer was destroyed by mining in each case, and was replaced by spoils during reclamation. The spoils formed new top aquifers that were slowly becoming resaturated at the end of the study period. Water levels in the aquifers were about the same after reclamation as before mining, although levels rose in a few places. It appears that the underclay at the base of the new top aquifers at all three sites prevents significant downward leakage from the top aquifers to lower except in places where the layer may have been damaged during mining. Water in the top aquifers is a calcium sulfate type, whereas calcium bicarbonate type water predominated before mining. The median specific conductance of water in the new top aquifers was about 5 times greater than that of the original top aquifers in two of the watersheds, and 1 1/2 times the level of the original top aquifers in the third. Concentrations of dissolved sulfate, iron, and manganese in the top aquifers before mining generally did not exceed U.S. and Ohio Environmental Protection Agency drinking-water limits, but generally exceeded these limits after reclamation. Water-quality changes in the middle aquifers were minor by comparison. Water levels and water quality in the deeper, regional aquifers were unaffected by mining.

  13. An analysis of injury claims from low-seam coal mines

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gallagher, S.; Moore, S.; Dempsey, P.G.

    2009-07-01

    The restricted workspace present in low-seam coal mines forces workers to adopt awkward working postures (kneeling and stooping), which place high physical demands on the knee and lower back. This article provides an analysis of injury claims for eight mining companies operating low-seam coal mines during calendar years 1996-2008. All cost data were normalized using data on the cost of medical care (MPI) as provided by the U.S. Bureau of Labor Statistics. Results of the analysis indicate that the knee was the body part that led in terms of claim cost ($4.2 million), followed by injuries to the lower backmore » ($2.7 million). While the average cost per injury for these body parts was $13,100 and $14,400, respectively (close to the average cost of an injury overall), the high frequency of these injuries resulted in their pre-eminence in terms of cost. Analysis of data from individual mining companies suggest that knee and lower back injuries were a consistent problem across companies, as these injuries were each among the top five most costly part of body for seven out of eight companies studied. Results of this investigation suggest that efforts to reduce the frequency of knee and low back injuries in low-seam mines have the potential to create substantial cost savings.« less

  14. 76 FR 41411 - West Virginia Regulatory Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-14

    ... of Environmental Protection (WVDEP). The interim rule provided an opportunity for public comment and... 30 CFR Part 948 Intergovernmental relations, Surface mining, Underground mining. Dated: July 5, 2011...

  15. Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus

    PubMed Central

    Comeau, Donald C.; Liu, Haibin; Islamaj Doğan, Rezarta; Wilbur, W. John

    2014-01-01

    BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net PMID:24935050

  16. ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials

    PubMed Central

    2012-01-01

    Clinical trials are mandatory protocols describing medical research on humans and among the most valuable sources of medical practice evidence. Searching for trials relevant to some query is laborious due to the immense number of existing protocols. Apart from search, writing new trials includes composing detailed eligibility criteria, which might be time-consuming, especially for new researchers. In this paper we present ASCOT, an efficient search application customised for clinical trials. ASCOT uses text mining and data mining methods to enrich clinical trials with metadata, that in turn serve as effective tools to narrow down search. In addition, ASCOT integrates a component for recommending eligibility criteria based on a set of selected protocols. PMID:22595088

  17. ASCOT: a text mining-based web-service for efficient search and assisted creation of clinical trials.

    PubMed

    Korkontzelos, Ioannis; Mu, Tingting; Ananiadou, Sophia

    2012-04-30

    Clinical trials are mandatory protocols describing medical research on humans and among the most valuable sources of medical practice evidence. Searching for trials relevant to some query is laborious due to the immense number of existing protocols. Apart from search, writing new trials includes composing detailed eligibility criteria, which might be time-consuming, especially for new researchers. In this paper we present ASCOT, an efficient search application customised for clinical trials. ASCOT uses text mining and data mining methods to enrich clinical trials with metadata, that in turn serve as effective tools to narrow down search. In addition, ASCOT integrates a component for recommending eligibility criteria based on a set of selected protocols.

  18. Environmental Impact of the Helen, Research, and Chicago Mercury Mines on Water, Sediment, and Biota in the Upper Dry Creek Watershed, Lake County, California

    USGS Publications Warehouse

    Rytuba, James J.; Hothem, Roger L.; May, Jason T.; Kim, Christopher S.; Lawler, David; Goldstein, Daniel; Brussee, Brianne E.

    2009-01-01

    The Helen, Research, and Chicago mercury (Hg) deposits are among the youngest Hg deposits in the Coast Range Hg mineral belt and are located in the southwestern part of the Clear Lake volcanic field in Lake County, California. The mine workings and tailings are located in the headwaters of Dry Creek. The Helen Hg mine is the largest mine in the watershed having produced about 7,600 flasks of Hg. The Chicago and Research Hg mines produced only a small amount of Hg, less than 30 flasks. Waste rock and tailings have eroded from the mines, and mine drainage from the Helen and Research mines contributes Hg-enriched mine wastes to the headwaters of Dry Creek and contaminate the creek further downstream. The mines are located on federal land managed by the U.S. Bureau of Land Management (USBLM). The USBLM requested that the U.S. Geological Survey (USGS) measure and characterize Hg and geochemical constituents in tailings, sediment, water, and biota at the Helen, Research, and Chicago mines and in Dry Creek. This report is made in response to the USBLM request to conduct a Comprehensive Environmental Response, Compensation, and Liability Act (CERCLA - Removal Site Investigation (RSI). The RSI applies to removal of Hg-contaminated mine waste from the Helen, Research, and Chicago mines as a means of reducing Hg transport to Dry Creek. This report summarizes data obtained from field sampling of mine tailings, waste rock, sediment, and water at the Helen, Research, and Chicago mines on April 19, 2001, during a storm event. Further sampling of water, sediment, and biota at the Helen mine area and the upper part of Dry Creek was completed on July 15, 2003, during low-flow conditions. Our results permit a preliminary assessment of the mining sources of Hg and associated chemical constituents that could elevate levels of monomethyl Hg (MMeHg) in the water, sediment, and biota that are impacted by historic mining.

  19. Studies on medicinal herbs for cognitive enhancement based on the text mining of Dongeuibogam and preliminary evaluation of its effects.

    PubMed

    Pak, Malk Eun; Kim, Yu Ri; Kim, Ha Neui; Ahn, Sung Min; Shin, Hwa Kyoung; Baek, Jin Ung; Choi, Byung Tae

    2016-02-17

    In literature on Korean medicine, Dongeuibogam (Treasured Mirror of Eastern Medicine), published in 1613, represents the overall results of the traditional medicines of North-East Asia based on prior medicinal literature of this region. We utilized this medicinal literature by text mining to establish a list of candidate herbs for cognitive enhancement in the elderly and then performed an evaluation of their effects. Text mining was performed for selection of candidate herbs. Cell viability was determined in HT22 hippocampal cells and immunohistochemistry and behavioral analysis was performed in a kainic acid (KA) mice model in order to observe alterations of hippocampal cells and cognition. Twenty four herbs for cognitive enhancement in the elderly were selected by text mining of Dongeuibogam. In HT22 cells, pretreatment with 3 candidate herbs resulted in significantly reduced glutamate-induced cell death. Panax ginseng was the most neuroprotective herb against glutamate-induced cell death. In the hippocampus of a KA mice model, pretreatment with 11 candidate herbs resulted in suppression of caspase-3 expression. Treatment with 7 candidate herbs resulted in significantly enhanced expression levels of phosphorylated cAMP response element binding protein. Number of proliferated cells indicated by BrdU labeling was increased by treatment with 10 candidate herbs. Schisandra chinensis was the most effective herb against cell death and proliferation of progenitor cells and Rehmannia glutinosa in neuroprotection in the hippocampus of a KA mice model. In a KA mice model, we confirmed improved spatial and short memory by treatment with the 3 most effective candidate herbs and these recovered functions were involved in a higher number of newly formed neurons from progenitor cells in the hippocampus. These established herbs and their combinations identified by text-mining technique and evaluation for effectiveness may have value in further experimental and clinical applications for cognitive enhancement in the elderly. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  20. Mining the Text: 34 Text Features that Can Ease or Obstruct Text Comprehension and Use

    ERIC Educational Resources Information Center

    White, Sheida

    2012-01-01

    This article presents 34 characteristics of texts and tasks ("text features") that can make continuous (prose), noncontinuous (document), and quantitative texts easier or more difficult for adolescents and adults to comprehend and use. The text features were identified by examining the assessment tasks and associated texts in the national…

Top