Sample records for processing nlp system

  1. Open Source Clinical NLP - More than Any Single System.

    PubMed

    Masanz, James; Pakhomov, Serguei V; Xu, Hua; Wu, Stephen T; Chute, Christopher G; Liu, Hongfang

    2014-01-01

    The number of Natural Language Processing (NLP) tools and systems for processing clinical free-text has grown as interest and processing capability have surged. Unfortunately any two systems typically cannot simply interoperate, even when both are built upon a framework designed to facilitate the creation of pluggable components. We present two ongoing activities promoting open source clinical NLP. The Open Health Natural Language Processing (OHNLP) Consortium was originally founded to foster a collaborative community around clinical NLP, releasing UIMA-based open source software. OHNLP's mission currently includes maintaining a catalog of clinical NLP software and providing interfaces to simplify the interaction of NLP systems. Meanwhile, Apache cTAKES aims to integrate best-of-breed annotators, providing a world-class NLP system for accessing clinical information within free-text. These two activities are complementary. OHNLP promotes open source clinical NLP activities in the research community and Apache cTAKES bridges research to the health information technology (HIT) practice.

  2. Open Source Clinical NLP – More than Any Single System

    PubMed Central

    Masanz, James; Pakhomov, Serguei V.; Xu, Hua; Wu, Stephen T.; Chute, Christopher G.; Liu, Hongfang

    2014-01-01

    The number of Natural Language Processing (NLP) tools and systems for processing clinical free-text has grown as interest and processing capability have surged. Unfortunately any two systems typically cannot simply interoperate, even when both are built upon a framework designed to facilitate the creation of pluggable components. We present two ongoing activities promoting open source clinical NLP. The Open Health Natural Language Processing (OHNLP) Consortium was originally founded to foster a collaborative community around clinical NLP, releasing UIMA-based open source software. OHNLP’s mission currently includes maintaining a catalog of clinical NLP software and providing interfaces to simplify the interaction of NLP systems. Meanwhile, Apache cTAKES aims to integrate best-of-breed annotators, providing a world-class NLP system for accessing clinical information within free-text. These two activities are complementary. OHNLP promotes open source clinical NLP activities in the research community and Apache cTAKES bridges research to the health information technology (HIT) practice. PMID:25954581

  3. A common type system for clinical natural language processing

    PubMed Central

    2013-01-01

    Background One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. Results We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. Conclusions We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types. PMID:23286462

  4. A common type system for clinical natural language processing.

    PubMed

    Wu, Stephen T; Kaggal, Vinod C; Dligach, Dmitriy; Masanz, James J; Chen, Pei; Becker, Lee; Chapman, Wendy W; Savova, Guergana K; Liu, Hongfang; Chute, Christopher G

    2013-01-03

    One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types.

  5. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.

    PubMed

    Kreimeyer, Kory; Foster, Matthew; Pandey, Abhishek; Arya, Nina; Halford, Gwendolyn; Jones, Sandra F; Forshee, Richard; Walderhaug, Mark; Botsis, Taxiarchis

    2017-09-01

    We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. Natural language processing: an introduction.

    PubMed

    Nadkarni, Prakash M; Ohno-Machado, Lucila; Chapman, Wendy W

    2011-01-01

    To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field.

  7. Natural language processing: an introduction

    PubMed Central

    Ohno-Machado, Lucila; Chapman, Wendy W

    2011-01-01

    Objectives To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. Target audience This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. Scope We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field. PMID:21846786

  8. CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines.

    PubMed

    Soysal, Ergin; Wang, Jingqi; Jiang, Min; Wu, Yonghui; Pakhomov, Serguei; Liu, Hongfang; Xu, Hua

    2017-11-24

    Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  9. Natural Language Processing: Toward Large-Scale, Robust Systems.

    ERIC Educational Resources Information Center

    Haas, Stephanie W.

    1996-01-01

    Natural language processing (NLP) is concerned with getting computers to do useful things with natural language. Major applications include machine translation, text generation, information retrieval, and natural language interfaces. Reviews important developments since 1987 that have led to advances in NLP; current NLP applications; and problems…

  10. Challenges in adapting existing clinical natural language processing systems to multiple, diverse health care settings.

    PubMed

    Carrell, David S; Schoen, Robert E; Leffler, Daniel A; Morris, Michele; Rose, Sherri; Baer, Andrew; Crockett, Seth D; Gourevitch, Rebecca A; Dean, Katie M; Mehrotra, Ateev

    2017-09-01

    Widespread application of clinical natural language processing (NLP) systems requires taking existing NLP systems and adapting them to diverse and heterogeneous settings. We describe the challenges faced and lessons learned in adapting an existing NLP system for measuring colonoscopy quality. Colonoscopy and pathology reports from 4 settings during 2013-2015, varying by geographic location, practice type, compensation structure, and electronic health record. Though successful, adaptation required considerably more time and effort than anticipated. Typical NLP challenges in assembling corpora, diverse report structures, and idiosyncratic linguistic content were greatly magnified. Strategies for addressing adaptation challenges include assessing site-specific diversity, setting realistic timelines, leveraging local electronic health record expertise, and undertaking extensive iterative development. More research is needed on how to make it easier to adapt NLP systems to new clinical settings. A key challenge in widespread application of NLP is adapting existing systems to new clinical settings. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  11. Robo-Sensei's NLP-Based Error Detection and Feedback Generation

    ERIC Educational Resources Information Center

    Nagata, Noriko

    2009-01-01

    This paper presents a new version of Robo-Sensei's NLP (Natural Language Processing) system which updates the version currently available as the software package "ROBO-SENSEI: Personal Japanese Tutor" (Nagata, 2004). Robo-Sensei's NLP system includes a lexicon, a morphological generator, a word segmentor, a morphological parser, a syntactic…

  12. HTP-NLP: A New NLP System for High Throughput Phenotyping.

    PubMed

    Schlegel, Daniel R; Crowner, Chris; Lehoullier, Frank; Elkin, Peter L

    2017-01-01

    Secondary use of clinical data for research requires a method to quickly process the data so that researchers can quickly extract cohorts. We present two advances in the High Throughput Phenotyping NLP system which support the aim of truly high throughput processing of clinical data, inspired by a characterization of the linguistic properties of such data. Semantic indexing to store and generalize partially-processed results and the use of compositional expressions for ungrammatical text are discussed, along with a set of initial timing results for the system.

  13. Natural Language Processing in Game Studies Research: An Overview

    ERIC Educational Resources Information Center

    Zagal, Jose P.; Tomuro, Noriko; Shepitsen, Andriy

    2012-01-01

    Natural language processing (NLP) is a field of computer science and linguistics devoted to creating computer systems that use human (natural) language as input and/or output. The authors propose that NLP can also be used for game studies research. In this article, the authors provide an overview of NLP and describe some research possibilities…

  14. Using Natural Language Processing to Improve Efficiency of Manual Chart Abstraction in Research: The Case of Breast Cancer Recurrence

    PubMed Central

    Carrell, David S.; Halgrim, Scott; Tran, Diem-Thy; Buist, Diana S. M.; Chubak, Jessica; Chapman, Wendy W.; Savova, Guergana

    2014-01-01

    The increasing availability of electronic health records (EHRs) creates opportunities for automated extraction of information from clinical text. We hypothesized that natural language processing (NLP) could substantially reduce the burden of manual abstraction in studies examining outcomes, like cancer recurrence, that are documented in unstructured clinical text, such as progress notes, radiology reports, and pathology reports. We developed an NLP-based system using open-source software to process electronic clinical notes from 1995 to 2012 for women with early-stage incident breast cancers to identify whether and when recurrences were diagnosed. We developed and evaluated the system using clinical notes from 1,472 patients receiving EHR-documented care in an integrated health care system in the Pacific Northwest. A separate study provided the patient-level reference standard for recurrence status and date. The NLP-based system correctly identified 92% of recurrences and estimated diagnosis dates within 30 days for 88% of these. Specificity was 96%. The NLP-based system overlooked 5 of 65 recurrences, 4 because electronic documents were unavailable. The NLP-based system identified 5 other recurrences incorrectly classified as nonrecurrent in the reference standard. If used in similar cohorts, NLP could reduce by 90% the number of EHR charts abstracted to identify confirmed breast cancer recurrence cases at a rate comparable to traditional abstraction. PMID:24488511

  15. Facilitating cancer research using natural language processing of pathology reports.

    PubMed

    Xu, Hua; Anderson, Kristin; Grann, Victor R; Friedman, Carol

    2004-01-01

    Many ongoing clinical research projects, such as projects involving studies associated with cancer, involve manual capture of information in surgical pathology reports so that the information can be used to determine the eligibility of recruited patients for the study and to provide other information, such as cancer prognosis. Natural language processing (NLP) systems offer an alternative to automated coding, but pathology reports have certain features that are difficult for NLP systems. This paper describes how a preprocessor was integrated with an existing NLP system (MedLEE) in order to reduce modification to the NLP system and to improve performance. The work was done in conjunction with an ongoing clinical research project that assesses disparities and risks of developing breast cancer for minority women. An evaluation of the system was performed using manually coded data from the research project's database as a gold standard. The evaluation outcome showed that the extended NLP system had a sensitivity of 90.6% and a precision of 91.6%. Results indicated that this system performed satisfactorily for capturing information for the cancer research project.

  16. The effects of natural language processing on cross-institutional portability of influenza case detection for disease surveillance.

    PubMed

    Ferraro, Jeffrey P; Ye, Ye; Gesteland, Per H; Haug, Peter J; Tsui, Fuchiang Rich; Cooper, Gregory F; Van Bree, Rudy; Ginter, Thomas; Nowalk, Andrew J; Wagner, Michael

    2017-05-31

    This study evaluates the accuracy and portability of a natural language processing (NLP) tool for extracting clinical findings of influenza from clinical notes across two large healthcare systems. Effectiveness is evaluated on how well NLP supports downstream influenza case-detection for disease surveillance. We independently developed two NLP parsers, one at Intermountain Healthcare (IH) in Utah and the other at University of Pittsburgh Medical Center (UPMC) using local clinical notes from emergency department (ED) encounters of influenza. We measured NLP parser performance for the presence and absence of 70 clinical findings indicative of influenza. We then developed Bayesian network models from NLP processed reports and tested their ability to discriminate among cases of (1) influenza, (2) non-influenza influenza-like illness (NI-ILI), and (3) 'other' diagnosis. On Intermountain Healthcare reports, recall and precision of the IH NLP parser were 0.71 and 0.75, respectively, and UPMC NLP parser, 0.67 and 0.79. On University of Pittsburgh Medical Center reports, recall and precision of the UPMC NLP parser were 0.73 and 0.80, respectively, and IH NLP parser, 0.53 and 0.80. Bayesian case-detection performance measured by AUROC for influenza versus non-influenza on Intermountain Healthcare cases was 0.93 (using IH NLP parser) and 0.93 (using UPMC NLP parser). Case-detection on University of Pittsburgh Medical Center cases was 0.95 (using UPMC NLP parser) and 0.83 (using IH NLP parser). For influenza versus NI-ILI on Intermountain Healthcare cases performance was 0.70 (using IH NLP parser) and 0.76 (using UPMC NLP parser). On University of Pisstburgh Medical Center cases, 0.76 (using UPMC NLP parser) and 0.65 (using IH NLP parser). In all but one instance (influenza versus NI-ILI using IH cases), local parsers were more effective at supporting case-detection although performances of non-local parsers were reasonable.

  17. Clinical Natural Language Processing in languages other than English: opportunities and challenges.

    PubMed

    Névéol, Aurélie; Dalianis, Hercules; Velupillai, Sumithra; Savova, Guergana; Zweigenbaum, Pierre

    2018-03-30

    Natural language processing applied to clinical text or aimed at a clinical outcome has been thriving in recent years. This paper offers the first broad overview of clinical Natural Language Processing (NLP) for languages other than English. Recent studies are summarized to offer insights and outline opportunities in this area. We envision three groups of intended readers: (1) NLP researchers leveraging experience gained in other languages, (2) NLP researchers faced with establishing clinical text processing in a language other than English, and (3) clinical informatics researchers and practitioners looking for resources in their languages in order to apply NLP techniques and tools to clinical practice and/or investigation. We review work in clinical NLP in languages other than English. We classify these studies into three groups: (i) studies describing the development of new NLP systems or components de novo, (ii) studies describing the adaptation of NLP architectures developed for English to another language, and (iii) studies focusing on a particular clinical application. We show the advantages and drawbacks of each method, and highlight the appropriate application context. Finally, we identify major challenges and opportunities that will affect the impact of NLP on clinical practice and public health studies in a context that encompasses English as well as other languages.

  18. Clinical documentation variations and NLP system portability: a case study in asthma birth cohorts across institutions.

    PubMed

    Sohn, Sunghwan; Wang, Yanshan; Wi, Chung-Il; Krusemark, Elizabeth A; Ryu, Euijung; Ali, Mir H; Juhn, Young J; Liu, Hongfang

    2017-11-30

    To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability. Birth cohorts from Mayo Clinic and Sanford Children's Hospital (SCH) were used in this study (n = 298 for each). Documentation variations regarding asthma between the 2 cohorts were examined in various aspects: (1) overall corpus at the word level (ie, lexical variation), (2) topics and asthma-related concepts (ie, semantic variation), and (3) clinical note types (ie, process variation). We compared those statistics and explored NLP system portability for asthma ascertainment in 2 stages: prototype and refinement. There exist notable lexical variations (word-level similarity = 0.669) and process variations (differences in major note types containing asthma-related concepts). However, semantic-level corpora were relatively homogeneous (topic similarity = 0.944, asthma-related concept similarity = 0.971). The NLP system for asthma ascertainment had an F-score of 0.937 at Mayo, and produced 0.813 (prototype) and 0.908 (refinement) when applied at SCH. The criteria for asthma ascertainment are largely dependent on asthma-related concepts. Therefore, we believe that semantic similarity is important to estimate NLP system portability. As the Mayo Clinic and SCH corpora were relatively homogeneous at a semantic level, the NLP system, developed at Mayo Clinic, was imported to SCH successfully with proper adjustments to deal with the intrinsic corpus heterogeneity. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  19. Development and Validation of a Natural Language Processing Tool to Identify Patients Treated for Pneumonia across VA Emergency Departments.

    PubMed

    Jones, B E; South, B R; Shao, Y; Lu, C C; Leng, J; Sauer, B C; Gundlapalli, A V; Samore, M H; Zeng, Q

    2018-01-01

    Identifying pneumonia using diagnosis codes alone may be insufficient for research on clinical decision making. Natural language processing (NLP) may enable the inclusion of cases missed by diagnosis codes. This article (1) develops a NLP tool that identifies the clinical assertion of pneumonia from physician emergency department (ED) notes, and (2) compares classification methods using diagnosis codes versus NLP against a gold standard of manual chart review to identify patients initially treated for pneumonia. Among a national population of ED visits occurring between 2006 and 2012 across the Veterans Affairs health system, we extracted 811 physician documents containing search terms for pneumonia for training, and 100 random documents for validation. Two reviewers annotated span- and document-level classifications of the clinical assertion of pneumonia. An NLP tool using a support vector machine was trained on the enriched documents. We extracted diagnosis codes assigned in the ED and upon hospital discharge and calculated performance characteristics for diagnosis codes, NLP, and NLP plus diagnosis codes against manual review in training and validation sets. Among the training documents, 51% contained clinical assertions of pneumonia; in the validation set, 9% were classified with pneumonia, of which 100% contained pneumonia search terms. After enriching with search terms, the NLP system alone demonstrated a recall/sensitivity of 0.72 (training) and 0.55 (validation), and a precision/positive predictive value (PPV) of 0.89 (training) and 0.71 (validation). ED-assigned diagnostic codes demonstrated lower recall/sensitivity (0.48 and 0.44) but higher precision/PPV (0.95 in training, 1.0 in validation); the NLP system identified more "possible-treated" cases than diagnostic coding. An approach combining NLP and ED-assigned diagnostic coding classification achieved the best performance (sensitivity 0.89 and PPV 0.80). System-wide application of NLP to clinical text can increase capture of initial diagnostic hypotheses, an important inclusion when studying diagnosis and clinical decision-making under uncertainty. Schattauer GmbH Stuttgart.

  20. What can Natural Language Processing do for Clinical Decision Support?

    PubMed Central

    Demner-Fushman, Dina; Chapman, Wendy W.; McDonald, Clement J.

    2009-01-01

    Computerized Clinical Decision Support (CDS) aims to aid decision making of health care providers and the public by providing easily accessible health-related information at the point and time it is needed. Natural Language Processing (NLP) is instrumental in using free-text information to drive CDS, representing clinical knowledge and CDS interventions in standardized formats, and leveraging clinical narrative. The early innovative NLP research of clinical narrative was followed by a period of stable research conducted at the major clinical centers and a shift of mainstream interest to biomedical NLP. This review primarily focuses on the recently renewed interest in development of fundamental NLP methods and advances in the NLP systems for CDS. The current solutions to challenges posed by distinct sublanguages, intended user groups, and support goals are discussed. PMID:19683066

  1. A Cloud-based Approach to Medical NLP

    PubMed Central

    Chard, Kyle; Russell, Michael; Lussier, Yves A.; Mendonça, Eneida A; Silverstein, Jonathan C.

    2011-01-01

    Natural Language Processing (NLP) enables access to deep content embedded in medical texts. To date, NLP has not fulfilled its promise of enabling robust clinical encoding, clinical use, quality improvement, and research. We submit that this is in part due to poor accessibility, scalability, and flexibility of NLP systems. We describe here an approach and system which leverages cloud-based approaches such as virtual machines and Representational State Transfer (REST) to extract, process, synthesize, mine, compare/contrast, explore, and manage medical text data in a flexibly secure and scalable architecture. Available architectures in which our Smntx (pronounced as semantics) system can be deployed include: virtual machines in a HIPAA-protected hospital environment, brought up to run analysis over bulk data and destroyed in a local cloud; a commercial cloud for a large complex multi-institutional trial; and within other architectures such as caGrid, i2b2, or NHIN. PMID:22195072

  2. A cloud-based approach to medical NLP.

    PubMed

    Chard, Kyle; Russell, Michael; Lussier, Yves A; Mendonça, Eneida A; Silverstein, Jonathan C

    2011-01-01

    Natural Language Processing (NLP) enables access to deep content embedded in medical texts. To date, NLP has not fulfilled its promise of enabling robust clinical encoding, clinical use, quality improvement, and research. We submit that this is in part due to poor accessibility, scalability, and flexibility of NLP systems. We describe here an approach and system which leverages cloud-based approaches such as virtual machines and Representational State Transfer (REST) to extract, process, synthesize, mine, compare/contrast, explore, and manage medical text data in a flexibly secure and scalable architecture. Available architectures in which our Smntx (pronounced as semantics) system can be deployed include: virtual machines in a HIPAA-protected hospital environment, brought up to run analysis over bulk data and destroyed in a local cloud; a commercial cloud for a large complex multi-institutional trial; and within other architectures such as caGrid, i2b2, or NHIN.

  3. From Sour Grapes to Low-Hanging Fruit: A Case Study Demonstrating a Practical Strategy for Natural Language Processing Portability.

    PubMed

    Johnson, Stephen B; Adekkanattu, Prakash; Campion, Thomas R; Flory, James; Pathak, Jyotishman; Patterson, Olga V; DuVall, Scott L; Major, Vincent; Aphinyanaphongs, Yindalon

    2018-01-01

    Natural Language Processing (NLP) holds potential for patient care and clinical research, but a gap exists between promise and reality. While some studies have demonstrated portability of NLP systems across multiple sites, challenges remain. Strategies to mitigate these challenges can strive for complex NLP problems using advanced methods (hard-to-reach fruit), or focus on simple NLP problems using practical methods (low-hanging fruit). This paper investigates a practical strategy for NLP portability using extraction of left ventricular ejection fraction (LVEF) as a use case. We used a tool developed at the Department of Veterans Affair (VA) to extract the LVEF values from free-text echocardiograms in the MIMIC-III database. The approach showed an accuracy of 98.4%, sensitivity of 99.4%, a positive predictive value of 98.7%, and F-score of 99.0%. This experience, in which a simple NLP solution proved highly portable with excellent performance, illustrates the point that simple NLP applications may be easier to disseminate and adapt, and in the short term may prove more useful, than complex applications.

  4. Building a Natural Language Processing Tool to Identify Patients With High Clinical Suspicion for Kawasaki Disease from Emergency Department Notes.

    PubMed

    Doan, Son; Maehara, Cleo K; Chaparro, Juan D; Lu, Sisi; Liu, Ruiling; Graham, Amanda; Berry, Erika; Hsu, Chun-Nan; Kanegaye, John T; Lloyd, David D; Ohno-Machado, Lucila; Burns, Jane C; Tremoulet, Adriana H

    2016-05-01

    Delayed diagnosis of Kawasaki disease (KD) may lead to serious cardiac complications. We sought to create and test the performance of a natural language processing (NLP) tool, the KD-NLP, in the identification of emergency department (ED) patients for whom the diagnosis of KD should be considered. We developed an NLP tool that recognizes the KD diagnostic criteria based on standard clinical terms and medical word usage using 22 pediatric ED notes augmented by Unified Medical Language System vocabulary. With high suspicion for KD defined as fever and three or more KD clinical signs, KD-NLP was applied to 253 ED notes from children ultimately diagnosed with either KD or another febrile illness. We evaluated KD-NLP performance against ED notes manually reviewed by clinicians and compared the results to a simple keyword search. KD-NLP identified high-suspicion patients with a sensitivity of 93.6% and specificity of 77.5% compared to notes manually reviewed by clinicians. The tool outperformed a simple keyword search (sensitivity = 41.0%; specificity = 76.3%). KD-NLP showed comparable performance to clinician manual chart review for identification of pediatric ED patients with a high suspicion for KD. This tool could be incorporated into the ED electronic health record system to alert providers to consider the diagnosis of KD. KD-NLP could serve as a model for decision support for other conditions in the ED. © 2016 by the Society for Academic Emergency Medicine.

  5. The use of natural language processing on pediatric diagnostic radiology reports in the electronic health record to identify deep venous thrombosis in children.

    PubMed

    Gálvez, Jorge A; Pappas, Janine M; Ahumada, Luis; Martin, John N; Simpao, Allan F; Rehman, Mohamed A; Witmer, Char

    2017-10-01

    Venous thromboembolism (VTE) is a potentially life-threatening condition that includes both deep vein thrombosis (DVT) and pulmonary embolism. We sought to improve detection and reporting of children with a new diagnosis of VTE by applying natural language processing (NLP) tools to radiologists' reports. We validated an NLP tool, Reveal NLP (Health Fidelity Inc, San Mateo, CA) and inference rules engine's performance in identifying reports with deep venous thrombosis using a curated set of ultrasound reports. We then configured the NLP tool to scan all available radiology reports on a daily basis for studies that met criteria for VTE between July 1, 2015, and March 31, 2016. The NLP tool and inference rules engine correctly identified 140 out of 144 reports with positive DVT findings and 98 out of 106 negative reports in the validation set. The tool's sensitivity was 97.2% (95% CI 93-99.2%), specificity was 92.5% (95% CI 85.7-96.7%). Subsequently, the NLP tool and inference rules engine processed 6373 radiology reports from 3371 hospital encounters. The NLP tool and inference rules engine identified 178 positive reports and 3193 negative reports with a sensitivity of 82.9% (95% CI 74.8-89.2) and specificity of 97.5% (95% CI 96.9-98). The system functions well as a safety net to screen patients for HA-VTE on a daily basis and offers value as an automated, redundant system. To our knowledge, this is the first pediatric study to apply NLP technology in a prospective manner for HA-VTE identification.

  6. Role of PROLOG (Programming and Logic) in natural-language processing. Report for September-December 1987

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McHale, M.L.

    The field of artificial Intelligence strives to produce computer programs that exhibit intelligent behavior. One of the areas of interest is the processing of natural language. This report discusses the role of the computer language PROLOG in Natural Language Processing (NLP) both from theoretic and pragmatic viewpoints. The reasons for using PROLOG for NLP are numerous. First, linguists can write natural-language grammars almost directly as PROLOG programs; this allows fast-prototyping of NLP systems and facilitates analysis of NLP theories. Second, semantic representations of natural-language texts that use logic formalisms are readily produced in PROLOG because of PROLOG's logical foundations. Third,more » PROLOG's built-in inferencing mechanisms are often sufficient for inferences on the logical forms produced by NLPs. Fourth, the logical, declarative nature of PROLOG may make it the language of choice for parallel computing systems. Finally, the fact that PROLOG has a de facto standard (Edinburgh) makes the porting of code from one computer system to another virtually trouble free. Perhaps the strongest tie one could make between NLP and PROLOG was stated by John Stuart Mill in his inaugural Address at St. Andrews: The structure of every sentence is a lesson in logic.« less

  7. Natural language processing of clinical notes for identification of critical limb ischemia.

    PubMed

    Afzal, Naveed; Mallipeddi, Vishnu Priya; Sohn, Sunghwan; Liu, Hongfang; Chaudhry, Rajeev; Scott, Christopher G; Kullo, Iftikhar J; Arruda-Olson, Adelaide M

    2018-03-01

    Critical limb ischemia (CLI) is a complication of advanced peripheral artery disease (PAD) with diagnosis based on the presence of clinical signs and symptoms. However, automated identification of cases from electronic health records (EHRs) is challenging due to absence of a single definitive International Classification of Diseases (ICD-9 or ICD-10) code for CLI. In this study, we extend a previously validated natural language processing (NLP) algorithm for PAD identification to develop and validate a subphenotyping NLP algorithm (CLI-NLP) for identification of CLI cases from clinical notes. We compared performance of the CLI-NLP algorithm with CLI-related ICD-9 billing codes. The gold standard for validation was human abstraction of clinical notes from EHRs. Compared to billing codes the CLI-NLP algorithm had higher positive predictive value (PPV) (CLI-NLP 96%, billing codes 67%, p < 0.001), specificity (CLI-NLP 98%, billing codes 74%, p < 0.001) and F1-score (CLI-NLP 90%, billing codes 76%, p < 0.001). The sensitivity of these two methods was similar (CLI-NLP 84%; billing codes 88%; p < 0.12). The CLI-NLP algorithm for identification of CLI from narrative clinical notes in an EHR had excellent PPV and has potential for translation to patient care as it will enable automated identification of CLI cases for quality projects, clinical decision support tools and support a learning healthcare system. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  8. A natural language processing program effectively extracts key pathologic findings from radical prostatectomy reports.

    PubMed

    Kim, Brian J; Merchant, Madhur; Zheng, Chengyi; Thomas, Anil A; Contreras, Richard; Jacobsen, Steven J; Chien, Gary W

    2014-12-01

    Natural language processing (NLP) software programs have been widely developed to transform complex free text into simplified organized data. Potential applications in the field of medicine include automated report summaries, physician alerts, patient repositories, electronic medical record (EMR) billing, and quality metric reports. Despite these prospects and the recent widespread adoption of EMR, NLP has been relatively underutilized. The objective of this study was to evaluate the performance of an internally developed NLP program in extracting select pathologic findings from radical prostatectomy specimen reports in the EMR. An NLP program was generated by a software engineer to extract key variables from prostatectomy reports in the EMR within our healthcare system, which included the TNM stage, Gleason grade, presence of a tertiary Gleason pattern, histologic subtype, size of dominant tumor nodule, seminal vesicle invasion (SVI), perineural invasion (PNI), angiolymphatic invasion (ALI), extracapsular extension (ECE), and surgical margin status (SMS). The program was validated by comparing NLP results to a gold standard compiled by two blinded manual reviewers for 100 random pathology reports. NLP demonstrated 100% accuracy for identifying the Gleason grade, presence of a tertiary Gleason pattern, SVI, ALI, and ECE. It also demonstrated near-perfect accuracy for extracting histologic subtype (99.0%), PNI (98.9%), TNM stage (98.0%), SMS (97.0%), and dominant tumor size (95.7%). The overall accuracy of NLP was 98.7%. NLP generated a result in <1 second, whereas the manual reviewers averaged 3.2 minutes per report. This novel program demonstrated high accuracy and efficiency identifying key pathologic details from the prostatectomy report within an EMR system. NLP has the potential to assist urologists by summarizing and highlighting relevant information from verbose pathology reports. It may also facilitate future urologic research through the rapid and automated creation of large databases.

  9. Usability Evaluation of an Unstructured Clinical Document Query Tool for Researchers.

    PubMed

    Hultman, Gretchen; McEwan, Reed; Pakhomov, Serguei; Lindemann, Elizabeth; Skube, Steven; Melton, Genevieve B

    2018-01-01

    Natural Language Processing - Patient Information Extraction for Researchers (NLP-PIER) was developed for clinical researchers for self-service Natural Language Processing (NLP) queries with clinical notes. This study was to conduct a user-centered analysis with clinical researchers to gain insight into NLP-PIER's usability and to gain an understanding of the needs of clinical researchers when using an application for searching clinical notes. Clinical researcher participants (n=11) completed tasks using the system's two existing search interfaces and completed a set of surveys and an exit interview. Quantitative data including time on task, task completion rate, and survey responses were collected. Interviews were analyzed qualitatively. Survey scores, time on task and task completion proportions varied widely. Qualitative analysis indicated that participants found the system to be useful and usable in specific projects. This study identified several usability challenges and our findings will guide the improvement of NLP-PIER 's interfaces.

  10. Using automatically extracted information from mammography reports for decision-support

    PubMed Central

    Bozkurt, Selen; Gimenez, Francisco; Burnside, Elizabeth S.; Gulkesen, Kemal H.; Rubin, Daniel L.

    2016-01-01

    Objective To evaluate a system we developed that connects natural language processing (NLP) for information extraction from narrative text mammography reports with a Bayesian network for decision-support about breast cancer diagnosis. The ultimate goal of this system is to provide decision support as part of the workflow of producing the radiology report. Materials and methods We built a system that uses an NLP information extraction system (which extract BI-RADS descriptors and clinical information from mammography reports) to provide the necessary inputs to a Bayesian network (BN) decision support system (DSS) that estimates lesion malignancy from BI-RADS descriptors. We used this integrated system to predict diagnosis of breast cancer from radiology text reports and evaluated it with a reference standard of 300 mammography reports. We collected two different outputs from the DSS: (1) the probability of malignancy and (2) the BI-RADS final assessment category. Since NLP may produce imperfect inputs to the DSS, we compared the difference between using perfect (“reference standard”) structured inputs to the DSS (“RS-DSS”) vs NLP-derived inputs (“NLP-DSS”) on the output of the DSS using the concordance correlation coefficient. We measured the classification accuracy of the BI-RADS final assessment category when using NLP-DSS, compared with the ground truth category established by the radiologist. Results The NLP-DSS and RS-DSS had closely matched probabilities, with a mean paired difference of 0.004 ± 0.025. The concordance correlation of these paired measures was 0.95. The accuracy of the NLP-DSS to predict the correct BI-RADS final assessment category was 97.58%. Conclusion The accuracy of the information extracted from mammography reports using the NLP system was sufficient to provide accurate DSS results. We believe our system could ultimately reduce the variation in practice in mammography related to assessment of malignant lesions and improve management decisions. PMID:27388877

  11. Natural Language Processing Accurately Calculates Adenoma and Sessile Serrated Polyp Detection Rates.

    PubMed

    Nayor, Jennifer; Borges, Lawrence F; Goryachev, Sergey; Gainer, Vivian S; Saltzman, John R

    2018-07-01

    ADR is a widely used colonoscopy quality indicator. Calculation of ADR is labor-intensive and cumbersome using current electronic medical databases. Natural language processing (NLP) is a method used to extract meaning from unstructured or free text data. (1) To develop and validate an accurate automated process for calculation of adenoma detection rate (ADR) and serrated polyp detection rate (SDR) on data stored in widely used electronic health record systems, specifically Epic electronic health record system, Provation ® endoscopy reporting system, and Sunquest PowerPath pathology reporting system. Screening colonoscopies performed between June 2010 and August 2015 were identified using the Provation ® reporting tool. An NLP pipeline was developed to identify adenomas and sessile serrated polyps (SSPs) on pathology reports corresponding to these colonoscopy reports. The pipeline was validated using a manual search. Precision, recall, and effectiveness of the natural language processing pipeline were calculated. ADR and SDR were then calculated. We identified 8032 screening colonoscopies that were linked to 3821 pathology reports (47.6%). The NLP pipeline had an accuracy of 100% for adenomas and 100% for SSPs. Mean total ADR was 29.3% (range 14.7-53.3%); mean male ADR was 35.7% (range 19.7-62.9%); and mean female ADR was 24.9% (range 9.1-51.0%). Mean total SDR was 4.0% (0-9.6%). We developed and validated an NLP pipeline that accurately and automatically calculates ADRs and SDRs using data stored in Epic, Provation ® and Sunquest PowerPath. This NLP pipeline can be used to evaluate colonoscopy quality parameters at both individual and practice levels.

  12. Mining Peripheral Arterial Disease Cases from Narrative Clinical Notes Using Natural Language Processing

    PubMed Central

    Afzal, Naveed; Sohn, Sunghwan; Abram, Sara; Scott, Christopher G.; Chaudhry, Rajeev; Liu, Hongfang; Kullo, Iftikhar J.; Arruda-Olson, Adelaide M.

    2016-01-01

    Objective Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm to billing code algorithms, using ankle-brachial index (ABI) test results as the gold standard. Methods We compared the performance of the NLP algorithm to 1) results of gold standard ABI; 2) previously validated algorithms based on relevant ICD-9 diagnostic codes (simple model) and 3) a combination of ICD-9 codes with procedural codes (full model). A dataset of 1,569 PAD patients and controls was randomly divided into training (n= 935) and testing (n= 634) subsets. Results We iteratively refined the NLP algorithm in the training set including narrative note sections, note types and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP: 91.8%, full model: 81.8%, simple model: 83%, P<.001), PPV (NLP: 92.9%, full model: 74.3%, simple model: 79.9%, P<.001), and specificity (NLP: 92.5%, full model: 64.2%, simple model: 75.9%, P<.001). Conclusions A knowledge-driven NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support. PMID:28189359

  13. Natural language processing, pragmatics, and verbal behavior

    PubMed Central

    Cherpas, Chris

    1992-01-01

    Natural Language Processing (NLP) is that part of Artificial Intelligence (AI) concerned with endowing computers with verbal and listener repertoires, so that people can interact with them more easily. Most attention has been given to accurately parsing and generating syntactic structures, although NLP researchers are finding ways of handling the semantic content of language as well. It is increasingly apparent that understanding the pragmatic (contextual and consequential) dimension of natural language is critical for producing effective NLP systems. While there are some techniques for applying pragmatics in computer systems, they are piecemeal, crude, and lack an integrated theoretical foundation. Unfortunately, there is little awareness that Skinner's (1957) Verbal Behavior provides an extensive, principled pragmatic analysis of language. The implications of Skinner's functional analysis for NLP and for verbal aspects of epistemology lead to a proposal for a “user expert”—a computer system whose area of expertise is the long-term computer user. The evolutionary nature of behavior suggests an AI technology known as genetic algorithms/programming for implementing such a system. ImagesFig. 1 PMID:22477052

  14. Analyzing Discourse Processing Using a Simple Natural Language Processing Tool

    ERIC Educational Resources Information Center

    Crossley, Scott A.; Allen, Laura K.; Kyle, Kristopher; McNamara, Danielle S.

    2014-01-01

    Natural language processing (NLP) provides a powerful approach for discourse processing researchers. However, there remains a notable degree of hesitation by some researchers to consider using NLP, at least on their own. The purpose of this article is to introduce and make available a "simple" NLP (SiNLP) tool. The overarching goal of…

  15. Integrating UIMA annotators in a web-based text processing framework.

    PubMed

    Chen, Xiang; Arnold, Corey W

    2013-01-01

    The Unstructured Information Management Architecture (UIMA) [1] framework is a growing platform for natural language processing (NLP) applications. However, such applications may be difficult for non-technical users deploy. This project presents a web-based framework that wraps UIMA-based annotator systems into a graphical user interface for researchers and clinicians, and a web service for developers. An annotator that extracts data elements from lung cancer radiology reports is presented to illustrate the use of the system. Annotation results from the web system can be exported to multiple formats for users to utilize in other aspects of their research and workflow. This project demonstrates the benefits of a lay-user interface for complex NLP applications. Efforts such as this can lead to increased interest and support for NLP work in the clinical domain.

  16. Using rule-based natural language processing to improve disease normalization in biomedical text.

    PubMed

    Kang, Ning; Singh, Bharat; Afzal, Zubair; van Mulligen, Erik M; Kors, Jan A

    2013-01-01

    In order for computers to extract useful information from unstructured text, a concept normalization system is needed to link relevant concepts in a text to sources that contain further information about the concept. Popular concept normalization tools in the biomedical field are dictionary-based. In this study we investigate the usefulness of natural language processing (NLP) as an adjunct to dictionary-based concept normalization. We compared the performance of two biomedical concept normalization systems, MetaMap and Peregrine, on the Arizona Disease Corpus, with and without the use of a rule-based NLP module. Performance was assessed for exact and inexact boundary matching of the system annotations with those of the gold standard and for concept identifier matching. Without the NLP module, MetaMap and Peregrine attained F-scores of 61.0% and 63.9%, respectively, for exact boundary matching, and 55.1% and 56.9% for concept identifier matching. With the aid of the NLP module, the F-scores of MetaMap and Peregrine improved to 73.3% and 78.0% for boundary matching, and to 66.2% and 69.8% for concept identifier matching. For inexact boundary matching, performances further increased to 85.5% and 85.4%, and to 73.6% and 73.3% for concept identifier matching. We have shown the added value of NLP for the recognition and normalization of diseases with MetaMap and Peregrine. The NLP module is general and can be applied in combination with any concept normalization system. Whether its use for concept types other than disease is equally advantageous remains to be investigated.

  17. Hierarchical semantic structures for medical NLP.

    PubMed

    Taira, Ricky K; Arnold, Corey W

    2013-01-01

    We present a framework for building a medical natural language processing (NLP) system capable of deep understanding of clinical text reports. The framework helps developers understand how various NLP-related efforts and knowledge sources can be integrated. The aspects considered include: 1) computational issues dealing with defining layers of intermediate semantic structures to reduce the dimensionality of the NLP problem; 2) algorithmic issues in which we survey the NLP literature and discuss state-of-the-art procedures used to map between various levels of the hierarchy; and 3) implementation issues to software developers with available resources. The objective of this poster is to educate readers to the various levels of semantic representation (e.g., word level concepts, ontological concepts, logical relations, logical frames, discourse structures, etc.). The poster presents an architecture for which diverse efforts and resources in medical NLP can be integrated in a principled way.

  18. Mining peripheral arterial disease cases from narrative clinical notes using natural language processing.

    PubMed

    Afzal, Naveed; Sohn, Sunghwan; Abram, Sara; Scott, Christopher G; Chaudhry, Rajeev; Liu, Hongfang; Kullo, Iftikhar J; Arruda-Olson, Adelaide M

    2017-06-01

    Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with billing code algorithms, using ankle-brachial index test results as the gold standard. We compared the performance of the NLP algorithm to (1) results of gold standard ankle-brachial index; (2) previously validated algorithms based on relevant International Classification of Diseases, Ninth Revision diagnostic codes (simple model); and (3) a combination of International Classification of Diseases, Ninth Revision codes with procedural codes (full model). A dataset of 1569 patients with PAD and controls was randomly divided into training (n = 935) and testing (n = 634) subsets. We iteratively refined the NLP algorithm in the training set including narrative note sections, note types, and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP, 91.8%; full model, 81.8%; simple model, 83%; P < .001), positive predictive value (NLP, 92.9%; full model, 74.3%; simple model, 79.9%; P < .001), and specificity (NLP, 92.5%; full model, 64.2%; simple model, 75.9%; P < .001). A knowledge-driven NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  19. Does Evidence-Based PTS Treatment Reduce PTS Symptoms and Suicide in Iraq and Afghanistan Veterans Seeking VA Care

    DTIC Science & Technology

    We succeeded in developing a Natural Language Processing ( NLP ) System with excellent performance characteristics for determining the type of...people (quadruple-annotated) and7,226 of which were double annotated. We also developed an NLP system to extract PT Checklist (PCL) scores from clinical notes with excellent accuracy (98 positive predictive value).

  20. Usability Evaluation of NLP-PIER: A Clinical Document Search Engine for Researchers.

    PubMed

    Hultman, Gretchen; McEwan, Reed; Pakhomov, Serguei; Lindemann, Elizabeth; Skube, Steven; Melton, Genevieve B

    2017-01-01

    NLP-PIER (Natural Language Processing - Patient Information Extraction for Research) is a self-service platform with a search engine for clinical researchers to perform natural language processing (NLP) queries using clinical notes. We conducted user-centered testing of NLP-PIER's usability to inform future design decisions. Quantitative and qualitative data were analyzed. Our findings will be used to improve the usability of NLP-PIER.

  1. NOBLE - Flexible concept recognition for large-scale biomedical natural language processing.

    PubMed

    Tseytlin, Eugene; Mitchell, Kevin; Legowski, Elizabeth; Corrigan, Julia; Chavan, Girish; Jacobson, Rebecca S

    2016-01-14

    Natural language processing (NLP) applications are increasingly important in biomedical data analysis, knowledge engineering, and decision support. Concept recognition is an important component task for NLP pipelines, and can be either general-purpose or domain-specific. We describe a novel, flexible, and general-purpose concept recognition component for NLP pipelines, and compare its speed and accuracy against five commonly used alternatives on both a biological and clinical corpus. NOBLE Coder implements a general algorithm for matching terms to concepts from an arbitrary vocabulary set. The system's matching options can be configured individually or in combination to yield specific system behavior for a variety of NLP tasks. The software is open source, freely available, and easily integrated into UIMA or GATE. We benchmarked speed and accuracy of the system against the CRAFT and ShARe corpora as reference standards and compared it to MMTx, MGrep, Concept Mapper, cTAKES Dictionary Lookup Annotator, and cTAKES Fast Dictionary Lookup Annotator. We describe key advantages of the NOBLE Coder system and associated tools, including its greedy algorithm, configurable matching strategies, and multiple terminology input formats. These features provide unique functionality when compared with existing alternatives, including state-of-the-art systems. On two benchmarking tasks, NOBLE's performance exceeded commonly used alternatives, performing almost as well as the most advanced systems. Error analysis revealed differences in error profiles among systems. NOBLE Coder is comparable to other widely used concept recognition systems in terms of accuracy and speed. Advantages of NOBLE Coder include its interactive terminology builder tool, ease of configuration, and adaptability to various domains and tasks. NOBLE provides a term-to-concept matching system suitable for general concept recognition in biomedical NLP pipelines.

  2. Comparing ICD9-encoded diagnoses and NLP-processed discharge summaries for clinical trials pre-screening: a case study.

    PubMed

    Li, Li; Chase, Herbert S; Patel, Chintan O; Friedman, Carol; Weng, Chunhua

    2008-11-06

    The prevalence of electronic medical record (EMR) systems has made mass-screening for clinical trials viable through secondary uses of clinical data, which often exist in both structured and free text formats. The tradeoffs of using information in either data format for clinical trials screening are understudied. This paper compares the results of clinical trial eligibility queries over ICD9-encoded diagnoses and NLP-processed textual discharge summaries. The strengths and weaknesses of both data sources are summarized along the following dimensions: information completeness, expressiveness, code granularity, and accuracy of temporal information. We conclude that NLP-processed patient reports supplement important information for eligibility screening and should be used in combination with structured data.

  3. Natural Language Processing-Enabled and Conventional Data Capture Methods for Input to Electronic Health Records: A Comparative Usability Study.

    PubMed

    Kaufman, David R; Sheehan, Barbara; Stetson, Peter; Bhatt, Ashish R; Field, Adele I; Patel, Chirag; Maisel, James Mark

    2016-10-28

    The process of documentation in electronic health records (EHRs) is known to be time consuming, inefficient, and cumbersome. The use of dictation coupled with manual transcription has become an increasingly common practice. In recent years, natural language processing (NLP)-enabled data capture has become a viable alternative for data entry. It enables the clinician to maintain control of the process and potentially reduce the documentation burden. The question remains how this NLP-enabled workflow will impact EHR usability and whether it can meet the structured data and other EHR requirements while enhancing the user's experience. The objective of this study is evaluate the comparative effectiveness of an NLP-enabled data capture method using dictation and data extraction from transcribed documents (NLP Entry) in terms of documentation time, documentation quality, and usability versus standard EHR keyboard-and-mouse data entry. This formative study investigated the results of using 4 combinations of NLP Entry and Standard Entry methods ("protocols") of EHR data capture. We compared a novel dictation-based protocol using MediSapien NLP (NLP-NLP) for structured data capture against a standard structured data capture protocol (Standard-Standard) as well as 2 novel hybrid protocols (NLP-Standard and Standard-NLP). The 31 participants included neurologists, cardiologists, and nephrologists. Participants generated 4 consultation or admission notes using 4 documentation protocols. We recorded the time on task, documentation quality (using the Physician Documentation Quality Instrument, PDQI-9), and usability of the documentation processes. A total of 118 notes were documented across the 3 subject areas. The NLP-NLP protocol required a median of 5.2 minutes per cardiology note, 7.3 minutes per nephrology note, and 8.5 minutes per neurology note compared with 16.9, 20.7, and 21.2 minutes, respectively, using the Standard-Standard protocol and 13.8, 21.3, and 18.7 minutes using the Standard-NLP protocol (1 of 2 hybrid methods). Using 8 out of 9 characteristics measured by the PDQI-9 instrument, the NLP-NLP protocol received a median quality score sum of 24.5; the Standard-Standard protocol received a median sum of 29; and the Standard-NLP protocol received a median sum of 29.5. The mean total score of the usability measure was 36.7 when the participants used the NLP-NLP protocol compared with 30.3 when they used the Standard-Standard protocol. In this study, the feasibility of an approach to EHR data capture involving the application of NLP to transcribed dictation was demonstrated. This novel dictation-based approach has the potential to reduce the time required for documentation and improve usability while maintaining documentation quality. Future research will evaluate the NLP-based EHR data capture approach in a clinical setting. It is reasonable to assert that EHRs will increasingly use NLP-enabled data entry tools such as MediSapien NLP because they hold promise for enhancing the documentation process and end-user experience. ©David R. Kaufman, Barbara Sheehan, Peter Stetson, Ashish R. Bhatt, Adele I. Field, Chirag Patel, James Mark Maisel. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 28.10.2016.

  4. Scaling-up NLP Pipelines to Process Large Corpora of Clinical Notes.

    PubMed

    Divita, G; Carter, M; Redd, A; Zeng, Q; Gupta, K; Trautner, B; Samore, M; Gundlapalli, A

    2015-01-01

    This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". This paper describes the scale-up efforts at the VA Salt Lake City Health Care System to address processing large corpora of clinical notes through a natural language processing (NLP) pipeline. The use case described is a current project focused on detecting the presence of an indwelling urinary catheter in hospitalized patients and subsequent catheter-associated urinary tract infections. An NLP algorithm using v3NLP was developed to detect the presence of an indwelling urinary catheter in hospitalized patients. The algorithm was tested on a small corpus of notes on patients for whom the presence or absence of a catheter was already known (reference standard). In planning for a scale-up, we estimated that the original algorithm would have taken 2.4 days to run on a larger corpus of notes for this project (550,000 notes), and 27 days for a corpus of 6 million records representative of a national sample of notes. We approached scaling-up NLP pipelines through three techniques: pipeline replication via multi-threading, intra-annotator threading for tasks that can be further decomposed, and remote annotator services which enable annotator scale-out. The scale-up resulted in reducing the average time to process a record from 206 milliseconds to 17 milliseconds or a 12- fold increase in performance when applied to a corpus of 550,000 notes. Purposely simplistic in nature, these scale-up efforts are the straight forward evolution from small scale NLP processing to larger scale extraction without incurring associated complexities that are inherited by the use of the underlying UIMA framework. These efforts represent generalizable and widely applicable techniques that will aid other computationally complex NLP pipelines that are of need to be scaled out for processing and analyzing big data.

  5. Evaluation of natural language processing systems: Issues and approaches

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Guida, G.; Mauri, G.

    This paper encompasses two main topics: a broad and general analysis of the issue of performance evaluation of NLP systems and a report on a specific approach developed by the authors and experimented on a sample test case. More precisely, it first presents a brief survey of the major works in the area of NLP systems evaluation. Then, after introducing the notion of the life cycle of an NLP system, it focuses on the concept of performance evaluation and analyzes the scope and the major problems of the investigation. The tools generally used within computer science to assess the qualitymore » of a software system are briefly reviewed, and their applicability to the task of evaluation of NLP systems is discussed. Particular attention is devoted to the concepts of efficiency, correctness, reliability, and adequacy, and how all of them basically fail in capturing the peculiar features of performance evaluation of an NLP system is discussed. Two main approaches to performance evaluation are later introduced; namely, black-box- and model-based, and their most important characteristics are presented. Finally, a specific model for performance evaluation proposed by the authors is illustrated, and the results of an experiment with a sample application are reported. The paper concludes with a discussion on research perspective, open problems, and importance of performance evaluation to industrial applications.« less

  6. Natural Language Processing–Enabled and Conventional Data Capture Methods for Input to Electronic Health Records: A Comparative Usability Study

    PubMed Central

    Sheehan, Barbara; Stetson, Peter; Bhatt, Ashish R; Field, Adele I; Patel, Chirag; Maisel, James Mark

    2016-01-01

    Background The process of documentation in electronic health records (EHRs) is known to be time consuming, inefficient, and cumbersome. The use of dictation coupled with manual transcription has become an increasingly common practice. In recent years, natural language processing (NLP)–enabled data capture has become a viable alternative for data entry. It enables the clinician to maintain control of the process and potentially reduce the documentation burden. The question remains how this NLP-enabled workflow will impact EHR usability and whether it can meet the structured data and other EHR requirements while enhancing the user’s experience. Objective The objective of this study is evaluate the comparative effectiveness of an NLP-enabled data capture method using dictation and data extraction from transcribed documents (NLP Entry) in terms of documentation time, documentation quality, and usability versus standard EHR keyboard-and-mouse data entry. Methods This formative study investigated the results of using 4 combinations of NLP Entry and Standard Entry methods (“protocols”) of EHR data capture. We compared a novel dictation-based protocol using MediSapien NLP (NLP-NLP) for structured data capture against a standard structured data capture protocol (Standard-Standard) as well as 2 novel hybrid protocols (NLP-Standard and Standard-NLP). The 31 participants included neurologists, cardiologists, and nephrologists. Participants generated 4 consultation or admission notes using 4 documentation protocols. We recorded the time on task, documentation quality (using the Physician Documentation Quality Instrument, PDQI-9), and usability of the documentation processes. Results A total of 118 notes were documented across the 3 subject areas. The NLP-NLP protocol required a median of 5.2 minutes per cardiology note, 7.3 minutes per nephrology note, and 8.5 minutes per neurology note compared with 16.9, 20.7, and 21.2 minutes, respectively, using the Standard-Standard protocol and 13.8, 21.3, and 18.7 minutes using the Standard-NLP protocol (1 of 2 hybrid methods). Using 8 out of 9 characteristics measured by the PDQI-9 instrument, the NLP-NLP protocol received a median quality score sum of 24.5; the Standard-Standard protocol received a median sum of 29; and the Standard-NLP protocol received a median sum of 29.5. The mean total score of the usability measure was 36.7 when the participants used the NLP-NLP protocol compared with 30.3 when they used the Standard-Standard protocol. Conclusions In this study, the feasibility of an approach to EHR data capture involving the application of NLP to transcribed dictation was demonstrated. This novel dictation-based approach has the potential to reduce the time required for documentation and improve usability while maintaining documentation quality. Future research will evaluate the NLP-based EHR data capture approach in a clinical setting. It is reasonable to assert that EHRs will increasingly use NLP-enabled data entry tools such as MediSapien NLP because they hold promise for enhancing the documentation process and end-user experience. PMID:27793791

  7. Filling the gaps between tools and users: a tool comparator, using protein-protein interaction as an example.

    PubMed

    Kano, Yoshinobu; Nguyen, Ngan; Saetre, Rune; Yoshida, Kazuhiro; Miyao, Yusuke; Tsuruoka, Yoshimasa; Matsubayashi, Yuichiro; Ananiadou, Sophia; Tsujii, Jun'ichi

    2008-01-01

    Recently, several text mining programs have reached a near-practical level of performance. Some systems are already being used by biologists and database curators. However, it has also been recognized that current Natural Language Processing (NLP) and Text Mining (TM) technology is not easy to deploy, since research groups tend to develop systems that cater specifically to their own requirements. One of the major reasons for the difficulty of deployment of NLP/TM technology is that re-usability and interoperability of software tools are typically not considered during development. While some effort has been invested in making interoperable NLP/TM toolkits, the developers of end-to-end systems still often struggle to reuse NLP/TM tools, and often opt to develop similar programs from scratch instead. This is particularly the case in BioNLP, since the requirements of biologists are so diverse that NLP tools have to be adapted and re-organized in a much more extensive manner than was originally expected. Although generic frameworks like UIMA (Unstructured Information Management Architecture) provide promising ways to solve this problem, the solution that they provide is only partial. In order for truly interoperable toolkits to become a reality, we also need sharable type systems and a developer-friendly environment for software integration that includes functionality for systematic comparisons of available tools, a simple I/O interface, and visualization tools. In this paper, we describe such an environment that was developed based on UIMA, and we show its feasibility through our experience in developing a protein-protein interaction (PPI) extraction system.

  8. Mastering Overdetection and Underdetection in Learner-Answer Processing: Simple Techniques for Analysis and Diagnosis

    ERIC Educational Resources Information Center

    Blanchard, Alexia; Kraif, Olivier; Ponton, Claude

    2009-01-01

    This paper presents a "didactic triangulation" strategy to cope with the problem of reliability of NLP applications for computer-assisted language learning (CALL) systems. It is based on the implementation of basic but well mastered NLP techniques and puts the emphasis on an adapted gearing between computable linguistic clues and didactic features…

  9. An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).

    PubMed

    Valdez, Joshua; Rueschman, Michael; Kim, Matthew; Redline, Susan; Sahoo, Satya S

    2016-10-01

    Extraction of structured information from biomedical literature is a complex and challenging problem due to the complexity of biomedical domain and lack of appropriate natural language processing (NLP) techniques. High quality domain ontologies model both data and metadata information at a fine level of granularity, which can be effectively used to accurately extract structured information from biomedical text. Extraction of provenance metadata, which describes the history or source of information, from published articles is an important task to support scientific reproducibility. Reproducibility of results reported by previous research studies is a foundational component of scientific advancement. This is highlighted by the recent initiative by the US National Institutes of Health called "Principles of Rigor and Reproducibility". In this paper, we describe an effective approach to extract provenance metadata from published biomedical research literature using an ontology-enabled NLP platform as part of the Provenance for Clinical and Healthcare Research (ProvCaRe). The ProvCaRe-NLP tool extends the clinical Text Analysis and Knowledge Extraction System (cTAKES) platform using both provenance and biomedical domain ontologies. We demonstrate the effectiveness of ProvCaRe-NLP tool using a corpus of 20 peer-reviewed publications. The results of our evaluation demonstrate that the ProvCaRe-NLP tool has significantly higher recall in extracting provenance metadata as compared to existing NLP pipelines such as MetaMap.

  10. Towards symbiosis in knowledge representation and natural language processing for structuring clinical practice guidelines.

    PubMed

    Weng, Chunhua; Payne, Philip R O; Velez, Mark; Johnson, Stephen B; Bakken, Suzanne

    2014-01-01

    The successful adoption by clinicians of evidence-based clinical practice guidelines (CPGs) contained in clinical information systems requires efficient translation of free-text guidelines into computable formats. Natural language processing (NLP) has the potential to improve the efficiency of such translation. However, it is laborious to develop NLP to structure free-text CPGs using existing formal knowledge representations (KR). In response to this challenge, this vision paper discusses the value and feasibility of supporting symbiosis in text-based knowledge acquisition (KA) and KR. We compare two ontologies: (1) an ontology manually created by domain experts for CPG eligibility criteria and (2) an upper-level ontology derived from a semantic pattern-based approach for automatic KA from CPG eligibility criteria text. Then we discuss the strengths and limitations of interweaving KA and NLP for KR purposes and important considerations for achieving the symbiosis of KR and NLP for structuring CPGs to achieve evidence-based clinical practice.

  11. UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text.

    PubMed

    Demner-Fushman, Dina; Mork, James G; Shooshan, Sonya E; Aronson, Alan R

    2010-08-01

    Identification of medical terms in free text is a first step in such Natural Language Processing (NLP) tasks as automatic indexing of biomedical literature and extraction of patients' problem lists from the text of clinical notes. Many tools developed to perform these tasks use biomedical knowledge encoded in the Unified Medical Language System (UMLS) Metathesaurus. We continue our exploration of automatic approaches to creation of subsets (UMLS content views) which can support NLP processing of either the biomedical literature or clinical text. We found that suppression of highly ambiguous terms in the conservative AutoFilter content view can partially replace manual filtering for literature applications, and suppression of two character mappings in the same content view achieves 89.5% precision at 78.6% recall for clinical applications. Published by Elsevier Inc.

  12. Toward a Learning Health-care System – Knowledge Delivery at the Point of Care Empowered by Big Data and NLP

    PubMed Central

    Kaggal, Vinod C.; Elayavilli, Ravikumar Komandur; Mehrabi, Saeed; Pankratz, Joshua J.; Sohn, Sunghwan; Wang, Yanshan; Li, Dingcheng; Rastegar, Majid Mojarad; Murphy, Sean P.; Ross, Jason L.; Chaudhry, Rajeev; Buntrock, James D.; Liu, Hongfang

    2016-01-01

    The concept of optimizing health care by understanding and generating knowledge from previous evidence, ie, the Learning Health-care System (LHS), has gained momentum and now has national prominence. Meanwhile, the rapid adoption of electronic health records (EHRs) enables the data collection required to form the basis for facilitating LHS. A prerequisite for using EHR data within the LHS is an infrastructure that enables access to EHR data longitudinally for health-care analytics and real time for knowledge delivery. Additionally, significant clinical information is embedded in the free text, making natural language processing (NLP) an essential component in implementing an LHS. Herein, we share our institutional implementation of a big data-empowered clinical NLP infrastructure, which not only enables health-care analytics but also has real-time NLP processing capability. The infrastructure has been utilized for multiple institutional projects including the MayoExpertAdvisor, an individualized care recommendation solution for clinical care. We compared the advantages of big data over two other environments. Big data infrastructure significantly outperformed other infrastructure in terms of computing speed, demonstrating its value in making the LHS a possibility in the near future. PMID:27385912

  13. Toward a Learning Health-care System - Knowledge Delivery at the Point of Care Empowered by Big Data and NLP.

    PubMed

    Kaggal, Vinod C; Elayavilli, Ravikumar Komandur; Mehrabi, Saeed; Pankratz, Joshua J; Sohn, Sunghwan; Wang, Yanshan; Li, Dingcheng; Rastegar, Majid Mojarad; Murphy, Sean P; Ross, Jason L; Chaudhry, Rajeev; Buntrock, James D; Liu, Hongfang

    2016-01-01

    The concept of optimizing health care by understanding and generating knowledge from previous evidence, ie, the Learning Health-care System (LHS), has gained momentum and now has national prominence. Meanwhile, the rapid adoption of electronic health records (EHRs) enables the data collection required to form the basis for facilitating LHS. A prerequisite for using EHR data within the LHS is an infrastructure that enables access to EHR data longitudinally for health-care analytics and real time for knowledge delivery. Additionally, significant clinical information is embedded in the free text, making natural language processing (NLP) an essential component in implementing an LHS. Herein, we share our institutional implementation of a big data-empowered clinical NLP infrastructure, which not only enables health-care analytics but also has real-time NLP processing capability. The infrastructure has been utilized for multiple institutional projects including the MayoExpertAdvisor, an individualized care recommendation solution for clinical care. We compared the advantages of big data over two other environments. Big data infrastructure significantly outperformed other infrastructure in terms of computing speed, demonstrating its value in making the LHS a possibility in the near future.

  14. Integrating natural language processing expertise with patient safety event review committees to improve the analysis of medication events.

    PubMed

    Fong, Allan; Harriott, Nicole; Walters, Donna M; Foley, Hanan; Morrissey, Richard; Ratwani, Raj R

    2017-08-01

    Many healthcare providers have implemented patient safety event reporting systems to better understand and improve patient safety. Reviewing and analyzing these reports is often time consuming and resource intensive because of both the quantity of reports and length of free-text descriptions in the reports. Natural language processing (NLP) experts collaborated with clinical experts on a patient safety committee to assist in the identification and analysis of medication related patient safety events. Different NLP algorithmic approaches were developed to identify four types of medication related patient safety events and the models were compared. Well performing NLP models were generated to categorize medication related events into pharmacy delivery delays, dispensing errors, Pyxis discrepancies, and prescriber errors with receiver operating characteristic areas under the curve of 0.96, 0.87, 0.96, and 0.81 respectively. We also found that modeling the brief without the resolution text generally improved model performance. These models were integrated into a dashboard visualization to support the patient safety committee review process. We demonstrate the capabilities of various NLP models and the use of two text inclusion strategies at categorizing medication related patient safety events. The NLP models and visualization could be used to improve the efficiency of patient safety event data review and analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Semi-Automated Methods for Refining a Domain-Specific Terminology Base

    DTIC Science & Technology

    2011-02-01

    only as a resource for written and oral translation, but also for Natural Language Processing ( NLP ) applications, text retrieval, document indexing...Natural Language Processing ( NLP ) applications, text retrieval, document indexing, and other knowledge management tasks. The objective of this...also for Natural Language Processing ( NLP ) applications, text retrieval (1), document indexing, and other knowledge management tasks. The National

  16. Comparison of Caenorhabditis elegans NLP peptides with arthropod neuropeptides.

    PubMed

    Husson, Steven J; Lindemans, Marleen; Janssen, Tom; Schoofs, Liliane

    2009-04-01

    Neuropeptides are small messenger molecules that can be found in all metazoans, where they govern a diverse array of physiological processes. Because neuropeptides seem to be conserved among pest species, selected peptides can be considered as attractive targets for drug discovery. Much can be learned from the model system Caenorhabditis elegans because of the availability of a sequenced genome and state-of-the-art postgenomic technologies that enable characterization of endogenous peptides derived from neuropeptide-like protein (NLP) precursors. Here, we provide an overview of the NLP peptide family in C. elegans and discuss their resemblance with arthropod neuropeptides and their relevance for anthelmintic discovery.

  17. Clinical Natural Language Processing in 2015: Leveraging the Variety of Texts of Clinical Interest.

    PubMed

    Névéol, A; Zweigenbaum, P

    2016-11-10

    To summarize recent research and present a selection of the best papers published in 2015 in the field of clinical Natural Language Processing (NLP). A systematic review of the literature was performed by the two section editors of the IMIA Yearbook NLP section by searching bibliographic databases with a focus on NLP efforts applied to clinical texts or aimed at a clinical outcome. Section editors first selected a shortlist of candidate best papers that were then peer-reviewed by independent external reviewers. The clinical NLP best paper selection shows that clinical NLP is making use of a variety of texts of clinical interest to contribute to the analysis of clinical information and the building of a body of clinical knowledge. The full review process highlighted five papers analyzing patient-authored texts or seeking to connect and aggregate multiple sources of information. They provide a contribution to the development of methods, resources, applications, and sometimes a combination of these aspects. The field of clinical NLP continues to thrive through the contributions of both NLP researchers and healthcare professionals interested in applying NLP techniques to impact clinical practice. Foundational progress in the field makes it possible to leverage a larger variety of texts of clinical interest for healthcare purposes.

  18. Augmenting Qualitative Text Analysis with Natural Language Processing: Methodological Study.

    PubMed

    Guetterman, Timothy C; Chang, Tammy; DeJonckheere, Melissa; Basu, Tanmay; Scruggs, Elizabeth; Vydiswaran, V G Vinod

    2018-06-29

    Qualitative research methods are increasingly being used across disciplines because of their ability to help investigators understand the perspectives of participants in their own words. However, qualitative analysis is a laborious and resource-intensive process. To achieve depth, researchers are limited to smaller sample sizes when analyzing text data. One potential method to address this concern is natural language processing (NLP). Qualitative text analysis involves researchers reading data, assigning code labels, and iteratively developing findings; NLP has the potential to automate part of this process. Unfortunately, little methodological research has been done to compare automatic coding using NLP techniques and qualitative coding, which is critical to establish the viability of NLP as a useful, rigorous analysis procedure. The purpose of this study was to compare the utility of a traditional qualitative text analysis, an NLP analysis, and an augmented approach that combines qualitative and NLP methods. We conducted a 2-arm cross-over experiment to compare qualitative and NLP approaches to analyze data generated through 2 text (short message service) message survey questions, one about prescription drugs and the other about police interactions, sent to youth aged 14-24 years. We randomly assigned a question to each of the 2 experienced qualitative analysis teams for independent coding and analysis before receiving NLP results. A third team separately conducted NLP analysis of the same 2 questions. We examined the results of our analyses to compare (1) the similarity of findings derived, (2) the quality of inferences generated, and (3) the time spent in analysis. The qualitative-only analysis for the drug question (n=58) yielded 4 major findings, whereas the NLP analysis yielded 3 findings that missed contextual elements. The qualitative and NLP-augmented analysis was the most comprehensive. For the police question (n=68), the qualitative-only analysis yielded 4 primary findings and the NLP-only analysis yielded 4 slightly different findings. Again, the augmented qualitative and NLP analysis was the most comprehensive and produced the highest quality inferences, increasing our depth of understanding (ie, details and frequencies). In terms of time, the NLP-only approach was quicker than the qualitative-only approach for the drug (120 vs 270 minutes) and police (40 vs 270 minutes) questions. An approach beginning with qualitative analysis followed by qualitative- or NLP-augmented analysis took longer time than that beginning with NLP for both drug (450 vs 240 minutes) and police (390 vs 220 minutes) questions. NLP provides both a foundation to code qualitatively more quickly and a method to validate qualitative findings. NLP methods were able to identify major themes found with traditional qualitative analysis but were not useful in identifying nuances. Traditional qualitative text analysis added important details and context. ©Timothy C Guetterman, Tammy Chang, Melissa DeJonckheere, Tanmay Basu, Elizabeth Scruggs, VG Vinod Vydiswaran. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 29.06.2018.

  19. Validation of Eye Movements Model of NLP through Stressed Recalls.

    ERIC Educational Resources Information Center

    Sandhu, Daya S.

    Neurolinguistic Progamming (NLP) has emerged as a new approach to counseling and psychotherapy. Though not to be confused with computer programming, NLP does claim to program, deprogram, and reprogram clients' behaviors with the precision and expedition akin to computer processes. It is as a tool for therapeutic communication that NLP has rapidly…

  20. NLP-PIER: A Scalable Natural Language Processing, Indexing, and Searching Architecture for Clinical Notes.

    PubMed

    McEwan, Reed; Melton, Genevieve B; Knoll, Benjamin C; Wang, Yan; Hultman, Gretchen; Dale, Justin L; Meyer, Tim; Pakhomov, Serguei V

    2016-01-01

    Many design considerations must be addressed in order to provide researchers with full text and semantic search of unstructured healthcare data such as clinical notes and reports. Institutions looking at providing this functionality must also address the big data aspects of their unstructured corpora. Because these systems are complex and demand a non-trivial investment, there is an incentive to make the system capable of servicing future needs as well, further complicating the design. We present architectural best practices as lessons learned in the design and implementation NLP-PIER (Patient Information Extraction for Research), a scalable, extensible, and secure system for processing, indexing, and searching clinical notes at the University of Minnesota.

  1. Natural Language Processing As an Alternative to Manual Reporting of Colonoscopy Quality Metrics

    PubMed Central

    RAJU, GOTTUMUKKALA S.; LUM, PHILLIP J.; SLACK, REBECCA; THIRUMURTHI, SELVI; LYNCH, PATRICK M.; MILLER, ETHAN; WESTON, BRIAN R.; DAVILA, MARTA L.; BHUTANI, MANOOP S.; SHAFI, MEHNAZ A.; BRESALIER, ROBERT S.; DEKOVICH, ALEXANDER A.; LEE, JEFFREY H.; GUHA, SUSHOVAN; PANDE, MALA; BLECHACZ, BORIS; RASHID, ASIF; ROUTBORT, MARK; SHUTTLESWORTH, GLADIS; MISHRA, LOPA; STROEHLEIN, JOHN R.; ROSS, WILLIAM A.

    2015-01-01

    BACKGROUND & AIMS The adenoma detection rate (ADR) is a quality metric tied to interval colon cancer occurrence. However, manual extraction of data to calculate and track the ADR in clinical practice is labor-intensive. To overcome this difficulty, we developed a natural language processing (NLP) method to identify patients, who underwent their first screening colonoscopy, identify adenomas and sessile serrated adenomas (SSA). We compared the NLP generated results with that of manual data extraction to test the accuracy of NLP, and report on colonoscopy quality metrics using NLP. METHODS Identification of screening colonoscopies using NLP was compared with that using the manual method for 12,748 patients who underwent colonoscopies from July 2010 to February 2013. Also, identification of adenomas and SSAs using NLP was compared with that using the manual method with 2259 matched patient records. Colonoscopy ADRs using these methods were generated for each physician. RESULTS NLP correctly identified 91.3% of the screening examinations, whereas the manual method identified 87.8% of them. Both the manual method and NLP correctly identified examinations of patients with adenomas and SSAs in the matched records almost perfectly. Both NLP and manual method produce comparable values for ADR for each endoscopist as well as the group as a whole. CONCLUSIONS NLP can correctly identify screening colonoscopies, accurately identify adenomas and SSAs in a pathology database, and provide real-time quality metrics for colonoscopy. PMID:25910665

  2. NLP based congestive heart failure case finding: A prospective analysis on statewide electronic medical records.

    PubMed

    Wang, Yue; Luo, Jin; Hao, Shiying; Xu, Haihua; Shin, Andrew Young; Jin, Bo; Liu, Rui; Deng, Xiaohong; Wang, Lijuan; Zheng, Le; Zhao, Yifan; Zhu, Chunqing; Hu, Zhongkai; Fu, Changlin; Hao, Yanpeng; Zhao, Yingzhen; Jiang, Yunliang; Dai, Dorothy; Culver, Devore S; Alfreds, Shaun T; Todd, Rogow; Stearns, Frank; Sylvester, Karl G; Widen, Eric; Ling, Xuefeng B

    2015-12-01

    In order to proactively manage congestive heart failure (CHF) patients, an effective CHF case finding algorithm is required to process both structured and unstructured electronic medical records (EMR) to allow complementary and cost-efficient identification of CHF patients. We set to identify CHF cases from both EMR codified and natural language processing (NLP) found cases. Using narrative clinical notes from all Maine Health Information Exchange (HIE) patients, the NLP case finding algorithm was retrospectively (July 1, 2012-June 30, 2013) developed with a random subset of HIE associated facilities, and blind-tested with the remaining facilities. The NLP based method was integrated into a live HIE population exploration system and validated prospectively (July 1, 2013-June 30, 2014). Total of 18,295 codified CHF patients were included in Maine HIE. Among the 253,803 subjects without CHF codings, our case finding algorithm prospectively identified 2411 uncodified CHF cases. The positive predictive value (PPV) is 0.914, and 70.1% of these 2411 cases were found to be with CHF histories in the clinical notes. A CHF case finding algorithm was developed, tested and prospectively validated. The successful integration of the CHF case findings algorithm into the Maine HIE live system is expected to improve the Maine CHF care. Copyright © 2015. Published by Elsevier Ireland Ltd.

  3. Identification of Patients with Family History of Pancreatic Cancer--Investigation of an NLP System Portability.

    PubMed

    Mehrabi, Saeed; Krishnan, Anand; Roch, Alexandra M; Schmidt, Heidi; Li, DingCheng; Kesterson, Joe; Beesley, Chris; Dexter, Paul; Schmidt, Max; Palakal, Mathew; Liu, Hongfang

    2015-01-01

    In this study we have developed a rule-based natural language processing (NLP) system to identify patients with family history of pancreatic cancer. The algorithm was developed in a Unstructured Information Management Architecture (UIMA) framework and consisted of section segmentation, relation discovery, and negation detection. The system was evaluated on data from two institutions. The family history identification precision was consistent across the institutions shifting from 88.9% on Indiana University (IU) dataset to 87.8% on Mayo Clinic dataset. Customizing the algorithm on the the Mayo Clinic data, increased its precision to 88.1%. The family member relation discovery achieved precision, recall, and F-measure of 75.3%, 91.6% and 82.6% respectively. Negation detection resulted in precision of 99.1%. The results show that rule-based NLP approaches for specific information extraction tasks are portable across institutions; however customization of the algorithm on the new dataset improves its performance.

  4. Using natural language processing to identify problem usage of prescription opioids.

    PubMed

    Carrell, David S; Cronkite, David; Palmer, Roy E; Saunders, Kathleen; Gross, David E; Masters, Elizabeth T; Hylan, Timothy R; Von Korff, Michael

    2015-12-01

    Accurate and scalable surveillance methods are critical to understand widespread problems associated with misuse and abuse of prescription opioids and for implementing effective prevention and control measures. Traditional diagnostic coding incompletely documents problem use. Relevant information for each patient is often obscured in vast amounts of clinical text. We developed and evaluated a method that combines natural language processing (NLP) and computer-assisted manual review of clinical notes to identify evidence of problem opioid use in electronic health records (EHRs). We used the EHR data and text of 22,142 patients receiving chronic opioid therapy (≥70 days' supply of opioids per calendar quarter) during 2006-2012 to develop and evaluate an NLP-based surveillance method and compare it to traditional methods based on International Classification of Disease, Ninth Edition (ICD-9) codes. We developed a 1288-term dictionary for clinician mentions of opioid addiction, abuse, misuse or overuse, and an NLP system to identify these mentions in unstructured text. The system distinguished affirmative mentions from those that were negated or otherwise qualified. We applied this system to 7336,445 electronic chart notes of the 22,142 patients. Trained abstractors using a custom computer-assisted software interface manually reviewed 7751 chart notes (from 3156 patients) selected by the NLP system and classified each note as to whether or not it contained textual evidence of problem opioid use. Traditional diagnostic codes for problem opioid use were found for 2240 (10.1%) patients. NLP-assisted manual review identified an additional 728 (3.1%) patients with evidence of clinically diagnosed problem opioid use in clinical notes. Inter-rater reliability among pairs of abstractors reviewing notes was high, with kappa=0.86 and 97% agreement for one pair, and kappa=0.71 and 88% agreement for another pair. Scalable, semi-automated NLP methods can efficiently and accurately identify evidence of problem opioid use in vast amounts of EHR text. Incorporating such methods into surveillance efforts may increase prevalence estimates by as much as one-third relative to traditional methods. Copyright © 2015. Published by Elsevier Ireland Ltd.

  5. Natural language processing and visualization in the molecular imaging domain.

    PubMed

    Tulipano, P Karina; Tao, Ying; Millar, William S; Zanzonico, Pat; Kolbert, Katherine; Xu, Hua; Yu, Hong; Chen, Lifeng; Lussier, Yves A; Friedman, Carol

    2007-06-01

    Molecular imaging is at the crossroads of genomic sciences and medical imaging. Information within the molecular imaging literature could be used to link to genomic and imaging information resources and to organize and index images in a way that is potentially useful to researchers. A number of natural language processing (NLP) systems are available to automatically extract information from genomic literature. One existing NLP system, known as BioMedLEE, automatically extracts biological information consisting of biomolecular substances and phenotypic data. This paper focuses on the adaptation, evaluation, and application of BioMedLEE to the molecular imaging domain. In order to adapt BioMedLEE for this domain, we extend an existing molecular imaging terminology and incorporate it into BioMedLEE. BioMedLEE's performance is assessed with a formal evaluation study. The system's performance, measured as recall and precision, is 0.74 (95% CI: [.70-.76]) and 0.70 (95% CI [.63-.76]), respectively. We adapt a JAVA viewer known as PGviewer for the simultaneous visualization of images with NLP extracted information.

  6. Video to Text (V2T) in Wide Area Motion Imagery

    DTIC Science & Technology

    2015-09-01

    microtext) or a document (e.g., using Sphinx or Apache NLP ) as an automated approach [102]. Previous work in natural language full-text searching...language processing ( NLP ) based module. The heart of the structured text processing module includes the following seven key word banks...Features Tracker MHT Multiple Hypothesis Tracking MIL Multiple Instance Learning NLP Natural Language Processing OAB Online AdaBoost OF Optic Flow

  7. Phylogenetic, expression and functional characterizations of the maize NLP transcription factor family reveal a role in nitrate assimilation and signaling.

    PubMed

    Wang, Zhangkui; Zhang, Lei; Sun, Ci; Gu, Riliang; Mi, Guohua; Yuan, Lixing

    2018-01-24

    Although nitrate represents an important nitrogen (N) source for maize, a major crop of dryland areas, the molecular mechanisms of nitrate uptake and assimilation remain poorly understood. Here, we identified nine maize NIN-like protein (ZmNLP) genes and analyzed the function of one member, ZmNLP3.1, in nitrate nutrition and signaling. The NLP family genes were clustered into three clades in a phylogenic tree. Comparative genomic analysis showed that most ZmNLP genes had collinear relationships to the corresponding NLPs in rice, and that the expansion of the ZmNLP family resulted from segmental duplications in the maize genome. Quantitative PCR analysis revealed the expression of ZmNLP2.1, ZmNLP2.2, ZmNLP3.1, ZmNLP3.2, ZmNLP3.3, and ZmNLP3.4 was induced by nitrate in maize roots. The function of ZmNLP3.1 was investigated by overexpressing it in the Arabidopsis nlp7-1 mutant, which is defective in the AtNLP7 gene for nitrate signaling and assimilation. Ectopic expression of ZmNLP3.1 restored the N-deficient phenotypes of nlp7-1 under nitrate-replete conditions in terms of shoot biomass, root morphology and nitrate assimilation. Furthermore, the nitrate induction of NRT2.1, NIA1, and NiR1 gene expression was recovered in the 35S::ZmNLP3.1/nlp7-1 transgenic lines, indicating that ZmNLP3.1 plays essential roles in nitrate signaling. Taken together, these results suggest that ZmNLP3.1 plays an essential role in regulating nitrate signaling and assimilation processes, and represents a valuable candidate for developing transgenic maize cultivars with high N-use efficiency. This article is protected by copyright. All rights reserved.

  8. Development and evaluation of task-specific NLP framework in China.

    PubMed

    Ge, Caixia; Zhang, Yinsheng; Huang, Zhenzhen; Jia, Zheng; Ju, Meizhi; Duan, Huilong; Li, Haomin

    2015-01-01

    Natural language processing (NLP) has been designed to convert narrative text into structured data. Although some general NLP architectures have been developed, a task-specific NLP framework to facilitate the effective use of data is still a challenge in lexical resource limited regions, such as China. The purpose of this study is to design and develop a task-specific NLP framework to extract targeted information from particular documents by adopting dedicated algorithms on current limited lexical resources. In this framework, a shared and evolving ontology mechanism was designed. The result has shown that such a free text driven platform will accelerate the NLP technology acceptance in China.

  9. Exploring Social Meaning in Online Bilingual Text through Social Network Analysis

    DTIC Science & Technology

    2015-09-01

    p. 1). 30 GATE development began in 1995. As techniques for natural language processing ( NLP ) are investigated by the research community and...become part of the NLP repetoire, developers incorporate them with wrappers, which allow the output from GATE processes to be recognized as input by...University NEE Named Entity Extraction NLP natural language processing OSD Office of the Secretary of Defense POS parts of speech SBIR Small Business

  10. Automating curation using a natural language processing pipeline

    PubMed Central

    Alex, Beatrice; Grover, Claire; Haddow, Barry; Kabadjov, Mijail; Klein, Ewan; Matthews, Michael; Tobin, Richard; Wang, Xinglong

    2008-01-01

    Background: The tasks in BioCreative II were designed to approximate some of the laborious work involved in curating biomedical research papers. The approach to these tasks taken by the University of Edinburgh team was to adapt and extend the existing natural language processing (NLP) system that we have developed as part of a commercial curation assistant. Although this paper concentrates on using NLP to assist with curation, the system can be equally employed to extract types of information from the literature that is immediately relevant to biologists in general. Results: Our system was among the highest performing on the interaction subtasks, and competitive performance on the gene mention task was achieved with minimal development effort. For the gene normalization task, a string matching technique that can be quickly applied to new domains was shown to perform close to average. Conclusion: The technologies being developed were shown to be readily adapted to the BioCreative II tasks. Although high performance may be obtained on individual tasks such as gene mention recognition and normalization, and document classification, tasks in which a number of components must be combined, such as detection and normalization of interacting protein pairs, are still challenging for NLP systems. PMID:18834488

  11. NLP-PIER: A Scalable Natural Language Processing, Indexing, and Searching Architecture for Clinical Notes

    PubMed Central

    McEwan, Reed; Melton, Genevieve B.; Knoll, Benjamin C.; Wang, Yan; Hultman, Gretchen; Dale, Justin L.; Meyer, Tim; Pakhomov, Serguei V.

    2016-01-01

    Many design considerations must be addressed in order to provide researchers with full text and semantic search of unstructured healthcare data such as clinical notes and reports. Institutions looking at providing this functionality must also address the big data aspects of their unstructured corpora. Because these systems are complex and demand a non-trivial investment, there is an incentive to make the system capable of servicing future needs as well, further complicating the design. We present architectural best practices as lessons learned in the design and implementation NLP-PIER (Patient Information Extraction for Research), a scalable, extensible, and secure system for processing, indexing, and searching clinical notes at the University of Minnesota. PMID:27570663

  12. An Evolving Ecosystem for Natural Language Processing in Department of Veterans Affairs.

    PubMed

    Garvin, Jennifer H; Kalsy, Megha; Brandt, Cynthia; Luther, Stephen L; Divita, Guy; Coronado, Gregory; Redd, Doug; Christensen, Carrie; Hill, Brent; Kelly, Natalie; Treitler, Qing Zeng

    2017-02-01

    In an ideal clinical Natural Language Processing (NLP) ecosystem, researchers and developers would be able to collaborate with others, undertake validation of NLP systems, components, and related resources, and disseminate them. We captured requirements and formative evaluation data from the Veterans Affairs (VA) Clinical NLP Ecosystem stakeholders using semi-structured interviews and meeting discussions. We developed a coding rubric to code interviews. We assessed inter-coder reliability using percent agreement and the kappa statistic. We undertook 15 interviews and held two workshop discussions. The main areas of requirements related to; design and functionality, resources, and information. Stakeholders also confirmed the vision of the second generation of the Ecosystem and recommendations included; adding mechanisms to better understand terms, measuring collaboration to demonstrate value, and datasets/tools to navigate spelling errors with consumer language, among others. Stakeholders also recommended capability to: communicate with developers working on the next version of the VA electronic health record (VistA Evolution), provide a mechanism to automatically monitor download of tools and to automatically provide a summary of the downloads to Ecosystem contributors and funders. After three rounds of coding and discussion, we determined the percent agreement of two coders to be 97.2% and the kappa to be 0.7851. The vision of the VA Clinical NLP Ecosystem met stakeholder needs. Interviews and discussion provided key requirements that inform the design of the VA Clinical NLP Ecosystem.

  13. Designing visual displays and system models for safe reactor operations based on the user`s perspective of the system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown-VanHoozer, S.A.

    Most designers are not schooled in the area of human-interaction psychology and therefore tend to rely on the traditional ergonomic aspects of human factors when designing complex human-interactive workstations related to reactor operations. They do not take into account the differences in user information processing behavior and how these behaviors may affect individual and team performance when accessing visual displays or utilizing system models in process and control room areas. Unfortunately, by ignoring the importance of the integration of the user interface at the information process level, the result can be sub-optimization and inherently error- and failure-prone systems. Therefore, tomore » minimize or eliminate failures in human-interactive systems, it is essential that the designers understand how each user`s processing characteristics affects how the user gathers information, and how the user communicates the information to the designer and other users. A different type of approach in achieving this understanding is Neuro Linguistic Programming (NLP). The material presented in this paper is based on two studies involving the design of visual displays, NLP, and the user`s perspective model of a reactor system. The studies involve the methodology known as NLP, and its use in expanding design choices from the user`s ``model of the world,`` in the areas of virtual reality, workstation design, team structure, decision and learning style patterns, safety operations, pattern recognition, and much, much more.« less

  14. Automated Extraction of Substance Use Information from Clinical Texts.

    PubMed

    Wang, Yan; Chen, Elizabeth S; Pakhomov, Serguei; Arsoniadis, Elliot; Carter, Elizabeth W; Lindemann, Elizabeth; Sarkar, Indra Neil; Melton, Genevieve B

    2015-01-01

    Within clinical discourse, social history (SH) includes important information about substance use (alcohol, drug, and nicotine use) as key risk factors for disease, disability, and mortality. In this study, we developed and evaluated a natural language processing (NLP) system for automated detection of substance use statements and extraction of substance use attributes (e.g., temporal and status) based on Stanford Typed Dependencies. The developed NLP system leveraged linguistic resources and domain knowledge from a multi-site social history study, Propbank and the MiPACQ corpus. The system attained F-scores of 89.8, 84.6 and 89.4 respectively for alcohol, drug, and nicotine use statement detection, as well as average F-scores of 82.1, 90.3, 80.8, 88.7, 96.6, and 74.5 respectively for extraction of attributes. Our results suggest that NLP systems can achieve good performance when augmented with linguistic resources and domain knowledge when applied to a wide breadth of substance use free text clinical notes.

  15. Characterization of necrosis-inducing NLP proteins in Phytophthora capsici

    PubMed Central

    2014-01-01

    Background Effector proteins function not only as toxins to induce plant cell death, but also enable pathogens to suppress or evade plant defense responses. NLP-like proteins are considered to be effector proteins, and they have been isolated from bacteria, fungi, and oomycete plant pathogens. There is increasing evidence that NLPs have the ability to induce cell death and ethylene accumulation in plants. Results We evaluated the expression patterns of 11 targeted PcNLP genes by qRT-PCR at different time points after infection by P. capsici. Several PcNLP genes were strongly expressed at the early stages in the infection process, but the expression of other PcNLP genes gradually increased to a maximum at late stages of infection. The genes PcNLP2, PcNLP6 and PcNLP14 showed the highest expression levels during infection by P. capsici. The necrosis-inducing activity of all targeted PcNLP genes was evaluated using heterologous expression by PVX agroinfection of Capsicum annuum and Nicotiana benthamiana and by Western blot analysis. The members of the PcNLP family can induce chlorosis or necrosis during infection of pepper and tobacco leaves, but the chlorotic or necrotic response caused by PcNLP genes was stronger in pepper leaves than in tobacco leaves. Moreover, PcNLP2, PcNLP6, and PcNLP14 caused the largest chlorotic or necrotic areas in both host plants, indicating that these three genes contribute to strong virulence during infection by P. capsici. This was confirmed through functional evaluation of their silenced transformants. In addition, we further verified that four conserved residues are putatively active sites in PcNLP1 by site-directed mutagenesis. Conclusions Each targeted PcNLP gene affects cells or tissues differently depending upon the stage of infection. Most PcNLP genes could trigger necrotic or chlorotic responses when expressed in the host C. annuum and the non-host N. benthamiana. Individual PcNLP genes have different phytotoxic effects, and PcNLP2, PcNLP6, and PcNLP14 may play important roles in symptom development and may be crucial for virulence, necrosis-inducing activity, or cell death during infection by P. capsici. PMID:24886309

  16. Characterization of necrosis-inducing NLP proteins in Phytophthora capsici.

    PubMed

    Feng, Bao-Zhen; Zhu, Xiao-Ping; Fu, Li; Lv, Rong-Fei; Storey, Dylan; Tooley, Paul; Zhang, Xiu-Guo

    2014-05-08

    Effector proteins function not only as toxins to induce plant cell death, but also enable pathogens to suppress or evade plant defense responses. NLP-like proteins are considered to be effector proteins, and they have been isolated from bacteria, fungi, and oomycete plant pathogens. There is increasing evidence that NLPs have the ability to induce cell death and ethylene accumulation in plants. We evaluated the expression patterns of 11 targeted PcNLP genes by qRT-PCR at different time points after infection by P. capsici. Several PcNLP genes were strongly expressed at the early stages in the infection process, but the expression of other PcNLP genes gradually increased to a maximum at late stages of infection. The genes PcNLP2, PcNLP6 and PcNLP14 showed the highest expression levels during infection by P. capsici. The necrosis-inducing activity of all targeted PcNLP genes was evaluated using heterologous expression by PVX agroinfection of Capsicum annuum and Nicotiana benthamiana and by Western blot analysis. The members of the PcNLP family can induce chlorosis or necrosis during infection of pepper and tobacco leaves, but the chlorotic or necrotic response caused by PcNLP genes was stronger in pepper leaves than in tobacco leaves. Moreover, PcNLP2, PcNLP6, and PcNLP14 caused the largest chlorotic or necrotic areas in both host plants, indicating that these three genes contribute to strong virulence during infection by P. capsici. This was confirmed through functional evaluation of their silenced transformants. In addition, we further verified that four conserved residues are putatively active sites in PcNLP1 by site-directed mutagenesis. Each targeted PcNLP gene affects cells or tissues differently depending upon the stage of infection. Most PcNLP genes could trigger necrotic or chlorotic responses when expressed in the host C. annuum and the non-host N. benthamiana. Individual PcNLP genes have different phytotoxic effects, and PcNLP2, PcNLP6, and PcNLP14 may play important roles in symptom development and may be crucial for virulence, necrosis-inducing activity, or cell death during infection by P. capsici.

  17. Ground Truth Creation for Complex Clinical NLP Tasks - an Iterative Vetting Approach and Lessons Learned.

    PubMed

    Liang, Jennifer J; Tsou, Ching-Huei; Devarakonda, Murthy V

    2017-01-01

    Natural language processing (NLP) holds the promise of effectively analyzing patient record data to reduce cognitive load on physicians and clinicians in patient care, clinical research, and hospital operations management. A critical need in developing such methods is the "ground truth" dataset needed for training and testing the algorithms. Beyond localizable, relatively simple tasks, ground truth creation is a significant challenge because medical experts, just as physicians in patient care, have to assimilate vast amounts of data in EHR systems. To mitigate potential inaccuracies of the cognitive challenges, we present an iterative vetting approach for creating the ground truth for complex NLP tasks. In this paper, we present the methodology, and report on its use for an automated problem list generation task, its effect on the ground truth quality and system accuracy, and lessons learned from the effort.

  18. Ensembles of NLP Tools for Data Element Extraction from Clinical Notes

    PubMed Central

    Kuo, Tsung-Ting; Rao, Pallavi; Maehara, Cleo; Doan, Son; Chaparro, Juan D.; Day, Michele E.; Farcas, Claudiu; Ohno-Machado, Lucila; Hsu, Chun-Nan

    2016-01-01

    Natural Language Processing (NLP) is essential for concept extraction from narrative text in electronic health records (EHR). To extract numerous and diverse concepts, such as data elements (i.e., important concepts related to a certain medical condition), a plausible solution is to combine various NLP tools into an ensemble to improve extraction performance. However, it is unclear to what extent ensembles of popular NLP tools improve the extraction of numerous and diverse concepts. Therefore, we built an NLP ensemble pipeline to synergize the strength of popular NLP tools using seven ensemble methods, and to quantify the improvement in performance achieved by ensembles in the extraction of data elements for three very different cohorts. Evaluation results show that the pipeline can improve the performance of NLP tools, but there is high variability depending on the cohort. PMID:28269947

  19. Ensembles of NLP Tools for Data Element Extraction from Clinical Notes.

    PubMed

    Kuo, Tsung-Ting; Rao, Pallavi; Maehara, Cleo; Doan, Son; Chaparro, Juan D; Day, Michele E; Farcas, Claudiu; Ohno-Machado, Lucila; Hsu, Chun-Nan

    2016-01-01

    Natural Language Processing (NLP) is essential for concept extraction from narrative text in electronic health records (EHR). To extract numerous and diverse concepts, such as data elements (i.e., important concepts related to a certain medical condition), a plausible solution is to combine various NLP tools into an ensemble to improve extraction performance. However, it is unclear to what extent ensembles of popular NLP tools improve the extraction of numerous and diverse concepts. Therefore, we built an NLP ensemble pipeline to synergize the strength of popular NLP tools using seven ensemble methods, and to quantify the improvement in performance achieved by ensembles in the extraction of data elements for three very different cohorts. Evaluation results show that the pipeline can improve the performance of NLP tools, but there is high variability depending on the cohort.

  20. The NLP toxin family in Phytophthora sojae includes rapidly evolving groups that lack necrosis-inducing activity.

    PubMed

    Dong, Suomeng; Kong, Guanghui; Qutob, Dinah; Yu, Xiaoli; Tang, Junli; Kang, Jixiong; Dai, Tingting; Wang, Hai; Gijzen, Mark; Wang, Yuanchao

    2012-07-01

    Necrosis- and ethylene-inducing-like proteins (NLP) are widely distributed in eukaryotic and prokaryotic plant pathogens and are considered to be important virulence factors. We identified, in total, 70 potential Phytophthora sojae NLP genes but 37 were designated as pseudogenes. Sequence alignment of the remaining 33 NLP delineated six groups. Three of these groups include proteins with an intact heptapeptide (Gly-His-Arg-His-Asp-Trp-Glu) motif, which is important for necrosis-inducing activity, whereas the motif is not conserved in the other groups. In total, 19 representative NLP genes were assessed for necrosis-inducing activity by heterologous expression in Nicotiana benthamiana. Surprisingly, only eight genes triggered cell death. The expression of the NLP genes in P. sojae was examined, distinguishing 20 expressed and 13 nonexpressed NLP genes. Real-time reverse-transcriptase polymerase chain reaction results indicate that most NLP are highly expressed during cyst germination and infection stages. Amino acid substitution ratios (Ka/Ks) of 33 NLP sequences from four different P. sojae strains resulted in identification of positive selection sites in a distinct NLP group. Overall, our study indicates that expansion and pseudogenization of the P. sojae NLP family results from an ongoing birth-and-death process, and that varying patterns of expression, necrosis-inducing activity, and positive selection suggest that NLP have diversified in function.

  1. Recognition of medication information from discharge summaries using ensembles of classifiers.

    PubMed

    Doan, Son; Collier, Nigel; Xu, Hua; Pham, Hoang Duy; Tu, Minh Phuong

    2012-05-07

    Extraction of clinical information such as medications or problems from clinical text is an important task of clinical natural language processing (NLP). Rule-based methods are often used in clinical NLP systems because they are easy to adapt and customize. Recently, supervised machine learning methods have proven to be effective in clinical NLP as well. However, combining different classifiers to further improve the performance of clinical entity recognition systems has not been investigated extensively. Combining classifiers into an ensemble classifier presents both challenges and opportunities to improve performance in such NLP tasks. We investigated ensemble classifiers that used different voting strategies to combine outputs from three individual classifiers: a rule-based system, a support vector machine (SVM) based system, and a conditional random field (CRF) based system. Three voting methods were proposed and evaluated using the annotated data sets from the 2009 i2b2 NLP challenge: simple majority, local SVM-based voting, and local CRF-based voting. Evaluation on 268 manually annotated discharge summaries from the i2b2 challenge showed that the local CRF-based voting method achieved the best F-score of 90.84% (94.11% Precision, 87.81% Recall) for 10-fold cross-validation. We then compared our systems with the first-ranked system in the challenge by using the same training and test sets. Our system based on majority voting achieved a better F-score of 89.65% (93.91% Precision, 85.76% Recall) than the previously reported F-score of 89.19% (93.78% Precision, 85.03% Recall) by the first-ranked system in the challenge. Our experimental results using the 2009 i2b2 challenge datasets showed that ensemble classifiers that combine individual classifiers into a voting system could achieve better performance than a single classifier in recognizing medication information from clinical text. It suggests that simple strategies that can be easily implemented such as majority voting could have the potential to significantly improve clinical entity recognition.

  2. Identification of Patients with Family History of Pancreatic Cancer - Investigation of an NLP System Portability

    PubMed Central

    Mehrabi, Saeed; Krishnan, Anand; Roch, Alexandra M; Schmidt, Heidi; Li, DingCheng; Kesterson, Joe; Beesley, Chris; Dexter, Paul; Schmidt, Max; Palakal, Mathew; Liu, Hongfang

    2018-01-01

    In this study we have developed a rule-based natural language processing (NLP) system to identify patients with family history of pancreatic cancer. The algorithm was developed in a Unstructured Information Management Architecture (UIMA) framework and consisted of section segmentation, relation discovery, and negation detection. The system was evaluated on data from two institutions. The family history identification precision was consistent across the institutions shifting from 88.9% on Indiana University (IU) dataset to 87.8% on Mayo Clinic dataset. Customizing the algorithm on the the Mayo Clinic data, increased its precision to 88.1%. The family member relation discovery achieved precision, recall, and F-measure of 75.3%, 91.6% and 82.6% respectively. Negation detection resulted in precision of 99.1%. The results show that rule-based NLP approaches for specific information extraction tasks are portable across institutions; however customization of the algorithm on the new dataset improves its performance. PMID:26262122

  3. An early illness recognition framework using a temporal Smith Waterman algorithm and NLP.

    PubMed

    Hajihashemi, Zahra; Popescu, Mihail

    2013-01-01

    In this paper we propose a framework for detecting health patterns based on non-wearable sensor sequence similarity and natural language processing (NLP). In TigerPlace, an aging in place facility from Columbia, MO, we deployed 47 sensor networks together with a nursing electronic health record (EHR) system to provide early illness recognition. The proposed framework utilizes sensor sequence similarity and NLP on EHR nursing comments to automatically notify the physician when health problems are detected. The reported methodology is inspired by genomic sequence annotation using similarity algorithms such as Smith Waterman (SW). Similarly, for each sensor sequence, we associate health concepts extracted from the nursing notes using Metamap, a NLP tool provided by Unified Medical Language System (UMLS). Since sensor sequences, unlike genomics ones, have an associated time dimension we propose a temporal variant of SW (TSW) to account for time. The main challenges presented by our framework are finding the most suitable time sequence similarity and aggregation of the retrieved UMLS concepts. On a pilot dataset from three Tiger Place residents, with a total of 1685 sensor days and 626 nursing records, we obtained an average precision of 0.64 and a recall of 0.37.

  4. Evaluation of Natural Language Processing (NLP) Systems to Annotate Drug Product Labeling with MedDRA Terminology.

    PubMed

    Ly, Thomas; Pamer, Carol; Dang, Oanh; Brajovic, Sonja; Haider, Shahrukh; Botsis, Taxiarchis; Milward, David; Winter, Andrew; Lu, Susan; Ball, Robert

    2018-05-31

    The FDA Adverse Event Reporting System (FAERS) is a primary data source for identifying unlabeled adverse events (AEs) in a drug or biologic drug product's postmarketing phase. Many AE reports must be reviewed by drug safety experts to identify unlabeled AEs, even if the reported AEs are previously identified, labeled AEs. Integrating the labeling status of drug product AEs into FAERS could increase report triage and review efficiency. Medical Dictionary for Regulatory Activities (MedDRA) is the standard for coding AE terms in FAERS cases. However, drug manufacturers are not required to use MedDRA to describe AEs in product labels. We hypothesized that natural language processing (NLP) tools could assist in automating the extraction and MedDRA mapping of AE terms in drug product labels. We evaluated the performance of three NLP systems, (ETHER, I2E, MetaMap) for their ability to extract AE terms from drug labels and translate the terms to MedDRA Preferred Terms (PTs). Pharmacovigilance-based annotation guidelines for extracting AE terms from drug labels were developed for this study. We compared each system's output to MedDRA PT AE lists, manually mapped by FDA pharmacovigilance experts using the guidelines, for ten drug product labels known as the "gold standard AE list" (GSL) dataset. Strict time and configuration conditions were imposed in order to test each system's capabilities under conditions of no human intervention and minimal system configuration. Each NLP system's output was evaluated for precision, recall and F measure in comparison to the GSL. A qualitative error analysis (QEA) was conducted to categorize a random sample of each NLP system's false positive and false negative errors. A total of 417, 278, and 250 false positive errors occurred in the ETHER, I2E, and MetaMap outputs, respectively. A total of 100, 80, and 187 false negative errors occurred in ETHER, I2E, and MetaMap outputs, respectively. Precision ranged from 64% to 77%, recall from 64% to 83% and F measure from 67% to 79%. I2E had the highest precision (77%), recall (83%) and F measure (79%). ETHER had the lowest precision (64%). MetaMap had the lowest recall (64%). The QEA found that the most prevalent false positive errors were context errors such as "Context error/General term", "Context error/Instructions or monitoring parameters", "Context error/Medical history preexisting condition underlying condition risk factor or contraindication", and "Context error/AE manifestations or secondary complication". The most prevalent false negative errors were in the "Incomplete or missed extraction" error category. Missing AE terms were typically due to long terms, or terms containing non-contiguous words which do not correspond exactly to MedDRA synonyms. MedDRA mapping errors were a minority of errors for ETHER and I2E but were the most prevalent false positive errors for MetaMap. The results demonstrate that it may be feasible to use NLP tools to extract and map AE terms to MedDRA PTs. However, the NLP tools we tested would need to be modified or reconfigured to lower the error rates to support their use in a regulatory setting. Tools specific for extracting AE terms from drug labels and mapping the terms to MedDRA PTs may need to be developed to support pharmacovigilance. Conducting research using additional NLP systems on a larger, diverse GSL would also be informative. Copyright © 2018. Published by Elsevier Inc.

  5. Predicate Matching in NLP: A Review of Research on the Preferred Representational System.

    ERIC Educational Resources Information Center

    Sharpley, Christopher F.

    1984-01-01

    Reviews 15 studies that have investigated the use of the Preferred Representational System (PRS) in Neurolinguistic Programming (NLP). Aspects of design, methodology, population and dependent measures are evaluated, with comments on the outcomes obtained. Results suggested that there is little supportive evidence for the use of PRS in the NLP.…

  6. Using Natural Language Processing to Extract Abnormal Results From Cancer Screening Reports.

    PubMed

    Moore, Carlton R; Farrag, Ashraf; Ashkin, Evan

    2017-09-01

    Numerous studies show that follow-up of abnormal cancer screening results, such as mammography and Papanicolaou (Pap) smears, is frequently not performed in a timely manner. A contributing factor is that abnormal results may go unrecognized because they are buried in free-text documents in electronic medical records (EMRs), and, as a result, patients are lost to follow-up. By identifying abnormal results from free-text reports in EMRs and generating alerts to clinicians, natural language processing (NLP) technology has the potential for improving patient care. The goal of the current study was to evaluate the performance of NLP software for extracting abnormal results from free-text mammography and Pap smear reports stored in an EMR. A sample of 421 and 500 free-text mammography and Pap reports, respectively, were manually reviewed by a physician, and the results were categorized for each report. We tested the performance of NLP to extract results from the reports. The 2 assessments (criterion standard versus NLP) were compared to determine the precision, recall, and accuracy of NLP. When NLP was compared with manual review for mammography reports, the results were as follows: precision, 98% (96%-99%); recall, 100% (98%-100%); and accuracy, 98% (96%-99%). For Pap smear reports, the precision, recall, and accuracy of NLP were all 100%. Our study developed NLP models that accurately extract abnormal results from mammography and Pap smear reports. Plans include using NLP technology to generate real-time alerts and reminders for providers to facilitate timely follow-up of abnormal results.

  7. Neurolinguistic Programming in the Context of Group Counseling.

    ERIC Educational Resources Information Center

    Childers, John H. Jr.; Saltmarsh, Robert E.

    1986-01-01

    Describes neurolinguistic programming (NLP) in the context of group counseling. NLP is a model of communication that focuses on verbal and nonverbal patterns of behaviors as well as on the structures and processes of human subjectivity. Five stages of group development are described, and specific NLP techniques appropriate to the various stages…

  8. Network Analysis with Stochastic Grammars

    DTIC Science & Technology

    2015-09-17

    Language Processing ( NLP ) domain SCFG...sentence into starting symbol. Figure 2 is an NLP part-of- speech example modified from [38] of an SCFG production rule set that reads a limited set of...English sentences for the purpose of determining grammatical validity and meaning through part-of-speech assignment. In the NLP domain, each word is in

  9. Soliton formation from a noise-like pulse during extreme events in a fibre ring laser

    NASA Astrophysics Data System (ADS)

    Pottiez, O.; Ibarra-Villalon, H. E.; Bracamontes-Rodriguez, Y.; Minguela-Gallardo, J. A.; Garcia-Sanchez, E.; Lauterio-Cruz, J. P.; Hernandez-Garcia, J. C.; Bello-Jimenez, M.; Kuzin, E. A.

    2017-10-01

    We study experimentally the interactions between soliton and noise-like pulse (NLP) components in a mode-locked fibre ring laser operating in a hybrid soliton-NLP regime. For proper polarization adjustments, one NLP and multiple packets of solitons coexist in the cavity, at 1530 nm and 1558 nm, respectively. By examining time-domain sequences measured using a 16 GHz real-time oscilloscope, we unveil the process of soliton genesis: they are produced during extreme-intensity episodes affecting the NLP. These extreme events can emerge sporadically, appear in small groups or even form quasi-periodic sequences. Once formed, the wavelength-shifted soliton packet drifts away from the NLP in the dispersive cavity, and eventually vanishes after a variable lifetime. Evidence of the inverse process, through which NLP formation is occasionally seeded by an extreme-intensity event affecting a bunch of solitons, is also provided. The quasi-stationary dynamics described here constitutes an impressive illustration of the connections and interactions between NLPs, extreme events and solitons in passively mode-locked fibre lasers.

  10. Techniques for Automatically Generating Biographical Summaries from News Articles

    DTIC Science & Technology

    2007-09-01

    non-trivial because of the many NLP areas that must be used to efficiently extract the relevant facts. Yet, no study has been done to determine how...also non-trivial because of the many NLP areas that must be used to efficiently extract the relevant facts. Yet, no study has been done to determine...AI) research is called Natural Language Processing ( NLP ). NLP seeks to find ways for computers to read and write documents in as human a way as

  11. Natural language processing to ascertain two key variables from operative reports in ophthalmology.

    PubMed

    Liu, Liyan; Shorstein, Neal H; Amsden, Laura B; Herrinton, Lisa J

    2017-04-01

    Antibiotic prophylaxis is critical to ophthalmology and other surgical specialties. We performed natural language processing (NLP) of 743 838 operative notes recorded for 315 246 surgeries to ascertain two variables needed to study the comparative effectiveness of antibiotic prophylaxis in cataract surgery. The first key variable was an exposure variable, intracameral antibiotic injection. The second was an intraoperative complication, posterior capsular rupture (PCR), which functioned as a potential confounder. To help other researchers use NLP in their settings, we describe our NLP protocol and lessons learned. For each of the two variables, we used SAS Text Miner and other SAS text-processing modules with a training set of 10 000 (1.3%) operative notes to develop a lexicon. The lexica identified misspellings, abbreviations, and negations, and linked words into concepts (e.g. "antibiotic" linked with "injection"). We confirmed the NLP tools by iteratively obtaining random samples of 2000 (0.3%) notes, with replacement. The NLP tools identified approximately 60 000 intracameral antibiotic injections and 3500 cases of PCR. The positive and negative predictive values for intracameral antibiotic injection exceeded 99%. For the intraoperative complication, they exceeded 94%. NLP was a valid and feasible method for obtaining critical variables needed for a research study of surgical safety. These NLP tools were intended for use in the study sample. Use with external datasets or future datasets in our own setting would require further testing. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  12. Natural Language Processing to Ascertain Two Key Variables from Operative Reports in Ophthalmology

    PubMed Central

    Liu, Liyan; Shorstein, Neal H.; Amsden, Laura B; Herrinton, Lisa J.

    2016-01-01

    Purpose Antibiotic prophylaxis is critical to ophthalmology and other surgical specialties. We performed natural language processing (NLP) of 743,838 operative notes recorded for 315,246 surgeries to ascertain two variables needed to study the comparative effectiveness of antibiotic prophylaxis in cataract surgery. The first key variable was an exposure variable, intracameral antibiotic injection. The second was an intraoperative complication, posterior capsular rupture (PCR), that functioned as a potential confounder. To help other researchers use NLP in their settings, we describe our NLP protocol and lessons learned. Methods For each of the two variables, we used SAS Text Miner and other SAS text-processing modules with a training set of 10,000 (1.3%) operative notes to develop a lexicon. The lexica identified misspellings, abbreviations, and negations, and linked words into concepts (e.g., “antibiotic” linked with “injection”). We confirmed the NLP tools by iteratively obtaining random samples of 2,000 (0.3%) notes, with replacement. Results The NLP tools identified approximately 60,000 intracameral antibiotic injections and 3,500 cases of PCR. The positive and negative predictive values for intracameral antibiotic injection exceeded 99%. For the intraoperative complication, they exceeded 94%. Conclusion NLP was a valid and feasible method for obtaining critical variables needed for a research study of surgical safety. These NLP tools were intended for use in the study sample. Use with external datasets or future datasets in our own setting would require further testing. PMID:28052483

  13. A neuropeptide-mediated stretch response links muscle contraction to changes in neurotransmitter release

    PubMed Central

    Hu, Zhitao; Pym, Edward C.G.; Babu, Kavita; Vashlishan Murray, Amy B.; Kaplan, Joshua M.

    2011-01-01

    Although C. elegans has been utilized extensively to study synapse formation and function, relatively little is known about synaptic plasticity in C. elegans. We show that a brief treatment with the cholinesterase inhibitor aldicarb induces a form of presynaptic potentiation whereby ACh release at neuromuscular junctions (NMJs) is doubled. Aldicarb-induced potentiation was eliminated by mutations that block processing of pro-neuropeptides, by mutations inactivating a single pro-neuropeptide (NLP-12), and by those inactivating an NLP-12 receptor (CKR-2). NLP-12 expression is limited to a single stretch-activated neuron, DVA. Analysis of a YFP-tagged NLP-12 suggests that aldicarb stimulates DVA secretion of NLP-12. Mutations disrupting the DVA mechanoreceptor (TRP-4) decreased aldicarb-induced NLP-12 secretion and blocked aldicarb-induced synaptic potentiation. Mutants lacking NLP-12 or CKR-2 have decreased locomotion rates. Collectively, these results suggest that NLP-12 mediates a mechanosensory feedback loop that couples muscle contraction to changes in presynaptic release, thereby providing a mechanism for proprioceptive control of locomotion. PMID:21745640

  14. An Evaluation of a Natural Language Processing Tool for Identifying and Encoding Allergy Information in Emergency Department Clinical Notes

    PubMed Central

    Goss, Foster R.; Plasek, Joseph M.; Lau, Jason J.; Seger, Diane L.; Chang, Frank Y.; Zhou, Li

    2014-01-01

    Emergency department (ED) visits due to allergic reactions are common. Allergy information is often recorded in free-text provider notes; however, this domain has not yet been widely studied by the natural language processing (NLP) community. We developed an allergy module built on the MTERMS NLP system to identify and encode food, drug, and environmental allergies and allergic reactions. The module included updates to our lexicon using standard terminologies, and novel disambiguation algorithms. We developed an annotation schema and annotated 400 ED notes that served as a gold standard for comparison to MTERMS output. MTERMS achieved an F-measure of 87.6% for the detection of allergen names and no known allergies, 90% for identifying true reactions in each allergy statement where true allergens were also identified, and 69% for linking reactions to their allergen. These preliminary results demonstrate the feasibility using NLP to extract and encode allergy information from clinical notes. PMID:25954363

  15. Natural Language Processing as a Discipline at LLNL

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Firpo, M A

    The field of Natural Language Processing (NLP) is described as it applies to the needs of LLNL in handling free-text. The state of the practice is outlined with the emphasis placed on two specific aspects of NLP: Information Extraction and Discourse Integration. A brief description is included of the NLP applications currently being used at LLNL. A gap analysis provides a look at where the technology needs work in order to meet the needs of LLNL. Finally, recommendations are made to meet these needs.

  16. Towards comprehensive syntactic and semantic annotations of the clinical narrative

    PubMed Central

    Albright, Daniel; Lanfranchi, Arrick; Fredriksen, Anwen; Styler, William F; Warner, Colin; Hwang, Jena D; Choi, Jinho D; Dligach, Dmitriy; Nielsen, Rodney D; Martin, James; Ward, Wayne; Palmer, Martha; Savova, Guergana K

    2013-01-01

    Objective To create annotated clinical narratives with layers of syntactic and semantic labels to facilitate advances in clinical natural language processing (NLP). To develop NLP algorithms and open source components. Methods Manual annotation of a clinical narrative corpus of 127 606 tokens following the Treebank schema for syntactic information, PropBank schema for predicate-argument structures, and the Unified Medical Language System (UMLS) schema for semantic information. NLP components were developed. Results The final corpus consists of 13 091 sentences containing 1772 distinct predicate lemmas. Of the 766 newly created PropBank frames, 74 are verbs. There are 28 539 named entity (NE) annotations spread over 15 UMLS semantic groups, one UMLS semantic type, and the Person semantic category. The most frequent annotations belong to the UMLS semantic groups of Procedures (15.71%), Disorders (14.74%), Concepts and Ideas (15.10%), Anatomy (12.80%), Chemicals and Drugs (7.49%), and the UMLS semantic type of Sign or Symptom (12.46%). Inter-annotator agreement results: Treebank (0.926), PropBank (0.891–0.931), NE (0.697–0.750). The part-of-speech tagger, constituency parser, dependency parser, and semantic role labeler are built from the corpus and released open source. A significant limitation uncovered by this project is the need for the NLP community to develop a widely agreed-upon schema for the annotation of clinical concepts and their relations. Conclusions This project takes a foundational step towards bringing the field of clinical NLP up to par with NLP in the general domain. The corpus creation and NLP components provide a resource for research and application development that would have been previously impossible. PMID:23355458

  17. Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis.

    PubMed

    Velupillai, S; Mowery, D; South, B R; Kvist, M; Dalianis, H

    2015-08-13

    We present a review of recent advances in clinical Natural Language Processing (NLP), with a focus on semantic analysis and key subtasks that support such analysis. We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant referenced publications from the included papers. Significant articles published within this time-span were included and are discussed from the perspective of semantic analysis. Three key clinical NLP subtasks that enable such analysis were identified: 1) developing more efficient methods for corpus creation (annotation and de-identification), 2) generating building blocks for extracting meaning (morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical utility (NLP applications and infrastructure for clinical use cases). Finally, we provide a reflection upon most recent developments and potential areas of future NLP development and applications. There has been an increase of advances within key NLP subtasks that support semantic analysis. Performance of NLP semantic analysis is, in many cases, close to that of agreement between humans. The creation and release of corpora annotated with complex semantic information models has greatly supported the development of new tools and approaches. Research on non-English languages is continuously growing. NLP methods have sometimes been successfully employed in real-world clinical tasks. However, there is still a gap between the development of advanced resources and their utilization in clinical settings. A plethora of new clinical use cases are emerging due to established health care initiatives and additional patient-generated sources through the extensive use of social media and other devices.

  18. v3NLP Framework: Tools to Build Applications for Extracting Concepts from Clinical Text

    PubMed Central

    Divita, Guy; Carter, Marjorie E.; Tran, Le-Thuy; Redd, Doug; Zeng, Qing T; Duvall, Scott; Samore, Matthew H.; Gundlapalli, Adi V.

    2016-01-01

    Introduction: Substantial amounts of clinically significant information are contained only within the narrative of the clinical notes in electronic medical records. The v3NLP Framework is a set of “best-of-breed” functionalities developed to transform this information into structured data for use in quality improvement, research, population health surveillance, and decision support. Background: MetaMap, cTAKES and similar well-known natural language processing (NLP) tools do not have sufficient scalability out of the box. The v3NLP Framework evolved out of the necessity to scale-up these tools up and provide a framework to customize and tune techniques that fit a variety of tasks, including document classification, tuned concept extraction for specific conditions, patient classification, and information retrieval. Innovation: Beyond scalability, several v3NLP Framework-developed projects have been efficacy tested and benchmarked. While v3NLP Framework includes annotators, pipelines and applications, its functionalities enable developers to create novel annotators and to place annotators into pipelines and scaled applications. Discussion: The v3NLP Framework has been successfully utilized in many projects including general concept extraction, risk factors for homelessness among veterans, and identification of mentions of the presence of an indwelling urinary catheter. Projects as diverse as predicting colonization with methicillin-resistant Staphylococcus aureus and extracting references to military sexual trauma are being built using v3NLP Framework components. Conclusion: The v3NLP Framework is a set of functionalities and components that provide Java developers with the ability to create novel annotators and to place those annotators into pipelines and applications to extract concepts from clinical text. There are scale-up and scale-out functionalities to process large numbers of records. PMID:27683667

  19. Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis

    PubMed Central

    Mowery, D.; South, B. R.; Kvist, M.; Dalianis, H.

    2015-01-01

    Summary Objectives We present a review of recent advances in clinical Natural Language Processing (NLP), with a focus on semantic analysis and key subtasks that support such analysis. Methods We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant referenced publications from the included papers. Results Significant articles published within this time-span were included and are discussed from the perspective of semantic analysis. Three key clinical NLP subtasks that enable such analysis were identified: 1) developing more efficient methods for corpus creation (annotation and de-identification), 2) generating building blocks for extracting meaning (morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical utility (NLP applications and infrastructure for clinical use cases). Finally, we provide a reflection upon most recent developments and potential areas of future NLP development and applications. Conclusions There has been an increase of advances within key NLP subtasks that support semantic analysis. Performance of NLP semantic analysis is, in many cases, close to that of agreement between humans. The creation and release of corpora annotated with complex semantic information models has greatly supported the development of new tools and approaches. Research on non-English languages is continuously growing. NLP methods have sometimes been successfully employed in real-world clinical tasks. However, there is still a gap between the development of advanced resources and their utilization in clinical settings. A plethora of new clinical use cases are emerging due to established health care initiatives and additional patient-generated sources through the extensive use of social media and other devices. PMID:26293867

  20. Natural Language Processing for Asthma Ascertainment in Different Practice Settings.

    PubMed

    Wi, Chung-Il; Sohn, Sunghwan; Ali, Mir; Krusemark, Elizabeth; Ryu, Euijung; Liu, Hongfang; Juhn, Young J

    We developed and validated NLP-PAC, a natural language processing (NLP) algorithm based on predetermined asthma criteria (PAC) for asthma ascertainment using electronic health records at Mayo Clinic. To adapt NLP-PAC in a different health care setting, Sanford Children Hospital, by assessing its external validity. The study was designed as a retrospective cohort study that used a random sample of 2011-2012 Sanford Birth cohort (n = 595). Manual chart review was performed on the cohort for asthma ascertainment on the basis of the PAC. We then used half of the cohort as a training cohort (n = 298) and the other half as a blind test cohort to evaluate the adapted NLP-PAC algorithm. Association of known asthma-related risk factors with the Sanford-NLP algorithm-driven asthma ascertainment was tested. Among the eligible test cohort (n = 297), 160 (53%) were males, 268 (90%) white, and the median age was 2.3 years (range, 1.5-3.1 years). NLP-PAC, after adaptation, and the human abstractor identified 74 (25%) and 72 (24%) subjects, respectively, with 66 subjects identified by both approaches. Sensitivity, specificity, positive predictive value, and negative predictive value for the NLP algorithm in predicting asthma status were 92%, 96%, 89%, and 97%, respectively. The known risk factors for asthma identified by NLP (eg, smoking history) were similar to the ones identified by manual chart review. Successful implementation of NLP-PAC for asthma ascertainment in 2 different practice settings demonstrates the feasibility of automated asthma ascertainment leveraging electronic health record data with a potential to enable large-scale, multisite asthma studies to improve asthma care and research. Copyright © 2017 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  1. Screening pregnant women for suicidal behavior in electronic medical records: diagnostic codes vs. clinical notes processed by natural language processing.

    PubMed

    Zhong, Qiu-Yue; Karlson, Elizabeth W; Gelaye, Bizu; Finan, Sean; Avillach, Paul; Smoller, Jordan W; Cai, Tianxi; Williams, Michelle A

    2018-05-29

    We examined the comparative performance of structured, diagnostic codes vs. natural language processing (NLP) of unstructured text for screening suicidal behavior among pregnant women in electronic medical records (EMRs). Women aged 10-64 years with at least one diagnostic code related to pregnancy or delivery (N = 275,843) from Partners HealthCare were included as our "datamart." Diagnostic codes related to suicidal behavior were applied to the datamart to screen women for suicidal behavior. Among women without any diagnostic codes related to suicidal behavior (n = 273,410), 5880 women were randomly sampled, of whom 1120 had at least one mention of terms related to suicidal behavior in clinical notes. NLP was then used to process clinical notes for the 1120 women. Chart reviews were performed for subsamples of women. Using diagnostic codes, 196 pregnant women were screened positive for suicidal behavior, among whom 149 (76%) had confirmed suicidal behavior by chart review. Using NLP among those without diagnostic codes, 486 pregnant women were screened positive for suicidal behavior, among whom 146 (30%) had confirmed suicidal behavior by chart review. The use of NLP substantially improves the sensitivity of screening suicidal behavior in EMRs. However, the prevalence of confirmed suicidal behavior was lower among women who did not have diagnostic codes for suicidal behavior but screened positive by NLP. NLP should be used together with diagnostic codes for future EMR-based phenotyping studies for suicidal behavior.

  2. RNAi-mediated disruption of neuropeptide genes, nlp-3 and nlp-12, cause multiple behavioral defects in Meloidogyne incognita.

    PubMed

    Dash, Manoranjan; Dutta, Tushar K; Phani, Victor; Papolu, Pradeep K; Shivakumara, Tagginahalli N; Rao, Uma

    2017-08-26

    Owing to the current deficiencies in chemical control options and unavailability of novel management strategies, root-knot nematode (M. incognita) infections remain widespread with significant socio-economic impacts. Helminth nervous systems are peptide-rich and appear to be putative drug targets that could be exploited by antihelmintic chemotherapy. Herein, to characterize the novel peptidergic neurotransmitters, in silico mining of M. incognita genomic and transciptomic datasets revealed the presence of 16 neuropeptide-like protein (nlp) genes with structural hallmarks of neuropeptide preproproteins; among which 13 nlps were PCR-amplified and sequenced. Two key nlp genes (Mi-nlp-3 and Mi-nlp-12) were localized to the basal bulb and tail region of nematode body via in situ hybridization assay. Mi-nlp-3 and Mi-nlp-12 were greatly expressed (in qRT-PCR assay) in the pre-parasitic juveniles and adult females, suggesting the association of these genes in host recognition, development and reproduction of M. incognita. In vitro knockdown of Mi-nlp-3 and Mi-nlp-12 via RNAi demonstrated the significant reduction in attraction and penetration of M. incognita in tomato root in Pluronic gel medium. A pronounced perturbation in development and reproduction of NLP-silenced worms was also documented in adzuki beans in CYG growth pouches. The deleterious phenotypes obtained due to NLP knockdown suggests that transgenic plants engineered to express RNA constructs targeting nlp genes may emerge as an environmentally viable option to manage nematode problems in crop plants. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. Computer Assisted Reading in German as a Foreign Language, Developing and Testing an NLP-Based Application

    ERIC Educational Resources Information Center

    Wood, Peter

    2011-01-01

    "QuickAssist," the program presented in this paper, uses natural language processing (NLP) technologies. It places a range of NLP tools at the disposal of learners, intended to enable them to independently read and comprehend a German text of their choice while they extend their vocabulary, learn about different uses of particular words,…

  4. Natural Language Processing in Radiology: A Systematic Review.

    PubMed

    Pons, Ewoud; Braun, Loes M M; Hunink, M G Myriam; Kors, Jan A

    2016-05-01

    Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed. (©) RSNA, 2016 Online supplemental material is available for this article.

  5. Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text.

    PubMed

    Park, Albert; Hartzler, Andrea L; Huh, Jina; McDonald, David W; Pratt, Wanda

    2015-08-31

    The prevalence and value of patient-generated health text are increasing, but processing such text remains problematic. Although existing biomedical natural language processing (NLP) tools are appealing, most were developed to process clinician- or researcher-generated text, such as clinical notes or journal articles. In addition to being constructed for different types of text, other challenges of using existing NLP include constantly changing technologies, source vocabularies, and characteristics of text. These continuously evolving challenges warrant the need for applying low-cost systematic assessment. However, the primarily accepted evaluation method in NLP, manual annotation, requires tremendous effort and time. The primary objective of this study is to explore an alternative approach-using low-cost, automated methods to detect failures (eg, incorrect boundaries, missed terms, mismapped concepts) when processing patient-generated text with existing biomedical NLP tools. We first characterize common failures that NLP tools can make in processing online community text. We then demonstrate the feasibility of our automated approach in detecting these common failures using one of the most popular biomedical NLP tools, MetaMap. Using 9657 posts from an online cancer community, we explored our automated failure detection approach in two steps: (1) to characterize the failure types, we first manually reviewed MetaMap's commonly occurring failures, grouped the inaccurate mappings into failure types, and then identified causes of the failures through iterative rounds of manual review using open coding, and (2) to automatically detect these failure types, we then explored combinations of existing NLP techniques and dictionary-based matching for each failure cause. Finally, we manually evaluated the automatically detected failures. From our manual review, we characterized three types of failure: (1) boundary failures, (2) missed term failures, and (3) word ambiguity failures. Within these three failure types, we discovered 12 causes of inaccurate mappings of concepts. We used automated methods to detect almost half of 383,572 MetaMap's mappings as problematic. Word sense ambiguity failure was the most widely occurring, comprising 82.22% of failures. Boundary failure was the second most frequent, amounting to 15.90% of failures, while missed term failures were the least common, making up 1.88% of failures. The automated failure detection achieved precision, recall, accuracy, and F1 score of 83.00%, 92.57%, 88.17%, and 87.52%, respectively. We illustrate the challenges of processing patient-generated online health community text and characterize failures of NLP tools on this patient-generated health text, demonstrating the feasibility of our low-cost approach to automatically detect those failures. Our approach shows the potential for scalable and effective solutions to automatically assess the constantly evolving NLP tools and source vocabularies to process patient-generated text.

  6. Assessing Question Quality Using NLP

    ERIC Educational Resources Information Center

    Kopp, Kristopher J.; Johnson, Amy M.; Crossley, Scott A.; McNamara, Danielle S.

    2017-01-01

    An NLP algorithm was developed to assess question quality to inform feedback on questions generated by students within iSTART (an intelligent tutoring system that teaches reading strategies). A corpus of 4575 questions was coded using a four-level taxonomy. NLP indices were calculated for each question and machine learning was used to predict…

  7. The Promise of NLP and Speech Processing Technologies in Language Assessment

    ERIC Educational Resources Information Center

    Chapelle, Carol A.; Chung, Yoo-Ree

    2010-01-01

    Advances in natural language processing (NLP) and automatic speech recognition and processing technologies offer new opportunities for language testing. Despite their potential uses on a range of language test item types, relatively little work has been done in this area, and it is therefore not well understood by test developers, researchers or…

  8. Automated encoding of clinical documents based on natural language processing.

    PubMed

    Friedman, Carol; Shagina, Lyudmila; Lussier, Yves; Hripcsak, George

    2004-01-01

    The aim of this study was to develop a method based on natural language processing (NLP) that automatically maps an entire clinical document to codes with modifiers and to quantitatively evaluate the method. An existing NLP system, MedLEE, was adapted to automatically generate codes. The method involves matching of structured output generated by MedLEE consisting of findings and modifiers to obtain the most specific code. Recall and precision applied to Unified Medical Language System (UMLS) coding were evaluated in two separate studies. Recall was measured using a test set of 150 randomly selected sentences, which were processed using MedLEE. Results were compared with a reference standard determined manually by seven experts. Precision was measured using a second test set of 150 randomly selected sentences from which UMLS codes were automatically generated by the method and then validated by experts. Recall of the system for UMLS coding of all terms was .77 (95% CI.72-.81), and for coding terms that had corresponding UMLS codes recall was .83 (.79-.87). Recall of the system for extracting all terms was .84 (.81-.88). Recall of the experts ranged from .69 to .91 for extracting terms. The precision of the system was .89 (.87-.91), and precision of the experts ranged from .61 to .91. Extraction of relevant clinical information and UMLS coding were accomplished using a method based on NLP. The method appeared to be comparable to or better than six experts. The advantage of the method is that it maps text to codes along with other related information, rendering the coded output suitable for effective retrieval.

  9. Representing Information in Patient Reports Using Natural Language Processing and the Extensible Markup Language

    PubMed Central

    Friedman, Carol; Hripcsak, George; Shagina, Lyuda; Liu, Hongfang

    1999-01-01

    Objective: To design a document model that provides reliable and efficient access to clinical information in patient reports for a broad range of clinical applications, and to implement an automated method using natural language processing that maps textual reports to a form consistent with the model. Methods: A document model that encodes structured clinical information in patient reports while retaining the original contents was designed using the extensible markup language (XML), and a document type definition (DTD) was created. An existing natural language processor (NLP) was modified to generate output consistent with the model. Two hundred reports were processed using the modified NLP system, and the XML output that was generated was validated using an XML validating parser. Results: The modified NLP system successfully processed all 200 reports. The output of one report was invalid, and 199 reports were valid XML forms consistent with the DTD. Conclusions: Natural language processing can be used to automatically create an enriched document that contains a structured component whose elements are linked to portions of the original textual report. This integrated document model provides a representation where documents containing specific information can be accurately and efficiently retrieved by querying the structured components. If manual review of the documents is desired, the salient information in the original reports can also be identified and highlighted. Using an XML model of tagging provides an additional benefit in that software tools that manipulate XML documents are readily available. PMID:9925230

  10. Automatic Lung-RADS™ classification with a natural language processing system.

    PubMed

    Beyer, Sebastian E; McKee, Brady J; Regis, Shawn M; McKee, Andrea B; Flacke, Sebastian; El Saadawi, Gilan; Wald, Christoph

    2017-09-01

    Our aim was to train a natural language processing (NLP) algorithm to capture imaging characteristics of lung nodules reported in a structured CT report and suggest the applicable Lung-RADS™ (LR) category. Our study included structured, clinical reports of consecutive CT lung screening (CTLS) exams performed from 08/2014 to 08/2015 at an ACR accredited Lung Cancer Screening Center. All patients screened were at high-risk for lung cancer according to the NCCN Guidelines ® . All exams were interpreted by one of three radiologists credentialed to read CTLS exams using LR using a standard reporting template. Training and test sets consisted of consecutive exams. Lung screening exams were divided into two groups: three training sets (500, 120, and 383 reports each) and one final evaluation set (498 reports). NLP algorithm results were compared with the gold standard of LR category assigned by the radiologist. The sensitivity/specificity of the NLP algorithm to correctly assign LR categories for suspicious nodules (LR 4) and positive nodules (LR 3/4) were 74.1%/98.6% and 75.0%/98.8% respectively. The majority of mismatches occurred in cases where pulmonary findings were present not currently addressed by LR. Misclassifications also resulted from the failure to identify exams as follow-up and the failure to completely characterize part-solid nodules. In a sub-group analysis among structured reports with standardized language, the sensitivity and specificity to detect LR 4 nodules were 87.0% and 99.5%, respectively. An NLP system can accurately suggest the appropriate LR category from CTLS exam findings when standardized reporting is used.

  11. Automatic Lung-RADS™ classification with a natural language processing system

    PubMed Central

    Beyer, Sebastian E.; Regis, Shawn M.; McKee, Andrea B.; Flacke, Sebastian; El Saadawi, Gilan; Wald, Christoph

    2017-01-01

    Background Our aim was to train a natural language processing (NLP) algorithm to capture imaging characteristics of lung nodules reported in a structured CT report and suggest the applicable Lung-RADS™ (LR) category. Methods Our study included structured, clinical reports of consecutive CT lung screening (CTLS) exams performed from 08/2014 to 08/2015 at an ACR accredited Lung Cancer Screening Center. All patients screened were at high-risk for lung cancer according to the NCCN Guidelines®. All exams were interpreted by one of three radiologists credentialed to read CTLS exams using LR using a standard reporting template. Training and test sets consisted of consecutive exams. Lung screening exams were divided into two groups: three training sets (500, 120, and 383 reports each) and one final evaluation set (498 reports). NLP algorithm results were compared with the gold standard of LR category assigned by the radiologist. Results The sensitivity/specificity of the NLP algorithm to correctly assign LR categories for suspicious nodules (LR 4) and positive nodules (LR 3/4) were 74.1%/98.6% and 75.0%/98.8% respectively. The majority of mismatches occurred in cases where pulmonary findings were present not currently addressed by LR. Misclassifications also resulted from the failure to identify exams as follow-up and the failure to completely characterize part-solid nodules. In a sub-group analysis among structured reports with standardized language, the sensitivity and specificity to detect LR 4 nodules were 87.0% and 99.5%, respectively. Conclusions An NLP system can accurately suggest the appropriate LR category from CTLS exam findings when standardized reporting is used. PMID:29221286

  12. Validating a strategy for psychosocial phenotyping using a large corpus of clinical text.

    PubMed

    Gundlapalli, Adi V; Redd, Andrew; Carter, Marjorie; Divita, Guy; Shen, Shuying; Palmer, Miland; Samore, Matthew H

    2013-12-01

    To develop algorithms to improve efficiency of patient phenotyping using natural language processing (NLP) on text data. Of a large number of note titles available in our database, we sought to determine those with highest yield and precision for psychosocial concepts. From a database of over 1 billion documents from US Department of Veterans Affairs medical facilities, a random sample of 1500 documents from each of 218 enterprise note titles were chosen. Psychosocial concepts were extracted using a UIMA-AS-based NLP pipeline (v3NLP), using a lexicon of relevant concepts with negation and template format annotators. Human reviewers evaluated a subset of documents for false positives and sensitivity. High-yield documents were identified by hit rate and precision. Reasons for false positivity were characterized. A total of 58 707 psychosocial concepts were identified from 316 355 documents for an overall hit rate of 0.2 concepts per document (median 0.1, range 1.6-0). Of 6031 concepts reviewed from a high-yield set of note titles, the overall precision for all concept categories was 80%, with variability among note titles and concept categories. Reasons for false positivity included templating, negation, context, and alternate meaning of words. The sensitivity of the NLP system was noted to be 49% (95% CI 43% to 55%). Phenotyping using NLP need not involve the entire document corpus. Our methods offer a generalizable strategy for scaling NLP pipelines to large free text corpora with complex linguistic annotations in attempts to identify patients of a certain phenotype.

  13. Validating a strategy for psychosocial phenotyping using a large corpus of clinical text

    PubMed Central

    Gundlapalli, Adi V; Redd, Andrew; Carter, Marjorie; Divita, Guy; Shen, Shuying; Palmer, Miland; Samore, Matthew H

    2013-01-01

    Objective To develop algorithms to improve efficiency of patient phenotyping using natural language processing (NLP) on text data. Of a large number of note titles available in our database, we sought to determine those with highest yield and precision for psychosocial concepts. Materials and methods From a database of over 1 billion documents from US Department of Veterans Affairs medical facilities, a random sample of 1500 documents from each of 218 enterprise note titles were chosen. Psychosocial concepts were extracted using a UIMA-AS-based NLP pipeline (v3NLP), using a lexicon of relevant concepts with negation and template format annotators. Human reviewers evaluated a subset of documents for false positives and sensitivity. High-yield documents were identified by hit rate and precision. Reasons for false positivity were characterized. Results A total of 58 707 psychosocial concepts were identified from 316 355 documents for an overall hit rate of 0.2 concepts per document (median 0.1, range 1.6–0). Of 6031 concepts reviewed from a high-yield set of note titles, the overall precision for all concept categories was 80%, with variability among note titles and concept categories. Reasons for false positivity included templating, negation, context, and alternate meaning of words. The sensitivity of the NLP system was noted to be 49% (95% CI 43% to 55%). Conclusions Phenotyping using NLP need not involve the entire document corpus. Our methods offer a generalizable strategy for scaling NLP pipelines to large free text corpora with complex linguistic annotations in attempts to identify patients of a certain phenotype. PMID:24169276

  14. Speech Processing and Recognition (SPaRe)

    DTIC Science & Technology

    2011-01-01

    results in the areas of automatic speech recognition (ASR), speech processing, machine translation (MT), natural language processing ( NLP ), and...Processing ( NLP ), Information Retrieval (IR) 16. SECURITY CLASSIFICATION OF: UNCLASSIFED 17. LIMITATION OF ABSTRACT 18. NUMBER OF PAGES 19a. NAME...Figure 9, the IOC was only expected to provide document submission and search; automatic speech recognition (ASR) for English, Spanish, Arabic , and

  15. A bibliometric analysis of natural language processing in medical research.

    PubMed

    Chen, Xieling; Xie, Haoran; Wang, Fu Lee; Liu, Ziqing; Xu, Juan; Hao, Tianyong

    2018-03-22

    Natural language processing (NLP) has become an increasingly significant role in advancing medicine. Rich research achievements of NLP methods and applications for medical information processing are available. It is of great significance to conduct a deep analysis to understand the recent development of NLP-empowered medical research field. However, limited study examining the research status of this field could be found. Therefore, this study aims to quantitatively assess the academic output of NLP in medical research field. We conducted a bibliometric analysis on NLP-empowered medical research publications retrieved from PubMed in the period 2007-2016. The analysis focused on three aspects. Firstly, the literature distribution characteristics were obtained with a statistics analysis method. Secondly, a network analysis method was used to reveal scientific collaboration relations. Finally, thematic discovery and evolution was reflected using an affinity propagation clustering method. There were 1405 NLP-empowered medical research publications published during the 10 years with an average annual growth rate of 18.39%. 10 most productive publication sources together contributed more than 50% of the total publications. The USA had the highest number of publications. A moderately significant correlation between country's publications and GDP per capita was revealed. Denny, Joshua C was the most productive author. Mayo Clinic was the most productive affiliation. The annual co-affiliation and co-country rates reached 64.04% and 15.79% in 2016, respectively. 10 main great thematic areas were identified including Computational biology, Terminology mining, Information extraction, Text classification, Social medium as data source, Information retrieval, etc. CONCLUSIONS: A bibliometric analysis of NLP-empowered medical research publications for uncovering the recent research status is presented. The results can assist relevant researchers, especially newcomers in understanding the research development systematically, seeking scientific cooperation partners, optimizing research topic choices and monitoring new scientific or technological activities.

  16. Aurora B Interaction of Centrosomal Nlp Regulates Cytokinesis*

    PubMed Central

    Yan, Jie; Jin, Shunqian; Li, Jia; Zhan, Qimin

    2010-01-01

    Cytokinesis is a fundamental cellular process, which ensures equal abscission and fosters diploid progenies. Aberrant cytokinesis may result in genomic instability and cell transformation. However, the underlying regulatory machinery of cytokinesis is largely undefined. Here, we demonstrate that Nlp (Ninein-like protein), a recently identified BRCA1-associated centrosomal protein that is required for centrosomes maturation at interphase and spindle formation in mitosis, also contributes to the accomplishment of cytokinesis. Through immunofluorescent analysis, Nlp is found to localize at midbody during cytokinesis. Depletion of endogenous Nlp triggers aborted division and subsequently leads to multinucleated phenotypes. Nlp can be recruited by Aurora B to the midbody apparatus via their physical association at the late stage of mitosis. Disruption of their interaction induces aborted cytokinesis. Importantly, Nlp is characterized as a novel substrate of Aurora B and can be phosphorylated by Aurora B. The specific phosphorylation sites are mapped at Ser-185, Ser-448, and Ser-585. The phosphorylation at Ser-448 and Ser-585 is likely required for Nlp association with Aurora B and localization at midbody. Meanwhile, the phosphorylation at Ser-185 is vital to Nlp protein stability. Disruptions of these phosphorylation sites abolish cytokinesis and lead to chromosomal instability. Collectively, these observations demonstrate that regulation of Nlp by Aurora B is critical for the completion of cytokinesis, providing novel insights into understanding the machinery of cell cycle progression. PMID:20864540

  17. Aurora B interaction of centrosomal Nlp regulates cytokinesis.

    PubMed

    Yan, Jie; Jin, Shunqian; Li, Jia; Zhan, Qimin

    2010-12-17

    Cytokinesis is a fundamental cellular process, which ensures equal abscission and fosters diploid progenies. Aberrant cytokinesis may result in genomic instability and cell transformation. However, the underlying regulatory machinery of cytokinesis is largely undefined. Here, we demonstrate that Nlp (Ninein-like protein), a recently identified BRCA1-associated centrosomal protein that is required for centrosomes maturation at interphase and spindle formation in mitosis, also contributes to the accomplishment of cytokinesis. Through immunofluorescent analysis, Nlp is found to localize at midbody during cytokinesis. Depletion of endogenous Nlp triggers aborted division and subsequently leads to multinucleated phenotypes. Nlp can be recruited by Aurora B to the midbody apparatus via their physical association at the late stage of mitosis. Disruption of their interaction induces aborted cytokinesis. Importantly, Nlp is characterized as a novel substrate of Aurora B and can be phosphorylated by Aurora B. The specific phosphorylation sites are mapped at Ser-185, Ser-448, and Ser-585. The phosphorylation at Ser-448 and Ser-585 is likely required for Nlp association with Aurora B and localization at midbody. Meanwhile, the phosphorylation at Ser-185 is vital to Nlp protein stability. Disruptions of these phosphorylation sites abolish cytokinesis and lead to chromosomal instability. Collectively, these observations demonstrate that regulation of Nlp by Aurora B is critical for the completion of cytokinesis, providing novel insights into understanding the machinery of cell cycle progression.

  18. Drosophila TAP/p32 is a core histone chaperone that cooperates with NAP-1, NLP, and nucleophosmin in sperm chromatin remodeling during fertilization

    PubMed Central

    Emelyanov, Alexander V.; Rabbani, Joshua; Mehta, Monika; Vershilova, Elena; Keogh, Michael C.

    2014-01-01

    Nuclear DNA in the male gamete of sexually reproducing animals is organized as sperm chromatin compacted primarily by sperm-specific protamines. Fertilization leads to sperm chromatin remodeling, during which protamines are expelled and replaced by histones. Despite our increased understanding of the factors that mediate nucleosome assembly in the nascent male pronucleus, the machinery for protamine removal remains largely unknown. Here we identify four Drosophila protamine chaperones that mediate the dissociation of protamine–DNA complexes: NAP-1, NLP, and nucleophosmin are previously characterized histone chaperones, and TAP/p32 has no known function in chromatin metabolism. We show that TAP/p32 is required for the removal of Drosophila protamine B in vitro, whereas NAP-1, NLP, and Nph share roles in the removal of protamine A. Embryos from P32-null females show defective formation of the male pronucleus in vivo. TAP/p32, similar to NAP-1, NLP, and Nph, facilitates nucleosome assembly in vitro and is therefore a histone chaperone. Furthermore, mutants of P32, Nlp, and Nph exhibit synthetic-lethal genetic interactions. In summary, we identified factors mediating protamine removal from DNA and reconstituted in a defined system the process of sperm chromatin remodeling that exchanges protamines for histones to form the nucleosome-based chromatin characteristic of somatic cells. PMID:25228646

  19. Natural Language Processing Technologies in Radiology Research and Clinical Applications.

    PubMed

    Cai, Tianrun; Giannopoulos, Andreas A; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K; Rybicki, Frank J; Mitsouras, Dimitrios

    2016-01-01

    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively "mine" these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. "Intelligent" search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016.

  20. Natural Language Processing Technologies in Radiology Research and Clinical Applications

    PubMed Central

    Cai, Tianrun; Giannopoulos, Andreas A.; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K.; Rybicki, Frank J.

    2016-01-01

    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively “mine” these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. “Intelligent” search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016 PMID:26761536

  1. An Overview of Computer-Based Natural Language Processing.

    ERIC Educational Resources Information Center

    Gevarter, William B.

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines using natural languages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields of artificial intelligence and computational linguistics. Commercial…

  2. NLPIR: A Theoretical Framework for Applying Natural Language Processing to Information Retrieval.

    ERIC Educational Resources Information Center

    Zhou, Lina; Zhang, Dongsong

    2003-01-01

    Proposes a theoretical framework called NLPIR that integrates natural language processing (NLP) into information retrieval (IR) based on the assumption that there exists representation distance between queries and documents. Discusses problems in traditional keyword-based IR, including relevance, and describes some existing NLP techniques.…

  3. Performance of a Natural Language Processing (NLP) Tool to Extract Pulmonary Function Test (PFT) Reports from Structured and Semistructured Veteran Affairs (VA) Data.

    PubMed

    Sauer, Brian C; Jones, Barbara E; Globe, Gary; Leng, Jianwei; Lu, Chao-Chin; He, Tao; Teng, Chia-Chen; Sullivan, Patrick; Zeng, Qing

    2016-01-01

    Pulmonary function tests (PFTs) are objective estimates of lung function, but are not reliably stored within the Veteran Health Affairs data systems as structured data. The aim of this study was to validate the natural language processing (NLP) tool we developed-which extracts spirometric values and responses to bronchodilator administration-against expert review, and to estimate the number of additional spirometric tests identified beyond the structured data. All patients at seven Veteran Affairs Medical Centers with a diagnostic code for asthma Jan 1, 2006-Dec 31, 2012 were included. Evidence of spirometry with a bronchodilator challenge (BDC) was extracted from structured data as well as clinical documents. NLP's performance was compared against a human reference standard using a random sample of 1,001 documents. In the validation set NLP demonstrated a precision of 98.9 percent (95 percent confidence intervals (CI): 93.9 percent, 99.7 percent), recall of 97.8 percent (95 percent CI: 92.2 percent, 99.7 percent), and an F-measure of 98.3 percent for the forced vital capacity pre- and post pairs and precision of 100 percent (95 percent CI: 96.6 percent, 100 percent), recall of 100 percent (95 percent CI: 96.6 percent, 100 percent), and an F-measure of 100 percent for the forced expiratory volume in one second pre- and post pairs for bronchodilator administration. Application of the NLP increased the proportion identified with complete bronchodilator challenge by 25 percent. This technology can improve identification of PFTs for epidemiologic research. Caution must be taken in assuming that a single domain of clinical data can completely capture the scope of a disease, treatment, or clinical test.

  4. A pharmacological study of NLP-12 neuropeptide signaling in free-living and parasitic nematodes.

    PubMed

    Peeters, Lise; Janssen, Tom; De Haes, Wouter; Beets, Isabel; Meelkop, Ellen; Grant, Warwick; Schoofs, Liliane

    2012-03-01

    NLP-12a and b have been identified as cholecystokinin/sulfakinin-like neuropeptides in the free-living nematode Caenorhabditis elegans. They are suggested to play an important role in the regulation of digestive enzyme secretion and fat storage. This study reports on the identification and characterization of an NLP-12-like peptide precursor gene in the rat parasitic nematode Strongyloides ratti. The S. ratti NLP-12 peptides are able to activate both C. elegans CKR-2 receptor isoforms in a dose-dependent way with affinities in the same nanomolar range as the native C. elegans NLP-12 peptides. The C-terminal RPLQFamide sequence motif of the NLP-12 peptides is perfectly conserved between free-living and parasitic nematodes. Based on systemic amino acid replacements the Arg-, Leu- and Phe- residues appear to be critical for high-affinity receptor binding. Finally, a SAR analysis revealed the essential pharmacophore in C. elegans NLP-12b to be the pentapeptide RPLQFamide. Copyright © 2011 Elsevier Inc. All rights reserved.

  5. Automated chart review utilizing natural language processing algorithm for asthma predictive index.

    PubMed

    Kaur, Harsheen; Sohn, Sunghwan; Wi, Chung-Il; Ryu, Euijung; Park, Miguel A; Bachman, Kay; Kita, Hirohito; Croghan, Ivana; Castro-Rodriguez, Jose A; Voge, Gretchen A; Liu, Hongfang; Juhn, Young J

    2018-02-13

    Thus far, no algorithms have been developed to automatically extract patients who meet Asthma Predictive Index (API) criteria from the Electronic health records (EHR) yet. Our objective is to develop and validate a natural language processing (NLP) algorithm to identify patients that meet API criteria. This is a cross-sectional study nested in a birth cohort study in Olmsted County, MN. Asthma status ascertained by manual chart review based on API criteria served as gold standard. NLP-API was developed on a training cohort (n = 87) and validated on a test cohort (n = 427). Criterion validity was measured by sensitivity, specificity, positive predictive value and negative predictive value of the NLP algorithm against manual chart review for asthma status. Construct validity was determined by associations of asthma status defined by NLP-API with known risk factors for asthma. Among the eligible 427 subjects of the test cohort, 48% were males and 74% were White. Median age was 5.3 years (interquartile range 3.6-6.8). 35 (8%) had a history of asthma by NLP-API vs. 36 (8%) by abstractor with 31 by both approaches. NLP-API predicted asthma status with sensitivity 86%, specificity 98%, positive predictive value 88%, negative predictive value 98%. Asthma status by both NLP and manual chart review were significantly associated with the known asthma risk factors, such as history of allergic rhinitis, eczema, family history of asthma, and maternal history of smoking during pregnancy (p value < 0.05). Maternal smoking [odds ratio: 4.4, 95% confidence interval 1.8-10.7] was associated with asthma status determined by NLP-API and abstractor, and the effect sizes were similar between the reviews with 4.4 vs 4.2 respectively. NLP-API was able to ascertain asthma status in children mining from EHR and has a potential to enhance asthma care and research through population management and large-scale studies when identifying children who meet API criteria.

  6. A CRF-based system for recognizing chemical entity mentions (CEMs) in biomedical literature

    PubMed Central

    2015-01-01

    Background In order to improve information access on chemical compounds and drugs (chemical entities) described in text repositories, it is very crucial to be able to identify chemical entity mentions (CEMs) automatically within text. The CHEMDNER challenge in BioCreative IV was specially designed to promote the implementation of corresponding systems that are able to detect mentions of chemical compounds and drugs, which has two subtasks: CDI (Chemical Document Indexing) and CEM. Results Our system processing pipeline consists of three major components: pre-processing (sentence detection, tokenization), recognition (CRF-based approach), and post-processing (rule-based approach and format conversion). In our post-challenge system, the cost parameter in CRF model was optimized by 10-fold cross validation with grid search, and word representations feature induced by Brown clustering method was introduced. For the CEM subtask, our official runs were ranked in top position by obtaining maximum 88.79% precision, 69.08% recall and 77.70% balanced F-measure, which were improved further to 88.43% precision, 76.48% recall and 82.02% balanced F-measure in our post-challenge system. Conclusions In our system, instead of extracting a CEM as a whole, we regarded it as a sequence labeling problem. Though our current system has much room for improvement, our system is valuable in showing that the performance in term of balanced F-measure can be improved largely by utilizing large amounts of relatively inexpensive un-annotated PubMed abstracts and optimizing the cost parameter in CRF model. From our practice and lessons, if one directly utilizes some open-source natural language processing (NLP) toolkits, such as OpenNLP, Standford CoreNLP, false positive (FP) rate may be very high. It is better to develop some additional rules to minimize the FP rate if one does not want to re-train the related models. Our CEM recognition system is available at: http://www.SciTeMiner.org/XuShuo/Demo/CEM. PMID:25810768

  7. Community challenges in biomedical text mining over 10 years: success, failure and the future

    PubMed Central

    Huang, Chung-Chi

    2016-01-01

    One effective way to improve the state of the art is through competitions. Following the success of the Critical Assessment of protein Structure Prediction (CASP) in bioinformatics research, a number of challenge evaluations have been organized by the text-mining research community to assess and advance natural language processing (NLP) research for biomedicine. In this article, we review the different community challenge evaluations held from 2002 to 2014 and their respective tasks. Furthermore, we examine these challenge tasks through their targeted problems in NLP research and biomedical applications, respectively. Next, we describe the general workflow of organizing a Biomedical NLP (BioNLP) challenge and involved stakeholders (task organizers, task data producers, task participants and end users). Finally, we summarize the impact and contributions by taking into account different BioNLP challenges as a whole, followed by a discussion of their limitations and difficulties. We conclude with future trends in BioNLP challenge evaluations. PMID:25935162

  8. An annotated corpus with nanomedicine and pharmacokinetic parameters

    PubMed Central

    Lewinski, Nastassja A; Jimenez, Ivan; McInnes, Bridget T

    2017-01-01

    A vast amount of data on nanomedicines is being generated and published, and natural language processing (NLP) approaches can automate the extraction of unstructured text-based data. Annotated corpora are a key resource for NLP and information extraction methods which employ machine learning. Although corpora are available for pharmaceuticals, resources for nanomedicines and nanotechnology are still limited. To foster nanotechnology text mining (NanoNLP) efforts, we have constructed a corpus of annotated drug product inserts taken from the US Food and Drug Administration’s Drugs@FDA online database. In this work, we present the development of the Engineered Nanomedicine Database corpus to support the evaluation of nanomedicine entity extraction. The data were manually annotated for 21 entity mentions consisting of nanomedicine physicochemical characterization, exposure, and biologic response information of 41 Food and Drug Administration-approved nanomedicines. We evaluate the reliability of the manual annotations and demonstrate the use of the corpus by evaluating two state-of-the-art named entity extraction systems, OpenNLP and Stanford NER. The annotated corpus is available open source and, based on these results, guidelines and suggestions for future development of additional nanomedicine corpora are provided. PMID:29066897

  9. Automated pancreatic cyst screening using natural language processing: a new tool in the early detection of pancreatic cancer

    PubMed Central

    Roch, Alexandra M; Mehrabi, Saeed; Krishnan, Anand; Schmidt, Heidi E; Kesterson, Joseph; Beesley, Chris; Dexter, Paul R; Palakal, Mathew; Schmidt, C Max

    2015-01-01

    Introduction As many as 3% of computed tomography (CT) scans detect pancreatic cysts. Because pancreatic cysts are incidental, ubiquitous and poorly understood, follow-up is often not performed. Pancreatic cysts may have a significant malignant potential and their identification represents a ‘window of opportunity’ for the early detection of pancreatic cancer. The purpose of this study was to implement an automated Natural Language Processing (NLP)-based pancreatic cyst identification system. Method A multidisciplinary team was assembled. NLP-based identification algorithms were developed based on key words commonly used by physicians to describe pancreatic cysts and programmed for automated search of electronic medical records. A pilot study was conducted prospectively in a single institution. Results From March to September 2013, 566 233 reports belonging to 50 669 patients were analysed. The mean number of patients reported with a pancreatic cyst was 88/month (range 78–98). The mean sensitivity and specificity were 99.9% and 98.8%, respectively. Conclusion NLP is an effective tool to automatically identify patients with pancreatic cysts based on electronic medical records (EMR). This highly accurate system can help capture patients ‘at-risk’ of pancreatic cancer in a registry. PMID:25537257

  10. Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing.

    PubMed

    Demner-Fushman, D; Elhadad, N

    2016-11-10

    This paper reviews work over the past two years in Natural Language Processing (NLP) applied to clinical and consumer-generated texts. We included any application or methodological publication that leverages text to facilitate healthcare and address the health-related needs of consumers and populations. Many important developments in clinical text processing, both foundational and task-oriented, were addressed in community- wide evaluations and discussed in corresponding special issues that are referenced in this review. These focused issues and in-depth reviews of several other active research areas, such as pharmacovigilance and summarization, allowed us to discuss in greater depth disease modeling and predictive analytics using clinical texts, and text analysis in social media for healthcare quality assessment, trends towards online interventions based on rapid analysis of health-related posts, and consumer health question answering, among other issues. Our analysis shows that although clinical NLP continues to advance towards practical applications and more NLP methods are used in large-scale live health information applications, more needs to be done to make NLP use in clinical applications a routine widespread reality. Progress in clinical NLP is mirrored by developments in social media text analysis: the research is moving from capturing trends to addressing individual health-related posts, thus showing potential to become a tool for precision medicine and a valuable addition to the standard healthcare quality evaluation tools.

  11. A Hybrid Approach to Clinical Question Answering

    DTIC Science & Technology

    2014-11-01

    participation in TREC, we submitted a single run using a hybrid Natural Language Processing ( NLP )-driven approach to accomplish the given task. Evaluation re...for the CDS track uses a variety of NLP - based techniques to address the clinical questions provided. We present a description of our approach, and...discuss our experimental setup, results and eval- uation in the subsequent sections. 2 Description of Our Approach Our hybrid NLP -driven method presents a

  12. Semantic characteristics of NLP-extracted concepts in clinical notes vs. biomedical literature.

    PubMed

    Wu, Stephen; Liu, Hongfang

    2011-01-01

    Natural language processing (NLP) has become crucial in unlocking information stored in free text, from both clinical notes and biomedical literature. Clinical notes convey clinical information related to individual patient health care, while biomedical literature communicates scientific findings. This work focuses on semantic characterization of texts at an enterprise scale, comparing and contrasting the two domains and their NLP approaches. We analyzed the empirical distributional characteristics of NLP-discovered named entities in Mayo Clinic clinical notes from 2001-2010, and in the 2011 MetaMapped Medline Baseline. We give qualitative and quantitative measures of domain similarity and point to the feasibility of transferring resources and techniques. An important by-product for this study is the development of a weighted ontology for each domain, which gives distributional semantic information that may be used to improve NLP applications.

  13. Differentiation of ileostomy from colostomy procedures: assessing the accuracy of current procedural terminology codes and the utility of natural language processing.

    PubMed

    Vo, Elaine; Davila, Jessica A; Hou, Jason; Hodge, Krystle; Li, Linda T; Suliburk, James W; Kao, Lillian S; Berger, David H; Liang, Mike K

    2013-08-01

    Large databases provide a wealth of information for researchers, but identifying patient cohorts often relies on the use of current procedural terminology (CPT) codes. In particular, studies of stoma surgery have been limited by the accuracy of CPT codes in identifying and differentiating ileostomy procedures from colostomy procedures. It is important to make this distinction because the prevalence of complications associated with stoma formation and reversal differ dramatically between types of stoma. Natural language processing (NLP) is a process that allows text-based searching. The Automated Retrieval Console is an NLP-based software that allows investigators to design and perform NLP-assisted document classification. In this study, we evaluated the role of CPT codes and NLP in differentiating ileostomy from colostomy procedures. Using CPT codes, we conducted a retrospective study that identified all patients undergoing a stoma-related procedure at a single institution between January 2005 and December 2011. All operative reports during this time were reviewed manually to abstract the following variables: formation or reversal and ileostomy or colostomy. Sensitivity and specificity for validation of the CPT codes against the mastery surgery schedule were calculated. Operative reports were evaluated by use of NLP to differentiate ileostomy- from colostomy-related procedures. Sensitivity and specificity for identifying patients with ileostomy or colostomy procedures were calculated for CPT codes and NLP for the entire cohort. CPT codes performed well in identifying stoma procedures (sensitivity 87.4%, specificity 97.5%). A total of 664 stoma procedures were identified by CPT codes between 2005 and 2011. The CPT codes were adequate in identifying stoma formation (sensitivity 97.7%, specificity 72.4%) and stoma reversal (sensitivity 74.1%, specificity 98.7%), but they were inadequate in identifying ileostomy (sensitivity 35.0%, specificity 88.1%) and colostomy (75.2% and 80.9%). NLP performed with greater sensitivity, specificity, and accuracy than CPT codes in identifying stoma procedures and stoma types. Major differences where NLP outperformed CPT included identifying ileostomy (specificity 95.8%, sensitivity 88.3%, and accuracy 91.5%) and colostomy (97.6%, 90.5%, and 92.8%, respectively). CPT codes can identify effectively patients who have had stoma procedures and are adequate in distinguishing between formation and reversal; however, CPT codes cannot differentiate ileostomy from colostomy. NLP can be used to differentiate between ileostomy- and colostomy-related procedures. The role of NLP in conjunction with electronic medical records in data retrieval warrants further investigation. Published by Mosby, Inc.

  14. Natural language processing in pathology: a scoping review.

    PubMed

    Burger, Gerard; Abu-Hanna, Ameen; de Keizer, Nicolette; Cornet, Ronald

    2016-07-22

    Encoded pathology data are key for medical registries and analyses, but pathology information is often expressed as free text. We reviewed and assessed the use of NLP (natural language processing) for encoding pathology documents. Papers addressing NLP in pathology were retrieved from PubMed, Association for Computing Machinery (ACM) Digital Library and Association for Computational Linguistics (ACL) Anthology. We reviewed and summarised the study objectives; NLP methods used and their validation; software implementations; the performance on the dataset used and any reported use in practice. The main objectives of the 38 included papers were encoding and extraction of clinically relevant information from pathology reports. Common approaches were word/phrase matching, probabilistic machine learning and rule-based systems. Five papers (13%) compared different methods on the same dataset. Four papers did not specify the method(s) used. 18 of the 26 studies that reported F-measure, recall or precision reported values of over 0.9. Proprietary software was the most frequently mentioned category (14 studies); General Architecture for Text Engineering (GATE) was the most applied architecture overall. Practical system use was reported in four papers. Most papers used expert annotation validation. Different methods are used in NLP research in pathology, and good performances, that is, high precision and recall, high retrieval/removal rates, are reported for all of these. Lack of validation and of shared datasets precludes performance comparison. More comparative analysis and validation are needed to provide better insight into the performance and merits of these methods. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  15. Development of a Natural Language Processing Engine to Generate Bladder Cancer Pathology Data for Health Services Research.

    PubMed

    Schroeck, Florian R; Patterson, Olga V; Alba, Patrick R; Pattison, Erik A; Seigne, John D; DuVall, Scott L; Robertson, Douglas J; Sirovich, Brenda; Goodney, Philip P

    2017-12-01

    To take the first step toward assembling population-based cohorts of patients with bladder cancer with longitudinal pathology data, we developed and validated a natural language processing (NLP) engine that abstracts pathology data from full-text pathology reports. Using 600 bladder pathology reports randomly selected from the Department of Veterans Affairs, we developed and validated an NLP engine to abstract data on histology, invasion (presence vs absence and depth), grade, the presence of muscularis propria, and the presence of carcinoma in situ. Our gold standard was based on an independent review of reports by 2 urologists, followed by adjudication. We assessed the NLP performance by calculating the accuracy, the positive predictive value, and the sensitivity. We subsequently applied the NLP engine to pathology reports from 10,725 patients with bladder cancer. When comparing the NLP output to the gold standard, NLP achieved the highest accuracy (0.98) for the presence vs the absence of carcinoma in situ. Accuracy for histology, invasion (presence vs absence), grade, and the presence of muscularis propria ranged from 0.83 to 0.96. The most challenging variable was depth of invasion (accuracy 0.68), with an acceptable positive predictive value for lamina propria (0.82) and for muscularis propria (0.87) invasion. The validated engine was capable of abstracting pathologic characteristics for 99% of the patients with bladder cancer. NLP had high accuracy for 5 of 6 variables and abstracted data for the vast majority of the patients. This now allows for the assembly of population-based cohorts with longitudinal pathology data. Published by Elsevier Inc.

  16. Creation of Lung-Targeted Dexamethasone Immunoliposome and Its Therapeutic Effect on Bleomycin-Induced Lung Injury in Rats

    PubMed Central

    Li, Nan; Hu, Yang; Zhang, Yuan; Xu, Jin-Fu; Li, Xia; Ren, Jie; Su, Bo; Yuan, Wei-Zhong; Teng, Xin-Rong; Zhang, Rong-Xuan; Jiang, Dian-hua; Mulet, Xavier; Li, Hui-Ping

    2013-01-01

    Objective Acute lung injury (ALI), is a major cause of morbidity and mortality, which is routinely treated with the administration of systemic glucocorticoids. The current study investigated the distribution and therapeutic effect of a dexamethasone(DXM)-loaded immunoliposome (NLP) functionalized with pulmonary surfactant protein A (SP-A) antibody (SPA-DXM-NLP) in an animal model. Methods DXM-NLP was prepared using film dispersion combined with extrusion techniques. SP-A antibody was used as the lung targeting agent. Tissue distribution of SPA-DXM-NLP was investigated in liver, spleen, kidney and lung tissue. The efficacy of SPA-DXM-NLP against lung injury was assessed in a rat model of bleomycin-induced acute lung injury. Results The SPA-DXM-NLP complex was successfully synthesized and the particles were stable at 4°C. Pulmonary dexamethasone levels were 40 times higher with SPA-DXM-NLP than conventional dexamethasone injection. Administration of SPA-DXM-NLP significantly attenuated lung injury and inflammation, decreased incidence of infection, and increased survival in animal models. Conclusions The administration of SPA-DXM-NLP to animal models resulted in increased levels of DXM in the lungs, indicating active targeting. The efficacy against ALI of the immunoliposomes was shown to be superior to conventional dexamethasone administration. These results demonstrate the potential of actively targeted glucocorticoid therapy in the treatment of lung disease in clinical practice. PMID:23516459

  17. Drosophila TAP/p32 is a core histone chaperone that cooperates with NAP-1, NLP, and nucleophosmin in sperm chromatin remodeling during fertilization.

    PubMed

    Emelyanov, Alexander V; Rabbani, Joshua; Mehta, Monika; Vershilova, Elena; Keogh, Michael C; Fyodorov, Dmitry V

    2014-09-15

    Nuclear DNA in the male gamete of sexually reproducing animals is organized as sperm chromatin compacted primarily by sperm-specific protamines. Fertilization leads to sperm chromatin remodeling, during which protamines are expelled and replaced by histones. Despite our increased understanding of the factors that mediate nucleosome assembly in the nascent male pronucleus, the machinery for protamine removal remains largely unknown. Here we identify four Drosophila protamine chaperones that mediate the dissociation of protamine-DNA complexes: NAP-1, NLP, and nucleophosmin are previously characterized histone chaperones, and TAP/p32 has no known function in chromatin metabolism. We show that TAP/p32 is required for the removal of Drosophila protamine B in vitro, whereas NAP-1, NLP, and Nph share roles in the removal of protamine A. Embryos from P32-null females show defective formation of the male pronucleus in vivo. TAP/p32, similar to NAP-1, NLP, and Nph, facilitates nucleosome assembly in vitro and is therefore a histone chaperone. Furthermore, mutants of P32, Nlp, and Nph exhibit synthetic-lethal genetic interactions. In summary, we identified factors mediating protamine removal from DNA and reconstituted in a defined system the process of sperm chromatin remodeling that exchanges protamines for histones to form the nucleosome-based chromatin characteristic of somatic cells. © 2014 Emelyanov et al.; Published by Cold Spring Harbor Laboratory Press.

  18. Recurrent Artificial Neural Networks and Finite State Natural Language Processing.

    ERIC Educational Resources Information Center

    Moisl, Hermann

    It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…

  19. The Application of Natural Language Processing to Augmentative and Alternative Communication

    ERIC Educational Resources Information Center

    Higginbotham, D. Jeffery; Lesher, Gregory W.; Moulton, Bryan J.; Roark, Brian

    2012-01-01

    Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next…

  20. Natural Language Processing and Its Implications for the Future of Medication Safety: A Narrative Review of Recent Advances and Challenges.

    PubMed

    Wong, Adrian; Plasek, Joseph M; Montecalvo, Steven P; Zhou, Li

    2018-06-09

    The safety of medication use has been a priority in the United States since the late 1930s. Recently, it has gained prominence due to the increasing amount of data suggesting that a large amount of patient harm is preventable and can be mitigated with effective risk strategies that have not been sufficiently adopted. Adverse events from medications are part of clinical practice, but the ability to identify a patient's risk and to minimize that risk must be a priority. The ability to identify adverse events has been a challenge due to limitations of available data sources, which are often free text. The use of natural language processing (NLP) may help to address these limitations. NLP is the artificial intelligence domain of computer science that uses computers to manipulate unstructured data (i.e., narrative text or speech data) in the context of a specific task. In this narrative review, we illustrate the fundamentals of NLP and discuss NLP's application to medication safety in four data sources: electronic health records, Internet-based data, published literature, and reporting systems. Given the magnitude of available data from these sources, a growing area is the use of computer algorithms to help automatically detect associations between medications and adverse effects. The main benefit of NLP is in the time savings associated with automation of various medication safety tasks such as the medication reconciliation process facilitated by computers, as well as the potential for near-real time identification of adverse events for postmarketing surveillance such as those posted on social media that would otherwise go unanalyzed. NLP is limited by a lack of data sharing between health care organizations due to insufficient interoperability capabilities, inhibiting large-scale adverse event monitoring across populations. We anticipate that future work in this area will focus on the integration of data sources from different domains to improve the ability to identify potential adverse events more quickly and to improve clinical decision support with regard to a patient's estimated risk for specific adverse events at the time of medication prescription or review. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  1. An Intelligent Computer Assisted Language Learning System for Arabic Learners

    ERIC Educational Resources Information Center

    Shaalan, Khaled F.

    2005-01-01

    This paper describes the development of an intelligent computer-assisted language learning (ICALL) system for learning Arabic. This system could be used for learning Arabic by students at primary schools or by learners of Arabic as a second or foreign language. It explores the use of Natural Language Processing (NLP) techniques for learning…

  2. Overexpression of Arabidopsis NLP7 improves plant growth under both nitrogen-limiting and -sufficient conditions by enhancing nitrogen and carbon assimilation.

    PubMed

    Yu, Lin-Hui; Wu, Jie; Tang, Hui; Yuan, Yang; Wang, Shi-Mei; Wang, Yu-Ping; Zhu, Qi-Sheng; Li, Shi-Gui; Xiang, Cheng-Bin

    2016-06-13

    Nitrogen is essential for plant survival and growth. Excessive application of nitrogenous fertilizer has generated serious environment pollution and increased production cost in agriculture. To deal with this problem, tremendous efforts have been invested worldwide to increase the nitrogen use ability of crops. However, only limited success has been achieved to date. Here we report that NLP7 (NIN-LIKE PROTEIN 7) is a potential candidate to improve plant nitrogen use ability. When overexpressed in Arabidopsis, NLP7 increases plant biomass under both nitrogen-poor and -rich conditions with better-developed root system and reduced shoot/root ratio. NLP7-overexpressing plants show a significant increase in key nitrogen metabolites, nitrogen uptake, total nitrogen content, and expression levels of genes involved in nitrogen assimilation and signalling. More importantly, overexpression of NLP7 also enhances photosynthesis rate and carbon assimilation, whereas knockout of NLP7 impaired both nitrogen and carbon assimilation. In addition, NLP7 improves plant growth and nitrogen use in transgenic tobacco (Nicotiana tabacum). Our results demonstrate that NLP7 significantly improves plant growth under both nitrogen-poor and -rich conditions by coordinately enhancing nitrogen and carbon assimilation and sheds light on crop improvement.

  3. Community challenges in biomedical text mining over 10 years: success, failure and the future.

    PubMed

    Huang, Chung-Chi; Lu, Zhiyong

    2016-01-01

    One effective way to improve the state of the art is through competitions. Following the success of the Critical Assessment of protein Structure Prediction (CASP) in bioinformatics research, a number of challenge evaluations have been organized by the text-mining research community to assess and advance natural language processing (NLP) research for biomedicine. In this article, we review the different community challenge evaluations held from 2002 to 2014 and their respective tasks. Furthermore, we examine these challenge tasks through their targeted problems in NLP research and biomedical applications, respectively. Next, we describe the general workflow of organizing a Biomedical NLP (BioNLP) challenge and involved stakeholders (task organizers, task data producers, task participants and end users). Finally, we summarize the impact and contributions by taking into account different BioNLP challenges as a whole, followed by a discussion of their limitations and difficulties. We conclude with future trends in BioNLP challenge evaluations. Published by Oxford University Press 2015. This work is written by US Government employees and is in the public domain in the US.

  4. Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing

    PubMed Central

    Elhadad, N.

    2016-01-01

    Summary Objectives This paper reviews work over the past two years in Natural Language Processing (NLP) applied to clinical and consumer-generated texts. Methods We included any application or methodological publication that leverages text to facilitate healthcare and address the health-related needs of consumers and populations. Results Many important developments in clinical text processing, both foundational and task-oriented, were addressed in community-wide evaluations and discussed in corresponding special issues that are referenced in this review. These focused issues and in-depth reviews of several other active research areas, such as pharmacovigilance and summarization, allowed us to discuss in greater depth disease modeling and predictive analytics using clinical texts, and text analysis in social media for healthcare quality assessment, trends towards online interventions based on rapid analysis of health-related posts, and consumer health question answering, among other issues. Conclusions Our analysis shows that although clinical NLP continues to advance towards practical applications and more NLP methods are used in large-scale live health information applications, more needs to be done to make NLP use in clinical applications a routine widespread reality. Progress in clinical NLP is mirrored by developments in social media text analysis: the research is moving from capturing trends to addressing individual health-related posts, thus showing potential to become a tool for precision medicine and a valuable addition to the standard healthcare quality evaluation tools. PMID:27830255

  5. Natural language processing and advanced information management

    NASA Technical Reports Server (NTRS)

    Hoard, James E.

    1989-01-01

    Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way.

  6. Comparison of Three Information Sources for Smoking Information in Electronic Health Records

    PubMed Central

    Wang, Liwei; Ruan, Xiaoyang; Yang, Ping; Liu, Hongfang

    2016-01-01

    OBJECTIVE The primary aim was to compare independent and joint performance of retrieving smoking status through different sources, including narrative text processed by natural language processing (NLP), patient-provided information (PPI), and diagnosis codes (ie, International Classification of Diseases, Ninth Revision [ICD-9]). We also compared the performance of retrieving smoking strength information (ie, heavy/light smoker) from narrative text and PPI. MATERIALS AND METHODS Our study leveraged an existing lung cancer cohort for smoking status, amount, and strength information, which was manually chart-reviewed. On the NLP side, smoking-related electronic medical record (EMR) data were retrieved first. A pattern-based smoking information extraction module was then implemented to extract smoking-related information. After that, heuristic rules were used to obtain smoking status-related information. Smoking information was also obtained from structured data sources based on diagnosis codes and PPI. Sensitivity, specificity, and accuracy were measured using patients with coverage (ie, the proportion of patients whose smoking status/strength can be effectively determined). RESULTS NLP alone has the best overall performance for smoking status extraction (patient coverage: 0.88; sensitivity: 0.97; specificity: 0.70; accuracy: 0.88); combining PPI with NLP further improved patient coverage to 0.96. ICD-9 does not provide additional improvement to NLP and its combination with PPI. For smoking strength, combining NLP with PPI has slight improvement over NLP alone. CONCLUSION These findings suggest that narrative text could serve as a more reliable and comprehensive source for obtaining smoking-related information than structured data sources. PPI, the readily available structured data, could be used as a complementary source for more comprehensive patient coverage. PMID:27980387

  7. The application of natural language processing to augmentative and alternative communication.

    PubMed

    Higginbotham, D Jeffery; Lesher, Gregory W; Moulton, Bryan J; Roark, Brian

    2011-01-01

    Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next generation of AAC technology.

  8. Gene expression and pharmacology of nematode NLP-12 neuropeptides.

    PubMed

    McVeigh, Paul; Leech, Suzie; Marks, Nikki J; Geary, Timothy G; Maule, Aaron G

    2006-05-31

    This study examines the biology of NLP-12 neuropeptides in Caenorhabditis elegans, and in the parasitic nematodes Ascaris suum and Trichostrongylus colubriformis. DYRPLQFamide (1 nM-10 microM; n > or =6) produced contraction of innervated dorsal and ventral Ascaris body wall muscle preparations (10 microM, 6.8+/-1.9 g; 1 microM, 4.6+/-1.8 g; 0.1 microM, 4.1+/-2.0 g; 10 nM, 3.8+/-2.0 g; n > or =6), and also caused a qualitatively similar, but quantitatively lower contractile response (10 microM, 4.0+/-1.5 g, n=6) on denervated muscle strips. Ovijector muscle displayed no measurable response (10 microM, n=5). nlp-12 cDNAs were characterised from A. suum (As-nlp-12) and T. colubriformis (Tc-nlp-12), both of which show sequence similarity to C. elegans nlp-12, in that they encode multiple copies of -LQFamide peptides. In C. elegans, reverse transcriptase (RT)-PCR analysis showed that nlp-12 was transcribed throughout the life cycle, suggesting that DYRPLQFamide plays a constitutive role in the nervous system of this nematode. Transcription was also identified in both L3 and adult stages of T. colubriformis, in which Tc-nlp-12 is expressed in a single tail neurone. Conversely, As-nlp-12 is expressed in both head and tail tissue of adult female A. suum, suggesting species-specific differences in the transcription pattern of this gene.

  9. Data-Driven Approaches for Paraphrasing across Language Variations

    ERIC Educational Resources Information Center

    Xu, Wei

    2014-01-01

    Our language changes very rapidly, accompanying political, social and cultural trends, as well as the evolution of science and technology. The Internet, especially the social media, has accelerated this process of change. This poses a severe challenge for both human beings and natural language processing (NLP) systems, which usually only model a…

  10. A hybrid nonlinear programming method for design optimization

    NASA Technical Reports Server (NTRS)

    Rajan, S. D.

    1986-01-01

    Solutions to engineering design problems formulated as nonlinear programming (NLP) problems usually require the use of more than one optimization technique. Moreover, the interaction between the user (analysis/synthesis) program and the NLP system can lead to interface, scaling, or convergence problems. An NLP solution system is presented that seeks to solve these problems by providing a programming system to ease the user-system interface. A simple set of rules is used to select an optimization technique or to switch from one technique to another in an attempt to detect, diagnose, and solve some potential problems. Numerical examples involving finite element based optimal design of space trusses and rotor bearing systems are used to illustrate the applicability of the proposed methodology.

  11. Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports.

    PubMed

    Chen, Po-Hao; Zafar, Hanna; Galperin-Aizenberg, Maya; Cook, Tessa

    2018-04-01

    A significant volume of medical data remains unstructured. Natural language processing (NLP) and machine learning (ML) techniques have shown to successfully extract insights from radiology reports. However, the codependent effects of NLP and ML in this context have not been well-studied. Between April 1, 2015 and November 1, 2016, 9418 cross-sectional abdomen/pelvis CT and MR examinations containing our internal structured reporting element for cancer were separated into four categories: Progression, Stable Disease, Improvement, or No Cancer. We combined each of three NLP techniques with five ML algorithms to predict the assigned label using the unstructured report text and compared the performance of each combination. The three NLP algorithms included term frequency-inverse document frequency (TF-IDF), term frequency weighting (TF), and 16-bit feature hashing. The ML algorithms included logistic regression (LR), random decision forest (RDF), one-vs-all support vector machine (SVM), one-vs-all Bayes point machine (BPM), and fully connected neural network (NN). The best-performing NLP model consisted of tokenized unigrams and bigrams with TF-IDF. Increasing N-gram length yielded little to no added benefit for most ML algorithms. With all parameters optimized, SVM had the best performance on the test dataset, with 90.6 average accuracy and F score of 0.813. The interplay between ML and NLP algorithms and their effect on interpretation accuracy is complex. The best accuracy is achieved when both algorithms are optimized concurrently.

  12. Novel Use of Natural Language Processing (NLP) to Predict Suicidal Ideation and Psychiatric Symptoms in a Text-Based Mental Health Intervention in Madrid.

    PubMed

    Cook, Benjamin L; Progovac, Ana M; Chen, Pei; Mullin, Brian; Hou, Sherry; Baca-Garcia, Enrique

    2016-01-01

    Natural language processing (NLP) and machine learning were used to predict suicidal ideation and heightened psychiatric symptoms among adults recently discharged from psychiatric inpatient or emergency room settings in Madrid, Spain. Participants responded to structured mental and physical health instruments at multiple follow-up points. Outcome variables of interest were suicidal ideation and psychiatric symptoms (GHQ-12). Predictor variables included structured items (e.g., relating to sleep and well-being) and responses to one unstructured question, "how do you feel today?" We compared NLP-based models using the unstructured question with logistic regression prediction models using structured data. The PPV, sensitivity, and specificity for NLP-based models of suicidal ideation were 0.61, 0.56, and 0.57, respectively, compared to 0.73, 0.76, and 0.62 of structured data-based models. The PPV, sensitivity, and specificity for NLP-based models of heightened psychiatric symptoms (GHQ-12 ≥ 4) were 0.56, 0.59, and 0.60, respectively, compared to 0.79, 0.79, and 0.85 in structured models. NLP-based models were able to generate relatively high predictive values based solely on responses to a simple general mood question. These models have promise for rapidly identifying persons at risk of suicide or psychological distress and could provide a low-cost screening alternative in settings where lengthy structured item surveys are not feasible.

  13. Identifying Falls Risk Screenings Not Documented with Administrative Codes Using Natural Language Processing

    PubMed Central

    Zhu, Vivienne J; Walker, Tina D; Warren, Robert W; Jenny, Peggy B; Meystre, Stephane; Lenert, Leslie A

    2017-01-01

    Quality reporting that relies on coded administrative data alone may not completely and accurately depict providers’ performance. To assess this concern with a test case, we developed and evaluated a natural language processing (NLP) approach to identify falls risk screenings documented in clinical notes of patients without coded falls risk screening data. Extracting information from 1,558 clinical notes (mainly progress notes) from 144 eligible patients, we generated a lexicon of 38 keywords relevant to falls risk screening, 26 terms for pre-negation, and 35 terms for post-negation. The NLP algorithm identified 62 (out of the 144) patients who falls risk screening documented only in clinical notes and not coded. Manual review confirmed 59 patients as true positives and 77 patients as true negatives. Our NLP approach scored 0.92 for precision, 0.95 for recall, and 0.93 for F-measure. These results support the concept of utilizing NLP to enhance healthcare quality reporting. PMID:29854264

  14. Instructor-Aided Asynchronous Question Answering System for Online Education and Distance Learning

    ERIC Educational Resources Information Center

    Wen, Dunwei; Cuzzola, John; Brown, Lorna; Kinshuk

    2012-01-01

    Question answering systems have frequently been explored for educational use. However, their value was somewhat limited due to the quality of the answers returned to the student. Recent question answering (QA) research has started to incorporate deep natural language processing (NLP) in order to improve these answers. However, current NLP…

  15. Performance of a Machine Learning Classifier of Knee MRI Reports in Two Large Academic Radiology Practices: A Tool to Estimate Diagnostic Yield.

    PubMed

    Hassanpour, Saeed; Langlotz, Curtis P; Amrhein, Timothy J; Befera, Nicholas T; Lungren, Matthew P

    2017-04-01

    The purpose of this study is to evaluate the performance of a natural language processing (NLP) system in classifying a database of free-text knee MRI reports at two separate academic radiology practices. An NLP system that uses terms and patterns in manually classified narrative knee MRI reports was constructed. The NLP system was trained and tested on expert-classified knee MRI reports from two major health care organizations. Radiology reports were modeled in the training set as vectors, and a support vector machine framework was used to train the classifier. A separate test set from each organization was used to evaluate the performance of the system. We evaluated the performance of the system both within and across organizations. Standard evaluation metrics, such as accuracy, precision, recall, and F1 score (i.e., the weighted average of the precision and recall), and their respective 95% CIs were used to measure the efficacy of our classification system. The accuracy for radiology reports that belonged to the model's clinically significant concept classes after training data from the same institution was good, yielding an F1 score greater than 90% (95% CI, 84.6-97.3%). Performance of the classifier on cross-institutional application without institution-specific training data yielded F1 scores of 77.6% (95% CI, 69.5-85.7%) and 90.2% (95% CI, 84.5-95.9%) at the two organizations studied. The results show excellent accuracy by the NLP machine learning classifier in classifying free-text knee MRI reports, supporting the institution-independent reproducibility of knee MRI report classification. Furthermore, the machine learning classifier performed well on free-text knee MRI reports from another institution. These data support the feasibility of multiinstitutional classification of radiologic imaging text reports with a single machine learning classifier without requiring institution-specific training data.

  16. Neuro Linguistic Programming for Counselors.

    ERIC Educational Resources Information Center

    Harman, Robert L.; O'Neill, Charles

    1981-01-01

    Describes contributions of Neuro Linguistic Programming (NLP) to counseling practice. The Meta-Model, representational systems, anchoring, and reframing are described. Counselors interested in learning NLP can integrate many valuable new ways of communicating with clients and changing client behaviors. (Author)

  17. Natural language processing and inference rules as strategies for updating problem list in an electronic health record.

    PubMed

    Plazzotta, Fernando; Otero, Carlos; Luna, Daniel; de Quiros, Fernan Gonzalez Bernaldo

    2013-01-01

    Physicians do not always keep the problem list accurate, complete and updated. To analyze natural language processing (NLP) techniques and inference rules as strategies to maintain completeness and accuracy of the problem list in EHRs. Non systematic literature review in PubMed, in the last 10 years. Strategies to maintain the EHRs problem list were analyzed in two ways: inputting and removing problems from the problem list. NLP and inference rules have acceptable performance for inputting problems into the problem list. No studies using these techniques for removing problems were published Conclusion: Both tools, NLP and inference rules have had acceptable results as tools for maintain the completeness and accuracy of the problem list.

  18. Cognition-Based Approaches for High-Precision Text Mining

    ERIC Educational Resources Information Center

    Shannon, George John

    2017-01-01

    This research improves the precision of information extraction from free-form text via the use of cognitive-based approaches to natural language processing (NLP). Cognitive-based approaches are an important, and relatively new, area of research in NLP and search, as well as linguistics. Cognitive approaches enable significant improvements in both…

  19. Common Ground: An Interactive Visual Exploration and Discovery for Complex Health Data

    DTIC Science & Technology

    2015-04-01

    working with Intermountain Healthcare on a new rich dataset extracted directly from medical notes using natural language processing ( NLP ) algorithms...probabilities based on a state- of-the-art NLP classifiers. At that stage the data did not include geographic information or temporal information but we

  20. Interdisciplinary Research at the Intersection of CALL, NLP, and SLA: Methodological Implications from an Input Enhancement Project

    ERIC Educational Resources Information Center

    Ziegler, Nicole; Meurers, Detmar; Rebuschat, Patrick; Ruiz, Simón; Moreno-Vega, José L.; Chinkina, Maria; Li, Wenjing; Grey, Sarah

    2017-01-01

    Despite the promise of research conducted at the intersection of computer-assisted language learning (CALL), natural language processing, and second language acquisition, few studies have explored the potential benefits of using intelligent CALL systems to deepen our understanding of the process and products of second language (L2) learning. The…

  1. A hybrid model for automatic identification of risk factors for heart disease.

    PubMed

    Yang, Hui; Garibaldi, Jonathan M

    2015-12-01

    Coronary artery disease (CAD) is the leading cause of death in both the UK and worldwide. The detection of related risk factors and tracking their progress over time is of great importance for early prevention and treatment of CAD. This paper describes an information extraction system that was developed to automatically identify risk factors for heart disease in medical records while the authors participated in the 2014 i2b2/UTHealth NLP Challenge. Our approaches rely on several nature language processing (NLP) techniques such as machine learning, rule-based methods, and dictionary-based keyword spotting to cope with complicated clinical contexts inherent in a wide variety of risk factors. Our system achieved encouraging performance on the challenge test data with an overall micro-averaged F-measure of 0.915, which was competitive to the best system (F-measure of 0.927) of this challenge task. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Epidemiology of angina pectoris: role of natural language processing of the medical record

    PubMed Central

    Pakhomov, Serguei; Hemingway, Harry; Weston, Susan A.; Jacobsen, Steven J.; Rodeheffer, Richard; Roger, Véronique L.

    2007-01-01

    Background The diagnosis of angina is challenging as it relies on symptom descriptions. Natural language processing (NLP) of the electronic medical record (EMR) can provide access to such information contained in free text that may not be fully captured by conventional diagnostic coding. Objective To test the hypothesis that NLP of the EMR improves angina pectoris (AP) ascertainment over diagnostic codes. Methods Billing records of in- and out-patients were searched for ICD-9 codes for AP, chronic ischemic heart disease and chest pain. EMR clinical reports were searched electronically for 50 specific non-negated natural language synonyms to these ICD-9 codes. The two methods were compared to a standardized assessment of angina by Rose questionnaire for three diagnostic levels: unspecified chest pain, exertional chest pain, and Rose angina. Results Compared to the Rose questionnaire, the true positive rate of EMR-NLP for unspecified chest pain was 62% (95%CI:55–67) vs. 51% (95%CI:44–58) for diagnostic codes (p<0.001). For exertional chest pain, the EMR-NLP true positive rate was 71% (95%CI:61–80) vs. 62% (95%CI:52–73) for diagnostic codes (p=0.10). Both approaches had 88% (95%CI:65–100) true positive rate for Rose angina. The EMR-NLP method consistently identified more patients with exertional chest pain over 28-month follow-up. Conclusion EMR-NLP method improves the detection of unspecified and exertional chest pain cases compared to diagnostic codes. These findings have implications for epidemiological and clinical studies of angina pectoris. PMID:17383310

  3. Adaptive Reading and Writing Instruction in iSTART and W-Pal

    ERIC Educational Resources Information Center

    Johnson, Amy M.; McCarthy, Kathryn S.; Kopp, Kristopher J.; Perret, Cecile A.; McNamara, Danielle S.

    2017-01-01

    Intelligent tutoring systems for ill-defined domains, such as reading and writing, are critically needed, yet uncommon. Two such systems, the Interactive Strategy Training for Active Reading and Thinking (iSTART) and Writing Pal (W-Pal) use natural language processing (NLP) to assess learners' written (i.e., typed) responses and provide immediate,…

  4. A Morphological Analyzer for Vocalized or Not Vocalized Arabic Language

    NASA Astrophysics Data System (ADS)

    El Amine Abderrahim, Med; Breksi Reguig, Fethi

    This research has been to show the realization of a morphological analyzer of the Arabic language (vocalized or not vocalized). This analyzer is based upon our object model for the Arabic Natural Language Processing (NLP) and can be exploited by NLP applications such as translation machine, orthographical correction and the search for information.

  5. Functional evaluation of out-of-the-box text-mining tools for data-mining tasks

    PubMed Central

    Jung, Kenneth; LePendu, Paea; Iyer, Srinivasan; Bauer-Mehren, Anna; Percha, Bethany; Shah, Nigam H

    2015-01-01

    Objective The trade-off between the speed and simplicity of dictionary-based term recognition and the richer linguistic information provided by more advanced natural language processing (NLP) is an area of active discussion in clinical informatics. In this paper, we quantify this trade-off among text processing systems that make different trade-offs between speed and linguistic understanding. We tested both types of systems in three clinical research tasks: phase IV safety profiling of a drug, learning adverse drug–drug interactions, and learning used-to-treat relationships between drugs and indications. Materials We first benchmarked the accuracy of the NCBO Annotator and REVEAL in a manually annotated, publically available dataset from the 2008 i2b2 Obesity Challenge. We then applied the NCBO Annotator and REVEAL to 9 million clinical notes from the Stanford Translational Research Integrated Database Environment (STRIDE) and used the resulting data for three research tasks. Results There is no significant difference between using the NCBO Annotator and REVEAL in the results of the three research tasks when using large datasets. In one subtask, REVEAL achieved higher sensitivity with smaller datasets. Conclusions For a variety of tasks, employing simple term recognition methods instead of advanced NLP methods results in little or no impact on accuracy when using large datasets. Simpler dictionary-based methods have the advantage of scaling well to very large datasets. Promoting the use of simple, dictionary-based methods for population level analyses can advance adoption of NLP in practice. PMID:25336595

  6. Semantic biomedical resource discovery: a Natural Language Processing framework.

    PubMed

    Sfakianaki, Pepi; Koumakis, Lefteris; Sfakianakis, Stelios; Iatraki, Galatia; Zacharioudakis, Giorgos; Graf, Norbert; Marias, Kostas; Tsiknakis, Manolis

    2015-09-30

    A plethora of publicly available biomedical resources do currently exist and are constantly increasing at a fast rate. In parallel, specialized repositories are been developed, indexing numerous clinical and biomedical tools. The main drawback of such repositories is the difficulty in locating appropriate resources for a clinical or biomedical decision task, especially for non-Information Technology expert users. In parallel, although NLP research in the clinical domain has been active since the 1960s, progress in the development of NLP applications has been slow and lags behind progress in the general NLP domain. The aim of the present study is to investigate the use of semantics for biomedical resources annotation with domain specific ontologies and exploit Natural Language Processing methods in empowering the non-Information Technology expert users to efficiently search for biomedical resources using natural language. A Natural Language Processing engine which can "translate" free text into targeted queries, automatically transforming a clinical research question into a request description that contains only terms of ontologies, has been implemented. The implementation is based on information extraction techniques for text in natural language, guided by integrated ontologies. Furthermore, knowledge from robust text mining methods has been incorporated to map descriptions into suitable domain ontologies in order to ensure that the biomedical resources descriptions are domain oriented and enhance the accuracy of services discovery. The framework is freely available as a web application at ( http://calchas.ics.forth.gr/ ). For our experiments, a range of clinical questions were established based on descriptions of clinical trials from the ClinicalTrials.gov registry as well as recommendations from clinicians. Domain experts manually identified the available tools in a tools repository which are suitable for addressing the clinical questions at hand, either individually or as a set of tools forming a computational pipeline. The results were compared with those obtained from an automated discovery of candidate biomedical tools. For the evaluation of the results, precision and recall measurements were used. Our results indicate that the proposed framework has a high precision and low recall, implying that the system returns essentially more relevant results than irrelevant. There are adequate biomedical ontologies already available, sufficiency of existing NLP tools and quality of biomedical annotation systems for the implementation of a biomedical resources discovery framework, based on the semantic annotation of resources and the use on NLP techniques. The results of the present study demonstrate the clinical utility of the application of the proposed framework which aims to bridge the gap between clinical question in natural language and efficient dynamic biomedical resources discovery.

  7. Opiates Modulate Noxious Chemical Nociception through a Complex Monoaminergic/Peptidergic Cascade

    PubMed Central

    Mills, Holly; Ortega, Amanda; Law, Wenjing; Hapiak, Vera; Summers, Philip; Clark, Tobias

    2016-01-01

    The ability to detect noxious stimuli, process the nociceptive signal, and elicit an appropriate behavioral response is essential for survival. In Caenorhabditis elegans, opioid receptor agonists, such as morphine, mimic serotonin, and suppress the overall withdrawal from noxious stimuli through a pathway requiring the opioid-like receptor, NPR-17. This serotonin- or morphine-dependent modulation can be rescued in npr-17-null animals by the expression of npr-17 or a human κ opioid receptor in the two ASI sensory neurons, with ASI opioid signaling selectively inhibiting ASI neuropeptide release. Serotonergic modulation requires peptides encoded by both nlp-3 and nlp-24, and either nlp-3 or nlp-24 overexpression mimics morphine and suppresses withdrawal. Peptides encoded by nlp-3 act differentially, with only NLP-3.3 mimicking morphine, whereas other nlp-3 peptides antagonize NLP-3.3 modulation. Together, these results demonstrate that opiates modulate nociception in Caenorhabditis elegans through a complex monoaminergic/peptidergic cascade, and suggest that this model may be useful for dissecting opiate signaling in mammals. SIGNIFICANCE STATEMENT Opiates are used extensively to treat chronic pain. In Caenorhabditis elegans, opioid receptor agonists suppress the overall withdrawal from noxious chemical stimuli through a pathway requiring an opioid-like receptor and two distinct neuropeptide-encoding genes, with individual peptides from the same gene functioning antagonistically to modulate nociception. Endogenous opioid signaling functions as part of a complex, monoaminergic/peptidergic signaling cascade and appears to selectively inhibit neuropeptide release, mediated by a α-adrenergic-like receptor, from two sensory neurons. Importantly, receptor null animals can be rescued by the expression of the human κ opioid receptor, and injection of human opioid receptor ligands mimics exogenous opiates, highlighting the utility of this model for dissecting opiate signaling in mammals. PMID:27194330

  8. Automated processing of electronic medical records is a reliable method of determining aspirin use in populations at risk for cardiovascular events.

    PubMed

    Pakhomov, Serguei Vs; Shah, Nilay D; Hanson, Penny; Balasubramaniam, Saranya C; Smith, Steven A

    2010-01-01

    Low-dose aspirin reduces cardiovascular risk; however, monitoring over-the-counter medication use relies on the time-consuming and costly manual review of medical records. Our objective is to validate natural language processing (NLP) of the electronic medical record (EMR) for extracting medication exposure and contraindication information. The text of EMRs for 499 patients with type 2 diabetes was searched using NLP for evidence of aspirin use and its contraindications. The results were compared to a standardised manual records review. Of the 499 patients, 351 (70%) were using aspirin and 148 (30%) were not, according to manual review. NLP correctly identified 346 of the 351 aspirin-positive and 134 of the 148 aspirin-negative patients, indicating a sensitivity of 99% (95% CI 97-100) and specificity of 91% (95% CI 88-97). Of the 148 aspirin-negative patients, 66 (45%) had contraindications and 82 (55%) did not, according to manual review. NLP search for contraindications correctly identified 61 of the 66 patients with contraindications and 58 of the 82 patients without, yielding a sensitivity of 92% (95% CI 84-97) and a specificity of 71% (95% CI 60-80). NLP of the EMR is accurate in ascertaining documented aspirin use and could potentially be used for epidemiological research as a source of cardiovascular risk factor information.

  9. Building an Evaluation Scale using Item Response Theory.

    PubMed

    Lalor, John P; Wu, Hao; Yu, Hong

    2016-11-01

    Evaluation of NLP methods requires testing against a previously vetted gold-standard test set and reporting standard metrics (accuracy/precision/recall/F1). The current assumption is that all items in a given test set are equal with regards to difficulty and discriminating power. We propose Item Response Theory (IRT) from psychometrics as an alternative means for gold-standard test-set generation and NLP system evaluation. IRT is able to describe characteristics of individual items - their difficulty and discriminating power - and can account for these characteristics in its estimation of human intelligence or ability for an NLP task. In this paper, we demonstrate IRT by generating a gold-standard test set for Recognizing Textual Entailment. By collecting a large number of human responses and fitting our IRT model, we show that our IRT model compares NLP systems with the performance in a human population and is able to provide more insight into system performance than standard evaluation metrics. We show that a high accuracy score does not always imply a high IRT score, which depends on the item characteristics and the response pattern.

  10. Building an Evaluation Scale using Item Response Theory

    PubMed Central

    Lalor, John P.; Wu, Hao; Yu, Hong

    2016-01-01

    Evaluation of NLP methods requires testing against a previously vetted gold-standard test set and reporting standard metrics (accuracy/precision/recall/F1). The current assumption is that all items in a given test set are equal with regards to difficulty and discriminating power. We propose Item Response Theory (IRT) from psychometrics as an alternative means for gold-standard test-set generation and NLP system evaluation. IRT is able to describe characteristics of individual items - their difficulty and discriminating power - and can account for these characteristics in its estimation of human intelligence or ability for an NLP task. In this paper, we demonstrate IRT by generating a gold-standard test set for Recognizing Textual Entailment. By collecting a large number of human responses and fitting our IRT model, we show that our IRT model compares NLP systems with the performance in a human population and is able to provide more insight into system performance than standard evaluation metrics. We show that a high accuracy score does not always imply a high IRT score, which depends on the item characteristics and the response pattern.1 PMID:28004039

  11. Expression of Caenorhabditis elegans antimicrobial peptide NLP-31 in Escherichia coli

    NASA Astrophysics Data System (ADS)

    Lim, Mei-Perng; Nathan, Sheila

    2014-09-01

    Burkholderia pseudomallei is the causative agent of melioidosis, a fulminant disease endemic in Southeast Asia and Northern Australia. The standardized form of therapy is antibiotics treatment; however, the bacterium has become increasingly resistant to these antibiotics. This has spurred the need to search for alternative therapeutic agents. Antimicrobial peptides (AMPs) are small proteins that possess broad-spectrum antimicrobial activity. In a previous study, the nematode Caenorhabditis elegans was infected by B. pseudomallei and a whole animal transcriptome analysis identified a number of AMP-encoded genes which were induced significantly in the infected worms. One of the AMPs identified is NLP-31 and to date, there are no reports of anti-B. pseudomallei activity demonstrated by NLP-31. To produce NLP-31 protein for future studies, the gene encoding for NLP-31 was cloned into the pET32b expression vector and transformed into Escherichia coli BL21(DE3). Protein expression was induced with 1 mM IPTG for 20 hours at 20°C and recombinant NLP-31 was detected in the soluble fraction. Taken together, a simple optimized heterologous production of AMPs in an E. coli expression system has been successfully developed.

  12. Correlating mammographic and pathologic findings in clinical decision support using natural language processing and data mining methods.

    PubMed

    Patel, Tejal A; Puppala, Mamta; Ogunti, Richard O; Ensor, Joe E; He, Tiancheng; Shewale, Jitesh B; Ankerst, Donna P; Kaklamani, Virginia G; Rodriguez, Angel A; Wong, Stephen T C; Chang, Jenny C

    2017-01-01

    A key challenge to mining electronic health records for mammography research is the preponderance of unstructured narrative text, which strikingly limits usable output. The imaging characteristics of breast cancer subtypes have been described previously, but without standardization of parameters for data mining. The authors searched the enterprise-wide data warehouse at the Houston Methodist Hospital, the Methodist Environment for Translational Enhancement and Outcomes Research (METEOR), for patients with Breast Imaging Reporting and Data System (BI-RADS) category 5 mammogram readings performed between January 2006 and May 2015 and an available pathology report. The authors developed natural language processing (NLP) software algorithms to automatically extract mammographic and pathologic findings from free text mammogram and pathology reports. The correlation between mammographic imaging features and breast cancer subtype was analyzed using one-way analysis of variance and the Fisher exact test. The NLP algorithm was able to obtain key characteristics for 543 patients who met the inclusion criteria. Patients with estrogen receptor-positive tumors were more likely to have spiculated margins (P = .0008), and those with tumors that overexpressed human epidermal growth factor receptor 2 (HER2) were more likely to have heterogeneous and pleomorphic calcifications (P = .0078 and P = .0002, respectively). Mammographic imaging characteristics, obtained from an automated text search and the extraction of mammogram reports using NLP techniques, correlated with pathologic breast cancer subtype. The results of the current study validate previously reported trends assessed by manual data collection. Furthermore, NLP provides an automated means with which to scale up data extraction and analysis for clinical decision support. Cancer 2017;114-121. © 2016 American Cancer Society. © 2016 American Cancer Society.

  13. Performance of a Natural Language Processing (NLP) Tool to Extract Pulmonary Function Test (PFT) Reports from Structured and Semistructured Veteran Affairs (VA) Data

    PubMed Central

    Sauer, Brian C.; Jones, Barbara E.; Globe, Gary; Leng, Jianwei; Lu, Chao-Chin; He, Tao; Teng, Chia-Chen; Sullivan, Patrick; Zeng, Qing

    2016-01-01

    Introduction/Objective: Pulmonary function tests (PFTs) are objective estimates of lung function, but are not reliably stored within the Veteran Health Affairs data systems as structured data. The aim of this study was to validate the natural language processing (NLP) tool we developed—which extracts spirometric values and responses to bronchodilator administration—against expert review, and to estimate the number of additional spirometric tests identified beyond the structured data. Methods: All patients at seven Veteran Affairs Medical Centers with a diagnostic code for asthma Jan 1, 2006–Dec 31, 2012 were included. Evidence of spirometry with a bronchodilator challenge (BDC) was extracted from structured data as well as clinical documents. NLP’s performance was compared against a human reference standard using a random sample of 1,001 documents. Results: In the validation set NLP demonstrated a precision of 98.9 percent (95 percent confidence intervals (CI): 93.9 percent, 99.7 percent), recall of 97.8 percent (95 percent CI: 92.2 percent, 99.7 percent), and an F-measure of 98.3 percent for the forced vital capacity pre- and post pairs and precision of 100 percent (95 percent CI: 96.6 percent, 100 percent), recall of 100 percent (95 percent CI: 96.6 percent, 100 percent), and an F-measure of 100 percent for the forced expiratory volume in one second pre- and post pairs for bronchodilator administration. Application of the NLP increased the proportion identified with complete bronchodilator challenge by 25 percent. Discussion/Conclusion: This technology can improve identification of PFTs for epidemiologic research. Caution must be taken in assuming that a single domain of clinical data can completely capture the scope of a disease, treatment, or clinical test. PMID:27376095

  14. Integer-ambiguity resolution in astronomy and geodesy

    NASA Astrophysics Data System (ADS)

    Lannes, A.; Prieur, J.-L.

    2014-02-01

    Recent theoretical developments in astronomical aperture synthesis have revealed the existence of integer-ambiguity problems. Those problems, which appear in the self-calibration procedures of radio imaging, have been shown to be similar to the nearest-lattice point (NLP) problems encountered in high-precision geodetic positioning and in global navigation satellite systems. In this paper we analyse the theoretical aspects of the matter and propose new methods for solving those NLP~problems. The related optimization aspects concern both the preconditioning stage, and the discrete-search stage in which the integer ambiguities are finally fixed. Our algorithms, which are described in an explicit manner, can easily be implemented. They lead to substantial gains in the processing time of both stages. Their efficiency was shown via intensive numerical tests.

  15. Common data model for natural language processing based on two existing standard information models: CDA+GrAF.

    PubMed

    Meystre, Stéphane M; Lee, Sanghoon; Jung, Chai Young; Chevrier, Raphaël D

    2012-08-01

    An increasing need for collaboration and resources sharing in the Natural Language Processing (NLP) research and development community motivates efforts to create and share a common data model and a common terminology for all information annotated and extracted from clinical text. We have combined two existing standards: the HL7 Clinical Document Architecture (CDA), and the ISO Graph Annotation Format (GrAF; in development), to develop such a data model entitled "CDA+GrAF". We experimented with several methods to combine these existing standards, and eventually selected a method wrapping separate CDA and GrAF parts in a common standoff annotation (i.e., separate from the annotated text) XML document. Two use cases, clinical document sections, and the 2010 i2b2/VA NLP Challenge (i.e., problems, tests, and treatments, with their assertions and relations), were used to create examples of such standoff annotation documents, and were successfully validated with the XML schemata provided with both standards. We developed a tool to automatically translate annotation documents from the 2010 i2b2/VA NLP Challenge format to GrAF, and automatically generated 50 annotation documents using this tool, all successfully validated. Finally, we adapted the XSL stylesheet provided with HL7 CDA to allow viewing annotation XML documents in a web browser, and plan to adapt existing tools for translating annotation documents between CDA+GrAF and the UIMA and GATE frameworks. This common data model may ease directly comparing NLP tools and applications, combining their output, transforming and "translating" annotations between different NLP applications, and eventually "plug-and-play" of different modules in NLP applications. Copyright © 2011 Elsevier Inc. All rights reserved.

  16. A study of the transferability of influenza case detection systems between two large healthcare systems

    PubMed Central

    Wagner, Michael M.; Cooper, Gregory F.; Ferraro, Jeffrey P.; Su, Howard; Gesteland, Per H.; Haug, Peter J.; Millett, Nicholas E.; Aronis, John M.; Nowalk, Andrew J.; Ruiz, Victor M.; López Pineda, Arturo; Shi, Lingyun; Van Bree, Rudy; Ginter, Thomas; Tsui, Fuchiang

    2017-01-01

    Objectives This study evaluates the accuracy and transferability of Bayesian case detection systems (BCD) that use clinical notes from emergency department (ED) to detect influenza cases. Methods A BCD uses natural language processing (NLP) to infer the presence or absence of clinical findings from ED notes, which are fed into a Bayesain network classifier (BN) to infer patients’ diagnoses. We developed BCDs at the University of Pittsburgh Medical Center (BCDUPMC) and Intermountain Healthcare in Utah (BCDIH). At each site, we manually built a rule-based NLP and trained a Bayesain network classifier from over 40,000 ED encounters between Jan. 2008 and May. 2010 using feature selection, machine learning, and expert debiasing approach. Transferability of a BCD in this study may be impacted by seven factors: development (source) institution, development parser, application (target) institution, application parser, NLP transfer, BN transfer, and classification task. We employed an ANOVA analysis to study their impacts on BCD performance. Results Both BCDs discriminated well between influenza and non-influenza on local test cases (AUCs > 0.92). When tested for transferability using the other institution’s cases, BCDUPMC discriminations declined minimally (AUC decreased from 0.95 to 0.94, p<0.01), and BCDIH discriminations declined more (from 0.93 to 0.87, p<0.0001). We attributed the BCDIH decline to the lower recall of the IH parser on UPMC notes. The ANOVA analysis showed five significant factors: development parser, application institution, application parser, BN transfer, and classification task. Conclusion We demonstrated high influenza case detection performance in two large healthcare systems in two geographically separated regions, providing evidentiary support for the use of automated case detection from routinely collected electronic clinical notes in national influenza surveillance. The transferability could be improved by training Bayesian network classifier locally and increasing the accuracy of the NLP parser. PMID:28380048

  17. A study of the transferability of influenza case detection systems between two large healthcare systems.

    PubMed

    Ye, Ye; Wagner, Michael M; Cooper, Gregory F; Ferraro, Jeffrey P; Su, Howard; Gesteland, Per H; Haug, Peter J; Millett, Nicholas E; Aronis, John M; Nowalk, Andrew J; Ruiz, Victor M; López Pineda, Arturo; Shi, Lingyun; Van Bree, Rudy; Ginter, Thomas; Tsui, Fuchiang

    2017-01-01

    This study evaluates the accuracy and transferability of Bayesian case detection systems (BCD) that use clinical notes from emergency department (ED) to detect influenza cases. A BCD uses natural language processing (NLP) to infer the presence or absence of clinical findings from ED notes, which are fed into a Bayesain network classifier (BN) to infer patients' diagnoses. We developed BCDs at the University of Pittsburgh Medical Center (BCDUPMC) and Intermountain Healthcare in Utah (BCDIH). At each site, we manually built a rule-based NLP and trained a Bayesain network classifier from over 40,000 ED encounters between Jan. 2008 and May. 2010 using feature selection, machine learning, and expert debiasing approach. Transferability of a BCD in this study may be impacted by seven factors: development (source) institution, development parser, application (target) institution, application parser, NLP transfer, BN transfer, and classification task. We employed an ANOVA analysis to study their impacts on BCD performance. Both BCDs discriminated well between influenza and non-influenza on local test cases (AUCs > 0.92). When tested for transferability using the other institution's cases, BCDUPMC discriminations declined minimally (AUC decreased from 0.95 to 0.94, p<0.01), and BCDIH discriminations declined more (from 0.93 to 0.87, p<0.0001). We attributed the BCDIH decline to the lower recall of the IH parser on UPMC notes. The ANOVA analysis showed five significant factors: development parser, application institution, application parser, BN transfer, and classification task. We demonstrated high influenza case detection performance in two large healthcare systems in two geographically separated regions, providing evidentiary support for the use of automated case detection from routinely collected electronic clinical notes in national influenza surveillance. The transferability could be improved by training Bayesian network classifier locally and increasing the accuracy of the NLP parser.

  18. Biological event composition

    PubMed Central

    2012-01-01

    Background In recent years, biological event extraction has emerged as a key natural language processing task, aiming to address the information overload problem in accessing the molecular biology literature. The BioNLP shared task competitions have contributed to this recent interest considerably. The first competition (BioNLP'09) focused on extracting biological events from Medline abstracts from a narrow domain, while the theme of the latest competition (BioNLP-ST'11) was generalization and a wider range of text types, event types, and subject domains were considered. We view event extraction as a building block in larger discourse interpretation and propose a two-phase, linguistically-grounded, rule-based methodology. In the first phase, a general, underspecified semantic interpretation is composed from syntactic dependency relations in a bottom-up manner. The notion of embedding underpins this phase and it is informed by a trigger dictionary and argument identification rules. Coreference resolution is also performed at this step, allowing extraction of inter-sentential relations. The second phase is concerned with constraining the resulting semantic interpretation by shared task specifications. We evaluated our general methodology on core biological event extraction and speculation/negation tasks in three main tracks of BioNLP-ST'11 (GENIA, EPI, and ID). Results We achieved competitive results in GENIA and ID tracks, while our results in the EPI track leave room for improvement. One notable feature of our system is that its performance across abstracts and articles bodies is stable. Coreference resolution results in minor improvement in system performance. Due to our interest in discourse-level elements, such as speculation/negation and coreference, we provide a more detailed analysis of our system performance in these subtasks. Conclusions The results demonstrate the viability of a robust, linguistically-oriented methodology, which clearly distinguishes general semantic interpretation from shared task specific aspects, for biological event extraction. Our error analysis pinpoints some shortcomings, which we plan to address in future work within our incremental system development methodology. PMID:22759461

  19. DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx

    PubMed Central

    Mehrabi, Saeed; Krishnan, Anand; Sohn, Sunghwan; Roch, Alexandra M; Schmidt, Heidi; Kesterson, Joe; Beesley, Chris; Dexter, Paul; Schmidt, C. Max; Liu, Hongfang; Palakal, Mathew

    2018-01-01

    In Electronic Health Records (EHRs), much of valuable information regarding patients’ conditions is embedded in free text format. Natural language processing (NLP) techniques have been developed to extract clinical information from free text. One challenge faced in clinical NLP is that the meaning of clinical entities is heavily affected by modifiers such as negation. A negation detection algorithm, NegEx, applies a simplistic approach that has been shown to be powerful in clinical NLP. However, due to the failure to consider the contextual relationship between words within a sentence, NegEx fails to correctly capture the negation status of concepts in complex sentences. Incorrect negation assignment could cause inaccurate diagnosis of patients’ condition or contaminated study cohorts. We developed a negation algorithm called DEEPEN to decrease NegEx’s false positives by taking into account the dependency relationship between negation words and concepts within a sentence using Stanford dependency parser. The system was developed and tested using EHR data from Indiana University (IU) and it was further evaluated on Mayo Clinic dataset to assess its generalizability. The evaluation results demonstrate DEEPEN, which incorporates dependency parsing into NegEx, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs. PMID:25791500

  20. A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time.

    PubMed

    Wu, Y; Denny, J C; Rosenbloom, S T; Miller, R A; Giuse, D A; Song, M; Xu, H

    2015-01-01

    To save time, healthcare providers frequently use abbreviations while authoring clinical documents. Nevertheless, abbreviations that authors deem unambiguous often confuse other readers, including clinicians, patients, and natural language processing (NLP) systems. Most current clinical NLP systems "post-process" notes long after clinicians enter them into electronic health record systems (EHRs). Such post-processing cannot guarantee 100% accuracy in abbreviation identification and disambiguation, since multiple alternative interpretations exist. Authors describe a prototype system for real-time Clinical Abbreviation Recognition and Disambiguation (rCARD) - i.e., a system that interacts with authors during note generation to verify correct abbreviation senses. The rCARD system design anticipates future integration with web-based clinical documentation systems to improve quality of healthcare records. When clinicians enter documents, rCARD will automatically recognize each abbreviation. For abbreviations with multiple possible senses, rCARD will show a ranked list of possible meanings with the best predicted sense at the top. The prototype application embodies three word sense disambiguation (WSD) methods to predict the correct senses of abbreviations. We then conducted three experments to evaluate rCARD, including 1) a performance evaluation of different WSD methods; 2) a time evaluation of real-time WSD methods; and 3) a user study of typing clinical sentences with abbreviations using rCARD. Using 4,721 sentences containing 25 commonly observed, highly ambiguous clinical abbreviations, our evaluation showed that the best profile-based method implemented in rCARD achieved a reasonable WSD accuracy of 88.8% (comparable to SVM - 89.5%) and the cost of time for the different WSD methods are also acceptable (ranging from 0.630 to 1.649 milliseconds within the same network). The preliminary user study also showed that the extra time costs by rCARD were about 5% of total document entry time and users did not feel a significant delay when using rCARD for clinical document entry. The study indicates that it is feasible to integrate a real-time, NLP-enabled abbreviation recognition and disambiguation module with clinical documentation systems.

  1. Finding 'Evidence of Absence' in Medical Notes: Using NLP for Clinical Inferencing.

    PubMed

    Carter, Marjorie E; Divita, Guy; Redd, Andrew; Rubin, Michael A; Samore, Matthew H; Gupta, Kalpana; Trautner, Barbara W; Gundlapalli, Adi V

    2016-01-01

    Extracting evidence of the absence of a target of interest from medical text can be useful in clinical inferencing. The purpose of our study was to develop a natural language processing (NLP) pipelineto identify the presence of indwelling urinary catheters from electronic medical notes to aid in detection of catheter-associated urinary tract infections (CAUTI). Finding clear evidence that a patient does not have an indwelling urinary catheter is useful in making a determination regarding CAUTI. We developed a lexicon of seven core concepts to infer the absence of a urinary catheter. Of the 990,391 concepts extractedby NLP from a large corpus of 744,285 electronic medical notes from 5589 hospitalized patients, 63,516 were labeled as evidence of absence.Human review revealed three primary causes for false negatives. The lexicon and NLP pipeline were refined using this information, resulting in outputs with an acceptable false positive rate of 11%.

  2. Mining protein phosphorylation information from biomedical literature using NLP parsing and Support Vector Machines.

    PubMed

    Raja, Kalpana; Natarajan, Jeyakumar

    2018-07-01

    Extraction of protein phosphorylation information from biomedical literature has gained much attention because of the importance in numerous biological processes. In this study, we propose a text mining methodology which consists of two phases, NLP parsing and SVM classification to extract phosphorylation information from literature. First, using NLP parsing we divide the data into three base-forms depending on the biomedical entities related to phosphorylation and further classify into ten sub-forms based on their distribution with phosphorylation keyword. Next, we extract the phosphorylation entity singles/pairs/triplets and apply SVM to classify the extracted singles/pairs/triplets using a set of features applicable to each sub-form. The performance of our methodology was evaluated on three corpora namely PLC, iProLink and hPP corpus. We obtained promising results of >85% F-score on ten sub-forms of training datasets on cross validation test. Our system achieved overall F-score of 93.0% on iProLink and 96.3% on hPP corpus test datasets. Furthermore, our proposed system achieved best performance on cross corpus evaluation and outperformed the existing system with recall of 90.1%. The performance analysis of our unique system on three corpora reveals that it extracts protein phosphorylation information efficiently in both non-organism specific general datasets such as PLC and iProLink, and human specific dataset such as hPP corpus. Copyright © 2018 Elsevier B.V. All rights reserved.

  3. GATECloud.net: a platform for large-scale, open-source text processing on the cloud.

    PubMed

    Tablan, Valentin; Roberts, Ian; Cunningham, Hamish; Bontcheva, Kalina

    2013-01-28

    Cloud computing is increasingly being regarded as a key enabler of the 'democratization of science', because on-demand, highly scalable cloud computing facilities enable researchers anywhere to carry out data-intensive experiments. In the context of natural language processing (NLP), algorithms tend to be complex, which makes their parallelization and deployment on cloud platforms a non-trivial task. This study presents a new, unique, cloud-based platform for large-scale NLP research--GATECloud. net. It enables researchers to carry out data-intensive NLP experiments by harnessing the vast, on-demand compute power of the Amazon cloud. Important infrastructural issues are dealt with by the platform, completely transparently for the researcher: load balancing, efficient data upload and storage, deployment on the virtual machines, security and fault tolerance. We also include a cost-benefit analysis and usage evaluation.

  4. Increased expression of Nlp, a potential oncogene in ovarian cancer, and its implication in carcinogenesis.

    PubMed

    Qu, Danni; Qu, Hongyan; Fu, Ming; Zhao, Xuelian; Liu, Rong; Sui, Lihua; Zhan, Qimin

    2008-08-01

    Nlp (Ninein-like protein), a novel centrosome protein involved in microtubule nucleation, has been studied extensively in our laboratory, and its overexpression has been found in some human tumors. To understand the role of Nlp in human ovarian cancer development, we studied the correlation of Nlp expression with clinicopathological parameters and survival in epithelial ovarian cancer, and the impact of Nlp overexpression on ovarian cancer cells. Nlp expression in normal, borderline, benign and malignant epithelial ovarian tissues was examined by immunohistochemistry. The correlation between Nlp expression and tumor grade, FIGO stage and histological type was also evaluated. Survival was calculated using Kaplan-Meier estimates. Cell proliferation and apoptosis were assayed after stable transfection of pEGFP-C3-Nlp or empty vector in human ovarian cancer cell line SKOV3. Nlp was positive in 1 of 10 (10%) normal ovarian tissues, 5 of 34 (14.7%) benign tumors, 9 of 26 (34.6%) borderline tumors and 73 of 131 (56.0%) ovarian tumors. Nlp immunoreactivity intensity significantly correlated with tumor grade, but not with FIGO stage or histological type. Kaplan-Meier curves showed that Nlp overexpression was marginally associated with decreased overall survival. Overexpression of Nlp enhanced proliferation and inhibited apoptosis induced by paclitaxel in the SKOV3 cell line. Overexpression of Nlp in ovarian tumors raises the possibility that Nlp may play a role in ovarian carcinogenesis.

  5. Overexpression of centrosomal protein Nlp confers breast carcinoma resistance to paclitaxel.

    PubMed

    Zhao, Weihong; Song, Yongmei; Xu, Binghe; Zhan, Qimin

    2012-02-01

    Nlp (ninein-like protein), an important molecule involved in centrosome maturation and spindle formation, plays an important role in tumorigenesis and its abnormal expression was recently observed in human breast and lung cancers. In this study, the correlation between overexpression of Nlp and paclitaxel chemosensitivity was investigated to explore the mechanisms of resistance to paclitaxel and to understand the effect of Nlp upon apoptosis induced by chemotherapeutic agents. Nlp expression vector was stably transfected into breast cancer MCF-7 cells. With Nlp overexpression, the survival rates, cell cycle distributions and apoptosis were analyzed in transfected MCF-7 cells by MTT test and FCM approach. The immunofluorescent assay was employed to detect the changes of microtubule after paclitaxel treatment. Immunoblotting analysis was used to examine expression of centrosomal proteins and apoptosis associated proteins. Subsequently, Nlp expression was retrospectively examined with 55 breast cancer samples derived from paclitaxel treated patients. Interestingly, the survival rates of MCF-7 cells with Nlp overexpressing were higher than that of control after paclitaxel treatment. Nlp overexpression promoted G2-M arrest and attenuated apoptosis induced by paclitaxel, which was coupled with elevated Bcl-2 protein. Nlp expression significantly lessened the microtubule polymerization and bundling elicited by paclitaxel attributing to alteration on the structure or dynamics of β-tubulin but not on its expression. The breast cancer patients with high expression of Nlp were likely resistant to the treatment of paclitaxel, as the response rate in Nlp negative patients was 62.5%, whereas was 58.3 and 15.8% in Nlp (+) and Nlp (++) patients respectively (p = 0.015). Nlp expression was positive correlated with those of Plk1 and PCNA. These findings provide insights into more rational chemotherapeutic regimens in clinical practice, and more effective approaches might be developed through targeting Nlp to increase chemotherapeutic sensitivity.

  6. SUBTLE: Situation Understanding Bot through Language and Environment

    DTIC Science & Technology

    2016-01-06

    a 4 day “hackathon” by Stuart Young’s small robots group which successfully ported the SUBTLE MURI NLP robot interface to the Packbot platform they...null element restoration, a step typically ig- nored in NLP systems, allows for correct parsing of im- peratives and questions, critical structures

  7. Comparison of Natural Language Processing Rules-based and Machine-learning Systems to Identify Lumbar Spine Imaging Findings Related to Low Back Pain.

    PubMed

    Tan, W Katherine; Hassanpour, Saeed; Heagerty, Patrick J; Rundell, Sean D; Suri, Pradeep; Huhdanpaa, Hannu T; James, Kathryn; Carrell, David S; Langlotz, Curtis P; Organ, Nancy L; Meier, Eric N; Sherman, Karen J; Kallmes, David F; Luetmer, Patrick H; Griffith, Brent; Nerenz, David R; Jarvik, Jeffrey G

    2018-03-28

    To evaluate a natural language processing (NLP) system built with open-source tools for identification of lumbar spine imaging findings related to low back pain on magnetic resonance and x-ray radiology reports from four health systems. We used a limited data set (de-identified except for dates) sampled from lumbar spine imaging reports of a prospectively assembled cohort of adults. From N = 178,333 reports, we randomly selected N = 871 to form a reference-standard dataset, consisting of N = 413 x-ray reports and N = 458 MR reports. Using standardized criteria, four spine experts annotated the presence of 26 findings, where 71 reports were annotated by all four experts and 800 were each annotated by two experts. We calculated inter-rater agreement and finding prevalence from annotated data. We randomly split the annotated data into development (80%) and testing (20%) sets. We developed an NLP system from both rule-based and machine-learned models. We validated the system using accuracy metrics such as sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). The multirater annotated dataset achieved inter-rater agreement of Cohen's kappa > 0.60 (substantial agreement) for 25 of 26 findings, with finding prevalence ranging from 3% to 89%. In the testing sample, rule-based and machine-learned predictions both had comparable average specificity (0.97 and 0.95, respectively). The machine-learned approach had a higher average sensitivity (0.94, compared to 0.83 for rules-based), and a higher overall AUC (0.98, compared to 0.90 for rules-based). Our NLP system performed well in identifying the 26 lumbar spine findings, as benchmarked by reference-standard annotation by medical experts. Machine-learned models provided substantial gains in model sensitivity with slight loss of specificity, and overall higher AUC. Copyright © 2018 The Association of University Radiologists. All rights reserved.

  8. Advancing Research in Second Language Writing through Computational Tools and Machine Learning Techniques: A Research Agenda

    ERIC Educational Resources Information Center

    Crossley, Scott A.

    2013-01-01

    This paper provides an agenda for replication studies focusing on second language (L2) writing and the use of natural language processing (NLP) tools and machine learning algorithms. Specifically, it introduces a range of the available NLP tools and machine learning algorithms and demonstrates how these could be used to replicate seminal studies…

  9. Adapting Semantic Natural Language Processing Technology to Address Information Overload in Influenza Epidemic Management

    PubMed Central

    Keselman, Alla; Rosemblat, Graciela; Kilicoglu, Halil; Fiszman, Marcelo; Jin, Honglan; Shin, Dongwook; Rindflesch, Thomas C.

    2013-01-01

    Explosion of disaster health information results in information overload among response professionals. The objective of this project was to determine the feasibility of applying semantic natural language processing (NLP) technology to addressing this overload. The project characterizes concepts and relationships commonly used in disaster health-related documents on influenza pandemics, as the basis for adapting an existing semantic summarizer to the domain. Methods include human review and semantic NLP analysis of a set of relevant documents. This is followed by a pilot-test in which two information specialists use the adapted application for a realistic information seeking task. According to the results, the ontology of influenza epidemics management can be described via a manageable number of semantic relationships that involve concepts from a limited number of semantic types. Test users demonstrate several ways to engage with the application to obtain useful information. This suggests that existing semantic NLP algorithms can be adapted to support information summarization and visualization in influenza epidemics and other disaster health areas. However, additional research is needed in the areas of terminology development (as many relevant relationships and terms are not part of existing standardized vocabularies), NLP, and user interface design. PMID:24311971

  10. Enhancing Grammatical Structures in Web-Based Texts

    ERIC Educational Resources Information Center

    Zilio, Leonardo; Wilkens, Rodrigo; Fairon, Cédrick

    2017-01-01

    Presentation of raw text to language learners is not enough to ensure learning. Thus, we present the Smart and Immersive Language Learning Environment (SMILLE), a system that uses Natural Language Processing (NLP) for enhancing grammatical information in texts chosen by a given user. The enhancements, carried out by means of text highlighting, are…

  11. Automated Assessment of Medical Students' Clinical Exposures according to AAMC Geriatric Competencies.

    PubMed

    Chen, Yukun; Wrenn, Jesse; Xu, Hua; Spickard, Anderson; Habermann, Ralf; Powers, James; Denny, Joshua C

    2014-01-01

    Competence is essential for health care professionals. Current methods to assess competency, however, do not efficiently capture medical students' experience. In this preliminary study, we used machine learning and natural language processing (NLP) to identify geriatric competency exposures from students' clinical notes. The system applied NLP to generate the concepts and related features from notes. We extracted a refined list of concepts associated with corresponding competencies. This system was evaluated through 10-fold cross validation for six geriatric competency domains: "medication management (MedMgmt)", "cognitive and behavioral disorders (CBD)", "falls, balance, gait disorders (Falls)", "self-care capacity (SCC)", "palliative care (PC)", "hospital care for elders (HCE)" - each an American Association of Medical Colleges competency for medical students. The systems could accurately assess MedMgmt, SCC, HCE, and Falls competencies with F-measures of 0.94, 0.86, 0.85, and 0.84, respectively, but did not attain good performance for PC and CBD (0.69 and 0.62 in F-measure, respectively).

  12. Automated extraction of family history information from clinical notes.

    PubMed

    Bill, Robert; Pakhomov, Serguei; Chen, Elizabeth S; Winden, Tamara J; Carter, Elizabeth W; Melton, Genevieve B

    2014-01-01

    Despite increased functionality for obtaining family history in a structured format within electronic health record systems, clinical notes often still contain this information. We developed and evaluated an Unstructured Information Management Application (UIMA)-based natural language processing (NLP) module for automated extraction of family history information with functionality for identifying statements, observations (e.g., disease or procedure), relative or side of family with attributes (i.e., vital status, age of diagnosis, certainty, and negation), and predication ("indicator phrases"), the latter of which was used to establish relationships between observations and family member. The family history NLP system demonstrated F-scores of 66.9, 92.4, 82.9, 57.3, 97.7, and 61.9 for detection of family history statements, family member identification, observation identification, negation identification, vital status, and overall extraction of the predications between family members and observations, respectively. While the system performed well for detection of family history statements and predication constituents, further work is needed to improve extraction of certainty and temporal modifications.

  13. Automated Extraction of Family History Information from Clinical Notes

    PubMed Central

    Bill, Robert; Pakhomov, Serguei; Chen, Elizabeth S.; Winden, Tamara J.; Carter, Elizabeth W.; Melton, Genevieve B.

    2014-01-01

    Despite increased functionality for obtaining family history in a structured format within electronic health record systems, clinical notes often still contain this information. We developed and evaluated an Unstructured Information Management Application (UIMA)-based natural language processing (NLP) module for automated extraction of family history information with functionality for identifying statements, observations (e.g., disease or procedure), relative or side of family with attributes (i.e., vital status, age of diagnosis, certainty, and negation), and predication (“indicator phrases”), the latter of which was used to establish relationships between observations and family member. The family history NLP system demonstrated F-scores of 66.9, 92.4, 82.9, 57.3, 97.7, and 61.9 for detection of family history statements, family member identification, observation identification, negation identification, vital status, and overall extraction of the predications between family members and observations, respectively. While the system performed well for detection of family history statements and predication constituents, further work is needed to improve extraction of certainty and temporal modifications. PMID:25954443

  14. Dietary Nanosized Lactobacillus plantarum Enhances the Anticancer Effect of Kimchi on Azoxymethane and Dextran Sulfate Sodium-Induced Colon Cancer in C57BL/6J Mice.

    PubMed

    Lee, Hyun Ah; Kim, Hyunung; Lee, Kwang-Won; Park, Kun-Young

    2016-01-01

    This study was undertaken to evaluate enhancement of the chemopreventive properties of kimchi by dietary nanosized Lactobacillus (Lab.)plantarum (nLp) in an azoxymethane (AOM)/dextran sulfate sodium (DSS)-induced colitis-associated colorectal cancer C57BL/6J mouse model. nLp is a dead, shrunken, processed form of Lab. Plantarum isolated from kimchi that is 0.5-1.0 µm in size. The results obtained showed that animals fed kimchi with nLp (K-nLp) had longer colons and lower colon weights/length ratios and developed fewer tumors than mice fed kimchi alone (K). In addition, K-nLp administration reduced levels of proinflammatory cytokine serum levels and mediated the mRNA and protein expressions of inflammatory, apoptotic, and cell-cycle markers to suppress inflammation and induce tumor-cell apoptosis and cell-cycle arrest. Moreover, it elevated natural killer-cell cytotoxicity. The study suggests adding nLp to kimchi could improve the suppressive effect of kimchi on AOM/DSS-induced colorectal cancer. These findings indicate nLp has potential use as a functional chemopreventive ingredient in the food industry.

  15. Functional evaluation of out-of-the-box text-mining tools for data-mining tasks.

    PubMed

    Jung, Kenneth; LePendu, Paea; Iyer, Srinivasan; Bauer-Mehren, Anna; Percha, Bethany; Shah, Nigam H

    2015-01-01

    The trade-off between the speed and simplicity of dictionary-based term recognition and the richer linguistic information provided by more advanced natural language processing (NLP) is an area of active discussion in clinical informatics. In this paper, we quantify this trade-off among text processing systems that make different trade-offs between speed and linguistic understanding. We tested both types of systems in three clinical research tasks: phase IV safety profiling of a drug, learning adverse drug-drug interactions, and learning used-to-treat relationships between drugs and indications. We first benchmarked the accuracy of the NCBO Annotator and REVEAL in a manually annotated, publically available dataset from the 2008 i2b2 Obesity Challenge. We then applied the NCBO Annotator and REVEAL to 9 million clinical notes from the Stanford Translational Research Integrated Database Environment (STRIDE) and used the resulting data for three research tasks. There is no significant difference between using the NCBO Annotator and REVEAL in the results of the three research tasks when using large datasets. In one subtask, REVEAL achieved higher sensitivity with smaller datasets. For a variety of tasks, employing simple term recognition methods instead of advanced NLP methods results in little or no impact on accuracy when using large datasets. Simpler dictionary-based methods have the advantage of scaling well to very large datasets. Promoting the use of simple, dictionary-based methods for population level analyses can advance adoption of NLP in practice. © The Author 2014. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  16. Building a comprehensive syntactic and semantic corpus of Chinese clinical texts.

    PubMed

    He, Bin; Dong, Bin; Guan, Yi; Yang, Jinfeng; Jiang, Zhipeng; Yu, Qiubin; Cheng, Jianyi; Qu, Chunyan

    2017-05-01

    To build a comprehensive corpus covering syntactic and semantic annotations of Chinese clinical texts with corresponding annotation guidelines and methods as well as to develop tools trained on the annotated corpus, which supplies baselines for research on Chinese texts in the clinical domain. An iterative annotation method was proposed to train annotators and to develop annotation guidelines. Then, by using annotation quality assurance measures, a comprehensive corpus was built, containing annotations of part-of-speech (POS) tags, syntactic tags, entities, assertions, and relations. Inter-annotator agreement (IAA) was calculated to evaluate the annotation quality and a Chinese clinical text processing and information extraction system (CCTPIES) was developed based on our annotated corpus. The syntactic corpus consists of 138 Chinese clinical documents with 47,426 tokens and 2612 full parsing trees, while the semantic corpus includes 992 documents that annotated 39,511 entities with their assertions and 7693 relations. IAA evaluation shows that this comprehensive corpus is of good quality, and the system modules are effective. The annotated corpus makes a considerable contribution to natural language processing (NLP) research into Chinese texts in the clinical domain. However, this corpus has a number of limitations. Some additional types of clinical text should be introduced to improve corpus coverage and active learning methods should be utilized to promote annotation efficiency. In this study, several annotation guidelines and an annotation method for Chinese clinical texts were proposed, and a comprehensive corpus with its NLP modules were constructed, providing a foundation for further study of applying NLP techniques to Chinese texts in the clinical domain. Copyright © 2017. Published by Elsevier Inc.

  17. Advocate: A Distributed Architecture for Speech-to-Speech Translation

    DTIC Science & Technology

    2009-01-01

    tecture, are either wrapped natural-language processing ( NLP ) components or objects developed from scratch using the architecture’s API. GATE is...framework, we put together a demonstration Arabic -to- English speech translation system using both internally developed ( Arabic speech recognition and MT...conditions of our Arabic S2S demonstration system described earlier. Once again, the data size was varied and eighty identical requests were

  18. Inappropriate Expression of an NLP Effector in Colletotrichum orbiculare Impairs Infection on Cucurbitaceae Cultivars via Plant Recognition of the C-Terminal Region.

    PubMed

    Azmi, Nur Sabrina Ahmad; Singkaravanit-Ogawa, Suthitar; Ikeda, Kyoko; Kitakura, Saeko; Inoue, Yoshihiro; Narusaka, Yoshihiro; Shirasu, Ken; Kaido, Masanori; Mise, Kazuyuki; Takano, Yoshitaka

    2018-01-01

    The hemibiotrophic pathogen Colletotrichum orbiculare preferentially expresses a necrosis and ethylene-inducing peptide 1 (Nep1)-like protein named NLP1 during the switch to necrotrophy. Here, we report that the constitutive expression of NLP1 in C. orbiculare blocks pathogen infection in multiple Cucurbitaceae cultivars via their enhanced defense responses. NLP1 has a cytotoxic activity that induces cell death in Nicotiana benthamiana. However, C. orbiculare transgenic lines constitutively expressing a mutant NLP1 lacking the cytotoxic activity still failed to infect cucumber, indicating no clear relationship between cytotoxic activity and the NLP1-dependent enhanced defense. NLP1 also possesses the microbe-associated molecular pattern (MAMP) sequence called nlp24, recognized by Arabidopsis thaliana at its central region, similar to NLPs of other pathogens. Surprisingly, inappropriate expression of a mutant NLP1 lacking the MAMP signature is also effective for blocking pathogen infection, uncoupling the infection block from the corresponding MAMP. Notably, the deletion analyses of NLP1 suggested that the C-terminal region of NLP1 is critical to enhance defense in cucumber. The expression of mCherry fused with the C-terminal 32 amino acids of NLP1 was enough to trigger the defense of cucurbits, revealing that the C-terminal region of the NLP1 protein is recognized by cucurbits and, then, terminates C. orbiculare infection.

  19. Terminology model discovery using natural language processing and visualization techniques.

    PubMed

    Zhou, Li; Tao, Ying; Cimino, James J; Chen, Elizabeth S; Liu, Hongfang; Lussier, Yves A; Hripcsak, George; Friedman, Carol

    2006-12-01

    Medical terminologies are important for unambiguous encoding and exchange of clinical information. The traditional manual method of developing terminology models is time-consuming and limited in the number of phrases that a human developer can examine. In this paper, we present an automated method for developing medical terminology models based on natural language processing (NLP) and information visualization techniques. Surgical pathology reports were selected as the testing corpus for developing a pathology procedure terminology model. The use of a general NLP processor for the medical domain, MedLEE, provides an automated method for acquiring semantic structures from a free text corpus and sheds light on a new high-throughput method of medical terminology model development. The use of an information visualization technique supports the summarization and visualization of the large quantity of semantic structures generated from medical documents. We believe that a general method based on NLP and information visualization will facilitate the modeling of medical terminologies.

  20. Workshop on using natural language processing applications for enhancing clinical decision making: an executive summary

    PubMed Central

    Pai, Vinay M; Rodgers, Mary; Conroy, Richard; Luo, James; Zhou, Ruixia; Seto, Belinda

    2014-01-01

    In April 2012, the National Institutes of Health organized a two-day workshop entitled ‘Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making’ (NLP-CDS). This report is a summary of the discussions during the second day of the workshop. Collectively, the workshop presenters and participants emphasized the need for unstructured clinical notes to be included in the decision making workflow and the need for individualized longitudinal data tracking. The workshop also discussed the need to: (1) combine evidence-based literature and patient records with machine-learning and prediction models; (2) provide trusted and reproducible clinical advice; (3) prioritize evidence and test results; and (4) engage healthcare professionals, caregivers, and patients. The overall consensus of the NLP-CDS workshop was that there are promising opportunities for NLP and CDS to deliver cognitive support for healthcare professionals, caregivers, and patients. PMID:23921193

  1. Collaborative human-machine analysis to disambiguate entities in unstructured text and structured datasets

    NASA Astrophysics Data System (ADS)

    Davenport, Jack H.

    2016-05-01

    Intelligence analysts demand rapid information fusion capabilities to develop and maintain accurate situational awareness and understanding of dynamic enemy threats in asymmetric military operations. The ability to extract relationships between people, groups, and locations from a variety of text datasets is critical to proactive decision making. The derived network of entities must be automatically created and presented to analysts to assist in decision making. DECISIVE ANALYTICS Corporation (DAC) provides capabilities to automatically extract entities, relationships between entities, semantic concepts about entities, and network models of entities from text and multi-source datasets. DAC's Natural Language Processing (NLP) Entity Analytics model entities as complex systems of attributes and interrelationships which are extracted from unstructured text via NLP algorithms. The extracted entities are automatically disambiguated via machine learning algorithms, and resolution recommendations are presented to the analyst for validation; the analyst's expertise is leveraged in this hybrid human/computer collaborative model. Military capability is enhanced by these NLP Entity Analytics because analysts can now create/update an entity profile with intelligence automatically extracted from unstructured text, thereby fusing entity knowledge from structured and unstructured data sources. Operational and sustainment costs are reduced since analysts do not have to manually tag and resolve entities.

  2. Sociolinguistically Informed Natural Language Processing: Automating Irony Detection

    DTIC Science & Technology

    2017-10-23

    ML and NLP technologies fail to detect ironic intent empirically. We specifically proposed to assess quantitatively (using the collected dataset...Aim 2. To analyze when existing ML and NLP technologies fail to detect ironic intent empirically. We specifically proposed to assess quantitatively ...of the embedding reddit thread, and the other comments in this thread) constitute 4 sub-reddit (URL) description number of labeled comments politics

  3. The role of centrosomal Nlp in the control of mitotic progression and tumourigenesis.

    PubMed

    Li, J; Zhan, Q

    2011-05-10

    The human centrosomal ninein-like protein (Nlp) is a new member of the γ-tubulin complexes binding proteins (GTBPs) that is essential for proper execution of various mitotic events. The primary function of Nlp is to promote microtubule nucleation that contributes to centrosome maturation, spindle formation and chromosome segregation. Its subcellular localisation and protein stability are regulated by several crucial mitotic kinases, such as Plk1, Nek2, Cdc2 and Aurora B. Several lines of evidence have linked Nlp to human cancer. Deregulation of Nlp in cell models results in aberrant spindle, chromosomal missegregation and multinulei, and induces chromosomal instability and renders cells tumourigenic. Overexpression of Nlp induces anchorage-independent growth and immortalised primary cell transformation. In addition, we first demonstrate that the expression of Nlp is elevated primarily due to NLP gene amplification in human breast cancer and lung carcinoma. Consistently, transgenic mice overexpressing Nlp display spontaneous tumours in breast, ovary and testicle, and show rapid onset of radiation-induced lymphoma, indicating that Nlp is involved in tumourigenesis. This review summarises our current knowledge of physiological roles of Nlp, with an emphasis on its potentials in tumourigenesis.

  4. The role of centrosomal Nlp in the control of mitotic progression and tumourigenesis

    PubMed Central

    Li, J; Zhan, Q

    2011-01-01

    The human centrosomal ninein-like protein (Nlp) is a new member of the γ-tubulin complexes binding proteins (GTBPs) that is essential for proper execution of various mitotic events. The primary function of Nlp is to promote microtubule nucleation that contributes to centrosome maturation, spindle formation and chromosome segregation. Its subcellular localisation and protein stability are regulated by several crucial mitotic kinases, such as Plk1, Nek2, Cdc2 and Aurora B. Several lines of evidence have linked Nlp to human cancer. Deregulation of Nlp in cell models results in aberrant spindle, chromosomal missegregation and multinulei, and induces chromosomal instability and renders cells tumourigenic. Overexpression of Nlp induces anchorage-independent growth and immortalised primary cell transformation. In addition, we first demonstrate that the expression of Nlp is elevated primarily due to NLP gene amplification in human breast cancer and lung carcinoma. Consistently, transgenic mice overexpressing Nlp display spontaneous tumours in breast, ovary and testicle, and show rapid onset of radiation-induced lymphoma, indicating that Nlp is involved in tumourigenesis. This review summarises our current knowledge of physiological roles of Nlp, with an emphasis on its potentials in tumourigenesis. PMID:21505454

  5. Direct transcriptional activation of BT genes by NLP transcription factors is a key component of the nitrate response in Arabidopsis.

    PubMed

    Sato, Takeo; Maekawa, Shugo; Konishi, Mineko; Yoshioka, Nozomi; Sasaki, Yuki; Maeda, Haruna; Ishida, Tetsuya; Kato, Yuki; Yamaguchi, Junji; Yanagisawa, Shuichi

    2017-01-29

    Nitrate modulates growth and development, functioning as a nutrient signal in plants. Although many changes in physiological processes in response to nitrate have been well characterized as nitrate responses, the molecular mechanisms underlying the nitrate response are not yet fully understood. Here, we show that NLP transcription factors, which are key regulators of the nitrate response, directly activate the nitrate-inducible expression of BT1 and BT2 encoding putative scaffold proteins with a plant-specific domain structure in Arabidopsis. Interestingly, the 35S promoter-driven expression of BT2 partially rescued growth inhibition caused by reductions in NLP activity in Arabidopsis. Furthermore, simultaneous disruption of BT1 and BT2 affected nitrate-dependent lateral root development. These results suggest that direct activation of BT1 and BT2 by NLP transcriptional activators is a key component of the molecular mechanism underlying the nitrate response in Arabidopsis. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. C-5M Super Galaxy Utilization with Joint Precision Airdrop System

    DTIC Science & Technology

    2012-03-22

    System Notes FireFly 900-2,200 Steerable Parafoil Screamer 500-2,200 Steerable Parafoil w/additional chutes to slow touchdown Dragonfly...setting . This initial feasible solution provides the Nonlinear Program algorithm a starting point to continue its calculations. The model continues...provides the NLP with a starting point of 1. This provides the NLP algorithm a point within the feasible region to begin its calculations in an attempt

  7. Coordinate regulation of the mother centriole component nlp by nek2 and plk1 protein kinases.

    PubMed

    Rapley, Joseph; Baxter, Joanne E; Blot, Joelle; Wattam, Samantha L; Casenghi, Martina; Meraldi, Patrick; Nigg, Erich A; Fry, Andrew M

    2005-02-01

    Mitotic entry requires a major reorganization of the microtubule cytoskeleton. Nlp, a centrosomal protein that binds gamma-tubulin, is a G(2)/M target of the Plk1 protein kinase. Here, we show that human Nlp and its Xenopus homologue, X-Nlp, are also phosphorylated by the cell cycle-regulated Nek2 kinase. X-Nlp is a 213-kDa mother centriole-specific protein, implicating it in microtubule anchoring. Although constant in abundance throughout the cell cycle, it is displaced from centrosomes upon mitotic entry. Overexpression of active Nek2 or Plk1 causes premature displacement of Nlp from interphase centrosomes. Active Nek2 is also capable of phosphorylating and displacing a mutant form of Nlp that lacks Plk1 phosphorylation sites. Importantly, kinase-inactive Nek2 interferes with Plk1-induced displacement of Nlp from interphase centrosomes and displacement of endogenous Nlp from mitotic spindle poles, while active Nek2 stimulates Plk1 phosphorylation of Nlp in vitro. Unlike Plk1, Nek2 does not prevent association of Nlp with gamma-tubulin. Together, these results provide the first example of a protein involved in microtubule organization that is coordinately regulated at the G(2)/M transition by two centrosomal kinases. We also propose that phosphorylation by Nek2 may prime Nlp for phosphorylation by Plk1.

  8. Using natural language processing for identification of herpes zoster ophthalmicus cases to support population-based study.

    PubMed

    Zheng, Chengyi; Luo, Yi; Mercado, Cheryl; Sy, Lina; Jacobsen, Steven J; Ackerson, Brad; Lewin, Bruno; Tseng, Hung Fu

    2018-06-19

    Diagnosis codes are inadequate for accurately identifying herpes zoster ophthalmicus (HZO). There is significant lack of population-based studies on HZO due to the high expense of manual review of medical records. To assess whether HZO can be identified from the clinical notes using natural language processing (NLP). To investigate the epidemiology of HZO among HZ population based on the developed approach. A retrospective cohort analysis. A total of 49,914 southern California residents aged over 18 years, who had a new diagnosis of HZ. An NLP-based algorithm was developed and validated with the manually curated validation dataset (n=461). The algorithm was applied on over 1 million clinical notes associated with the study population. HZO versus non-HZO cases were compared by age, sex, race, and comorbidities. We measured the accuracy of NLP algorithm. NLP algorithm achieved 95.6% sensitivity and 99.3% specificity. Compared to the diagnosis codes, NLP identified significant more HZO cases among HZ population (13.9% versus 1.7%). Compared to the non-HZO group, the HZO group was older, had more males, had more Whites, and had more outpatient visits. We developed and validated an automatic method to identify HZO cases with high accuracy. As one of the largest studies on HZO, our finding emphasizes the importance of preventing HZ in the elderly population. This method can be a valuable tool to support population-based studies and clinical care of HZO in the era of big data. This article is protected by copyright. All rights reserved.

  9. Rapid Training of Information Extraction with Local and Global Data Views

    DTIC Science & Technology

    2012-05-01

    56 xiii 4.1 An example of words and their bit string representations. Bold ones are transliterated Arabic words...Natural Language Processing ( NLP ) community faces new tasks and new domains all the time. Without enough labeled data of a new task or a new domain to...conduct supervised learning, semi-supervised learning is particularly attractive to NLP researchers since it only requires a handful of labeled examples

  10. Reliable Electronic Text: The Elusive Prerequisite for a Host of Human Language Technologies

    DTIC Science & Technology

    2010-09-30

    is not always the case—for example, ligatures in Latin-fonts, and glyphs in Arabic fonts (King, 2008; Carrier, 2009). This complexity, and others...such effects can render electronic text useless for natural language processing ( NLP ). Typically, file converters do not expose the details of the...the many component NLP technologies typically used inside information extraction and text categorization applications, such as tokenization, part-of

  11. Centrosomal Nlp is an oncogenic protein that is gene-amplified in human tumors and causes spontaneous tumorigenesis in transgenic mice.

    PubMed

    Shao, Shujuan; Liu, Rong; Wang, Yang; Song, Yongmei; Zuo, Lihui; Xue, Liyan; Lu, Ning; Hou, Ning; Wang, Mingrong; Yang, Xiao; Zhan, Qimin

    2010-02-01

    Disruption of mitotic events contributes greatly to genomic instability and results in mutator phenotypes. Indeed, abnormalities of mitotic components are closely associated with malignant transformation and tumorigenesis. Here we show that ninein-like protein (Nlp), a recently identified BRCA1-associated centrosomal protein involved in microtubule nucleation and spindle formation, is an oncogenic protein. Nlp was found to be overexpressed in approximately 80% of human breast and lung carcinomas analyzed. In human lung cancers, this deregulated expression was associated with NLP gene amplification. Further analysis revealed that Nlp exhibited strong oncogenic properties; for example, it conferred to NIH3T3 rodent fibroblasts the capacity for anchorage-independent growth in vitro and tumor formation in nude mice. Consistent with these data, transgenic mice overexpressing Nlp displayed spontaneous tumorigenesis in the breast, ovary, and testicle within 60 weeks. In addition, Nlp overexpression induced more rapid onset of radiation-induced lymphoma. Furthermore, mouse embryonic fibroblasts (MEFs) derived from Nlp transgenic mice showed centrosome amplification, suggesting that Nlp overexpression mimics BRCA1 loss. These findings demonstrate that Nlp abnormalities may contribute to genomic instability and tumorigenesis and suggest that Nlp might serve as a potential biomarker for clinical diagnosis and therapeutic target.

  12. Centrosomal Nlp is an oncogenic protein that is gene-amplified in human tumors and causes spontaneous tumorigenesis in transgenic mice

    PubMed Central

    Shao, Shujuan; Liu, Rong; Wang, Yang; Song, Yongmei; Zuo, Lihui; Xue, Liyan; Lu, Ning; Hou, Ning; Wang, Mingrong; Yang, Xiao; Zhan, Qimin

    2010-01-01

    Disruption of mitotic events contributes greatly to genomic instability and results in mutator phenotypes. Indeed, abnormalities of mitotic components are closely associated with malignant transformation and tumorigenesis. Here we show that ninein-like protein (Nlp), a recently identified BRCA1-associated centrosomal protein involved in microtubule nucleation and spindle formation, is an oncogenic protein. Nlp was found to be overexpressed in approximately 80% of human breast and lung carcinomas analyzed. In human lung cancers, this deregulated expression was associated with NLP gene amplification. Further analysis revealed that Nlp exhibited strong oncogenic properties; for example, it conferred to NIH3T3 rodent fibroblasts the capacity for anchorage-independent growth in vitro and tumor formation in nude mice. Consistent with these data, transgenic mice overexpressing Nlp displayed spontaneous tumorigenesis in the breast, ovary, and testicle within 60 weeks. In addition, Nlp overexpression induced more rapid onset of radiation-induced lymphoma. Furthermore, mouse embryonic fibroblasts (MEFs) derived from Nlp transgenic mice showed centrosome amplification, suggesting that Nlp overexpression mimics BRCA1 loss. These findings demonstrate that Nlp abnormalities may contribute to genomic instability and tumorigenesis and suggest that Nlp might serve as a potential biomarker for clinical diagnosis and therapeutic target. PMID:20093778

  13. Determining post-test risk in a national sample of stress nuclear myocardial perfusion imaging reports: Implications for natural language processing tools.

    PubMed

    Levy, Andrew E; Shah, Nishant R; Matheny, Michael E; Reeves, Ruth M; Gobbel, Glenn T; Bradley, Steven M

    2018-04-25

    Reporting standards promote clarity and consistency of stress myocardial perfusion imaging (MPI) reports, but do not require an assessment of post-test risk. Natural Language Processing (NLP) tools could potentially help estimate this risk, yet it is unknown whether reports contain adequate descriptive data to use NLP. Among VA patients who underwent stress MPI and coronary angiography between January 1, 2009 and December 31, 2011, 99 stress test reports were randomly selected for analysis. Two reviewers independently categorized each report for the presence of critical data elements essential to describing post-test ischemic risk. Few stress MPI reports provided a formal assessment of post-test risk within the impression section (3%) or the entire document (4%). In most cases, risk was determinable by combining critical data elements (74% impression, 98% whole). If ischemic risk was not determinable (25% impression, 2% whole), inadequate description of systolic function (9% impression, 1% whole) and inadequate description of ischemia (5% impression, 1% whole) were most commonly implicated. Post-test ischemic risk was determinable but rarely reported in this sample of stress MPI reports. This supports the potential use of NLP to help clarify risk. Further study of NLP in this context is needed.

  14. BRCA1 interaction of centrosomal protein Nlp is required for successful mitotic progression.

    PubMed

    Jin, Shunqian; Gao, Hua; Mazzacurati, Lucia; Wang, Yang; Fan, Wenhong; Chen, Qiang; Yu, Wei; Wang, Mingrong; Zhu, Xueliang; Zhang, Chuanmao; Zhan, Qimin

    2009-08-21

    Breast cancer susceptibility gene BRCA1 is implicated in the control of mitotic progression, although the underlying mechanism(s) remains to be further defined. Deficiency of BRCA1 function leads to disrupted mitotic machinery and genomic instability. Here, we show that BRCA1 physically interacts and colocalizes with Nlp, an important molecule involved in centrosome maturation and spindle formation. Interestingly, Nlp centrosomal localization and its protein stability are regulated by normal cellular BRCA1 function because cells containing BRCA1 mutations or silenced for endogenous BRCA1 exhibit disrupted Nlp colocalization to centrosomes and enhanced Nlp degradation. Its is likely that the BRCA1 regulation of Nlp stability involves Plk1 suppression. Inhibition of endogenous Nlp via the small interfering RNA approach results in aberrant spindle formation, aborted chromosomal segregation, and aneuploidy, which mimic the phenotypes of disrupted BRCA1. Thus, BRCA1 interaction of Nlp might be required for the successful mitotic progression, and abnormalities of Nlp lead to genomic instability.

  15. BRCA1 Interaction of Centrosomal Protein Nlp Is Required for Successful Mitotic Progression*♦

    PubMed Central

    Jin, Shunqian; Gao, Hua; Mazzacurati, Lucia; Wang, Yang; Fan, Wenhong; Chen, Qiang; Yu, Wei; Wang, Mingrong; Zhu, Xueliang; Zhang, Chuanmao; Zhan, Qimin

    2009-01-01

    Breast cancer susceptibility gene BRCA1 is implicated in the control of mitotic progression, although the underlying mechanism(s) remains to be further defined. Deficiency of BRCA1 function leads to disrupted mitotic machinery and genomic instability. Here, we show that BRCA1 physically interacts and colocalizes with Nlp, an important molecule involved in centrosome maturation and spindle formation. Interestingly, Nlp centrosomal localization and its protein stability are regulated by normal cellular BRCA1 function because cells containing BRCA1 mutations or silenced for endogenous BRCA1 exhibit disrupted Nlp colocalization to centrosomes and enhanced Nlp degradation. Its is likely that the BRCA1 regulation of Nlp stability involves Plk1 suppression. Inhibition of endogenous Nlp via the small interfering RNA approach results in aberrant spindle formation, aborted chromosomal segregation, and aneuploidy, which mimic the phenotypes of disrupted BRCA1. Thus, BRCA1 interaction of Nlp might be required for the successful mitotic progression, and abnormalities of Nlp lead to genomic instability. PMID:19509300

  16. Tailoring vocabularies for NLP in sub-domains: a method to detect unused word sense.

    PubMed

    Figueroa, Rosa L; Zeng-Treitler, Qing; Goryachev, Sergey; Wiechmann, Eduardo P

    2009-11-14

    We developed a method to help tailor a comprehensive vocabulary system (e.g. the UMLS) for a sub-domain (e.g. clinical reports) in support of natural language processing (NLP). The method detects unused sense in a sub-domain by comparing the relational neighborhood of a word/term in the vocabulary with the semantic neighborhood of the word/term in the sub-domain. The semantic neighborhood of the word/term in the sub-domain is determined using latent semantic analysis (LSA). We trained and tested the unused sense detection on two clinical text corpora: one contains discharge summaries and the other outpatient visit notes. We were able to detect unused senses with precision from 79% to 87%, recall from 48% to 74%, and an area under receiver operation curve (AUC) of 72% to 87%.

  17. Using NLP to identify cancer cases in imaging reports drawn from radiology information systems.

    PubMed

    Patrick, Jon; Asgari, Pooyan; Li, Min; Nguyen, Dung

    2013-01-01

    A Natural Language processing (NLP) classifier has been developed for the Victorian and NSW Cancer Registries with the purpose of automatically identifying cancer reports from imaging services, transmitting them to the Registries and then extracting pertinent cancer information. Large scale trials conducted on over 40,000 reports show the sensitivity for identifying reportable cancer reports is above 98% with a specificity above 96%. Detection of tumour stream, report purpose, and a variety of extracted content is generally above 90% specificity. The differences between report layout and authoring strategies across imaging services appear to require different classifiers to retain this high level of accuracy. Linkage of the imaging data with existing registry records (hospital and pathology reports) to derive stage and recurrence of cancer has commenced and shown very promising results.

  18. Extracting laboratory test information from biomedical text

    PubMed Central

    Kang, Yanna Shen; Kayaalp, Mehmet

    2013-01-01

    Background: No previous study reported the efficacy of current natural language processing (NLP) methods for extracting laboratory test information from narrative documents. This study investigates the pathology informatics question of how accurately such information can be extracted from text with the current tools and techniques, especially machine learning and symbolic NLP methods. The study data came from a text corpus maintained by the U.S. Food and Drug Administration, containing a rich set of information on laboratory tests and test devices. Methods: The authors developed a symbolic information extraction (SIE) system to extract device and test specific information about four types of laboratory test entities: Specimens, analytes, units of measures and detection limits. They compared the performance of SIE and three prominent machine learning based NLP systems, LingPipe, GATE and BANNER, each implementing a distinct supervised machine learning method, hidden Markov models, support vector machines and conditional random fields, respectively. Results: Machine learning systems recognized laboratory test entities with moderately high recall, but low precision rates. Their recall rates were relatively higher when the number of distinct entity values (e.g., the spectrum of specimens) was very limited or when lexical morphology of the entity was distinctive (as in units of measures), yet SIE outperformed them with statistically significant margins on extracting specimen, analyte and detection limit information in both precision and F-measure. Its high recall performance was statistically significant on analyte information extraction. Conclusions: Despite its shortcomings against machine learning methods, a well-tailored symbolic system may better discern relevancy among a pile of information of the same type and may outperform a machine learning system by tapping into lexically non-local contextual information such as the document structure. PMID:24083058

  19. Cell-free production of a functional oligomeric form of a Chlamydia major outer-membrane protein (MOMP) for vaccine development

    DOE PAGES

    He, Wei; Felderman, Martina; Evans, Angela C.; ...

    2017-07-24

    Chlamydia is a prevalent sexually transmitted disease that infects more than 100 million people worldwide. Although most individuals infected with Chlamydia trachomatis are initially asymptomatic, symptoms can arise if left undiagnosed. Long-term infection can result in debilitating conditions such as pelvic inflammatory disease, infertility, and blindness. Chlamydia infection, therefore, constitutes a significant public health threat, underscoring the need for a Chlamydia-specific vaccine. Chlamydia strains express a major outer-membrane protein (MOMP) that has been shown to be an effective vaccine antigen. However, approaches to produce a functional recombinant MOMP protein for vaccine development are limited by poor solubility, low yield, andmore » protein misfolding. For this study, we used an Escherichia coli-based cell-free system to express a MOMP protein from the mouse-specific species Chlamydia muridarum (MoPn-MOMP or mMOMP). The codon-optimized mMOMP gene was co-translated with Δ49apolipoprotein A1 (Δ49ApoA1), a truncated version of mouse ApoA1 in which the N-terminal 49 amino acids were removed. This co-translation process produced mMOMP supported within a telodendrimer nanolipoprotein particle (mMOMP–tNLP). The cell-free expressed mMOMP–tNLPs contain mMOMP multimers similar to the native MOMP protein. This cell-free process produced on average 1.5 mg of purified, water-soluble mMOMP–tNLP complex in a 1-ml cell-free reaction. The mMOMP–tNLP particle also accommodated the co-localization of CpG oligodeoxynucleotide 1826, a single-stranded synthetic DNA adjuvant, eliciting an enhanced humoral immune response in vaccinated mice. Using our mMOMP–tNLP formulation, we demonstrate a unique approach to solubilizing and administering membrane-bound proteins for future vaccine development. This method can be applied to other previously difficult-to-obtain antigens while maintaining full functionality and immunogenicity.« less

  20. Cell-free production of a functional oligomeric form of a Chlamydia major outer-membrane protein (MOMP) for vaccine development

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    He, Wei; Felderman, Martina; Evans, Angela C.

    Chlamydia is a prevalent sexually transmitted disease that infects more than 100 million people worldwide. Although most individuals infected with Chlamydia trachomatis are initially asymptomatic, symptoms can arise if left undiagnosed. Long-term infection can result in debilitating conditions such as pelvic inflammatory disease, infertility, and blindness. Chlamydia infection, therefore, constitutes a significant public health threat, underscoring the need for a Chlamydia-specific vaccine. Chlamydia strains express a major outer-membrane protein (MOMP) that has been shown to be an effective vaccine antigen. However, approaches to produce a functional recombinant MOMP protein for vaccine development are limited by poor solubility, low yield, andmore » protein misfolding. For this study, we used an Escherichia coli-based cell-free system to express a MOMP protein from the mouse-specific species Chlamydia muridarum (MoPn-MOMP or mMOMP). The codon-optimized mMOMP gene was co-translated with Δ49apolipoprotein A1 (Δ49ApoA1), a truncated version of mouse ApoA1 in which the N-terminal 49 amino acids were removed. This co-translation process produced mMOMP supported within a telodendrimer nanolipoprotein particle (mMOMP–tNLP). The cell-free expressed mMOMP–tNLPs contain mMOMP multimers similar to the native MOMP protein. This cell-free process produced on average 1.5 mg of purified, water-soluble mMOMP–tNLP complex in a 1-ml cell-free reaction. The mMOMP–tNLP particle also accommodated the co-localization of CpG oligodeoxynucleotide 1826, a single-stranded synthetic DNA adjuvant, eliciting an enhanced humoral immune response in vaccinated mice. Using our mMOMP–tNLP formulation, we demonstrate a unique approach to solubilizing and administering membrane-bound proteins for future vaccine development. This method can be applied to other previously difficult-to-obtain antigens while maintaining full functionality and immunogenicity.« less

  1. Cdc2/cyclin B1 regulates centrosomal Nlp proteolysis and subcellular localization.

    PubMed

    Zhao, Xuelian; Jin, Shunqian; Song, Yongmei; Zhan, Qimin

    2010-11-01

    The formation of proper mitotic spindles is required for appropriate chromosome segregation during cell division. Aberrant spindle formation often causes aneuploidy and results in tumorigenesis. However, the underlying mechanism of regulating spindle formation and chromosome separation remains to be further defined. Centrosomal Nlp (ninein-like protein) is a recently characterized BRCA1-regulated centrosomal protein and plays an important role in centrosome maturation and spindle formation. In this study, we show that Nlp can be phosphorylated by cell cycle protein kinase Cdc2/cyclin B1. The phosphorylation sites of Nlp are mapped at Ser185 and Ser589. Interestingly, the Cdc2/cyclin B1 phosphorylation site Ser185 of Nlp is required for its recognition by PLK1, which enable Nlp depart from centrosomes to allow the establishment of a mitotic scaffold at the onset of mitosis . PLK1 fails to dissociate the Nlp mutant lacking Ser185 from centrosome, suggesting that Cdc2/cyclin B1 might serve as a primary kinase of PLK1 in regulating Nlp subcellular localization. However, the phosphorylation at the site Ser589 by Cdc2/cyclin B1 plays an important role in Nlp protein stability probably due to its effect on protein degradation. Furthermore, we show that deregulated expression or subcellular localization of Nlp lead to multinuclei in cells, indicating that scheduled levels of Nlp and proper subcellular localization of Nlp are critical for successful completion of normal cell mitosis, These findings demonstrate that Cdc2/cyclin B1 is a key regulator in maintaining appropriate degradation and subcellular localization of Nlp, providing novel insights into understanding on the role of Cdc2/cyclin B1 in mitotic progression.

  2. Natural language processing: state of the art and prospects for significant progress, a workshop sponsored by the National Library of Medicine.

    PubMed

    Friedman, Carol; Rindflesch, Thomas C; Corn, Milton

    2013-10-01

    Natural language processing (NLP) is crucial for advancing healthcare because it is needed to transform relevant information locked in text into structured data that can be used by computer processes aimed at improving patient care and advancing medicine. In light of the importance of NLP to health, the National Library of Medicine (NLM) recently sponsored a workshop to review the state of the art in NLP focusing on text in English, both in biomedicine and in the general language domain. Specific goals of the NLM-sponsored workshop were to identify the current state of the art, grand challenges and specific roadblocks, and to identify effective use and best practices. This paper reports on the main outcomes of the workshop, including an overview of the state of the art, strategies for advancing the field, and obstacles that need to be addressed, resulting in recommendations for a research agenda intended to advance the field. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  3. Adaptable, high recall, event extraction system with minimal configuration.

    PubMed

    Miwa, Makoto; Ananiadou, Sophia

    2015-01-01

    Biomedical event extraction has been a major focus of biomedical natural language processing (BioNLP) research since the first BioNLP shared task was held in 2009. Accordingly, a large number of event extraction systems have been developed. Most such systems, however, have been developed for specific tasks and/or incorporated task specific settings, making their application to new corpora and tasks problematic without modification of the systems themselves. There is thus a need for event extraction systems that can achieve high levels of accuracy when applied to corpora in new domains, without the need for exhaustive tuning or modification, whilst retaining competitive levels of performance. We have enhanced our state-of-the-art event extraction system, EventMine, to alleviate the need for task-specific tuning. Task-specific details are specified in a configuration file, while extensive task-specific parameter tuning is avoided through the integration of a weighting method, a covariate shift method, and their combination. The task-specific configuration and weighting method have been employed within the context of two different sub-tasks of BioNLP shared task 2013, i.e. Cancer Genetics (CG) and Pathway Curation (PC), removing the need to modify the system specifically for each task. With minimal task specific configuration and tuning, EventMine achieved the 1st place in the PC task, and 2nd in the CG, achieving the highest recall for both tasks. The system has been further enhanced following the shared task by incorporating the covariate shift method and entity generalisations based on the task definitions, leading to further performance improvements. We have shown that it is possible to apply a state-of-the-art event extraction system to new tasks with high levels of performance, without having to modify the system internally. Both covariate shift and weighting methods are useful in facilitating the production of high recall systems. These methods and their combination can adapt a model to the target data with no deep tuning and little manual configuration.

  4. Mass Spectrometry of Single GABAergic Somatic Motorneurons Identifies a Novel Inhibitory Peptide, As-NLP-22, in the Nematode Ascaris suum.

    PubMed

    Konop, Christopher J; Knickelbine, Jennifer J; Sygulla, Molly S; Wruck, Colin D; Vestling, Martha M; Stretton, Antony O W

    2015-12-01

    Neuromodulators have become an increasingly important component of functional circuits, dramatically changing the properties of both neurons and synapses to affect behavior. To explore the role of neuropeptides in Ascaris suum behavior, we devised an improved method for cleanly dissecting single motorneuronal cell bodies from the many other cell processes and hypodermal tissue in the ventral nerve cord. We determined their peptide content using matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS). The reduced complexity of the peptide mixture greatly aided the detection of peptides; peptide levels were sufficient to permit sequencing by tandem MS from single cells. Inhibitory motorneurons, known to be GABAergic, contain a novel neuropeptide, As-NLP-22 (SLASGRWGLRPamide). From this sequence and information from the A. suum expressed sequence tag (EST) database, we cloned the transcript (As-nlp-22) and synthesized a riboprobe for in situ hybridization, which labeled the inhibitory motorneurons; this validates the integrity of the dissection method, showing that the peptides detected originate from the cells themselves and not from adhering processes from other cells (e.g., synaptic terminals). Synthetic As-NLP-22 has potent inhibitory activity on acetylcholine-induced muscle contraction as well as on basal muscle tone. Both of these effects are dose-dependent: the inhibitory effect on ACh contraction has an IC50 of 8.3 × 10(-9) M. When injected into whole worms, As-NLP-22 produces a dose-dependent inhibition of locomotory movements and, at higher levels, complete paralysis. These experiments demonstrate the utility of MALDI TOF/TOF MS in identifying novel neuromodulators at the single-cell level. Graphical Abstract ᅟ.

  5. Mass Spectrometry of Single GABAergic Somatic Motorneurons Identifies a Novel Inhibitory Peptide, As-NLP-22, in the Nematode Ascaris suum

    NASA Astrophysics Data System (ADS)

    Konop, Christopher J.; Knickelbine, Jennifer J.; Sygulla, Molly S.; Wruck, Colin D.; Vestling, Martha M.; Stretton, Antony O. W.

    2015-12-01

    Neuromodulators have become an increasingly important component of functional circuits, dramatically changing the properties of both neurons and synapses to affect behavior. To explore the role of neuropeptides in Ascaris suum behavior, we devised an improved method for cleanly dissecting single motorneuronal cell bodies from the many other cell processes and hypodermal tissue in the ventral nerve cord. We determined their peptide content using matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry (MS). The reduced complexity of the peptide mixture greatly aided the detection of peptides; peptide levels were sufficient to permit sequencing by tandem MS from single cells. Inhibitory motorneurons, known to be GABAergic, contain a novel neuropeptide, As-NLP-22 (SLASGRWGLRPamide). From this sequence and information from the A. suum expressed sequence tag (EST) database, we cloned the transcript ( As-nlp-22) and synthesized a riboprobe for in situ hybridization, which labeled the inhibitory motorneurons; this validates the integrity of the dissection method, showing that the peptides detected originate from the cells themselves and not from adhering processes from other cells (e.g., synaptic terminals). Synthetic As-NLP-22 has potent inhibitory activity on acetylcholine-induced muscle contraction as well as on basal muscle tone. Both of these effects are dose-dependent: the inhibitory effect on ACh contraction has an IC50 of 8.3 × 10-9 M. When injected into whole worms, As-NLP-22 produces a dose-dependent inhibition of locomotory movements and, at higher levels, complete paralysis. These experiments demonstrate the utility of MALDI TOF/TOF MS in identifying novel neuromodulators at the single-cell level.

  6. Automating Assessment of Lifestyle Counseling in Electronic Health Records

    PubMed Central

    Hazlehurst, Brian L.; Lawrence, Jean M.; Donahoo, William T.; Sherwood, Nancy E; Kurtz, Stephen E; Xu, Stan; Steiner, John F

    2015-01-01

    Background Numerous population-based surveys indicate that overweight and obese patients can benefit from lifestyle counseling during routine clinical care. Purpose To determine if natural language processing (NLP) could be applied to information in the electronic health record (EHR) to automatically assess delivery of counseling related to weight management in clinical health care encounters. Methods The MediClass system with NLP capabilities was used to identify weight management counseling in EHR encounter records. Knowledge for the NLP application was derived from the 5As framework for behavior counseling: Ask (evaluate weight and related disease), Advise at-risk patients to lose weight, Assess patients’ readiness to change behavior, Assist through discussion of weight loss methods and programs and Arrange follow-up efforts including referral. Using samples of EHR data in 1/1/2007-3/31/2011 period from two health systems, the accuracy of the MediClass processor for identifying these counseling elements was evaluated in post-partum visits of 600 women with gestational diabetes mellitus (GDM) compared to manual chart review as gold standard. Data were analyzed in 2013. Results Mean sensitivity and specificity for each of the 5As compared to the gold standard was at or above 85%, with the exception of sensitivity for Assist which was measured at 40% and 60% respectively for each of the two health systems. The automated method identified many valid cases of Assist not identified in the gold standard. Conclusions The MediClass processor has performance capability sufficiently similar to human abstractors to permit automated assessment of counseling for weight loss in post-partum encounter records. PMID:24745635

  7. Automating assessment of lifestyle counseling in electronic health records.

    PubMed

    Hazlehurst, Brian L; Lawrence, Jean M; Donahoo, William T; Sherwood, Nancy E; Kurtz, Stephen E; Xu, Stan; Steiner, John F

    2014-05-01

    Numerous population-based surveys indicate that overweight and obese patients can benefit from lifestyle counseling during routine clinical care. To determine if natural language processing (NLP) could be applied to information in the electronic health record (EHR) to automatically assess delivery of weight management-related counseling in clinical healthcare encounters. The MediClass system with NLP capabilities was used to identify weight-management counseling in EHRs. Knowledge for the NLP application was derived from the 5As framework for behavior counseling: Ask (evaluate weight and related disease), Advise at-risk patients to lose weight, Assess patients' readiness to change behavior, Assist through discussion of weight-loss methods and programs, and Arrange follow-up efforts including referral. Using samples of EHR data between January 1, 2007, and March 31, 2011, from two health systems, the accuracy of the MediClass processor for identifying these counseling elements was evaluated in postpartum visits of 600 women with gestational diabetes mellitus (GDM) compared to manual chart review as the gold standard. Data were analyzed in 2013. Mean sensitivity and specificity for each of the 5As compared to the gold standard was at or above 85%, with the exception of sensitivity for Assist, which was 40% and 60% for each of the two health systems. The automated method identified many valid Assist cases not identified in the gold standard. The MediClass processor has performance capability sufficiently similar to human abstractors to permit automated assessment of counseling for weight loss in postpartum encounter records. Copyright © 2014 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.

  8. Universality of next-to-leading power threshold effects for colourless final states in hadronic collisions

    NASA Astrophysics Data System (ADS)

    Del Duca, V.; Laenen, E.; Magnea, L.; Vernazza, L.; White, C. D.

    2017-11-01

    We consider the production of an arbitrary number of colour-singlet particles near partonic threshold, and show that next-to-leading order cross sections for this class of processes have a simple universal form at next-to-leading power (NLP) in the energy of the emitted gluon radiation. Our analysis relies on a recently derived factorisation formula for NLP threshold effects at amplitude level, and therefore applies both if the leading-order process is tree-level and if it is loop-induced. It holds for differential distributions as well. The results can furthermore be seen as applications of recently derived next-to-soft theorems for gauge theory amplitudes. We use our universal expression to re-derive known results for the production of up to three Higgs bosons at NLO in the large top mass limit, and for the hadro-production of a pair of electroweak gauge bosons. Finally, we present new analytic results for Higgs boson pair production at NLO and NLP, with exact top-mass dependence.

  9. Automated Assessment of Medical Students’ Clinical Exposures according to AAMC Geriatric Competencies

    PubMed Central

    Chen, Yukun; Wrenn, Jesse; Xu, Hua; Spickard, Anderson; Habermann, Ralf; Powers, James; Denny, Joshua C.

    2014-01-01

    Competence is essential for health care professionals. Current methods to assess competency, however, do not efficiently capture medical students’ experience. In this preliminary study, we used machine learning and natural language processing (NLP) to identify geriatric competency exposures from students’ clinical notes. The system applied NLP to generate the concepts and related features from notes. We extracted a refined list of concepts associated with corresponding competencies. This system was evaluated through 10-fold cross validation for six geriatric competency domains: “medication management (MedMgmt)”, “cognitive and behavioral disorders (CBD)”, “falls, balance, gait disorders (Falls)”, “self-care capacity (SCC)”, “palliative care (PC)”, “hospital care for elders (HCE)” – each an American Association of Medical Colleges competency for medical students. The systems could accurately assess MedMgmt, SCC, HCE, and Falls competencies with F-measures of 0.94, 0.86, 0.85, and 0.84, respectively, but did not attain good performance for PC and CBD (0.69 and 0.62 in F-measure, respectively). PMID:25954341

  10. Molecular characterization and functional analysis of a necrosis- and ethylene-inducing, protein-encoding gene family from Verticillium dahliae.

    PubMed

    Zhou, Bang-Jun; Jia, Pei-Song; Gao, Feng; Guo, Hui-Shan

    2012-07-01

    Verticillium dahliae Kleb. is a hemibiotrophic, phytopathogenic fungus that causes wilt disease in a wide range of crops, including cotton. Successful host colonization by hemibiotrophic pathogens requires the induction of plant cell death to provide the saprophytic nutrition for the transition from the biotrophic to the necrotrophic stage. In this study, we identified a necrosis-inducing Phytophthora protein (NPP1) domain-containing protein family containing nine genes in a virulent, defoliating isolate of V. dahliae (V592), named the VdNLP genes. Functional analysis demonstrated that only two of these VdNLP genes, VdNLP1 and VdNLP2, encoded proteins that were capable of inducing necrotic lesions and triggering defense responses in Nicotiana benthamiana, Arabidopsis, and cotton plants. Both VdNLP1 and VdNLP2 induced the wilting of cotton seedling cotyledons. However, gene-deletion mutants targeted by VdNLP1, VdNLP2, or both did not affect the pathogenicity of V. dahliae V592 in cotton infection. Similar expression and induction patterns were found for seven of the nine VdNLP transcripts. Through a comparison of the conserved amino acid residues of VdNLP with different necrosis-inducing activities, combined with mutagenesis-based analyses, we identified several novel conserved amino acid residues, in addition to the known conserved heptapeptide GHRHDWE motif and the cysteine residues of the NPP domain-containing protein, that are indispensable for the necrosis-inducing activity of the VdNLP2 protein.

  11. The potential of zwitterionic nanoliposomes against neurotoxic alpha-synuclein aggregates in Parkinson's Disease.

    PubMed

    Aliakbari, Farhang; Mohammad-Beigi, Hossein; Rezaei-Ghaleh, Nasrollah; Becker, Stefan; Dehghani Esmatabad, Faezeh; Eslampanah Seyedi, Hadieh Alsadat; Bardania, Hassan; Tayaranian Marvian, Amir; Collingwood, Joanna F; Christiansen, Gunna; Zweckstetter, Markus; Otzen, Daniel E; Morshedi, Dina

    2018-05-17

    The protein α-synuclein (αSN) aggregates to form fibrils in neuronal cells of Parkinson's patients. Here we report on the effect of neutral (zwitterionic) nanoliposomes (NLPs), supplemented with cholesterol (NLP-Chol) and decorated with PEG (NLP-Chol-PEG), on αSN aggregation and neurotoxicity. Both NLPs retard αSN fibrillization in a concentration-independent fashion. They do so largely by increasing lag time (formation of fibrillization nuclei) rather than elongation (extension of existing nuclei). Interactions between neutral NLPs and αSN may locate to the N-terminus of the protein. This interaction can even perturb the interaction of αSN with negatively charged NLPs which induces an α-helical structure in αSN. This interaction was found to occur throughout the fibrillization process. Both NLP-Chol and NLP-Chol-PEG were shown to be biocompatible in vitro, and to reduce αSN neurotoxicity and reactive oxygen species (ROS) levels with no influence on intracellular calcium in neuronal cells, emphasizing a prospective role for NLPs in reducing αSN pathogenicity in vivo as well as utility as a vehicle for drug delivery.

  12. Informatics in radiology: RADTF: a semantic search-enabled, natural language processor-generated radiology teaching file.

    PubMed

    Do, Bao H; Wu, Andrew; Biswal, Sandip; Kamaya, Aya; Rubin, Daniel L

    2010-11-01

    Storing and retrieving radiology cases is an important activity for education and clinical research, but this process can be time-consuming. In the process of structuring reports and images into organized teaching files, incidental pathologic conditions not pertinent to the primary teaching point can be omitted, as when a user saves images of an aortic dissection case but disregards the incidental osteoid osteoma. An alternate strategy for identifying teaching cases is text search of reports in radiology information systems (RIS), but retrieved reports are unstructured, teaching-related content is not highlighted, and patient identifying information is not removed. Furthermore, searching unstructured reports requires sophisticated retrieval methods to achieve useful results. An open-source, RadLex(®)-compatible teaching file solution called RADTF, which uses natural language processing (NLP) methods to process radiology reports, was developed to create a searchable teaching resource from the RIS and the picture archiving and communication system (PACS). The NLP system extracts and de-identifies teaching-relevant statements from full reports to generate a stand-alone database, thus converting existing RIS archives into an on-demand source of teaching material. Using RADTF, the authors generated a semantic search-enabled, Web-based radiology archive containing over 700,000 cases with millions of images. RADTF combines a compact representation of the teaching-relevant content in radiology reports and a versatile search engine with the scale of the entire RIS-PACS collection of case material. ©RSNA, 2010

  13. DNA-targeted 2-nitroimidazoles: studies of the influence of the phenanthridine-linked nitroimidazoles, 2-NLP-3 and 2-NLP-4, on DNA damage induced by ionizing radiation.

    PubMed

    Buchko, Garry W; Weinfeld, Michael

    2002-09-01

    The nitroimidazole-linked phenanthridines 2-NLP-3 (5-[3-(2-nitro-1-imidazoyl)-propyl]-phenanthridinium bromide) and 2-NLP-4 (5-[3-(2-nitro-1-imidazoyl)-butyl]-phenanthridinium bromide) are composed of the radiosensitizer, 2-nitroimidazole, attached to the DNA intercalator phenanthridine by a 3- and 4-carbon linker, respectively. Previous in vitro assays showed both compounds to be 10-100 times more efficient as hypoxic cell radiosensitizers (based on external drug concentrations) than the untargeted 2-nitroimidazole radiosensitizer, misonidazole (Cowan et al., Radiat. Res. 127, 81-89, 1991). Here we have used a (32)P postlabeling assay and 5'-end-labeled oligonucleotide assay to compare the radiation-induced DNA damage generated in the presence of 2-NLP-3, 2-NLP-4, phenanthridine and misonidazole. After irradiation of the DNA under anoxic conditions, we observed a significantly greater level of 3'-phosphoglycolate DNA damage in the presence of 2-NLP-3 or 2-NLP-4 compared to irradiation of the DNA in the presence of misonidazole. This may account at least in part for the greater cellular radiosensitization shown by the nitroimidazole-linked phenanthridines over misonidazole. Of the two nitroimidazole-linked phenanthridines, the better in vitro radiosensitizer, 2-NLP-4, generated more 3'-phosphoglycolate in DNA than did 2-NLP-3. At all concentrations, phenanthridine had little effect on the levels of DNA damage, suggesting that the enhanced radiosensitization displayed by 2-NLP-3 and 2-NLP-4 is due to the localization of the 2-nitroimidazole to the DNA by the phenanthridine substituent and not to radiosensitization by the phenanthridine moiety itself.

  14. Integrating Structured and Unstructured EHR Data Using an FHIR-based Type System: A Case Study with Medication Data.

    PubMed

    Hong, Na; Wen, Andrew; Shen, Feichen; Sohn, Sunghwan; Liu, Sijia; Liu, Hongfang; Jiang, Guoqian

    2018-01-01

    Standards-based modeling of electronic health records (EHR) data holds great significance for data interoperability and large-scale usage. Integration of unstructured data into a standard data model, however, poses unique challenges partially due to heterogeneous type systems used in existing clinical NLP systems. We introduce a scalable and standards-based framework for integrating structured and unstructured EHR data leveraging the HL7 Fast Healthcare Interoperability Resources (FHIR) specification. We implemented a clinical NLP pipeline enhanced with an FHIR-based type system and performed a case study using medication data from Mayo Clinic's EHR. Two UIMA-based NLP tools known as MedXN and MedTime were integrated in the pipeline to extract FHIR MedicationStatement resources and related attributes from unstructured medication lists. We developed a rule-based approach for assigning the NLP output types to the FHIR elements represented in the type system, whereas we investigated the FHIR elements belonging to the source of the structured EMR data. We used the FHIR resource "MedicationStatement" as an example to illustrate our integration framework and methods. For evaluation, we manually annotated FHIR elements in 166 medication statements from 14 clinical notes generated by Mayo Clinic in the course of patient care, and used standard performance measures (precision, recall and f-measure). The F-scores achieved ranged from 0.73 to 0.99 for the various FHIR element representations. The results demonstrated that our framework based on the FHIR type system is feasible for normalizing and integrating both structured and unstructured EHR data.

  15. Mitotic regulator Nlp interacts with XPA/ERCC1 complexes and regulates nucleotide excision repair (NER) in response to UV radiation.

    PubMed

    Ma, Xiao-Juan; Shang, Li; Zhang, Wei-Min; Wang, Ming-Rong; Zhan, Qi-Min

    2016-04-10

    Cellular response to DNA damage, including ionizing radiation (IR) and UV radiation, is critical for the maintenance of genomic fidelity. Defects of DNA repair often result in genomic instability and malignant cell transformation. Centrosomal protein Nlp (ninein-like protein) has been characterized as an important cell cycle regulator that is required for proper mitotic progression. In this study, we demonstrate that Nlp is able to improve nucleotide excision repair (NER) activity and protects cells against UV radiation. Upon exposure of cells to UVC, Nlp is translocated into the nucleus. The C-terminus (1030-1382) of Nlp is necessary and sufficient for its nuclear import. Upon UVC radiation, Nlp interacts with XPA and ERCC1, and enhances their association. Interestingly, down-regulated expression of Nlp is found to be associated with human skin cancers, indicating that dysregulated Nlp might be related to the development of human skin cancers. Taken together, this study identifies mitotic protein Nlp as a new and important member of NER pathway and thus provides novel insights into understanding of regulatory machinery involved in NER. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  16. Feasibility and Utility of Lexical Analysis for Occupational Health Text.

    PubMed

    Harber, Philip; Leroy, Gondy

    2017-06-01

    Assess feasibility and potential utility of natural language processing (NLP) for storing and analyzing occupational health data. Basic NLP lexical analysis methods were applied to 89,000 Mine Safety and Health Administration (MSHA) free text records. Steps included tokenization, term and co-occurrence counts, term annotation, and identifying exposure-health effect relationships. Presence of terms in the Unified Medical Language System (UMLS) was assessed. The methods efficiently demonstrated common exposures, health effects, and exposure-injury relationships. Many workplace terms are not present in UMLS or map inaccurately. Use of free text rather than narrowly defined numerically coded fields is feasible, flexible, and efficient. It has potential to encourage workers and clinicians to provide more data and to support automated knowledge creation. The lexical method used is easily generalizable to other areas. The UMLS vocabularies should be enhanced to be relevant to occupational health.

  17. Semantic extraction and processing of medical records for patient-oriented visual index

    NASA Astrophysics Data System (ADS)

    Zheng, Weilin; Dong, Wenjie; Chen, Xiangjiao; Zhang, Jianguo

    2012-02-01

    To have comprehensive and completed understanding healthcare status of a patient, doctors need to search patient medical records from different healthcare information systems, such as PACS, RIS, HIS, USIS, as a reference of diagnosis and treatment decisions for the patient. However, it is time-consuming and tedious to do these procedures. In order to solve this kind of problems, we developed a patient-oriented visual index system (VIS) to use the visual technology to show health status and to retrieve the patients' examination information stored in each system with a 3D human model. In this presentation, we present a new approach about how to extract the semantic and characteristic information from the medical record systems such as RIS/USIS to create the 3D Visual Index. This approach includes following steps: (1) Building a medical characteristic semantic knowledge base; (2) Developing natural language processing (NLP) engine to perform semantic analysis and logical judgment on text-based medical records; (3) Applying the knowledge base and NLP engine on medical records to extract medical characteristics (e.g., the positive focus information), and then mapping extracted information to related organ/parts of 3D human model to create the visual index. We performed the testing procedures on 559 samples of radiological reports which include 853 focuses, and achieved 828 focuses' information. The successful rate of focus extraction is about 97.1%.

  18. Interpreting Hypernymic Propositions in an Online Medical Encyclopedia

    PubMed Central

    Fiszman, Marcelo; Rindflesch, Thomas C.; Kilicoglu, Halil

    2003-01-01

    Interpretation of semantic propositions from biomedical texts documents would provide valuable support to natural language processing (NLP) applications. We are developing a methodology to interpret a kind of semantic proposition, the hypernymic proposition, in MEDLINE abstracts. In this paper, we expanded the system to identify these structures in a different discourse domain: the Medical Encyclopedia from the National Library of Medicine’s MEDLINEplus® Website. PMID:14728345

  19. Interpreting hypernymic propositions in an online medical encyclopedia.

    PubMed

    Fiszman, Marcelo; Rindflesch, Thomas C; Kilicoglu, Halil

    2003-01-01

    Interpretation of semantic propositions from bio-medical texts documents would provide valuable support to natural language processing (NLP) applications. We are developing a methodology to interpret a kind of semantic proposition, the hypernymic proposition, in MEDLINE abstracts. In this paper, we expanded the system to identify these structures in a different discourse domain: the Medical Encyclopedia from the National Library of Medi-cine's MEDLINEplus Website.

  20. Computing Accurate Grammatical Feedback in a Virtual Writing Conference for German-Speaking Elementary-School Children: An Approach Based on Natural Language Generation

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2009-01-01

    We built a natural language processing (NLP) system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary…

  1. Eudicot plant-specific sphingolipids determine host selectivity of microbial NLP cytolysins.

    PubMed

    Lenarčič, Tea; Albert, Isabell; Böhm, Hannah; Hodnik, Vesna; Pirc, Katja; Zavec, Apolonija B; Podobnik, Marjetka; Pahovnik, David; Žagar, Ema; Pruitt, Rory; Greimel, Peter; Yamaji-Hasegawa, Akiko; Kobayashi, Toshihide; Zienkiewicz, Agnieszka; Gömann, Jasmin; Mortimer, Jenny C; Fang, Lin; Mamode-Cassim, Adiilah; Deleu, Magali; Lins, Laurence; Oecking, Claudia; Feussner, Ivo; Mongrand, Sébastien; Anderluh, Gregor; Nürnberger, Thorsten

    2017-12-15

    Necrosis and ethylene-inducing peptide 1-like (NLP) proteins constitute a superfamily of proteins produced by plant pathogenic bacteria, fungi, and oomycetes. Many NLPs are cytotoxins that facilitate microbial infection of eudicot, but not of monocot plants. Here, we report glycosylinositol phosphorylceramide (GIPC) sphingolipids as NLP toxin receptors. Plant mutants with altered GIPC composition were more resistant to NLP toxins. Binding studies and x-ray crystallography showed that NLPs form complexes with terminal monomeric hexose moieties of GIPCs that result in conformational changes within the toxin. Insensitivity to NLP cytolysins of monocot plants may be explained by the length of the GIPC head group and the architecture of the NLP sugar-binding site. We unveil early steps in NLP cytolysin action that determine plant clade-specific toxin selectivity. Copyright © 2017 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

  2. Polo-like kinase 1 regulates Nlp, a centrosome protein involved in microtubule nucleation.

    PubMed

    Casenghi, Martina; Meraldi, Patrick; Weinhart, Ulrike; Duncan, Peter I; Körner, Roman; Nigg, Erich A

    2003-07-01

    In animal cells, most microtubules are nucleated at centrosomes. At the onset of mitosis, centrosomes undergo a structural reorganization, termed maturation, which leads to increased microtubule nucleation activity. Centrosome maturation is regulated by several kinases, including Polo-like kinase 1 (Plk1). Here, we identify a centrosomal Plk1 substrate, termed Nlp (ninein-like protein), whose properties suggest an important role in microtubule organization. Nlp interacts with two components of the gamma-tubulin ring complex and stimulates microtubule nucleation. Plk1 phosphorylates Nlp and disrupts both its centrosome association and its gamma-tubulin interaction. Overexpression of an Nlp mutant lacking Plk1 phosphorylation sites severely disturbs mitotic spindle formation. We propose that Nlp plays an important role in microtubule organization during interphase, and that the activation of Plk1 at the onset of mitosis triggers the displacement of Nlp from the centrosome, allowing the establishment of a mitotic scaffold with enhanced microtubule nucleation activity.

  3. NLP is a novel transcription regulator involved in VSG expression site control in Trypanosoma brucei.

    PubMed

    Narayanan, Mani Shankar; Kushwaha, Manish; Ersfeld, Klaus; Fullbrook, Alexander; Stanne, Tara M; Rudenko, Gloria

    2011-03-01

    Trypanosoma brucei mono-allelically expresses one of approximately 1500 variant surface glycoprotein (VSG) genes while multiplying in the mammalian bloodstream. The active VSG is transcribed by RNA polymerase I in one of approximately 15 telomeric VSG expression sites (ESs). T. brucei is unusual in controlling gene expression predominantly post-transcriptionally, and how ESs are mono-allelically controlled remains a mystery. Here we identify a novel transcription regulator, which resembles a nucleoplasmin-like protein (NLP) with an AT-hook motif. NLP is key for ES control in bloodstream form T. brucei, as NLP knockdown results in 45- to 65-fold derepression of the silent VSG221 ES. NLP is also involved in repression of transcription in the inactive VSG Basic Copy arrays, minichromosomes and procyclin loci. NLP is shown to be enriched on the 177- and 50-bp simple sequence repeats, the non-transcribed regions around rDNA and procyclin, and both active and silent ESs. Blocking NLP synthesis leads to downregulation of the active ES, indicating that NLP plays a role in regulating appropriate levels of transcription of ESs in both their active and silent state. Discovery of the unusual transcription regulator NLP provides new insight into the factors that are critical for ES control.

  4. An Introduction to Natural Language Processing: How You Can Get More From Those Electronic Notes You Are Generating.

    PubMed

    Kimia, Amir A; Savova, Guergana; Landschaft, Assaf; Harper, Marvin B

    2015-07-01

    Electronically stored clinical documents may contain both structured data and unstructured data. The use of structured clinical data varies by facility, but clinicians are familiar with coded data such as International Classification of Diseases, Ninth Revision, Systematized Nomenclature of Medicine-Clinical Terms codes, and commonly other data including patient chief complaints or laboratory results. Most electronic health records have much more clinical information stored as unstructured data, for example, clinical narrative such as history of present illness, procedure notes, and clinical decision making are stored as unstructured data. Despite the importance of this information, electronic capture or retrieval of unstructured clinical data has been challenging. The field of natural language processing (NLP) is undergoing rapid development, and existing tools can be successfully used for quality improvement, research, healthcare coding, and even billing compliance. In this brief review, we provide examples of successful uses of NLP using emergency medicine physician visit notes for various projects and the challenges of retrieving specific data and finally present practical methods that can run on a standard personal computer as well as high-end state-of-the-art funded processes run by leading NLP informatics researchers.

  5. NlpI contributes to Escherichia coli K1 strain RS218 interaction with human brain microvascular endothelial cells.

    PubMed

    Teng, Ching-Hao; Tseng, Yu-Ting; Maruvada, Ravi; Pearce, Donna; Xie, Yi; Paul-Satyaseela, Maneesh; Kim, Kwang Sik

    2010-07-01

    Escherichia coli K1 is the most common Gram-negative bacillary organism causing neonatal meningitis. E. coli K1 binding to and invasion of human brain microvascular endothelial cells (HBMECs) is a prerequisite for its traversal of the blood-brain barrier (BBB) and penetration into the brain. In the present study, we identified NlpI as a novel bacterial determinant contributing to E. coli K1 interaction with HBMECs. The deletion of nlpI did not affect the expression of the known bacterial determinants involved in E. coli K1-HBMEC interaction, such as type 1 fimbriae, flagella, and OmpA, and the contribution of NlpI to HBMECs binding and invasion was independent of those bacterial determinants. Previous reports have shown that the nlpI mutant of E. coli K-12 exhibits growth defect at 42 degrees C at low osmolarity, and its thermosensitive phenotype can be suppressed by a mutation on the spr gene. The nlpI mutant of strain RS218 exhibited similar thermosensitive phenotype, but additional spr mutation did not restore the ability of the nlpI mutant to interact with HBMECs. These findings suggest the decreased ability of the nlpI mutant to interact with HBMECs is not associated with the thermosensitive phenotype. NlpI was determined as an outer membrane-anchored protein in E. coli, and the nlpI mutant was defective in cytosolic phospholipase A(2)alpha (cPLA(2)alpha) phosphorylation compared to the parent strain. These findings illustrate the first demonstration of NlpI's contribution to E. coli K1 binding to and invasion of HBMECs, and its contribution is likely to involve cPLA(2)alpha.

  6. Using Information from the Electronic Health Record to Improve Measurement of Unemployment in Service Members and Veterans with mTBI and Post-Deployment Stress

    PubMed Central

    Dillahunt-Aspillaga, Christina; Finch, Dezon; Massengale, Jill; Kretzmer, Tracy; Luther, Stephen L.; McCart, James A.

    2014-01-01

    Objective The purpose of this pilot study is 1) to develop an annotation schema and a training set of annotated notes to support the future development of a natural language processing (NLP) system to automatically extract employment information, and 2) to determine if information about employment status, goals and work-related challenges reported by service members and Veterans with mild traumatic brain injury (mTBI) and post-deployment stress can be identified in the Electronic Health Record (EHR). Design Retrospective cohort study using data from selected progress notes stored in the EHR. Setting Post-deployment Rehabilitation and Evaluation Program (PREP), an in-patient rehabilitation program for Veterans with TBI at the James A. Haley Veterans' Hospital in Tampa, Florida. Participants Service members and Veterans with TBI who participated in the PREP program (N = 60). Main Outcome Measures Documentation of employment status, goals, and work-related challenges reported by service members and recorded in the EHR. Results Two hundred notes were examined and unique vocational information was found indicating a variety of self-reported employment challenges. Current employment status and future vocational goals along with information about cognitive, physical, and behavioral symptoms that may affect return-to-work were extracted from the EHR. The annotation schema developed for this study provides an excellent tool upon which NLP studies can be developed. Conclusions Information related to employment status and vocational history is stored in text notes in the EHR system. Information stored in text does not lend itself to easy extraction or summarization for research and rehabilitation planning purposes. Development of NLP systems to automatically extract text-based employment information provides data that may improve the understanding and measurement of employment in this important cohort. PMID:25541956

  7. An optimal modeling of multidimensional wave digital filtering network for free vibration analysis of symmetrically laminated composite FSDT plates

    NASA Astrophysics Data System (ADS)

    Tseng, Chien-Hsun

    2015-02-01

    The technique of multidimensional wave digital filtering (MDWDF) that builds on traveling wave formulation of lumped electrical elements, is successfully implemented on the study of dynamic responses of symmetrically laminated composite plate based on the first order shear deformation theory. The philosophy applied for the first time in this laminate mechanics relies on integration of certain principles involving modeling and simulation, circuit theory, and MD digital signal processing to provide a great variety of outstanding features. Especially benefited by the conservation of passivity gives rise to a nonlinear programming problem (NLP) for the issue of numerical stability of a MD discrete system. Adopting the augmented Lagrangian genetic algorithm, an effective optimization technique for rapidly achieving solution spaces of NLP models, numerical stability of the MDWDF network is well received at all time by the satisfaction of the Courant-Friedrichs-Levy stability criterion with the least restriction. In particular, optimum of the NLP has led to the optimality of the network in terms of effectively and accurately predicting the desired fundamental frequency, and thus to give an insight into the robustness of the network by looking at the distribution of system energies. To further explore the application of the optimum network, more numerical examples are engaged in efforts to achieve a qualitative understanding of the behavior of the laminar system. These are carried out by investigating various effects based on different stacking sequences, stiffness and span-to-thickness ratios, mode shapes and boundary conditions. Results are scrupulously validated by cross referencing with early published works, which show that the present method is in excellent agreement with other numerical and analytical methods.

  8. Neurolinguistic programming: a systematic review of the effects on health outcomes.

    PubMed

    Sturt, Jackie; Ali, Saima; Robertson, Wendy; Metcalfe, David; Grove, Amy; Bourne, Claire; Bridle, Chris

    2012-11-01

    Neurolinguistic programming (NLP) in health care has captured the interest of doctors, healthcare professionals, and managers. To evaluate the effects of NLP on health-related outcomes. Systematic review of experimental studies. The following data sources were searched: MEDLINE, PsycINFO, ASSIA, AMED, CINAHL, Web of Knowledge, CENTRAL, NLP specialist databases, reference lists, review articles, and NLP professional associations, training providers, and research groups. Searches revealed 1459 titles from which 10 experimental studies were included. Five studies were randomised controlled trials (RCTs) and five were pre-post studies. Targeted health conditions were anxiety disorders, weight maintenance, morning sickness, substance misuse, and claustrophobia during MRI scanning. NLP interventions were mainly delivered across 4-20 sessions although three were single session. Eighteen outcomes were reported and the RCT sample sizes ranged from 22 to 106. Four RCTs reported no significant between group differences with the fifth finding in favour of the NLP arm (F = 8.114, P<0.001). Three RCTs and five pre-post studies reported within group improvements. Risk of bias across all studies was high or uncertain. There is little evidence that NLP interventions improve health-related outcomes. This conclusion reflects the limited quantity and quality of NLP research, rather than robust evidence of no effect. There is currently insufficient evidence to support the allocation of NHS resources to NLP activities outside of research purposes.

  9. Neurolinguistic programming: a systematic review of the effects on health outcomes

    PubMed Central

    Sturt, Jackie; Ali, Saima; Robertson, Wendy; Metcalfe, David; Grove, Amy; Bourne, Claire; Bridle, Chris

    2012-01-01

    Background Neurolinguistic programming (NLP) in health care has captured the interest of doctors, healthcare professionals, and managers. Aim To evaluate the effects of NLP on health-related outcomes. Design and setting Systematic review of experimental studies. Method The following data sources were searched: MEDLINE®, PsycINFO, ASSIA, AMED, CINAHL®, Web of Knowledge, CENTRAL, NLP specialist databases, reference lists, review articles, and NLP professional associations, training providers, and research groups. Results Searches revealed 1459 titles from which 10 experimental studies were included. Five studies were randomised controlled trials (RCTs) and five were pre-post studies. Targeted health conditions were anxiety disorders, weight maintenance, morning sickness, substance misuse, and claustrophobia during MRI scanning. NLP interventions were mainly delivered across 4–20 sessions although three were single session. Eighteen outcomes were reported and the RCT sample sizes ranged from 22 to 106. Four RCTs reported no significant between group differences with the fifth finding in favour of the NLP arm (F = 8.114, P<0.001). Three RCTs and five pre-post studies reported within group improvements. Risk of bias across all studies was high or uncertain. Conclusion There is little evidence that NLP interventions improve health-related outcomes. This conclusion reflects the limited quantity and quality of NLP research, rather than robust evidence of no effect. There is currently insufficient evidence to support the allocation of NHS resources to NLP activities outside of research purposes. PMID:23211179

  10. Overview of the Cancer Genetics and Pathway Curation tasks of BioNLP Shared Task 2013

    PubMed Central

    2015-01-01

    Background Since their introduction in 2009, the BioNLP Shared Task events have been instrumental in advancing the development of methods and resources for the automatic extraction of information from the biomedical literature. In this paper, we present the Cancer Genetics (CG) and Pathway Curation (PC) tasks, two event extraction tasks introduced in the BioNLP Shared Task 2013. The CG task focuses on cancer, emphasizing the extraction of physiological and pathological processes at various levels of biological organization, and the PC task targets reactions relevant to the development of biomolecular pathway models, defining its extraction targets on the basis of established pathway representations and ontologies. Results Six groups participated in the CG task and two groups in the PC task, together applying a wide range of extraction approaches including both established state-of-the-art systems and newly introduced extraction methods. The best-performing systems achieved F-scores of 55% on the CG task and 53% on the PC task, demonstrating a level of performance comparable to the best results achieved in similar previously proposed tasks. Conclusions The results indicate that existing event extraction technology can generalize to meet the novel challenges represented by the CG and PC task settings, suggesting that extraction methods are capable of supporting the construction of knowledge bases on the molecular mechanisms of cancer and the curation of biomolecular pathway models. The CG and PC tasks continue as open challenges for all interested parties, with data, tools and resources available from the shared task homepage. PMID:26202570

  11. Close association between metal allergy and nail lichen planus: detection of causative metals in nail lesions.

    PubMed

    Nishizawa, A; Satoh, T; Yokozeki, H

    2013-02-01

    Lichen planus (LP) is a common skin disorder of unknown aetiology that affects the skin, mucous membranes and nails. Although metal allergies have been implicated in the development of oral LP (OLP), the contribution of these allergies to nail LP (NLP) has yet to be studied in detail. To elucidate the link between metal allergy and NLP. We retrospectively analysed 115 LP patients with respect to the contribution of metals to either NLP or OLP. We also attempted to detect the specific metals involved in these nail lesions. Of the 79 patients that received a metal patch test (PT), 24 (30%) were positive for at least one of the metal compounds tested. Notably, the prevalence of positive reactions to metals in the NLP patients was significantly higher as compared with the OLP patients (59% vs. 27%, P < 0.05). Among the 10 PT-positive patients with NLP, improvement of the skin lesions was seen in six of the patients after removal of dental materials containing causative metals or systemic disodium cromoglycate therapy. On the other hand, only 3 of 16 PT-positive patients with OLP exhibited improvement after the removal of dental materials. Causative metals in the dental fillings/braces were detected in the involved nail tissues. This study suggests that metal allergies are more closely associated with NLP vs. OLP, and that deposited metals in the nail apparatus contribute to the development of lichenoid tissue reactions in the nail bed and matrix. © 2012 The Authors. Journal of the European Academy of Dermatology and Venereology © 2012 European Academy of Dermatology and Venereology.

  12. An opioid-like system regulating feeding behavior in C. elegans

    PubMed Central

    Cheong, Mi Cheong; Artyukhin, Alexander B; You, Young-Jai; Avery, Leon

    2015-01-01

    Neuropeptides are essential for the regulation of appetite. Here we show that neuropeptides could regulate feeding in mutants that lack neurotransmission from the motor neurons that stimulate feeding muscles. We identified nlp-24 by an RNAi screen of 115 neuropeptide genes, testing whether they affected growth. NLP-24 peptides have a conserved YGGXX sequence, similar to mammalian opioid neuropeptides. In addition, morphine and naloxone respectively stimulated and inhibited feeding in starved worms, but not in worms lacking NPR-17, which encodes a protein with sequence similarity to opioid receptors. Opioid agonists activated heterologously expressed NPR-17, as did at least one NLP-24 peptide. Worms lacking the ASI neurons, which express npr-17, did not response to naloxone. Thus, we suggest that Caenorhabditis elegans has an endogenous opioid system that acts through NPR-17, and that opioids regulate feeding via ASI neurons. Together, these results suggest C. elegans may be the first genetically tractable invertebrate opioid model. DOI: http://dx.doi.org/10.7554/eLife.06683.001 PMID:25898004

  13. Energy-modeled flight in a wind field

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Feldman, M.A.; Cliff, E.M.

    Optimal shaping of aerospace trajectories has provided the motivation for much modern study of optimization theory and algorithms. Current industrial practice favors approaches where the continuous-time optimal control problem is transcribed to a finite-dimensional nonlinear programming problem (NLP) by a discretization process. Two such formulations are implemented in the POST and the OTIS codes. In the present paper we use a discretization that is specially adapted to the flight problem of interest. Among the unique aspects of the present discretization are: a least-squares formulation for certain kinematic constraints; the use of an energy ideas to enforce Newton`s Laws; and, themore » inclusion of large magnitude horizontal winds. In the next section we shall provide a description of the flight problem and its NLP representation. Following this we provide some details of the constraint formulation. Finally, we present an overview of the NLP problem.« less

  14. Identification of Long Bone Fractures in Radiology Reports Using Natural Language Processing to support Healthcare Quality Improvement.

    PubMed

    Grundmeier, Robert W; Masino, Aaron J; Casper, T Charles; Dean, Jonathan M; Bell, Jamie; Enriquez, Rene; Deakyne, Sara; Chamberlain, James M; Alpern, Elizabeth R

    2016-11-09

    Important information to support healthcare quality improvement is often recorded in free text documents such as radiology reports. Natural language processing (NLP) methods may help extract this information, but these methods have rarely been applied outside the research laboratories where they were developed. To implement and validate NLP tools to identify long bone fractures for pediatric emergency medicine quality improvement. Using freely available statistical software packages, we implemented NLP methods to identify long bone fractures from radiology reports. A sample of 1,000 radiology reports was used to construct three candidate classification models. A test set of 500 reports was used to validate the model performance. Blinded manual review of radiology reports by two independent physicians provided the reference standard. Each radiology report was segmented and word stem and bigram features were constructed. Common English "stop words" and rare features were excluded. We used 10-fold cross-validation to select optimal configuration parameters for each model. Accuracy, recall, precision and the F1 score were calculated. The final model was compared to the use of diagnosis codes for the identification of patients with long bone fractures. There were 329 unique word stems and 344 bigrams in the training documents. A support vector machine classifier with Gaussian kernel performed best on the test set with accuracy=0.958, recall=0.969, precision=0.940, and F1 score=0.954. Optimal parameters for this model were cost=4 and gamma=0.005. The three classification models that we tested all performed better than diagnosis codes in terms of accuracy, precision, and F1 score (diagnosis code accuracy=0.932, recall=0.960, precision=0.896, and F1 score=0.927). NLP methods using a corpus of 1,000 training documents accurately identified acute long bone fractures from radiology reports. Strategic use of straightforward NLP methods, implemented with freely available software, offers quality improvement teams new opportunities to extract information from narrative documents.

  15. Cross domains Arabic named entity recognition system

    NASA Astrophysics Data System (ADS)

    Al-Ahmari, S. Saad; Abdullatif Al-Johar, B.

    2016-07-01

    Named Entity Recognition (NER) plays an important role in many Natural Language Processing (NLP) applications such as; Information Extraction (IE), Question Answering (QA), Text Clustering, Text Summarization and Word Sense Disambiguation. This paper presents the development and implementation of domain independent system to recognize three types of Arabic named entities. The system works based on a set of domain independent grammar-rules along with Arabic part of speech tagger in addition to gazetteers and lists of trigger words. The experimental results shown, that the system performed as good as other systems with better results in some cases of cross-domains corpora.

  16. Effects of the ninein-like protein centrosomal protein on breast cancer cell invasion and migration

    PubMed Central

    LIU, QI; WANG, XINZHAO; LV, MINLIN; MU, DIANBIN; WANG, LEILEI; ZUO, WENSU; YU, ZHIYONG

    2015-01-01

    To investigate the effects of the centrosomal protein, ninein-like protein (Nlp), on the proliferation, invasion and metastasis of MCF-7 breast cancer cells, the present study established green fluorescent protein (GFP)-containing MCF7 plasmids with steady and overexpression of Nlp (MCG7-GFP-N1p) and blank plasmids (MCF7-GFP) using lentiviral transfection technology in MCF7 the breast cancer cell line. The expression of Nlp was determined by reverse transcription-quantitative polymerase chain reaction and western blott analysis. Differences in levels of proliferation, invasion and metastasis between the MCF7-GFP-Nlp group and MCF-GFP group were compared using MTT, plate colony formation and Transwell migration assays. The cell growth was more rapid and the colony forming rate was markedly increased in the MCF7-GFP-Nlp group (P<0.05) compared with the MCF7-GFP group. The number of cells in the MCF-GFP-Nlp and MCF7-GFP groups transferred across membranes were 878±18.22 and 398±8.02, respectively, in the migration assay. The invasive capacity was significantly increased in the MCF7-GFP-Nlp group (P<0.05) compared with the MCF7-GFP group. The western blotting results demonstrated high expression levels of C-X-C chemokine receptor type 4 in the MCF7-GFP-Nlp group. The increased expression of Nlp was associated with an increase in MCF7 cell proliferation, invasion and metastasis, which indicated that Nlp promoted breast tumorigenesis and may be used as a potent biological index to predict breast cancer metastasis and develop therapeutic regimes. PMID:25901761

  17. Adaptable, high recall, event extraction system with minimal configuration

    PubMed Central

    2015-01-01

    Background Biomedical event extraction has been a major focus of biomedical natural language processing (BioNLP) research since the first BioNLP shared task was held in 2009. Accordingly, a large number of event extraction systems have been developed. Most such systems, however, have been developed for specific tasks and/or incorporated task specific settings, making their application to new corpora and tasks problematic without modification of the systems themselves. There is thus a need for event extraction systems that can achieve high levels of accuracy when applied to corpora in new domains, without the need for exhaustive tuning or modification, whilst retaining competitive levels of performance. Results We have enhanced our state-of-the-art event extraction system, EventMine, to alleviate the need for task-specific tuning. Task-specific details are specified in a configuration file, while extensive task-specific parameter tuning is avoided through the integration of a weighting method, a covariate shift method, and their combination. The task-specific configuration and weighting method have been employed within the context of two different sub-tasks of BioNLP shared task 2013, i.e. Cancer Genetics (CG) and Pathway Curation (PC), removing the need to modify the system specifically for each task. With minimal task specific configuration and tuning, EventMine achieved the 1st place in the PC task, and 2nd in the CG, achieving the highest recall for both tasks. The system has been further enhanced following the shared task by incorporating the covariate shift method and entity generalisations based on the task definitions, leading to further performance improvements. Conclusions We have shown that it is possible to apply a state-of-the-art event extraction system to new tasks with high levels of performance, without having to modify the system internally. Both covariate shift and weighting methods are useful in facilitating the production of high recall systems. These methods and their combination can adapt a model to the target data with no deep tuning and little manual configuration. PMID:26201408

  18. Bullying in Virtual Learning Communities.

    PubMed

    Nikiforos, Stefanos; Tzanavaris, Spyros; Kermanidis, Katia Lida

    2017-01-01

    Bullying through the internet has been investigated and analyzed mainly in the field of social media. In this paper, it is attempted to analyze bullying in the Virtual Learning Communities using Natural Language Processing (NLP) techniques, mainly in the context of sociocultural learning theories. Therefore four case studies took place. We aim to apply NLP techniques to speech analysis on communication data of online communities. Emphasis is given on qualitative data, taking into account the subjectivity of the collaborative activity. Finally, this is the first time such type of analysis is attempted on Greek data.

  19. Informatics can identify systemic sclerosis (SSc) patients at risk for scleroderma renal crisis.

    PubMed

    Redd, Doug; Frech, Tracy M; Murtaugh, Maureen A; Rhiannon, Julia; Zeng, Qing T

    2014-10-01

    Electronic medical records (EMR) provide an ideal opportunity for the detection, diagnosis, and management of systemic sclerosis (SSc) patients within the Veterans Health Administration (VHA). The objective of this project was to use informatics to identify potential SSc patients in the VHA that were on prednisone, in order to inform an outreach project to prevent scleroderma renal crisis (SRC). The electronic medical data for this study came from Veterans Informatics and Computing Infrastructure (VINCI). For natural language processing (NLP) analysis, a set of retrieval criteria was developed for documents expected to have a high correlation to SSc. The two annotators reviewed the ratings to assemble a single adjudicated set of ratings, from which a support vector machine (SVM) based document classifier was trained. Any patient having at least one document positively classified for SSc was considered positive for SSc and the use of prednisone≥10mg in the clinical document was reviewed to determine whether it was an active medication on the prescription list. In the VHA, there were 4272 patients that have a diagnosis of SSc determined by the presence of an ICD-9 code. From these patients, 1118 patients (21%) had the use of prednisone≥10mg. Of these patients, 26 had a concurrent diagnosis of hypertension, thus these patients should not be on prednisone. By the use of natural language processing (NLP) an additional 16,522 patients were identified as possible SSc, highlighting that cases of SSc in the VHA may exist that are unidentified by ICD-9. A 10-fold cross validation of the classifier resulted in a precision (positive predictive value) of 0.814, recall (sensitivity) of 0.973, and f-measure of 0.873. Our study demonstrated that current clinical practice in the VHA includes the potentially dangerous use of prednisone for veterans with SSc. This present study also suggests there may be many undetected cases of SSc and NLP can successfully identify these patients. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Experimenting with semantic web services to understand the role of NLP technologies in healthcare.

    PubMed

    Jagannathan, V

    2006-01-01

    NLP technologies can play a significant role in healthcare where a predominant segment of the clinical documentation is in text form. In a graduate course focused on understanding semantic web services at West Virginia University, a class project was designed with the purpose of exploring potential use for NLP-based abstraction of clinical documentation. The role of NLP-technology was simulated using human abstractors and various workflows were investigated using public domain workflow and semantic web service technologies. This poster explores the potential use of NLP and the role of workflow and semantic web technologies in developing healthcare IT environments.

  1. Using natural language processing techniques to inform research on nanotechnology.

    PubMed

    Lewinski, Nastassja A; McInnes, Bridget T

    2015-01-01

    Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics.

  2. Automated Non-Alphanumeric Symbol Resolution in Clinical Texts

    PubMed Central

    Moon, SungRim; Pakhomov, Serguei; Ryan, James; Melton, Genevieve B.

    2011-01-01

    Although clinical texts contain many symbols, relatively little attention has been given to symbol resolution by medical natural language processing (NLP) researchers. Interpreting the meaning of symbols may be viewed as a special case of Word Sense Disambiguation (WSD). One thousand instances of four common non-alphanumeric symbols (‘+’, ‘–’, ‘/’, and ‘#’) were randomly extracted from a clinical document repository and annotated by experts. The symbols and their surrounding context, in addition to bag-of-Words (BoW), and heuristic rules were evaluated as features for the following classifiers: Naïve Bayes, Support Vector Machine, and Decision Tree, using 10-fold cross-validation. Accuracies for ‘+’, ‘–’, ‘/’, and ‘#’ were 80.11%, 80.22%, 90.44%, and 95.00% respectively, with Naïve Bayes. While symbol context contributed the most, BoW was also helpful for disambiguation of some symbols. Symbol disambiguation with supervised techniques can be implemented with reasonable accuracy as a module for medical NLP systems. PMID:22195157

  3. Speculation detection for Chinese clinical notes: Impacts of word segmentation and embedding models.

    PubMed

    Zhang, Shaodian; Kang, Tian; Zhang, Xingting; Wen, Dong; Elhadad, Noémie; Lei, Jianbo

    2016-04-01

    Speculations represent uncertainty toward certain facts. In clinical texts, identifying speculations is a critical step of natural language processing (NLP). While it is a nontrivial task in many languages, detecting speculations in Chinese clinical notes can be particularly challenging because word segmentation may be necessary as an upstream operation. The objective of this paper is to construct a state-of-the-art speculation detection system for Chinese clinical notes and to investigate whether embedding features and word segmentations are worth exploiting toward this overall task. We propose a sequence labeling based system for speculation detection, which relies on features from bag of characters, bag of words, character embedding, and word embedding. We experiment on a novel dataset of 36,828 clinical notes with 5103 gold-standard speculation annotations on 2000 notes, and compare the systems in which word embeddings are calculated based on word segmentations given by general and by domain specific segmenters respectively. Our systems are able to reach performance as high as 92.2% measured by F score. We demonstrate that word segmentation is critical to produce high quality word embedding to facilitate downstream information extraction applications, and suggest that a domain dependent word segmenter can be vital to such a clinical NLP task in Chinese language. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. A Natural Language Processing-based Model to Automate MRI Brain Protocol Selection and Prioritization.

    PubMed

    Brown, Andrew D; Marotta, Thomas R

    2017-02-01

    Incorrect imaging protocol selection can contribute to increased healthcare cost and waste. To help healthcare providers improve the quality and safety of medical imaging services, we developed and evaluated three natural language processing (NLP) models to determine whether NLP techniques could be employed to aid in clinical decision support for protocoling and prioritization of magnetic resonance imaging (MRI) brain examinations. To test the feasibility of using an NLP model to support clinical decision making for MRI brain examinations, we designed three different medical imaging prediction tasks, each with a unique outcome: selecting an examination protocol, evaluating the need for contrast administration, and determining priority. We created three models for each prediction task, each using a different classification algorithm-random forest, support vector machine, or k-nearest neighbor-to predict outcomes based on the narrative clinical indications and demographic data associated with 13,982 MRI brain examinations performed from January 1, 2013 to June 30, 2015. Test datasets were used to calculate the accuracy, sensitivity and specificity, predictive values, and the area under the curve. Our optimal results show an accuracy of 82.9%, 83.0%, and 88.2% for the protocol selection, contrast administration, and prioritization tasks, respectively, demonstrating that predictive algorithms can be used to aid in clinical decision support for examination protocoling. NLP models developed from the narrative clinical information provided by referring clinicians and demographic data are feasible methods to predict the protocol and priority of MRI brain examinations. Copyright © 2017 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.

  5. Application of Sequential Quadratic Programming to Minimize Smart Active Flap Rotor Hub Loads

    NASA Technical Reports Server (NTRS)

    Kottapalli, Sesi; Leyland, Jane

    2014-01-01

    In an analytical study, SMART active flap rotor hub loads have been minimized using nonlinear programming constrained optimization methodology. The recently developed NLPQLP system (Schittkowski, 2010) that employs Sequential Quadratic Programming (SQP) as its core algorithm was embedded into a driver code (NLP10x10) specifically designed to minimize active flap rotor hub loads (Leyland, 2014). Three types of practical constraints on the flap deflections have been considered. To validate the current application, two other optimization methods have been used: i) the standard, linear unconstrained method, and ii) the nonlinear Generalized Reduced Gradient (GRG) method with constraints. The new software code NLP10x10 has been systematically checked out. It has been verified that NLP10x10 is functioning as desired. The following are briefly covered in this paper: relevant optimization theory; implementation of the capability of minimizing a metric of all, or a subset, of the hub loads as well as the capability of using all, or a subset, of the flap harmonics; and finally, solutions for the SMART rotor. The eventual goal is to implement NLP10x10 in a real-time wind tunnel environment.

  6. De-identification of clinical notes via recurrent neural network and conditional random field.

    PubMed

    Liu, Zengjian; Tang, Buzhou; Wang, Xiaolong; Chen, Qingcai

    2017-11-01

    De-identification, identifying information from data, such as protected health information (PHI) present in clinical data, is a critical step to enable data to be shared or published. The 2016 Centers of Excellence in Genomic Science (CEGS) Neuropsychiatric Genome-scale and RDOC Individualized Domains (N-GRID) clinical natural language processing (NLP) challenge contains a de-identification track in de-identifying electronic medical records (EMRs) (i.e., track 1). The challenge organizers provide 1000 annotated mental health records for this track, 600 out of which are used as a training set and 400 as a test set. We develop a hybrid system for the de-identification task on the training set. Firstly, four individual subsystems, that is, a subsystem based on bidirectional LSTM (long-short term memory, a variant of recurrent neural network), a subsystem-based on bidirectional LSTM with features, a subsystem based on conditional random field (CRF) and a rule-based subsystem, are used to identify PHI instances. Then, an ensemble learning-based classifiers is deployed to combine all PHI instances predicted by above three machine learning-based subsystems. Finally, the results of the ensemble learning-based classifier and the rule-based subsystem are merged together. Experiments conducted on the official test set show that our system achieves the highest micro F1-scores of 93.07%, 91.43% and 95.23% under the "token", "strict" and "binary token" criteria respectively, ranking first in the 2016 CEGS N-GRID NLP challenge. In addition, on the dataset of 2014 i2b2 NLP challenge, our system achieves the highest micro F1-scores of 96.98%, 95.11% and 98.28% under the "token", "strict" and "binary token" criteria respectively, outperforming other state-of-the-art systems. All these experiments prove the effectiveness of our proposed method. Copyright © 2017. Published by Elsevier Inc.

  7. The pleiotropic transcriptional regulator NlpR contributes to the modulation of nitrogen metabolism, lipogenesis and triacylglycerol accumulation in oleaginous rhodococci.

    PubMed

    Hernández, Martín A; Lara, Julia; Gago, Gabriela; Gramajo, Hugo; Alvarez, Héctor M

    2017-01-01

    The regulatory mechanisms involved in lipogenesis and triacylglycerol (TAG) accumulation are largely unknown in oleaginous rhodococci. In this study a regulatory protein (here called NlpR: Nitrogen lipid Regulator), which contributes to the modulation of nitrogen metabolism, lipogenesis and triacylglycerol accumulation in oleaginous rhodococci was identified. Under nitrogen deprivation conditions, in which TAG accumulation is stimulated, the nlpR gene was significantly upregulated, whereas a significant decrease of its expression and TAG accumulation occurred when cerulenin was added. The nlpR disruption negatively affected the nitrate/nitrite reduction as well as lipid biosynthesis under nitrogen-limiting conditions. In contrast, its overexpression increased TAG production during cultivation of cells in nitrogen-rich media. A putative 'NlpR-binding motif' upstream of several genes related to nitrogen and lipid metabolisms was found. The nlpR disruption in RHA1 strain led to a reduced transcription of genes involved in nitrate/nitrite assimilation, as well as in fatty acid and TAG biosynthesis. Purified NlpR was able to bind to narK, nirD, fasI, plsC and atf3 promoter regions. It was suggested that NlpR acts as a pleiotropic transcriptional regulator by activating of nitrate/nitrite assimilation genes and others genes involved in fatty acid and TAG biosynthesis, in response to nitrogen deprivation. © 2016 John Wiley & Sons Ltd.

  8. Life-span extension by dietary restriction is mediated by NLP-7 signaling and coelomocyte endocytosis in C. elegans.

    PubMed

    Park, Sang-Kyu; Link, Christopher D; Johnson, Thomas E

    2010-02-01

    Recent studies have shown that the rate of aging can be modulated by diverse interventions. Dietary restriction is the most widely used intervention to promote longevity; however, the mechanisms underlying the effect of dietary restriction remain elusive. In a previous study, we identified two novel genes, nlp-7 and cup-4, required for normal longevity in Caenorhabditis elegans. nlp-7 is one of a set of neuropeptide-like protein genes; cup-4 encodes an ion-channel involved in endocytosis by coelomocytes. Here, we assess whether nlp-7 and cup-4 mediate longevity increases by dietary restriction. RNAi of nlp-7 or cup-4 significantly reduces the life span of the eat-2 mutant, a genetic model of dietary restriction, but has no effect on the life span of long-lived mutants resulting from reduced insulin/IGF-1 signaling or dysfunction of the mitochondrial electron transport chain. The life-span extension observed in wild-type N2 worms by dietary restriction using bacterial dilution is prevented significantly in nlp-7 and cup-4 mutants. RNAi knockdown of genes encoding candidate receptors of NLP-7 and genes involved in endocytosis by coelomocytes also specifically shorten the life span of the eat-2 mutant. We conclude that two novel pathways, NLP-7 signaling and endocytosis by coelomocytes, are required for life extension under dietary restriction in C. elegans.

  9. Synthesis and characterization of Her2-NLP peptide conjugates targeting circulating breast cancer cells: cellular uptake and localization by fluorescent microscopic imaging.

    PubMed

    Cai, Huawei; Singh, Ajay N; Sun, Xiankai; Peng, Fangyu

    2015-01-01

    To synthesize a fluorescent Her2-NLP peptide conjugate consisting of Her2/neu targeting peptide and nuclear localization sequence peptide (NLP) and assess its cellular uptake and intracellular localization for radionuclide cancer therapy targeting Her2/neu-positive circulating breast cancer cells (CBCC). Fluorescent Cy5.5 Her2-NLP peptide conjugate was synthesized by coupling a bivalent peptide sequence, which consisted of a Her2-binding peptide (NH2-GSGKCCYSL) and an NLP peptide (CGYGPKKKRKVGG) linked by a polyethylene glycol (PEG) chain with 6 repeating units, with an activated Cy5.5 ester. The conjugate was separated and purified by HPLC and then characterized by Maldi-MS. The intracellular localization of fluorescent Cy5.5 Her2-NLP peptide conjugate was assessed by fluorescent microscopic imaging using a confocal microscope after incubation of Cy5.5-Her2-NLP with Her2/neu positive breast cancer cells and Her2/neu negative control breast cancer cells, respectively. Fluorescent signals were detected in cytoplasm of Her2/neu positive breast cancer cells (SKBR-3 and BT474 cell lines), but not or little in cytoplasm of Her2/neu negative breast cancer cells (MDA-MB-231), after incubation of the breast cancer cells with Cy5.5-Her2-NLP conjugates in vitro. No fluorescent signals were detected within the nuclei of Her2/neu positive SKBR-3 and BT474 breast cancer cells, neither Her2/neu negative MDA-MB-231 cells, incubated with the Cy5.5-Her2-NLP peptide conjugates, suggesting poor nuclear localization of the Cy5.5-Her2-NLP conjugates localized within the cytoplasm after their cellular uptake and internalization by the Her2/neu positive breast cancer cells. Her2-binding peptide (KCCYSL) is a promising agent for radionuclide therapy of Her2/neu positive breast cancer using a β(-) or α emitting radionuclide, but poor nuclear localization of the Her2-NLP peptide conjugates may limit its use for eradication of Her2/neu-positive CBCC using I-125 or other Auger electron emitting radionuclide.

  10. Natural Language Processing in aid of FlyBase curators

    PubMed Central

    Karamanis, Nikiforos; Seal, Ruth; Lewin, Ian; McQuilton, Peter; Vlachos, Andreas; Gasperin, Caroline; Drysdale, Rachel; Briscoe, Ted

    2008-01-01

    Background Despite increasing interest in applying Natural Language Processing (NLP) to biomedical text, whether this technology can facilitate tasks such as database curation remains unclear. Results PaperBrowser is the first NLP-powered interface that was developed under a user-centered approach to improve the way in which FlyBase curators navigate an article. In this paper, we first discuss how observing curators at work informed the design and evaluation of PaperBrowser. Then, we present how we appraise PaperBrowser's navigational functionalities in a user-based study using a text highlighting task and evaluation criteria of Human-Computer Interaction. Our results show that PaperBrowser reduces the amount of interactions between two highlighting events and therefore improves navigational efficiency by about 58% compared to the navigational mechanism that was previously available to the curators. Moreover, PaperBrowser is shown to provide curators with enhanced navigational utility by over 74% irrespective of the different ways in which they highlight text in the article. Conclusion We show that state-of-the-art performance in certain NLP tasks such as Named Entity Recognition and Anaphora Resolution can be combined with the navigational functionalities of PaperBrowser to support curation quite successfully. PMID:18410678

  11. Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach.

    PubMed

    Weng, Wei-Hung; Wagholikar, Kavishwar B; McCray, Alexa T; Szolovits, Peter; Chueh, Henry C

    2017-12-01

    The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note. We constructed the pipeline using the clinical NLP system, clinical Text Analysis and Knowledge Extraction System (cTAKES), the Unified Medical Language System (UMLS) Metathesaurus, Semantic Network, and learning algorithms to extract features from two datasets - clinical notes from Integrating Data for Analysis, Anonymization, and Sharing (iDASH) data repository (n = 431) and Massachusetts General Hospital (MGH) (n = 91,237), and built medical subdomain classifiers with different combinations of data representation methods and supervised learning algorithms. We evaluated the performance of classifiers and their portability across the two datasets. The convolutional recurrent neural network with neural word embeddings trained-medical subdomain classifier yielded the best performance measurement on iDASH and MGH datasets with area under receiver operating characteristic curve (AUC) of 0.975 and 0.991, and F1 scores of 0.845 and 0.870, respectively. Considering better clinical interpretability, linear support vector machine-trained medical subdomain classifier using hybrid bag-of-words and clinically relevant UMLS concepts as the feature representation, with term frequency-inverse document frequency (tf-idf)-weighting, outperformed other shallow learning classifiers on iDASH and MGH datasets with AUC of 0.957 and 0.964, and F1 scores of 0.932 and 0.934 respectively. We trained classifiers on one dataset, applied to the other dataset and yielded the threshold of F1 score of 0.7 in classifiers for half of the medical subdomains we studied. Our study shows that a supervised learning-based NLP approach is useful to develop medical subdomain classifiers. The deep learning algorithm with distributed word representation yields better performance yet shallow learning algorithms with the word and concept representation achieves comparable performance with better clinical interpretability. Portable classifiers may also be used across datasets from different institutions.

  12. Neuro-Linguistic Programming and Family Therapy.

    ERIC Educational Resources Information Center

    Davis, Susan L. R.; Davis, Donald I.

    1983-01-01

    Presents a brief introduction to Neuro-Linguistic Programming (NLP), followed by case examples which illustrate some of the substantive gains which NLP techniques have provided in work with couples and families. NLP's major contributions involve understanding new models of human experience. (WAS)

  13. Synergist: Collaborative Analyst Assistant

    DTIC Science & Technology

    2009-04-01

    NLP Framework ............................................................................................ 4  3.2  Identifying Concepts in Text...48  iii LIST OF FIGURES Figure 1: Lymba’s NLP Pipeline...events, general concepts, relations and context, and build representations that yield well to reasoning on text and providing information access. NLP

  14. Speaker Recognition Through NLP and CWT Modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown-VanHoozer, S.A.; Kercel, S.W.; Tucker, R.W.

    The objective of this research is to develop a system capable of identifying speakers on wiretaps from a large database (>500 speakers) with a short search time duration (<30 seconds), and with better than 90% accuracy. Much previous research in speaker recognition has led to algorithms that produced encouraging preliminary results, but were overwhelmed when applied to populations of more than a dozen or so different speakers. The authors are investigating a solution to the "large population" problem by seeking two completely different kinds of characterizing features. These features are he techniques of Neuro-Linguistic Programming (NLP) and the continuous waveletmore » transform (CWT). NLP extracts precise neurological, verbal and non-verbal information, and assimilates the information into useful patterns. These patterns are based on specific cues demonstrated by each individual, and provide ways of determining congruency between verbal and non-verbal cues. The primary NLP modalities are characterized through word spotting (or verbal predicates cues, e.g., see, sound, feel, etc.) while the secondary modalities would be characterized through the speech transcription used by the individual. This has the practical effect of reducing the size of the search space, and greatly speeding up the process of identifying an unknown speaker. The wavelet-based line of investigation concentrates on using vowel phonemes and non-verbal cues, such as tempo. The rationale for concentrating on vowels is there are a limited number of vowels phonemes, and at least one of them usually appears in even the shortest of speech segments. Using the fast, CWT algorithm, the details of both the formant frequency and the glottal excitation characteristics can be easily extracted from voice waveforms. The differences in the glottal excitation waveforms as well as the formant frequency are evident in the CWT output. More significantly, the CWT reveals significant detail of the glottal excitation waveform.« less

  15. Speaker recognition through NLP and CWT modeling.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brown-VanHoozer, A.; Kercel, S. W.; Tucker, R. W.

    The objective of this research is to develop a system capable of identifying speakers on wiretaps from a large database (>500 speakers) with a short search time duration (<30 seconds), and with better than 90% accuracy. Much previous research in speaker recognition has led to algorithms that produced encouraging preliminary results, but were overwhelmed when applied to populations of more than a dozen or so different speakers. The authors are investigating a solution to the ''huge population'' problem by seeking two completely different kinds of characterizing features. These features are extracted using the techniques of Neuro-Linguistic Programming (NLP) and themore » continuous wavelet transform (CWT). NLP extracts precise neurological, verbal and non-verbal information, and assimilates the information into useful patterns. These patterns are based on specific cues demonstrated by each individual, and provide ways of determining congruency between verbal and non-verbal cues. The primary NLP modalities are characterized through word spotting (or verbal predicates cues, e.g., see, sound, feel, etc.) while the secondary modalities would be characterized through the speech transcription used by the individual. This has the practical effect of reducing the size of the search space, and greatly speeding up the process of identifying an unknown speaker. The wavelet-based line of investigation concentrates on using vowel phonemes and non-verbal cues, such as tempo. The rationale for concentrating on vowels is there are a limited number of vowels phonemes, and at least one of them usually appears in even the shortest of speech segments. Using the fast, CWT algorithm, the details of both the formant frequency and the glottal excitation characteristics can be easily extracted from voice waveforms. The differences in the glottal excitation waveforms as well as the formant frequency are evident in the CWT output. More significantly, the CWT reveals significant detail of the glottal excitation waveform.« less

  16. Extracting important information from Chinese Operation Notes with natural language processing methods.

    PubMed

    Wang, Hui; Zhang, Weide; Zeng, Qiang; Li, Zuofeng; Feng, Kaiyan; Liu, Lei

    2014-04-01

    Extracting information from unstructured clinical narratives is valuable for many clinical applications. Although natural Language Processing (NLP) methods have been profoundly studied in electronic medical records (EMR), few studies have explored NLP in extracting information from Chinese clinical narratives. In this study, we report the development and evaluation of extracting tumor-related information from operation notes of hepatic carcinomas which were written in Chinese. Using 86 operation notes manually annotated by physicians as the training set, we explored both rule-based and supervised machine-learning approaches. Evaluating on unseen 29 operation notes, our best approach yielded 69.6% in precision, 58.3% in recall and 63.5% F-score. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. Building gold standard corpora for medical natural language processing tasks.

    PubMed

    Deleger, Louise; Li, Qi; Lingren, Todd; Kaiser, Megan; Molnar, Katalin; Stoutenborough, Laura; Kouril, Michal; Marsolo, Keith; Solti, Imre

    2012-01-01

    We present the construction of three annotated corpora to serve as gold standards for medical natural language processing (NLP) tasks. Clinical notes from the medical record, clinical trial announcements, and FDA drug labels are annotated. We report high inter-annotator agreements (overall F-measures between 0.8467 and 0.9176) for the annotation of Personal Health Information (PHI) elements for a de-identification task and of medications, diseases/disorders, and signs/symptoms for information extraction (IE) task. The annotated corpora of clinical trials and FDA labels will be publicly released and to facilitate translational NLP tasks that require cross-corpora interoperability (e.g. clinical trial eligibility screening) their annotation schemas are aligned with a large scale, NIH-funded clinical text annotation project.

  18. Insights into substrate specificity of NlpC/P60 cell wall hydrolases containing bacterial SH3 domains

    DOE PAGES

    Xu, Qingping; Mengin-Lecreulx, Dominique; Liu, Xueqian W.; ...

    2015-09-15

    Bacterial SH3 (SH3b) domains are commonly fused with papain-like Nlp/P60 cell wall hydrolase domains. To understand how the modular architecture of SH3b and NlpC/P60 affects the activity of the catalytic domain, three putative NlpC/P60 cell wall hydrolases were biochemically and structurally characterized. In addition, these enzymes all have γ-d-Glu-A 2pm (A 2pm is diaminopimelic acid) cysteine amidase (ordl-endopeptidase) activities but with different substrate specificities. One enzyme is a cell wall lysin that cleaves peptidoglycan (PG), while the other two are cell wall recycling enzymes that only cleave stem peptides with an N-terminall-Ala. Their crystal structures revealed a highly conserved structuremore » consisting of two SH3b domains and a C-terminal NlpC/P60 catalytic domain, despite very low sequence identity. Interestingly, loops from the first SH3b domain dock into the ends of the active site groove of the catalytic domain, remodel the substrate binding site, and modulate substrate specificity. Two amino acid differences at the domain interface alter the substrate binding specificity in favor of stem peptides in recycling enzymes, whereas the SH3b domain may extend the peptidoglycan binding surface in the cell wall lysins. Remarkably, the cell wall lysin can be converted into a recycling enzyme with a single mutation.Peptidoglycan is a meshlike polymer that envelops the bacterial plasma membrane and bestows structural integrity. Cell wall lysins and recycling enzymes are part of a set of lytic enzymes that target covalent bonds connecting the amino acid and amino sugar building blocks of the PG network. These hydrolases are involved in processes such as cell growth and division, autolysis, invasion, and PG turnover and recycling. To avoid cleavage of unintended substrates, these enzymes have very selective substrate specificities. Our biochemical and structural analysis of three modular NlpC/P60 hydrolases, one lysin, and two recycling enzymes, show that they may have evolved from a common molecular architecture, where the substrate preference is modulated by local changes. These results also suggest that new pathways for recycling PG turnover products, such as tracheal cytotoxin, may have evolved in bacteria in the human gut microbiome that involve NlpC/P60 cell wall hydrolases.« less

  19. Insights into substrate specificity of NlpC/P60 cell wall hydrolases containing bacterial SH3 domains

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xu, Qingping; Mengin-Lecreulx, Dominique; Liu, Xueqian W.

    Bacterial SH3 (SH3b) domains are commonly fused with papain-like Nlp/P60 cell wall hydrolase domains. To understand how the modular architecture of SH3b and NlpC/P60 affects the activity of the catalytic domain, three putative NlpC/P60 cell wall hydrolases were biochemically and structurally characterized. In addition, these enzymes all have γ-d-Glu-A 2pm (A 2pm is diaminopimelic acid) cysteine amidase (ordl-endopeptidase) activities but with different substrate specificities. One enzyme is a cell wall lysin that cleaves peptidoglycan (PG), while the other two are cell wall recycling enzymes that only cleave stem peptides with an N-terminall-Ala. Their crystal structures revealed a highly conserved structuremore » consisting of two SH3b domains and a C-terminal NlpC/P60 catalytic domain, despite very low sequence identity. Interestingly, loops from the first SH3b domain dock into the ends of the active site groove of the catalytic domain, remodel the substrate binding site, and modulate substrate specificity. Two amino acid differences at the domain interface alter the substrate binding specificity in favor of stem peptides in recycling enzymes, whereas the SH3b domain may extend the peptidoglycan binding surface in the cell wall lysins. Remarkably, the cell wall lysin can be converted into a recycling enzyme with a single mutation.Peptidoglycan is a meshlike polymer that envelops the bacterial plasma membrane and bestows structural integrity. Cell wall lysins and recycling enzymes are part of a set of lytic enzymes that target covalent bonds connecting the amino acid and amino sugar building blocks of the PG network. These hydrolases are involved in processes such as cell growth and division, autolysis, invasion, and PG turnover and recycling. To avoid cleavage of unintended substrates, these enzymes have very selective substrate specificities. Our biochemical and structural analysis of three modular NlpC/P60 hydrolases, one lysin, and two recycling enzymes, show that they may have evolved from a common molecular architecture, where the substrate preference is modulated by local changes. These results also suggest that new pathways for recycling PG turnover products, such as tracheal cytotoxin, may have evolved in bacteria in the human gut microbiome that involve NlpC/P60 cell wall hydrolases.« less

  20. Insights into Substrate Specificity of NlpC/P60 Cell Wall Hydrolases Containing Bacterial SH3 Domains

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xu, Qingping; Mengin-Lecreulx, Dominique; Liu, Xueqian W.

    ABSTRACT Bacterial SH3 (SH3b) domains are commonly fused with papain-like Nlp/P60 cell wall hydrolase domains. To understand how the modular architecture of SH3b and NlpC/P60 affects the activity of the catalytic domain, three putative NlpC/P60 cell wall hydrolases were biochemically and structurally characterized. These enzymes all have γ-d-Glu-A 2pm (A 2pm is diaminopimelic acid) cysteine amidase (ordl-endopeptidase) activities but with different substrate specificities. One enzyme is a cell wall lysin that cleaves peptidoglycan (PG), while the other two are cell wall recycling enzymes that only cleave stem peptides with an N-terminall-Ala. Their crystal structures revealed a highly conserved structure consistingmore » of two SH3b domains and a C-terminal NlpC/P60 catalytic domain, despite very low sequence identity. Interestingly, loops from the first SH3b domain dock into the ends of the active site groove of the catalytic domain, remodel the substrate binding site, and modulate substrate specificity. Two amino acid differences at the domain interface alter the substrate binding specificity in favor of stem peptides in recycling enzymes, whereas the SH3b domain may extend the peptidoglycan binding surface in the cell wall lysins. Remarkably, the cell wall lysin can be converted into a recycling enzyme with a single mutation. IMPORTANCEPeptidoglycan is a meshlike polymer that envelops the bacterial plasma membrane and bestows structural integrity. Cell wall lysins and recycling enzymes are part of a set of lytic enzymes that target covalent bonds connecting the amino acid and amino sugar building blocks of the PG network. These hydrolases are involved in processes such as cell growth and division, autolysis, invasion, and PG turnover and recycling. To avoid cleavage of unintended substrates, these enzymes have very selective substrate specificities. Our biochemical and structural analysis of three modular NlpC/P60 hydrolases, one lysin, and two recycling enzymes, show that they may have evolved from a common molecular architecture, where the substrate preference is modulated by local changes. These results also suggest that new pathways for recycling PG turnover products, such as tracheal cytotoxin, may have evolved in bacteria in the human gut microbiome that involve NlpC/P60 cell wall hydrolases.« less

  1. Using natural language processing techniques to inform research on nanotechnology

    PubMed Central

    Lewinski, Nastassja A

    2015-01-01

    Summary Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics. PMID:26199848

  2. Canary: An NLP Platform for Clinicians and Researchers.

    PubMed

    Malmasi, Shervin; Sandor, Nicolae L; Hosomura, Naoshi; Goldberg, Matt; Skentzos, Stephen; Turchin, Alexander

    2017-05-03

    Information Extraction methods can help discover critical knowledge buried in the vast repositories of unstructured clinical data. However, these methods are underutilized in clinical research, potentially due to the absence of free software geared towards clinicians with little technical expertise. The skills required for developing/using such software constitute a major barrier for medical researchers wishing to employ these methods. To address this, we have developed Canary, a free and open-source solution designed for users without natural language processing (NLP) or software engineering experience. It was designed to be fast and work out of the box via a user-friendly graphical interface.

  3. The Multi-Needle Langmuir Probe System on Board NorSat-1

    NASA Astrophysics Data System (ADS)

    Hoang, H.; Clausen, L. B. N.; Røed, K.; Bekkeng, T. A.; Trondsen, E.; Lybekk, B.; Strøm, H.; Bang-Hauge, D. M.; Pedersen, A.; Spicher, A.; Moen, J. I.

    2018-06-01

    On July 14th, 2017, the first Norwegian scientific satellite NorSat-1 was launched into a high-inclination (98∘), low-Earth orbit (600 km altitude) from Baikonur, Kazakhstan. As part of the payload package, NorSat-1 carries the multi-needle Langmuir probe (m-NLP) instrument which is capable of sampling the electron density at a rate up to 1 kHz, thus offering an unprecedented opportunity to continuously resolve ionospheric plasma density structures down to a few meters. Over the coming years, NorSat-1 will cross the equatorial and polar regions twice every 90 minutes, providing a wealth of data that will help to better understand the mechanisms that dissipate energy input from larger spatial scales by creating small-scale plasma density structures within the ionosphere. In this paper we describe the m-NLP system on board NorSat-1 and present some first results from the instrument commissioning phase. We show that the m-NLP instrument performs as expected and highlight its unique capabilities at resolving small-scale ionospheric plasma density structures.

  4. Launch flexibility using NLP guidance and remote wind sensing

    NASA Technical Reports Server (NTRS)

    Cramer, Evin J.; Bradt, Jerre E.; Hardtla, John W.

    1990-01-01

    This paper examines the use of lidar wind measurements in the implementation of a guidance strategy for a nonlinear programming (NLP) launch guidance algorithm. The NLP algorithm uses B-spline command function representation for flexibility in the design of the guidance steering commands. Using this algorithm, the guidance system solves a two-point boundary value problem at each guidance update. The specification of different boundary value problems at each guidance update provides flexibility that can be used in the design of the guidance strategy. The algorithm can use lidar wind measurements for on pad guidance retargeting and for load limiting guidance steering commands. Examples presented in the paper use simulated wind updates to correct wind induced final orbit errors and to adjust the guidance steering commands to limit the product of the dynamic pressure and angle-of-attack for launch vehicle load alleviation.

  5. Web 2.0-Based Crowdsourcing for High-Quality Gold Standard Development in Clinical Natural Language Processing

    PubMed Central

    Deleger, Louise; Li, Qi; Kaiser, Megan; Stoutenborough, Laura

    2013-01-01

    Background A high-quality gold standard is vital for supervised, machine learning-based, clinical natural language processing (NLP) systems. In clinical NLP projects, expert annotators traditionally create the gold standard. However, traditional annotation is expensive and time-consuming. To reduce the cost of annotation, general NLP projects have turned to crowdsourcing based on Web 2.0 technology, which involves submitting smaller subtasks to a coordinated marketplace of workers on the Internet. Many studies have been conducted in the area of crowdsourcing, but only a few have focused on tasks in the general NLP field and only a handful in the biomedical domain, usually based upon very small pilot sample sizes. In addition, the quality of the crowdsourced biomedical NLP corpora were never exceptional when compared to traditionally-developed gold standards. The previously reported results on medical named entity annotation task showed a 0.68 F-measure based agreement between crowdsourced and traditionally-developed corpora. Objective Building upon previous work from the general crowdsourcing research, this study investigated the usability of crowdsourcing in the clinical NLP domain with special emphasis on achieving high agreement between crowdsourced and traditionally-developed corpora. Methods To build the gold standard for evaluating the crowdsourcing workers’ performance, 1042 clinical trial announcements (CTAs) from the ClinicalTrials.gov website were randomly selected and double annotated for medication names, medication types, and linked attributes. For the experiments, we used CrowdFlower, an Amazon Mechanical Turk-based crowdsourcing platform. We calculated sensitivity, precision, and F-measure to evaluate the quality of the crowd’s work and tested the statistical significance (P<.001, chi-square test) to detect differences between the crowdsourced and traditionally-developed annotations. Results The agreement between the crowd’s annotations and the traditionally-generated corpora was high for: (1) annotations (0.87, F-measure for medication names; 0.73, medication types), (2) correction of previous annotations (0.90, medication names; 0.76, medication types), and excellent for (3) linking medications with their attributes (0.96). Simple voting provided the best judgment aggregation approach. There was no statistically significant difference between the crowd and traditionally-generated corpora. Our results showed a 27.9% improvement over previously reported results on medication named entity annotation task. Conclusions This study offers three contributions. First, we proved that crowdsourcing is a feasible, inexpensive, fast, and practical approach to collect high-quality annotations for clinical text (when protected health information was excluded). We believe that well-designed user interfaces and rigorous quality control strategy for entity annotation and linking were critical to the success of this work. Second, as a further contribution to the Internet-based crowdsourcing field, we will publicly release the JavaScript and CrowdFlower Markup Language infrastructure code that is necessary to utilize CrowdFlower’s quality control and crowdsourcing interfaces for named entity annotations. Finally, to spur future research, we will release the CTA annotations that were generated by traditional and crowdsourced approaches. PMID:23548263

  6. DNA-Targeted 2-Nitroimidazoles: Studies of the Influence of the Phenanthridine-Linked Nitroimidazoles, 2-NLP-3 and 2-NLP-4, on DNA Damage Induced by Ionizing Radiation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Buchko, Garry W.; Weinfeld, Michael

    The nitroimidazole-linked phenanthridines 2-NLP-3 (5-[3-(2-nitro-1-imidazoyl)-propyl]-phenanthridinium bromide) and 2-NLP-4 (5-[3-(2-nitro-1-imidazoyl)-butyl1]-phenanthridinium bromide) are composed of the radiosensitizer, 2-nitroimidazole, attached to the DNA intercalator phenanthridine via a 3- and 4-carbon linker, respectively. Previous in vitro assays show both compounds to be 10 - 100 times more efficient as hypoxic cell radiosensitizer, misonidazole[Cowan et al., Radiat. Res. 127, 81-89, 1991]. Here we have used a 32P postlabeling assay and 5'-end labeled oligonucleotide assay to compare the radiogenic DNA damage generated in the presence of 2-NLP-3, 2-NLP-4 compared to irradiation in the presence of misonidazole. This may account, at least in part, for the greatermore » cellular radiosensitization shown by the nitroimidazole-linked phenanthridines over misonidazole.« less

  7. A Conserved Dopamine-Cholecystokinin Signaling Pathway Shapes Context–Dependent Caenorhabditis elegans Behavior

    PubMed Central

    Bhattacharya, Raja; Touroutine, Denis; Barbagallo, Belinda; Climer, Jason; Lambert, Christopher M.; Clark, Christopher M.; Alkema, Mark J.; Francis, Michael M.

    2014-01-01

    An organism's ability to thrive in changing environmental conditions requires the capacity for making flexible behavioral responses. Here we show that, in the nematode Caenorhabditis elegans, foraging responses to changes in food availability require nlp-12, a homolog of the mammalian neuropeptide cholecystokinin (CCK). nlp-12 expression is limited to a single interneuron (DVA) that is postsynaptic to dopaminergic neurons involved in food-sensing, and presynaptic to locomotory control neurons. NLP-12 release from DVA is regulated through the D1-like dopamine receptor DOP-1, and both nlp-12 and dop-1 are required for normal local food searching responses. nlp-12/CCK overexpression recapitulates characteristics of local food searching, and DVA ablation or mutations disrupting muscle acetylcholine receptor function attenuate these effects. Conversely, nlp-12 deletion reverses behavioral and functional changes associated with genetically enhanced muscle acetylcholine receptor activity. Thus, our data suggest that dopamine-mediated sensory information about food availability shapes foraging in a context-dependent manner through peptide modulation of locomotory output. PMID:25167143

  8. The neuropeptide NLP-22 regulates a sleep-like state in Caenorhabditis elegans

    PubMed Central

    Nelson, MD; Trojanowski, NF; George-Raizen, JB; Smith, CJ; Yu, C-C; Fang-Yen, C; Raizen, DM

    2013-01-01

    Neuropeptides play central roles in the regulation of homeostatic behaviors such as sleep and feeding. Caenorhabditis elegans displays sleep-like quiescence of locomotion and feeding during a larval transition stage called lethargus and feeds during active larval and adult stages. Here we show that the neuropeptide NLP-22 is a regulator of Caenorhabditis elegans sleep-like quiescence observed during lethargus. nlp-22 shows cyclical mRNA expression in synchrony with lethargus; it is regulated by LIN-42, an orthologue of the core circadian protein PERIOD; and it is expressed solely in the two RIA interneurons. nlp-22 and the RIA interneurons are required for normal lethargus quiescence, and forced expression of nlp-22 during active stages causes anachronistic locomotion and feeding quiescence. Optogenetic stimulation of RIA interneurons has a movement-promoting effect, demonstrating functional complexity in a single neuron type. Our work defines a quiescence-regulating role for NLP-22 and expands our knowledge of the neural circuitry controlling Caenorhabditis elegans behavioral quiescence. PMID:24301180

  9. The neuropeptide NLP-22 regulates a sleep-like state in Caenorhabditis elegans.

    PubMed

    Nelson, M D; Trojanowski, N F; George-Raizen, J B; Smith, C J; Yu, C-C; Fang-Yen, C; Raizen, D M

    2013-01-01

    Neuropeptides have central roles in the regulation of homoeostatic behaviours such as sleep and feeding. Caenorhabditis elegans displays sleep-like quiescence of locomotion and feeding during a larval transition stage called lethargus and feeds during active larval and adult stages. Here we show that the neuropeptide NLP-22 is a regulator of Caenorhabditis elegans sleep-like quiescence observed during lethargus. nlp-22 shows cyclical mRNA expression in synchrony with lethargus; it is regulated by LIN-42, an orthologue of the core circadian protein PERIOD; and it is expressed solely in the two RIA interneurons. nlp-22 and the RIA interneurons are required for normal lethargus quiescence, and forced expression of nlp-22 during active stages causes anachronistic locomotion and feeding quiescence. Optogenetic stimulation of the RIA interneurons has a movement-promoting effect, demonstrating functional complexity in a single-neuron type. Our work defines a quiescence-regulating role for NLP-22 and expands our knowledge of the neural circuitry controlling Caenorhabditis elegans behavioural quiescence.

  10. Biomedical informatics advancing the national health agenda: the AMIA 2015 year-in-review in clinical and consumer informatics.

    PubMed

    Roberts, Kirk; Boland, Mary Regina; Pruinelli, Lisiane; Dcruz, Jina; Berry, Andrew; Georgsson, Mattias; Hazen, Rebecca; Sarmiento, Raymond F; Backonja, Uba; Yu, Kun-Hsing; Jiang, Yun; Brennan, Patricia Flatley

    2017-04-01

    The field of biomedical informatics experienced a productive 2015 in terms of research. In order to highlight the accomplishments of that research, elicit trends, and identify shortcomings at a macro level, a 19-person team conducted an extensive review of the literature in clinical and consumer informatics. The result of this process included a year-in-review presentation at the American Medical Informatics Association Annual Symposium and a written report (see supplemental data). Key findings are detailed in the report and summarized here. This article organizes the clinical and consumer health informatics research from 2015 under 3 themes: the electronic health record (EHR), the learning health system (LHS), and consumer engagement. Key findings include the following: (1) There are significant advances in establishing policies for EHR feature implementation, but increased interoperability is necessary for these to gain traction. (2) Decision support systems improve practice behaviors, but evidence of their impact on clinical outcomes is still lacking. (3) Progress in natural language processing (NLP) suggests that we are approaching but have not yet achieved truly interactive NLP systems. (4) Prediction models are becoming more robust but remain hampered by the lack of interoperable clinical data records. (5) Consumers can and will use mobile applications for improved engagement, yet EHR integration remains elusive. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  11. The Use of Systemic-Functional Linguistics in Automated Text Mining

    DTIC Science & Technology

    2009-03-01

    what degree two or more documents are similar in terms of their meaning. Simply put, such a cognitive model aims to link the physical manifestation...These features, both in terms of frequency and their chaining across a text, were taken as salient stylistic features that had a direct relationship to...because SFL attempts to model these cognitive processes, this has the potential to improve NLP tasks by making them more ’human-like’. Secondly

  12. Interacting TCP and NLP transcription factors control plant responses to nitrate availability.

    PubMed

    Guan, Peizhu; Ripoll, Juan-José; Wang, Renhou; Vuong, Lam; Bailey-Steinitz, Lindsay J; Ye, Dening; Crawford, Nigel M

    2017-02-28

    Plants have evolved adaptive strategies that involve transcriptional networks to cope with and survive environmental challenges. Key transcriptional regulators that mediate responses to environmental fluctuations in nitrate have been identified; however, little is known about how these regulators interact to orchestrate nitrogen (N) responses and cell-cycle regulation. Here we report that teosinte branched1/cycloidea/proliferating cell factor1-20 (TCP20) and NIN-like protein (NLP) transcription factors NLP6 and NLP7, which act as activators of nitrate assimilatory genes, bind to adjacent sites in the upstream promoter region of the nitrate reductase gene, NIA1 , and physically interact under continuous nitrate and N-starvation conditions. Regions of these proteins necessary for these interactions were found to include the type I/II Phox and Bem1p (PB1) domains of NLP6&7, a protein-interaction module conserved in animals for nutrient signaling, and the histidine- and glutamine-rich domain of TCP20, which is conserved across plant species. Under N starvation, TCP20-NLP6&7 heterodimers accumulate in the nucleus, and this coincides with TCP20 and NLP6&7-dependent up-regulation of nitrate assimilation and signaling genes and down-regulation of the G 2 /M cell-cycle marker gene, CYCB1;1 TCP20 and NLP6&7 also support root meristem growth under N starvation. These findings provide insights into how plants coordinate responses to nitrate availability, linking nitrate assimilation and signaling with cell-cycle progression.

  13. Scholarly Information Extraction Is Going to Make a Quantum Leap with PubMed Central (PMC).

    PubMed

    Matthies, Franz; Hahn, Udo

    2017-01-01

    With the increasing availability of complete full texts (journal articles), rather than their surrogates (titles, abstracts), as resources for text analytics, entirely new opportunities arise for information extraction and text mining from scholarly publications. Yet, we gathered evidence that a range of problems are encountered for full-text processing when biomedical text analytics simply reuse existing NLP pipelines which were developed on the basis of abstracts (rather than full texts). We conducted experiments with four different relation extraction engines all of which were top performers in previous BioNLP Event Extraction Challenges. We found that abstract-trained engines loose up to 6.6% F-score points when run on full-text data. Hence, the reuse of existing abstract-based NLP software in a full-text scenario is considered harmful because of heavy performance losses. Given the current lack of annotated full-text resources to train on, our study quantifies the price paid for this short cut.

  14. Minimum Fuel Trajectory Design in Multiple Dynamical Environments Utilizing Direct Transcription Methods and Particle Swarm Optimization

    DTIC Science & Technology

    2016-03-01

    89 3.1.3 NLP Improvement...3.2.1.2 NLP Improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 3.2.2 Multiple-burn Planar LEO to GEO Transfer...101 3.2.2.1 PSO Initial Guess Generation . . . . . . . . . . . . . . . . . . . . . 101 3.2.2.2 NLP Improvement

  15. The effect of neuro-linguistic programming on occupational stress in critical care nurses

    PubMed Central

    HemmatiMaslakpak, Masumeh; Farhadi, Masumeh; Fereidoni, Javid

    2016-01-01

    Background: The use of coping strategies in reducing the adverse effects of stress can be helpful. Nero-linguistic programming (NLP) is one of the modern methods of psychotherapy. This study aimed to determine the effect of NLP on occupational stress in nurses working in critical care units of Urmia. Materials and Methods: This study was carried out quasi-experimentally (before–after) with control and experimental groups. Of all the nurses working in the critical care units of Urmia Imam Khomeini and Motahari educational/therapeutic centers, 60 people participated in this survey. Eighteen sessions of intervention were done, each for 180 min. The experimental group received NLP program (such as goal setting, time management, assertiveness skills, representational system, and neurological levels, as well as some practical and useful NLP techniques). Expanding Nursing Stress Scale (ENSS) was used as the data gathering tool. Data were analyzed using SPSS version 16. Descriptive statistics and Chi-square test, Mann–Whitney test, and independent t-test were used to analyze the data. Results: The baseline score average of job stress was 120.88 and 121.36 for the intervention and control groups, respectively (P = 0.65). After intervention, the score average of job stress decreased to 64.53 in the experimental group while that of control group remained relatively unchanged (120.96). Mann–Whitney test results showed that stress scores between the two groups was statistically significant (P = 0.0001). Conclusions: The results showed that the use of NLP can increase coping with stressful situations, and it can reduce the adverse effects of occupational stress. PMID:26985221

  16. Identification of Long Bone Fractures in Radiology Reports Using Natural Language Processing to Support Healthcare Quality Improvement

    PubMed Central

    Masino, Aaron J.; Casper, T. Charles; Dean, Jonathan M.; Bell, Jamie; Enriquez, Rene; Deakyne, Sara; Chamberlain, James M.; Alpern, Elizabeth R.

    2016-01-01

    Summary Background Important information to support healthcare quality improvement is often recorded in free text documents such as radiology reports. Natural language processing (NLP) methods may help extract this information, but these methods have rarely been applied outside the research laboratories where they were developed. Objective To implement and validate NLP tools to identify long bone fractures for pediatric emergency medicine quality improvement. Methods Using freely available statistical software packages, we implemented NLP methods to identify long bone fractures from radiology reports. A sample of 1,000 radiology reports was used to construct three candidate classification models. A test set of 500 reports was used to validate the model performance. Blinded manual review of radiology reports by two independent physicians provided the reference standard. Each radiology report was segmented and word stem and bigram features were constructed. Common English “stop words” and rare features were excluded. We used 10-fold cross-validation to select optimal configuration parameters for each model. Accuracy, recall, precision and the F1 score were calculated. The final model was compared to the use of diagnosis codes for the identification of patients with long bone fractures. Results There were 329 unique word stems and 344 bigrams in the training documents. A support vector machine classifier with Gaussian kernel performed best on the test set with accuracy=0.958, recall=0.969, precision=0.940, and F1 score=0.954. Optimal parameters for this model were cost=4 and gamma=0.005. The three classification models that we tested all performed better than diagnosis codes in terms of accuracy, precision, and F1 score (diagnosis code accuracy=0.932, recall=0.960, precision=0.896, and F1 score=0.927). Conclusions NLP methods using a corpus of 1,000 training documents accurately identified acute long bone fractures from radiology reports. Strategic use of straightforward NLP methods, implemented with freely available software, offers quality improvement teams new opportunities to extract information from narrative documents. PMID:27826610

  17. The Effects of Clinical Hypnosis versus Neurolinguistic Programming (NLP) before External Cephalic Version (ECV): A Prospective Off-Centre Randomised, Double-Blind, Controlled Trial

    PubMed Central

    Reinhard, Joscha; Peiffer, Swati; Sänger, Nicole; Herrmann, Eva; Yuan, Juping; Louwen, Frank

    2012-01-01

    Objective. To examine the effects of clinical hypnosis versus NLP intervention on the success rate of ECV procedures in comparison to a control group. Methods. A prospective off-centre randomised trial of a clinical hypnosis intervention against NLP of women with a singleton breech fetus at or after 370/7 (259 days) weeks of gestation and normal amniotic fluid index. All 80 participants heard a 20-minute recorded intervention via head phones. Main outcome assessed was success rate of ECV. The intervention groups were compared with a control group with standard medical care alone (n = 122). Results. A total of 42 women, who received a hypnosis intervention prior to ECV, had a 40.5% (n = 17), successful ECV, whereas 38 women, who received NLP, had a 44.7% (n = 17) successful ECV (P > 0.05). The control group had similar patient characteristics compared to the intervention groups (P > 0.05). In the control group (n = 122) 27.3% (n = 33) had a statistically significant lower successful ECV procedure than NLP (P = 0.05) and hypnosis and NLP (P = 0.03). Conclusions. These findings suggest that prior clinical hypnosis and NLP have similar success rates of ECV procedures and are both superior to standard medical care alone. PMID:22778774

  18. The Effects of Clinical Hypnosis versus Neurolinguistic Programming (NLP) before External Cephalic Version (ECV): A Prospective Off-Centre Randomised, Double-Blind, Controlled Trial.

    PubMed

    Reinhard, Joscha; Peiffer, Swati; Sänger, Nicole; Herrmann, Eva; Yuan, Juping; Louwen, Frank

    2012-01-01

    Objective. To examine the effects of clinical hypnosis versus NLP intervention on the success rate of ECV procedures in comparison to a control group. Methods. A prospective off-centre randomised trial of a clinical hypnosis intervention against NLP of women with a singleton breech fetus at or after 37(0/7) (259 days) weeks of gestation and normal amniotic fluid index. All 80 participants heard a 20-minute recorded intervention via head phones. Main outcome assessed was success rate of ECV. The intervention groups were compared with a control group with standard medical care alone (n = 122). Results. A total of 42 women, who received a hypnosis intervention prior to ECV, had a 40.5% (n = 17), successful ECV, whereas 38 women, who received NLP, had a 44.7% (n = 17) successful ECV (P > 0.05). The control group had similar patient characteristics compared to the intervention groups (P > 0.05). In the control group (n = 122) 27.3% (n = 33) had a statistically significant lower successful ECV procedure than NLP (P = 0.05) and hypnosis and NLP (P = 0.03). Conclusions. These findings suggest that prior clinical hypnosis and NLP have similar success rates of ECV procedures and are both superior to standard medical care alone.

  19. Evidence-based Neuro Linguistic Psychotherapy: a meta-analysis.

    PubMed

    Zaharia, Cătălin; Reiner, Melita; Schütz, Peter

    2015-12-01

    Neuro Linguistic Programming (NLP) Framework has enjoyed enormous popularity in the field of applied psychology. NLP has been used in business, education, law, medicine and psychotherapy to identify people's patterns and alter their responses to stimuli, so they are better able to regulate their environment and themselves. NLP looks at achieving goals, creating stable relationships, eliminating barriers such as fears and phobias, building self-confidence, and self-esteem, and achieving peak performance. Neuro Linguistic Psychotherapy (NLPt) encompasses NLP as framework and set of interventions in the treatment of individuals with different psychological and/or social problems. We aimed systematically to analyse the available data regarding the effectiveness of Neuro Linguistic Psychotherapy (NLPt). The present work is a meta-analysis of studies, observational or randomized controlled trials, for evaluating the efficacy of Neuro Linguistic Programming in individuals with different psychological and/or social problems. The databases searched to identify studies in English and German language: CENTRAL in the Cochrane Library; PubMed; ISI Web of Knowledge (include results also from Medline and the Web of Science); PsycINFO (including PsycARTICLES); Psyndex; Deutschsprachige Diplomarbeiten der Psychologie (database of theses in Psychology in German language), Social SciSearch; National library of health and two NLP-specific research databases: one from the NLP Community (http://www.nlp.de/cgi-bin/research/nlprdb.cgi?action=res_entries) and one from the NLP Group (http://www.nlpgrup.com/bilimselarastirmalar/bilimsel-arastirmalar-4.html#Zweig154). From a total number of 425 studies, 350 were removed and considered not relevant based on the title and abstract. Included, in the final analysis, are 12 studies with numbers of participants ranging between 12 and 115 subjects. The vast majority of studies were prospective observational. The actual paper represents the first meta-analysis evaluating the effectiveness of NLP therapy for individuals with social/psychological problems. The overall meta-analysis found that the NLP therapy may add an overall standardized mean difference of 0.54 with a confidence interval of CI=[0.20; 0.88]. Neuro-Linguistic Psychotherapy as a psychotherapeutic modality grounded in theoretical frameworks, methodologies and interventions scientifically developed, including models developed by NLP, shows results that can hold its ground in comparison with other psychotherapeutic methods.

  20. Enhancing Risk Assessment in Patients Receiving Chronic Opioid Analgesic Therapy Using Natural Language Processing.

    PubMed

    Haller, Irina V; Renier, Colleen M; Juusola, Mitch; Hitz, Paul; Steffen, William; Asmus, Michael J; Craig, Terri; Mardekian, Jack; Masters, Elizabeth T; Elliott, Thomas E

    2017-10-01

    Clinical guidelines for the use of opioids in chronic noncancer pain recommend assessing risk for aberrant drug-related behaviors prior to initiating opioid therapy. Despite recent dramatic increases in prescription opioid misuse and abuse, use of screening tools by clinicians continues to be underutilized. This research evaluated natural language processing (NLP) together with other data extraction techniques for risk assessment of patients considered for opioid therapy as a means of predicting opioid abuse. Using a retrospective cohort of 3,668 chronic noncancer pain patients with at least one opioid agreement between January 1, 2007, and December 31, 2012, we examined the availability of electronic health record structured and unstructured data to populate the Opioid Risk Tool (ORT) and other selected outcomes. Clinician-documented opioid agreement violations in the clinical notes were determined using NLP techniques followed by manual review of the notes. Confirmed through manual review, the NLP algorithm had 96.1% sensitivity, 92.8% specificity, and 92.6% positive predictive value in identifying opioid agreement violation. At the time of most recent opioid agreement, automated ORT identified 42.8% of patients as at low risk, 28.2% as at moderate risk, and 29.0% as at high risk for opioid abuse. During a year following the agreement, 22.5% of patients had opioid agreement violations. Patients classified as high risk were three times more likely to violate opioid agreements compared with those with low/moderate risk. Our findings suggest that NLP techniques have potential utility to support clinicians in screening chronic noncancer pain patients considered for long-term opioid therapy. © 2016 American Academy of Pain Medicine. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

  1. Automating Quality Measures for Heart Failure Using Natural Language Processing: A Descriptive Study in the Department of Veterans Affairs

    PubMed Central

    Kim, Youngjun; Gobbel, Glenn Temple; Matheny, Michael E; Redd, Andrew; Bray, Bruce E; Heidenreich, Paul; Bolton, Dan; Heavirland, Julia; Kelly, Natalie; Reeves, Ruth; Kalsy, Megha; Goldstein, Mary Kane; Meystre, Stephane M

    2018-01-01

    Background We developed an accurate, stakeholder-informed, automated, natural language processing (NLP) system to measure the quality of heart failure (HF) inpatient care, and explored the potential for adoption of this system within an integrated health care system. Objective To accurately automate a United States Department of Veterans Affairs (VA) quality measure for inpatients with HF. Methods We automated the HF quality measure Congestive Heart Failure Inpatient Measure 19 (CHI19) that identifies whether a given patient has left ventricular ejection fraction (LVEF) <40%, and if so, whether an angiotensin-converting enzyme inhibitor or angiotensin-receptor blocker was prescribed at discharge if there were no contraindications. We used documents from 1083 unique inpatients from eight VA medical centers to develop a reference standard (RS) to train (n=314) and test (n=769) the Congestive Heart Failure Information Extraction Framework (CHIEF). We also conducted semi-structured interviews (n=15) for stakeholder feedback on implementation of the CHIEF. Results The CHIEF classified each hospitalization in the test set with a sensitivity (SN) of 98.9% and positive predictive value of 98.7%, compared with an RS and SN of 98.5% for available External Peer Review Program assessments. Of the 1083 patients available for the NLP system, the CHIEF evaluated and classified 100% of cases. Stakeholders identified potential implementation facilitators and clinical uses of the CHIEF. Conclusions The CHIEF provided complete data for all patients in the cohort and could potentially improve the efficiency, timeliness, and utility of HF quality measurements. PMID:29335238

  2. Mapping Partners Master Drug Dictionary to RxNorm using an NLP-based approach.

    PubMed

    Zhou, Li; Plasek, Joseph M; Mahoney, Lisa M; Chang, Frank Y; DiMaggio, Dana; Rocha, Roberto A

    2012-08-01

    To develop an automated method based on natural language processing (NLP) to facilitate the creation and maintenance of a mapping between RxNorm and a local medication terminology for interoperability and meaningful use purposes. We mapped 5961 terms from Partners Master Drug Dictionary (MDD) and 99 of the top prescribed medications to RxNorm. The mapping was conducted at both term and concept levels using an NLP tool, called MTERMS, followed by a manual review conducted by domain experts who created a gold standard mapping. The gold standard was used to assess the overall mapping between MDD and RxNorm and evaluate the performance of MTERMS. Overall, 74.7% of MDD terms and 82.8% of the top 99 terms had an exact semantic match to RxNorm. Compared to the gold standard, MTERMS achieved a precision of 99.8% and a recall of 73.9% when mapping all MDD terms, and a precision of 100% and a recall of 72.6% when mapping the top prescribed medications. The challenges and gaps in mapping MDD to RxNorm are mainly due to unique user or application requirements for representing drug concepts and the different modeling approaches inherent in the two terminologies. An automated approach based on NLP followed by human expert review is an efficient and feasible way for conducting dynamic mapping. Copyright © 2011 Elsevier Inc. All rights reserved.

  3. A System for Identifying Named Entities in Biomedical Text: how Results From two Evaluations Reflect on Both the System and the Evaluations

    PubMed Central

    Dingare, Shipra; Nissim, Malvina; Finkel, Jenny; Grover, Claire

    2005-01-01

    We present a maximum entropy-based system for identifying named entities (NEs) in biomedical abstracts and present its performance in the only two biomedical named entity recognition (NER) comparative evaluations that have been held to date, namely BioCreative and Coling BioNLP. Our system obtained an exact match F-score of 83.2% in the BioCreative evaluation and 70.1% in the BioNLP evaluation. We discuss our system in detail, including its rich use of local features, attention to correct boundary identification, innovative use of external knowledge resources, including parsing and web searches, and rapid adaptation to new NE sets. We also discuss in depth problems with data annotation in the evaluations which caused the final performance to be lower than optimal. PMID:18629295

  4. NLP as a communication strategy tool in libraries

    NASA Astrophysics Data System (ADS)

    Koulouris, Alexandros; Sakas, Damianos P.; Giannakopoulos, Georgios

    2015-02-01

    The role of communication is a catalyst for the proper function of an organization. This paper focuses on libraries, where the communication is crucial for their success. In our opinion, libraries in Greece are suffering from the lack of communication and marketing strategy. Communication has many forms and manifestations. A key aspect of communication is body language, which has a dominant communication tool the neuro-linguistic programming (NLP). The body language is a system that expresses and transfers messages, thoughts and emotions. More and more organizations in the public sector and companies in the private sector base their success on the communication skills of their personnel. The NLP suggests several methods to obtain excellent relations in the workplace and to develop ideal communication. The NLP theory is mainly based on the development of standards (communication model) that guarantees the expected results. This research was conducted and analyzed in two parts, the qualitative and the quantitative. The findings mainly confirm the need for proper communication within libraries. In the qualitative research, the interviewees were aware of communication issues, although some gaps in that knowledge were observed. Even this slightly lack of knowledge, highlights the need for constant information through educational programs. This is particularly necessary for senior executives of libraries, who should attend relevant seminars and refresh their knowledge on communication related issues.

  5. Head injury assessment of non-lethal projectile impacts: A combined experimental/computational method.

    PubMed

    Sahoo, Debasis; Robbe, Cyril; Deck, Caroline; Meyer, Frank; Papy, Alexandre; Willinger, Remy

    2016-11-01

    The main objective of this study is to develop a methodology to assess this risk based on experimental tests versus numerical predictive head injury simulations. A total of 16 non-lethal projectiles (NLP) impacts were conducted with rigid force plate at three different ranges of impact velocity (120, 72 and 55m/s) and the force/deformation-time data were used for the validation of finite element (FE) NLP. A good accordance between experimental and simulation data were obtained during validation of FE NLP with high correlation value (>0.98) and peak force discrepancy of less than 3%. A state-of-the art finite element head model with enhanced brain and skull material laws and specific head injury criteria was used for numerical computation of NLP impacts. Frontal and lateral FE NLP impacts to the head model at different velocities were performed under LS-DYNA. It is the very first time that the lethality of NLP is assessed by axonal strain computation to predict diffuse axonal injury (DAI) in NLP impacts to head. In case of temporo-parietal impact the min-max risk of DAI is 0-86%. With a velocity above 99.2m/s there is greater than 50% risk of DAI for temporo-parietal impacts. All the medium- and high-velocity impacts are susceptible to skull fracture, with a percentage risk higher than 90%. This study provides tool for a realistic injury (DAI and skull fracture) assessment during NLP impacts to the human head. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. Nesfatin-1-like peptide is a novel metabolic factor that suppresses feeding, and regulates whole-body energy homeostasis in male Wistar rats

    PubMed Central

    Gawli, Kavishankar; Ramesh, Naresh

    2017-01-01

    Nucleobindin-1 has high sequence similarity to nucleobindin-2, which encodes the anorectic and metabolic peptide, nesfatin-1. We previously reported a nesfatin-1-like peptide (NLP), anorectic in fish and insulinotropic in mice islet beta-like cells. The main objective of this research was to determine whether NLP is a metabolic regulator in male Wistar rats. A single intraperitoneal (IP) injection of NLP (100 μg/kg BW) decreased food intake and increased ambulatory movement, without causing any change in total activity or energy expenditure when compared to saline-treated rats. Continuous subcutaneous infusion of NLP (100 μg/kg BW) using osmotic mini-pumps for 7 days caused a reduction in food intake on days 3 and 4. Similarly, water intake was also reduced for two days (days 3 and 4) with the effect being observed during the dark phase. This was accompanied by an increased RER and energy expenditure. However, decreased whole-body fat oxidation, and total activity were observed during the long-term treatment (7 days). Body weight gain was not significantly different between control and NLP infused rats. The expression of mRNAs encoding adiponectin, resistin, ghrelin, cholecystokinin and uncoupling protein 1 (UCP1) were significantly upregulated, while leptin and peptide YY mRNA expression was downregulated in NLP-treated rats. These findings indicate that administration of NLP at 100 μg/kg BW reduces food intake and modulates whole body energy balance. In summary, NLP is a novel metabolic peptide in rats. PMID:28542568

  7. Identifying QT prolongation from ECG impressions using a general-purpose Natural Language Processor

    PubMed Central

    Denny, Joshua C.; Miller, Randolph A.; Waitman, Lemuel Russell; Arrieta, Mark; Peterson, Joshua F.

    2009-01-01

    Objective Typically detected via electrocardiograms (ECGs), QT interval prolongation is a known risk factor for sudden cardiac death. Since medications can promote or exacerbate the condition, detection of QT interval prolongation is important for clinical decision support. We investigated the accuracy of natural language processing (NLP) for identifying QT prolongation from cardiologist-generated, free-text ECG impressions compared to corrected QT (QTc) thresholds reported by ECG machines. Methods After integrating negation detection to a locally-developed natural language processor, the KnowledgeMap concept identifier, we evaluated NLP-based detection of QT prolongation compared to the calculated QTc on a set of 44,318 ECGs obtained from hospitalized patients. We also created a string query using regular expressions to identify QT prolongation. We calculated sensitivity and specificity of the methods using manual physician review of the cardiologist-generated reports as the gold standard. To investigate causes of “false positive” calculated QTc, we manually reviewed randomly selected ECGs with a long calculated QTc but no mention of QT prolongation. Separately, we validated the performance of the negation detection algorithm on 5,000 manually-categorized ECG phrases for any medical concept (not limited to QT prolongation) prior to developing the NLP query for QT prolongation. Results The NLP query for QT prolongation correctly identified 2,364 of 2,373 ECGs with QT prolongation with a sensitivity of 0.996 and a positive predictive value of 1.000. There were no false positives. The regular expression query had a sensitivity of 0.999 and positive predictive value of 0.982. In contrast, the positive predictive value of common QTc thresholds derived from ECG machines was 0.07–0.25 with corresponding sensitivities of 0.994–0.046. The negation detection algorithm had a recall of 0.973 and precision of 0.982 for 10,490 concepts found within ECG impressions. Conclusions NLP and regular expression queries of cardiologists’ ECG interpretations can more effectively identify QT prolongation than the automated QTc intervals reported by ECG machines. Future clinical decision support could employ NLP queries to detect QTc prolongation and other reported ECG abnormalities. PMID:18938105

  8. Enhancing Comparative Effectiveness Research With Automated Pediatric Pneumonia Detection in a Multi-Institutional Clinical Repository: A PHIS+ Pilot Study.

    PubMed

    Meystre, Stephane; Gouripeddi, Ramkiran; Tieder, Joel; Simmons, Jeffrey; Srivastava, Rajendu; Shah, Samir

    2017-05-15

    Community-acquired pneumonia is a leading cause of pediatric morbidity. Administrative data are often used to conduct comparative effectiveness research (CER) with sufficient sample sizes to enhance detection of important outcomes. However, such studies are prone to misclassification errors because of the variable accuracy of discharge diagnosis codes. The aim of this study was to develop an automated, scalable, and accurate method to determine the presence or absence of pneumonia in children using chest imaging reports. The multi-institutional PHIS+ clinical repository was developed to support pediatric CER by expanding an administrative database of children's hospitals with detailed clinical data. To develop a scalable approach to find patients with bacterial pneumonia more accurately, we developed a Natural Language Processing (NLP) application to extract relevant information from chest diagnostic imaging reports. Domain experts established a reference standard by manually annotating 282 reports to train and then test the NLP application. Findings of pleural effusion, pulmonary infiltrate, and pneumonia were automatically extracted from the reports and then used to automatically classify whether a report was consistent with bacterial pneumonia. Compared with the annotated diagnostic imaging reports reference standard, the most accurate implementation of machine learning algorithms in our NLP application allowed extracting relevant findings with a sensitivity of .939 and a positive predictive value of .925. It allowed classifying reports with a sensitivity of .71, a positive predictive value of .86, and a specificity of .962. When compared with each of the domain experts manually annotating these reports, the NLP application allowed for significantly higher sensitivity (.71 vs .527) and similar positive predictive value and specificity . NLP-based pneumonia information extraction of pediatric diagnostic imaging reports performed better than domain experts in this pilot study. NLP is an efficient method to extract information from a large collection of imaging reports to facilitate CER. ©Stephane Meystre, Ramkiran Gouripeddi, Joel Tieder, Jeffrey Simmons, Rajendu Srivastava, Samir Shah. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 15.05.2017.

  9. Neuro-Linguistic Programming in Couple Therapy.

    ERIC Educational Resources Information Center

    Forman, Bruce D.

    Neuro-Linguistic Programming (NLP) is a method of understanding the organization of subjective human experience. The NLP model provides a theoretical framework for directing or guiding therapeutic change. According to NLP, people experience the so-called real world indirectly and operate on the real world as if it were like the model of it they…

  10. The Role of NLP in Teachers' Classroom Discourse

    ERIC Educational Resources Information Center

    Millrood, Radislav

    2004-01-01

    Neuro-linguistic programming (NLP) is an approach to language teaching which is claimed to help achieve excellence in learner performance. Yet there is little evidence of the impact that NLP techniques in teachers' discourse can have on learners. The article draws on workshops with teachers where classroom simulations were used to raise teachers'…

  11. A generalizable NLP framework for fast development of pattern-based biomedical relation extraction systems.

    PubMed

    Peng, Yifan; Torii, Manabu; Wu, Cathy H; Vijay-Shanker, K

    2014-08-23

    Text mining is increasingly used in the biomedical domain because of its ability to automatically gather information from large amount of scientific articles. One important task in biomedical text mining is relation extraction, which aims to identify designated relations among biological entities reported in literature. A relation extraction system achieving high performance is expensive to develop because of the substantial time and effort required for its design and implementation. Here, we report a novel framework to facilitate the development of a pattern-based biomedical relation extraction system. It has several unique design features: (1) leveraging syntactic variations possible in a language and automatically generating extraction patterns in a systematic manner, (2) applying sentence simplification to improve the coverage of extraction patterns, and (3) identifying referential relations between a syntactic argument of a predicate and the actual target expected in the relation extraction task. A relation extraction system derived using the proposed framework achieved overall F-scores of 72.66% for the Simple events and 55.57% for the Binding events on the BioNLP-ST 2011 GE test set, comparing favorably with the top performing systems that participated in the BioNLP-ST 2011 GE task. We obtained similar results on the BioNLP-ST 2013 GE test set (80.07% and 60.58%, respectively). We conducted additional experiments on the training and development sets to provide a more detailed analysis of the system and its individual modules. This analysis indicates that without increasing the number of patterns, simplification and referential relation linking play a key role in the effective extraction of biomedical relations. In this paper, we present a novel framework for fast development of relation extraction systems. The framework requires only a list of triggers as input, and does not need information from an annotated corpus. Thus, we reduce the involvement of domain experts, who would otherwise have to provide manual annotations and help with the design of hand crafted patterns. We demonstrate how our framework is used to develop a system which achieves state-of-the-art performance on a public benchmark corpus.

  12. Ease of adoption of clinical natural language processing software: An evaluation of five systems.

    PubMed

    Zheng, Kai; Vydiswaran, V G Vinod; Liu, Yang; Wang, Yue; Stubbs, Amber; Uzuner, Özlem; Gururaj, Anupama E; Bayer, Samuel; Aberdeen, John; Rumshisky, Anna; Pakhomov, Serguei; Liu, Hongfang; Xu, Hua

    2015-12-01

    In recognition of potential barriers that may inhibit the widespread adoption of biomedical software, the 2014 i2b2 Challenge introduced a special track, Track 3 - Software Usability Assessment, in order to develop a better understanding of the adoption issues that might be associated with the state-of-the-art clinical NLP systems. This paper reports the ease of adoption assessment methods we developed for this track, and the results of evaluating five clinical NLP system submissions. A team of human evaluators performed a series of scripted adoptability test tasks with each of the participating systems. The evaluation team consisted of four "expert evaluators" with training in computer science, and eight "end user evaluators" with mixed backgrounds in medicine, nursing, pharmacy, and health informatics. We assessed how easy it is to adopt the submitted systems along the following three dimensions: communication effectiveness (i.e., how effective a system is in communicating its designed objectives to intended audience), effort required to install, and effort required to use. We used a formal software usability testing tool, TURF, to record the evaluators' interactions with the systems and 'think-aloud' data revealing their thought processes when installing and using the systems and when resolving unexpected issues. Overall, the ease of adoption ratings that the five systems received are unsatisfactory. Installation of some of the systems proved to be rather difficult, and some systems failed to adequately communicate their designed objectives to intended adopters. Further, the average ratings provided by the end user evaluators on ease of use and ease of interpreting output are -0.35 and -0.53, respectively, indicating that this group of users generally deemed the systems extremely difficult to work with. While the ratings provided by the expert evaluators are higher, 0.6 and 0.45, respectively, these ratings are still low indicating that they also experienced considerable struggles. The results of the Track 3 evaluation show that the adoptability of the five participating clinical NLP systems has a great margin for improvement. Remedy strategies suggested by the evaluators included (1) more detailed and operation system specific use instructions; (2) provision of more pertinent onscreen feedback for easier diagnosis of problems; (3) including screen walk-throughs in use instructions so users know what to expect and what might have gone wrong; (4) avoiding jargon and acronyms in materials intended for end users; and (5) packaging prerequisites required within software distributions so that prospective adopters of the software do not have to obtain each of the third-party components on their own. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. Non-abelian factorisation for next-to-leading-power threshold logarithms

    NASA Astrophysics Data System (ADS)

    Bonocore, D.; Laenen, E.; Magnea, L.; Vernazza, L.; White, C. D.

    2016-12-01

    Soft and collinear radiation is responsible for large corrections to many hadronic cross sections, near thresholds for the production of heavy final states. There is much interest in extending our understanding of this radiation to next-to-leading power (NLP) in the threshold expansion. In this paper, we generalise a previously proposed all-order NLP factorisation formula to include non-abelian corrections. We define a nonabelian radiative jet function, organising collinear enhancements at NLP, and compute it for quark jets at one loop. We discuss in detail the issue of double counting between soft and collinear regions. Finally, we verify our prescription by reproducing all NLP logarithms in Drell-Yan production up to NNLO, including those associated with double real emission. Our results constitute an important step in the development of a fully general resummation formalism for NLP threshold effects.

  14. Assessment of commercial NLP engines for medication information extraction from dictated clinical notes.

    PubMed

    Jagannathan, V; Mullett, Charles J; Arbogast, James G; Halbritter, Kevin A; Yellapragada, Deepthi; Regulapati, Sushmitha; Bandaru, Pavani

    2009-04-01

    We assessed the current state of commercial natural language processing (NLP) engines for their ability to extract medication information from textual clinical documents. Two thousand de-identified discharge summaries and family practice notes were submitted to four commercial NLP engines with the request to extract all medication information. The four sets of returned results were combined to create a comparison standard which was validated against a manual, physician-derived gold standard created from a subset of 100 reports. Once validated, the individual vendor results for medication names, strengths, route, and frequency were compared against this automated standard with precision, recall, and F measures calculated. Compared with the manual, physician-derived gold standard, the automated standard was successful at accurately capturing medication names (F measure=93.2%), but performed less well with strength (85.3%) and route (80.3%), and relatively poorly with dosing frequency (48.3%). Moderate variability was seen in the strengths of the four vendors. The vendors performed better with the structured discharge summaries than with the clinic notes in an analysis comparing the two document types. Although automated extraction may serve as the foundation for a manual review process, it is not ready to automate medication lists without human intervention.

  15. Automatic Figure Ranking and User Interfacing for Intelligent Figure Search

    PubMed Central

    Yu, Hong; Liu, Feifan; Ramesh, Balaji Polepalli

    2010-01-01

    Background Figures are important experimental results that are typically reported in full-text bioscience articles. Bioscience researchers need to access figures to validate research facts and to formulate or to test novel research hypotheses. On the other hand, the sheer volume of bioscience literature has made it difficult to access figures. Therefore, we are developing an intelligent figure search engine (http://figuresearch.askhermes.org). Existing research in figure search treats each figure equally, but we introduce a novel concept of “figure ranking”: figures appearing in a full-text biomedical article can be ranked by their contribution to the knowledge discovery. Methodology/Findings We empirically validated the hypothesis of figure ranking with over 100 bioscience researchers, and then developed unsupervised natural language processing (NLP) approaches to automatically rank figures. Evaluating on a collection of 202 full-text articles in which authors have ranked the figures based on importance, our best system achieved a weighted error rate of 0.2, which is significantly better than several other baseline systems we explored. We further explored a user interfacing application in which we built novel user interfaces (UIs) incorporating figure ranking, allowing bioscience researchers to efficiently access important figures. Our evaluation results show that 92% of the bioscience researchers prefer as the top two choices the user interfaces in which the most important figures are enlarged. With our automatic figure ranking NLP system, bioscience researchers preferred the UIs in which the most important figures were predicted by our NLP system than the UIs in which the most important figures were randomly assigned. In addition, our results show that there was no statistical difference in bioscience researchers' preference in the UIs generated by automatic figure ranking and UIs by human ranking annotation. Conclusion/Significance The evaluation results conclude that automatic figure ranking and user interfacing as we reported in this study can be fully implemented in online publishing. The novel user interface integrated with the automatic figure ranking system provides a more efficient and robust way to access scientific information in the biomedical domain, which will further enhance our existing figure search engine to better facilitate accessing figures of interest for bioscientists. PMID:20949102

  16. A Qualitative Investigation into the Experience of Neuro-Linguistic Programming Certification Training among Japanese Career Consultants

    ERIC Educational Resources Information Center

    Kotera, Yasuhiro

    2018-01-01

    Although the application of neuro-linguistic programming (NLP) has been reported worldwide, its scientific investigation is limited. Career consulting is one of the fields where NLP has been increasingly applied in Japan. This study explored why career consultants undertake NLP training, and what they find most useful to their practice. Thematic…

  17. Using the Natural Language Paradigm (NLP) to Increase Vocalizations of Older Adults with Cognitive Impairments

    ERIC Educational Resources Information Center

    LeBlanc, Linda A.; Geiger, Kaneen B.; Sautter, Rachael A.; Sidener, Tina M.

    2007-01-01

    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated…

  18. Retrieval of radiology reports citing critical findings with disease-specific customization.

    PubMed

    Lacson, Ronilda; Sugarbaker, Nathanael; Prevedello, Luciano M; Ivan, Ip; Mar, Wendy; Andriole, Katherine P; Khorasani, Ramin

    2012-01-01

    Communication of critical results from diagnostic procedures between caregivers is a Joint Commission national patient safety goal. Evaluating critical result communication often requires manual analysis of voluminous data, especially when reviewing unstructured textual results of radiologic findings. Information retrieval (IR) tools can facilitate this process by enabling automated retrieval of radiology reports that cite critical imaging findings. However, IR tools that have been developed for one disease or imaging modality often need substantial reconfiguration before they can be utilized for another disease entity. THIS PAPER: 1) describes the process of customizing two Natural Language Processing (NLP) and Information Retrieval/Extraction applications - an open-source toolkit, A Nearly New Information Extraction system (ANNIE); and an application developed in-house, Information for Searching Content with an Ontology-Utilizing Toolkit (iSCOUT) - to illustrate the varying levels of customization required for different disease entities and; 2) evaluates each application's performance in identifying and retrieving radiology reports citing critical imaging findings for three distinct diseases, pulmonary nodule, pneumothorax, and pulmonary embolus. Both applications can be utilized for retrieval. iSCOUT and ANNIE had precision values between 0.90-0.98 and recall values between 0.79 and 0.94. ANNIE had consistently higher precision but required more customization. Understanding the customizations involved in utilizing NLP applications for various diseases will enable users to select the most suitable tool for specific tasks.

  19. Retrieval of Radiology Reports Citing Critical Findings with Disease-Specific Customization

    PubMed Central

    Lacson, Ronilda; Sugarbaker, Nathanael; Prevedello, Luciano M; Ivan, IP; Mar, Wendy; Andriole, Katherine P; Khorasani, Ramin

    2012-01-01

    Background: Communication of critical results from diagnostic procedures between caregivers is a Joint Commission national patient safety goal. Evaluating critical result communication often requires manual analysis of voluminous data, especially when reviewing unstructured textual results of radiologic findings. Information retrieval (IR) tools can facilitate this process by enabling automated retrieval of radiology reports that cite critical imaging findings. However, IR tools that have been developed for one disease or imaging modality often need substantial reconfiguration before they can be utilized for another disease entity. Purpose: This paper: 1) describes the process of customizing two Natural Language Processing (NLP) and Information Retrieval/Extraction applications – an open-source toolkit, A Nearly New Information Extraction system (ANNIE); and an application developed in-house, Information for Searching Content with an Ontology-Utilizing Toolkit (iSCOUT) – to illustrate the varying levels of customization required for different disease entities and; 2) evaluates each application’s performance in identifying and retrieving radiology reports citing critical imaging findings for three distinct diseases, pulmonary nodule, pneumothorax, and pulmonary embolus. Results: Both applications can be utilized for retrieval. iSCOUT and ANNIE had precision values between 0.90-0.98 and recall values between 0.79 and 0.94. ANNIE had consistently higher precision but required more customization. Conclusion: Understanding the customizations involved in utilizing NLP applications for various diseases will enable users to select the most suitable tool for specific tasks. PMID:22934127

  20. Concordium 2015: Strategic Uses of Evidence to Transform Delivery Systems

    PubMed Central

    Holve, Erin; Weiss, Samantha

    2016-01-01

    In September 2015 the EDM Forum hosted AcademyHealth’s newest national conference, Concordium. The 11 papers featured in the eGEMs “Concordium 2015” special issue successfully reflect the major themes and issues discussed at the meeting. Many of the papers address informatics or methodological approaches to natural language processing (NLP) or text analysis, which is indicative of the importance of analyzing text data to gain insights into care coordination and patient-centered outcomes. Perspectives on the tools and infrastructure requirements that are needed to build learning health systems were also recurrent themes. PMID:27683671

  1. A review of approaches to identifying patient phenotype cohorts using electronic health records

    PubMed Central

    Shivade, Chaitanya; Raghavan, Preethi; Fosler-Lussier, Eric; Embi, Peter J; Elhadad, Noemie; Johnson, Stephen B; Lai, Albert M

    2014-01-01

    Objective To summarize literature describing approaches aimed at automatically identifying patients with a common phenotype. Materials and methods We performed a review of studies describing systems or reporting techniques developed for identifying cohorts of patients with specific phenotypes. Every full text article published in (1) Journal of American Medical Informatics Association, (2) Journal of Biomedical Informatics, (3) Proceedings of the Annual American Medical Informatics Association Symposium, and (4) Proceedings of Clinical Research Informatics Conference within the past 3 years was assessed for inclusion in the review. Only articles using automated techniques were included. Results Ninety-seven articles met our inclusion criteria. Forty-six used natural language processing (NLP)-based techniques, 24 described rule-based systems, 41 used statistical analyses, data mining, or machine learning techniques, while 22 described hybrid systems. Nine articles described the architecture of large-scale systems developed for determining cohort eligibility of patients. Discussion We observe that there is a rise in the number of studies associated with cohort identification using electronic medical records. Statistical analyses or machine learning, followed by NLP techniques, are gaining popularity over the years in comparison with rule-based systems. Conclusions There are a variety of approaches for classifying patients into a particular phenotype. Different techniques and data sources are used, and good performance is reported on datasets at respective institutions. However, no system makes comprehensive use of electronic medical records addressing all of their known weaknesses. PMID:24201027

  2. Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes

    PubMed Central

    Pantazatos, Spiro P.; Li, Jianrong; Pavlidis, Paul; Lussier, Yves A.

    2009-01-01

    An approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable model of disease (SNOMED CT®). The approach was implemented using sample datasets from fMRIDC, GEO, The Whole Brain Atlas and Neuronames, and allowed for complex queries such as “List all disorders with a finding site of brain region X, and then find the semantically related references in all participating databases based on the ontological model of the disease or its anatomical and morphological attributes”. Precision of the NLP-derived coding of the unstructured phenotypes in each dataset was 88% (n = 50), and precision of the semantic mapping between these terms across datasets was 98% (n = 100). To our knowledge, this is the first example of the use of both semantic decomposition of disease relationships and hierarchical information found in ontologies to integrate heterogeneous phenotypes across clinical and molecular datasets. PMID:20495688

  3. OntoMate: a text-mining tool aiding curation at the Rat Genome Database

    PubMed Central

    Liu, Weisong; Laulederkind, Stanley J. F.; Hayman, G. Thomas; Wang, Shur-Jen; Nigam, Rajni; Smith, Jennifer R.; De Pons, Jeff; Dwinell, Melinda R.; Shimoyama, Mary

    2015-01-01

    The Rat Genome Database (RGD) is the premier repository of rat genomic, genetic and physiologic data. Converting data from free text in the scientific literature to a structured format is one of the main tasks of all model organism databases. RGD spends considerable effort manually curating gene, Quantitative Trait Locus (QTL) and strain information. The rapidly growing volume of biomedical literature and the active research in the biological natural language processing (bioNLP) community have given RGD the impetus to adopt text-mining tools to improve curation efficiency. Recently, RGD has initiated a project to use OntoMate, an ontology-driven, concept-based literature search engine developed at RGD, as a replacement for the PubMed (http://www.ncbi.nlm.nih.gov/pubmed) search engine in the gene curation workflow. OntoMate tags abstracts with gene names, gene mutations, organism name and most of the 16 ontologies/vocabularies used at RGD. All terms/ entities tagged to an abstract are listed with the abstract in the search results. All listed terms are linked both to data entry boxes and a term browser in the curation tool. OntoMate also provides user-activated filters for species, date and other parameters relevant to the literature search. Using the system for literature search and import has streamlined the process compared to using PubMed. The system was built with a scalable and open architecture, including features specifically designed to accelerate the RGD gene curation process. With the use of bioNLP tools, RGD has added more automation to its curation workflow. Database URL: http://rgd.mcw.edu PMID:25619558

  4. Heavy quarkonium production at collider energies: Factorization and evolution

    NASA Astrophysics Data System (ADS)

    Kang, Zhong-Bo; Ma, Yan-Qing; Qiu, Jian-Wei; Sterman, George

    2014-08-01

    We present a perturbative QCD factorization formalism for inclusive production of heavy quarkonia of large transverse momentum, pT at collider energies, including both leading power (LP) and next-to-leading power (NLP) behavior in pT. We demonstrate that both LP and NLP contributions can be factorized in terms of perturbatively calculable short-distance partonic coefficient functions and universal nonperturbative fragmentation functions, and derive the evolution equations that are implied by the factorization. We identify projection operators for all channels of the factorized LP and NLP infrared safe short-distance partonic hard parts, and corresponding operator definitions of fragmentation functions. For the NLP, we focus on the contributions involving the production of a heavy quark pair, a necessary condition for producing a heavy quarkonium. We evaluate the first nontrivial order of evolution kernels for all relevant fragmentation functions, and discuss the role of NLP contributions.

  5. Informatics can identify systemic sclerosis (SSc) patients at risk for scleroderma renal crisis

    PubMed Central

    Redd, Doug; Frech, Tracy M.; Murtaugh, Maureen A.; Rhiannon, Julia; Zeng, Qing T.

    2016-01-01

    Background Electronic medical records (EMR) provide an ideal opportunity for the detection, diagnosis, and management of systemic sclerosis (SSc) patients within the Veterans Health Administration (VHA). The objective of this project was to use informatics to identify potential SSc patients in the VHA that were on prednisone, in order to inform an outreach project to prevent scleroderma renal crisis (SRC). Methods The electronic medical data for this study came from Veterans Informatics and Computing Infrastructure (VINCI). For natural language processing (NLP) analysis, a set of retrieval criteria was developed for documents expected to have a high correlation to SSc. The two annotators reviewed the ratings to assemble a single adjudicated set of ratings, from which a support vector machine (SVM) based document classifier was trained. Any patient having at least one document positively classified for SSc was considered positive for SSc and the use of prednisone ≥ 10 mg in the clinical document was reviewed to determine whether it was an active medication on the prescription list. Results In the VHA, there were 4,272 patients that have a diagnosis of SSc determined by the presence of an ICD-9 code. From these patients, 1,118 patients (21%) had the use of prednisone ≥_10 mg. Of these patients, 26 had a concurrent diagnosis of hypertension, thus these patients should not be on prednisone. By the use of natural language processing (NLP) an additional 16,522 patients were identified as possible SSc, highlighting that cases of SSc in the VHA may exist that are unidentified by ICD-9. A 10-fold cross validation of the classifier resulted in a precision (positive predictive value) of 0.814, recall (sensitivity) of 0.973, and f-measure of 0.873. Conclusions Our study demonstrated that current clinical practice in the VHA includes the potentially dangerous use of prednisone for veterans with SSc. This present study also suggests there may be many undetected cases of SSc and NLP can successfully identify these patients. PMID:25168254

  6. Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals.

    PubMed

    Connolly, Brian; Matykiewicz, Pawel; Bretonnel Cohen, K; Standridge, Shannon M; Glauser, Tracy A; Dlugos, Dennis J; Koh, Susan; Tham, Eric; Pestian, John

    2014-01-01

    The constant progress in computational linguistic methods provides amazing opportunities for discovering information in clinical text and enables the clinical scientist to explore novel approaches to care. However, these new approaches need evaluation. We describe an automated system to compare descriptions of epilepsy patients at three different organizations: Cincinnati Children's Hospital, the Children's Hospital Colorado, and the Children's Hospital of Philadelphia. To our knowledge, there have been no similar previous studies. In this work, a support vector machine (SVM)-based natural language processing (NLP) algorithm is trained to classify epilepsy progress notes as belonging to a patient with a specific type of epilepsy from a particular hospital. The same SVM is then used to classify notes from another hospital. Our null hypothesis is that an NLP algorithm cannot be trained using epilepsy-specific notes from one hospital and subsequently used to classify notes from another hospital better than a random baseline classifier. The hypothesis is tested using epilepsy progress notes from the three hospitals. We are able to reject the null hypothesis at the 95% level. It is also found that classification was improved by including notes from a second hospital in the SVM training sample. With a reasonably uniform epilepsy vocabulary and an NLP-based algorithm able to use this uniformity to classify epilepsy progress notes across different hospitals, we can pursue automated comparisons of patient conditions, treatments, and diagnoses across different healthcare settings. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  7. A combined NLP-differential evolution algorithm approach for the optimization of looped water distribution systems

    NASA Astrophysics Data System (ADS)

    Zheng, Feifei; Simpson, Angus R.; Zecchin, Aaron C.

    2011-08-01

    This paper proposes a novel optimization approach for the least cost design of looped water distribution systems (WDSs). Three distinct steps are involved in the proposed optimization approach. In the first step, the shortest-distance tree within the looped network is identified using the Dijkstra graph theory algorithm, for which an extension is proposed to find the shortest-distance tree for multisource WDSs. In the second step, a nonlinear programming (NLP) solver is employed to optimize the pipe diameters for the shortest-distance tree (chords of the shortest-distance tree are allocated the minimum allowable pipe sizes). Finally, in the third step, the original looped water network is optimized using a differential evolution (DE) algorithm seeded with diameters in the proximity of the continuous pipe sizes obtained in step two. As such, the proposed optimization approach combines the traditional deterministic optimization technique of NLP with the emerging evolutionary algorithm DE via the proposed network decomposition. The proposed methodology has been tested on four looped WDSs with the number of decision variables ranging from 21 to 454. Results obtained show the proposed approach is able to find optimal solutions with significantly less computational effort than other optimization techniques.

  8. Towards a semantic lexicon for biological language processing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verspoor, K.

    It is well understood that natural language processing (NLP) applications require sophisticated lexical resources to support their processing goals. In the biomedical domain, we are privileged to have access to extensive terminological resources in the form of controlled vocabularies and ontologies, which have been integrated into the framework of the National Library of Medicine's Unified Medical Language System's (UMLS) Metathesaurus. However, the existence of such terminological resources does not guarantee their utility for NLP. In particular, we have two core requirements for lexical resources for NLP in addition to the basic enumeration of important domain terms: representation of morphosyntactic informationmore » about those terms, specifically part of speech information and inflectional patterns to support parsing and lemma assignment, and representation of semantic information indicating general categorical information about terms, and significant relations between terms to support text understanding and inference (Hahn et at, 1999). Biomedical vocabularies by and large commonly leave out morphosyntactic information, and where they address semantic considerations, they often do so in an unprincipled manner, for instance by indicating a relation between two concepts without indicating the type of that relation. But all is not lost. The UMLS knowledge sources include two additional resources which are relevant - the SPECIALIST lexicon, a lexicon addressing our morphosyntactic requirements, and the Semantic Network, a representation of core conceptual categories in the biomedical domain. The coverage of these two knowledge sources with respect to the full coverage of the Metathesaurus is, however, not entirely clear. Furthermore, when our goals are specifically to process biological text - and often more specifically, text in the molecular biology domain - it is difficult to say whether the coverage of these resources is meaningful. The utility of the UMLS knowledge sources for medical language processing (MLP) has been explored (Johnson, 1999; Friedman et al 2001); the time has now come to repeat these experiments with respect to biological language processing (BLP). To that end, this paper presents an analysis of ihe UMLS resources, specifically with an eye towards constructing lexical resources suitable for BLP. We follow the paradigm presented in Johnson (1999) for medical language, exploring overlap between the UMLS Metathesaurus and SPECIALIST lexicon to construct a morphosyntactic and semantically-specified lexicon, and then further explore the overlap with a relevant domain corpus for molecular biology.« less

  9. Applying natural language processing techniques to develop a task-specific EMR interface for timely stroke thrombolysis: A feasibility study.

    PubMed

    Sung, Sheng-Feng; Chen, Kuanchin; Wu, Darren Philbert; Hung, Ling-Chien; Su, Yu-Hsiang; Hu, Ya-Han

    2018-04-01

    To reduce errors in determining eligibility for intravenous thrombolytic therapy (IVT) in stroke patients through use of an enhanced task-specific electronic medical record (EMR) interface powered by natural language processing (NLP) techniques. The information processing algorithm utilized MetaMap to extract medical concepts from IVT eligibility criteria and expanded the concepts using the Unified Medical Language System Metathesaurus. Concepts identified from clinical notes by MetaMap were compared to those from IVT eligibility criteria. The task-specific EMR interface displays IVT-relevant information by highlighting phrases that contain matched concepts. Clinical usability was assessed with clinicians staffing the acute stroke team by comparing user performance while using the task-specific and the current EMR interfaces. The algorithm identified IVT-relevant concepts with micro-averaged precisions, recalls, and F1 measures of 0.998, 0.812, and 0.895 at the phrase level and of 1, 0.972, and 0.986 at the document level. Users using the task-specific interface achieved a higher accuracy score than those using the current interface (91% versus 80%, p = 0.016) in assessing the IVT eligibility criteria. The completion time between the interfaces was statistically similar (2.46 min versus 1.70 min, p = 0.754). Although the information processing algorithm had room for improvement, the task-specific EMR interface significantly reduced errors in assessing IVT eligibility criteria. The study findings provide evidence to support an NLP enhanced EMR system to facilitate IVT decision-making by presenting meaningful and timely information to clinicians, thereby offering a new avenue for improvements in acute stroke care. Copyright © 2018 Elsevier B.V. All rights reserved.

  10. Ascent guidance algorithm using lidar wind measurements

    NASA Technical Reports Server (NTRS)

    Cramer, Evin J.; Bradt, Jerre E.; Hardtla, John W.

    1990-01-01

    The formulation of a general nonlinear programming guidance algorithm that incorporates wind measurements in the computation of ascent guidance steering commands is discussed. A nonlinear programming (NLP) algorithm that is designed to solve a very general problem has the potential to address the diversity demanded by future launch systems. Using B-splines for the command functional form allows the NLP algorithm to adjust the shape of the command profile to achieve optimal performance. The algorithm flexibility is demonstrated by simulation of ascent with dynamic loading constraints through a set of random wind profiles with and without wind sensing capability.

  11. Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation.

    PubMed

    Ferraro, Jeffrey P; Daumé, Hal; Duvall, Scott L; Chapman, Wendy W; Harkema, Henk; Haug, Peter J

    2013-01-01

    Natural language processing (NLP) tasks are commonly decomposed into subtasks, chained together to form processing pipelines. The residual error produced in these subtasks propagates, adversely affecting the end objectives. Limited availability of annotated clinical data remains a barrier to reaching state-of-the-art operating characteristics using statistically based NLP tools in the clinical domain. Here we explore the unique linguistic constructions of clinical texts and demonstrate the loss in operating characteristics when out-of-the-box part-of-speech (POS) tagging tools are applied to the clinical domain. We test a domain adaptation approach integrating a novel lexical-generation probability rule used in a transformation-based learner to boost POS performance on clinical narratives. Two target corpora from independent healthcare institutions were constructed from high frequency clinical narratives. Four leading POS taggers with their out-of-the-box models trained from general English and biomedical abstracts were evaluated against these clinical corpora. A high performing domain adaptation method, Easy Adapt, was compared to our newly proposed method ClinAdapt. The evaluated POS taggers drop in accuracy by 8.5-15% when tested on clinical narratives. The highest performing tagger reports an accuracy of 88.6%. Domain adaptation with Easy Adapt reports accuracies of 88.3-91.0% on clinical texts. ClinAdapt reports 93.2-93.9%. ClinAdapt successfully boosts POS tagging performance through domain adaptation requiring a modest amount of annotated clinical data. Improving the performance of critical NLP subtasks is expected to reduce pipeline error propagation leading to better overall results on complex processing tasks.

  12. The Effect of Neuro-Linguistic Programming (NLP) on Reading Comprehension in English for Specific Purposes Courses

    ERIC Educational Resources Information Center

    Farahani, Fahimeh

    2018-01-01

    Neuro-Linguistic Programming (NLP) has potential to help language learners; however, it has received scant attention. The present study was an attempt to investigate the effect of NLP techniques on reading comprehension of English as a Foreign Language (EFL) learners at an English for Specific Purposes (ESP) course. To achieve this goal, two…

  13. NLP-1: a DNA intercalating hypoxic cell radiosensitizer and cytotoxin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Panicucci, R.; Heal, R.; Laderoute, K.

    The 2-nitroimidazole linked phenanthridine, NLP-1 (5-(3-(2-nitro-1-imidazoyl)-propyl)-phenanthridinium bromide), was synthesized with the rationale of targeting the nitroimidazole to DNA via the phenanthridine ring. The drug is soluble in aqueous solution (greater than 25 mM) and stable at room temperature. It binds to DNA with a binding constant 1/30 that of ethidium bromide. At a concentration of 0.5 mM, NLP-1 is 8 times more toxic to hypoxic than aerobic cells at 37 degrees C. This concentration is 40 times less than the concentration of misonidazole, a non-intercalating 2-nitroimidazole, required for the same degree of hypoxic cell toxicity. The toxicity of NLP-1 ismore » reduced at least 10-fold at 0 degrees C. Its ability to radiosensitize hypoxic cells is similar to misonidazole at 0 degrees C. Thus the putative targeting of the 2-nitroimidazole, NLP-1, to DNA, via its phenanthridine group, enhances its hypoxic toxicity, but not its radiosensitizing ability under the present test conditions. NLP-1 represents a lead compound for intercalating 2-nitroimidazoles with selective toxicity for hypoxic cells.« less

  14. Extraction of phenotypic traits from taxonomic descriptions for the tree of life using natural language processing.

    PubMed

    Endara, Lorena; Cui, Hong; Burleigh, J Gordon

    2018-03-01

    Phenotypic data sets are necessary to elucidate the genealogy of life, but assembling phenotypic data for taxa across the tree of life can be technically challenging and prohibitively time consuming. We describe a semi-automated protocol to facilitate and expedite the assembly of phenotypic character matrices of plants from formal taxonomic descriptions. This pipeline uses new natural language processing (NLP) techniques and a glossary of over 9000 botanical terms. Our protocol includes the Explorer of Taxon Concepts (ETC), an online application that assembles taxon-by-character matrices from taxonomic descriptions, and MatrixConverter, a Java application that enables users to evaluate and discretize the characters extracted by ETC. We demonstrate this protocol using descriptions from Araucariaceae. The NLP pipeline unlocks the phenotypic data found in taxonomic descriptions and makes them usable for evolutionary analyses.

  15. How Confounder Strength Can Affect Allocation of Resources in Electronic Health Records.

    PubMed

    Lynch, Kristine E; Whitcomb, Brian W; DuVall, Scott L

    2018-01-01

    When electronic health record (EHR) data are used, multiple approaches may be available for measuring the same variable, introducing potentially confounding factors. While additional information may be gleaned and residual confounding reduced through resource-intensive assessment methods such as natural language processing (NLP), whether the added benefits offset the added cost of the additional resources is not straightforward. We evaluated the implications of misclassification of a confounder when using EHRs. Using a combination of simulations and real data surrounding hospital readmission, we considered smoking as a potential confounder. We compared ICD-9 diagnostic code assignment, which is an easily available measure but has the possibility of substantial misclassification of smoking status, with NLP, a method of determining smoking status that more expensive and time-consuming than ICD-9 code assignment but has less potential for misclassification. Classification of smoking status with NLP consistently produced less residual confounding than the use of ICD-9 codes; however, when minimal confounding was present, differences between the approaches were small. When considerable confounding is present, investing in a superior measurement tool becomes advantageous.

  16. Identification and functional analysis of the NLP-encoding genes from the phytopathogenic oomycete Phytophthora capsici.

    PubMed

    Chen, Xiao-Ren; Huang, Shen-Xin; Zhang, Ye; Sheng, Gui-Lin; Li, Yan-Peng; Zhu, Feng

    2018-03-23

    Phytophthora capsici is a hemibiotrophic, phytopathogenic oomycete that infects a wide range of crops, resulting in significant economic losses worldwide. By means of a diverse arsenal of secreted effector proteins, hemibiotrophic pathogens may manipulate plant cell death to establish a successful infection and colonization. In this study, we described the analysis of the gene family encoding necrosis- and ethylene-inducing peptide 1 (Nep1)-like proteins (NLPs) in P. capsici, and identified 39 real NLP genes and 26 NLP pseudogenes. Out of the 65 predicted NLP genes, 48 occur in groups with two or more genes, whereas the remainder appears to be singletons distributed randomly among the genome. Phylogenetic analysis of the 39 real NLPs delineated three groups. Key residues/motif important for the effector activities are degenerated in most NLPs, including the nlp24 peptide consisting of the conserved region I (11-aa immunogenic part) and conserved region II (the heptapeptide GHRHDWE motif) that is important for phytotoxic activity. Transcriptional profiling of eight selected NLP genes indicated that they were differentially expressed during the developmental and plant infection phases of P. capsici. Functional analysis of ten cloned NLPs demonstrated that Pc11951, Pc107869, Pc109174 and Pc118548 were capable of inducing cell death in the Solanaceae, including Nicotiana benthamiana and hot pepper. This study provides an overview of the P. capsici NLP gene family, laying a foundation for further elucidating the pathogenicity mechanism of this devastating pathogen.

  17. Neuropeptide Secreted from a Pacemaker Activates Neurons to Control a Rhythmic Behavior

    PubMed Central

    Wang, Han; Girskis, Kelly; Janssen, Tom; Chan, Jason P.; Dasgupta, Krishnakali; Knowles, James A.; Schoofs, Liliane; Sieburth, Derek

    2013-01-01

    Summary Background Rhythmic behaviors are driven by endogenous biological clocks in pacemakers, which must reliably transmit timing information to target tissues that execute rhythmic outputs. During the defecation motor program in C. elegans, calcium oscillations in the pacemaker (intestine), which occur about every 50 seconds, trigger rhythmic enteric muscle contractions through downstream GABAergic neurons that innervate enteric muscles. However, the identity of the timing signal released by the pacemaker and the mechanism underlying the delivery of timing information to the GABAergic neurons are unknown. Results Here we show that a neuropeptide-like protein (NLP-40) released by the pacemaker triggers a single rapid calcium transient in the GABAergic neurons during each defecation cycle. We find that mutants lacking nlp-40 have normal pacemaker function, but lack enteric muscle contractions. NLP-40 undergoes calcium-dependent release that is mediated by the calcium sensor, SNT-2/synaptotagmin. We identify AEX-2, the G protein-coupled receptor on the GABAergic neurons, as the receptor of NLP-40. Functional calcium imaging reveals that NLP-40 and AEX-2/GPCR are both necessary for rhythmic activation of these neurons. Furthermore, acute application of synthetic NLP-40-derived peptide depolarizes the GABAergic neurons in vivo. Conclusions Our results show that NLP-40 carries the timing information from the pacemaker via calcium-dependent release and delivers it to the GABAergic neurons by instructing their activation. Thus, we propose that rhythmic release of neuropeptides can deliver temporal information from pacemakers to downstream neurons to execute rhythmic behaviors. PMID:23583549

  18. Nesfatin-1-Like Peptide Encoded in Nucleobindin-1 in Goldfish is a Novel Anorexigen Modulated by Sex Steroids, Macronutrients and Daily Rhythm

    PubMed Central

    Sundarrajan, Lakshminarasimhan; Blanco, Ayelén Melisa; Bertucci, Juan Ignacio; Ramesh, Naresh; Canosa, Luis Fabián; Unniappan, Suraj

    2016-01-01

    Nesfatin-1 is an 82 amino acid anorexigen encoded in a secreted precursor nucleobindin-2 (NUCB2). NUCB2 was named so due to its high sequence similarity with nucleobindin-1 (NUCB1). It was recently reported that NUCB1 encodes an insulinotropic nesfatin-1-like peptide (NLP) in mice. Here, we aimed to characterize NLP in fish. RT- qPCR showed NUCB1 expression in both central and peripheral tissues. Western blot analysis and/or fluorescence immunohistochemistry determined NUCB1/NLP in the brain, pituitary, testis, ovary and gut of goldfish. NUCB1 mRNA expression in goldfish pituitary and gut displayed a daily rhythmic pattern of expression. Pituitary NUCB1 mRNA expression was downregulated by estradiol, while testosterone upregulated its expression in female goldfish brain. High carbohydrate and fat suppressed NUCB1 mRNA expression in the brain and gut. Intraperitoneal injection of synthetic rat NLP and goldfish NLP at 10 and 100 ng/g body weight doses caused potent inhibition of food intake in goldfish. NLP injection also downregulated the expression of mRNAs encoding orexigens, preproghrelin and orexin-A, and upregulated anorexigen cocaine and amphetamine regulated transcript mRNA in goldfish brain. Collectively, these results provide the first set of results supporting the anorectic action of NLP, and the regulation of tissue specific expression of goldfish NUCB1. PMID:27329836

  19. Toward a complete dataset of drug-drug interaction information from publicly available sources.

    PubMed

    Ayvaz, Serkan; Horn, John; Hassanzadeh, Oktie; Zhu, Qian; Stan, Johann; Tatonetti, Nicholas P; Vilar, Santiago; Brochhausen, Mathias; Samwald, Matthias; Rastegar-Mojarad, Majid; Dumontier, Michel; Boyce, Richard D

    2015-06-01

    Although potential drug-drug interactions (PDDIs) are a significant source of preventable drug-related harm, there is currently no single complete source of PDDI information. In the current study, all publically available sources of PDDI information that could be identified using a comprehensive and broad search were combined into a single dataset. The combined dataset merged fourteen different sources including 5 clinically-oriented information sources, 4 Natural Language Processing (NLP) Corpora, and 5 Bioinformatics/Pharmacovigilance information sources. As a comprehensive PDDI source, the merged dataset might benefit the pharmacovigilance text mining community by making it possible to compare the representativeness of NLP corpora for PDDI text extraction tasks, and specifying elements that can be useful for future PDDI extraction purposes. An analysis of the overlap between and across the data sources showed that there was little overlap. Even comprehensive PDDI lists such as DrugBank, KEGG, and the NDF-RT had less than 50% overlap with each other. Moreover, all of the comprehensive lists had incomplete coverage of two data sources that focus on PDDIs of interest in most clinical settings. Based on this information, we think that systems that provide access to the comprehensive lists, such as APIs into RxNorm, should be careful to inform users that the lists may be incomplete with respect to PDDIs that drug experts suggest clinicians be aware of. In spite of the low degree of overlap, several dozen cases were identified where PDDI information provided in drug product labeling might be augmented by the merged dataset. Moreover, the combined dataset was also shown to improve the performance of an existing PDDI NLP pipeline and a recently published PDDI pharmacovigilance protocol. Future work will focus on improvement of the methods for mapping between PDDI information sources, identifying methods to improve the use of the merged dataset in PDDI NLP algorithms, integrating high-quality PDDI information from the merged dataset into Wikidata, and making the combined dataset accessible as Semantic Web Linked Data. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  20. Using the Natural Language Paradigm (NLP) to increase vocalizations of older adults with cognitive impairments.

    PubMed

    Leblanc, Linda A; Geiger, Kaneen B; Sautter, Rachael A; Sidener, Tina M

    2007-01-01

    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated in a multiple baseline design across participants. Data were collected on appropriate and inappropriate vocalizations with appropriate vocalizations coded as prompted or unprompted during baseline and treatment sessions. All participants experienced increases in appropriate speech during NLP with variable response patterns. Additionally, the two participants with substantial inappropriate vocalizations showed decreases in inappropriate speech. Implications for intervention in day programs are discussed.

  1. Building a common pipeline for rule-based document classification.

    PubMed

    Patterson, Olga V; Ginter, Thomas; DuVall, Scott L

    2013-01-01

    Instance-based classification of clinical text is a widely used natural language processing task employed as a step for patient classification, document retrieval, or information extraction. Rule-based approaches rely on concept identification and context analysis in order to determine the appropriate class. We propose a five-step process that enables even small research teams to develop simple but powerful rule-based NLP systems by taking advantage of a common UIMA AS based pipeline for classification. Our proposed methodology coupled with the general-purpose solution provides researchers with access to the data locked in clinical text in cases of limited human resources and compact timelines.

  2. Combining Machine Learning and Natural Language Processing to Assess Literary Text Comprehension

    ERIC Educational Resources Information Center

    Balyan, Renu; McCarthy, Kathryn S.; McNamara, Danielle S.

    2017-01-01

    This study examined how machine learning and natural language processing (NLP) techniques can be leveraged to assess the interpretive behavior that is required for successful literary text comprehension. We compared the accuracy of seven different machine learning classification algorithms in predicting human ratings of student essays about…

  3. Net-centric ACT-R-Based Cognitive Architecture with DEVS Unified Process

    DTIC Science & Technology

    2011-04-01

    effort has been spent in analyzing various forms of requirement specifications, viz, state-based, Natural Language based, UML-based, Rule- based, BPMN ...requirement specifications in one of the chosen formats such as BPMN , DoDAF, Natural Language Processing (NLP) based, UML- based, DSL or simply

  4. Phosphorylation of Nlp by Plk1 negatively regulates its dynein-dynactin-dependent targeting to the centrosome.

    PubMed

    Casenghi, Martina; Barr, Francis A; Nigg, Erich A

    2005-11-01

    When cells enter mitosis the microtubule (MT) network undergoes a profound rearrangement, in part due to alterations in the MT nucleating and anchoring properties of the centrosome. Ninein and the ninein-like protein (Nlp) are centrosomal proteins involved in MT organisation in interphase cells. We show that the overexpression of these two proteins induces the fragmentation of the Golgi, and causes lysosomes to disperse toward the cell periphery. The ability of Nlp and ninein to perturb the cytoplasmic distribution of these organelles depends on their ability to interact with the dynein-dynactin motor complex. Our data also indicate that dynactin is required for the targeting of Nlp and ninein to the centrosome. Furthermore, phosphorylation of Nlp by the polo-like kinase 1 (Plk1) negatively regulates its association with dynactin. These findings uncover a mechanism through which Plk1 helps to coordinate changes in MT organisation with cell cycle progression, by controlling the dynein-dynactin-dependent transport of centrosomal proteins.

  5. Neuropeptide secreted from a pacemaker activates neurons to control a rhythmic behavior.

    PubMed

    Wang, Han; Girskis, Kelly; Janssen, Tom; Chan, Jason P; Dasgupta, Krishnakali; Knowles, James A; Schoofs, Liliane; Sieburth, Derek

    2013-05-06

    Rhythmic behaviors are driven by endogenous biological clocks in pacemakers, which must reliably transmit timing information to target tissues that execute rhythmic outputs. During the defecation motor program in C. elegans, calcium oscillations in the pacemaker (intestine), which occur about every 50 s, trigger rhythmic enteric muscle contractions through downstream GABAergic neurons that innervate enteric muscles. However, the identity of the timing signal released by the pacemaker and the mechanism underlying the delivery of timing information to the GABAergic neurons are unknown. Here, we show that a neuropeptide-like protein (NLP-40) released by the pacemaker triggers a single rapid calcium transient in the GABAergic neurons during each defecation cycle. We find that mutants lacking nlp-40 have normal pacemaker function, but lack enteric muscle contractions. NLP-40 undergoes calcium-dependent release that is mediated by the calcium sensor, SNT-2/synaptotagmin. We identify AEX-2, the G-protein-coupled receptor on the GABAergic neurons, as the receptor for NLP-40. Functional calcium imaging reveals that NLP-40 and AEX-2/GPCR are both necessary for rhythmic activation of these neurons. Furthermore, acute application of synthetic NLP-40-derived peptide depolarizes the GABAergic neurons in vivo. Our results show that NLP-40 carries the timing information from the pacemaker via calcium-dependent release and delivers it to the GABAergic neurons by instructing their activation. Thus, we propose that rhythmic release of neuropeptides can deliver temporal information from pacemakers to downstream neurons to execute rhythmic behaviors. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. TEES 2.2: Biomedical Event Extraction for Diverse Corpora

    PubMed Central

    2015-01-01

    Background The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. Results The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. Conclusions The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented. PMID:26551925

  7. TEES 2.2: Biomedical Event Extraction for Diverse Corpora.

    PubMed

    Björne, Jari; Salakoski, Tapio

    2015-01-01

    The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented.

  8. Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora

    PubMed Central

    Al-Thubaity, Abdulmohsen; Alqifari, Reem

    2014-01-01

    Despite the accessibility of numerous online corpora, students and researchers engaged in the fields of Natural Language Processing (NLP), corpus linguistics, and language learning and teaching may encounter situations in which they need to develop their own corpora. Several commercial and free standalone corpora processing systems are available to process such corpora. In this study, we first propose a framework for the evaluation of standalone corpora processing systems and then use it to evaluate seven freely available systems. The proposed framework considers the usability, functionality, and performance of the evaluated systems while taking into consideration their suitability for Arabic corpora. While the results show that most of the evaluated systems exhibited comparable usability scores, the scores for functionality and performance were substantially different with respect to support for the Arabic language and N-grams profile generation. The results of our evaluation will help potential users of the evaluated systems to choose the system that best meets their needs. More importantly, the results will help the developers of the evaluated systems to enhance their systems and developers of new corpora processing systems by providing them with a reference framework. PMID:25610910

  9. Proposed framework for the evaluation of standalone corpora processing systems: an application to Arabic corpora.

    PubMed

    Al-Thubaity, Abdulmohsen; Al-Khalifa, Hend; Alqifari, Reem; Almazrua, Manal

    2014-01-01

    Despite the accessibility of numerous online corpora, students and researchers engaged in the fields of Natural Language Processing (NLP), corpus linguistics, and language learning and teaching may encounter situations in which they need to develop their own corpora. Several commercial and free standalone corpora processing systems are available to process such corpora. In this study, we first propose a framework for the evaluation of standalone corpora processing systems and then use it to evaluate seven freely available systems. The proposed framework considers the usability, functionality, and performance of the evaluated systems while taking into consideration their suitability for Arabic corpora. While the results show that most of the evaluated systems exhibited comparable usability scores, the scores for functionality and performance were substantially different with respect to support for the Arabic language and N-grams profile generation. The results of our evaluation will help potential users of the evaluated systems to choose the system that best meets their needs. More importantly, the results will help the developers of the evaluated systems to enhance their systems and developers of new corpora processing systems by providing them with a reference framework.

  10. Indexing Anatomical Phrases in Neuro-Radiology Reports to the UMLS 2005AA

    PubMed Central

    Bashyam, Vijayaraghavan; Taira, Ricky K.

    2005-01-01

    This work describes a methodology to index anatomical phrases to the 2005AA release of the Unified Medical Language System (UMLS). A phrase chunking tool based on Natural Language Processing (NLP) was developed to identify semantically coherent phrases within medical reports. Using this phrase chunker, a set of 2,551 unique anatomical phrases was extracted from brain radiology reports. These phrases were mapped to the 2005AA release of the UMLS using a vector space model. Precision for the task of indexing unique phrases was 0.87. PMID:16778995

  11. Designing Rules for Accounting Transaction Identification based on Indonesian NLP

    NASA Astrophysics Data System (ADS)

    Iswandi, I.; Suwardi, I. S.; Maulidevi, N. U.

    2017-03-01

    Recording accounting transactions carried out by the evidence of the transactions. It can be invoices, receipts, letters of intent, electricity bill, telephone bill, etc. In this paper, we proposed design of rules to identify the entities located on the sales invoice. There are some entities identified in a sales invoice, namely : invoice date, company name, invoice number, product id, product name, quantity and total price. Identification this entities using named entity recognition method. The entities generated from the rules used as a basis for automation process of data input into the accounting system.

  12. Adapting existing natural language processing resources for cardiovascular risk factors identification in clinical notes.

    PubMed

    Khalifa, Abdulrahman; Meystre, Stéphane

    2015-12-01

    The 2014 i2b2 natural language processing shared task focused on identifying cardiovascular risk factors such as high blood pressure, high cholesterol levels, obesity and smoking status among other factors found in health records of diabetic patients. In addition, the task involved detecting medications, and time information associated with the extracted data. This paper presents the development and evaluation of a natural language processing (NLP) application conceived for this i2b2 shared task. For increased efficiency, the application main components were adapted from two existing NLP tools implemented in the Apache UIMA framework: Textractor (for dictionary-based lookup) and cTAKES (for preprocessing and smoking status detection). The application achieved a final (micro-averaged) F1-measure of 87.5% on the final evaluation test set. Our attempt was mostly based on existing tools adapted with minimal changes and allowed for satisfying performance with limited development efforts. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. Behind the scenes: A medical natural language processing project.

    PubMed

    Wu, Joy T; Dernoncourt, Franck; Gehrmann, Sebastian; Tyler, Patrick D; Moseley, Edward T; Carlson, Eric T; Grant, David W; Li, Yeran; Welt, Jonathan; Celi, Leo Anthony

    2018-04-01

    Advancement of Artificial Intelligence (AI) capabilities in medicine can help address many pressing problems in healthcare. However, AI research endeavors in healthcare may not be clinically relevant, may have unrealistic expectations, or may not be explicit enough about their limitations. A diverse and well-functioning multidisciplinary team (MDT) can help identify appropriate and achievable AI research agendas in healthcare, and advance medical AI technologies by developing AI algorithms as well as addressing the shortage of appropriately labeled datasets for machine learning. In this paper, our team of engineers, clinicians and machine learning experts share their experience and lessons learned from their two-year-long collaboration on a natural language processing (NLP) research project. We highlight specific challenges encountered in cross-disciplinary teamwork, dataset creation for NLP research, and expectation setting for current medical AI technologies. Copyright © 2017. Published by Elsevier B.V.

  14. From Web Directories to Ontologies: Natural Language Processing Challenges

    NASA Astrophysics Data System (ADS)

    Zaihrayeu, Ilya; Sun, Lei; Giunchiglia, Fausto; Pan, Wei; Ju, Qi; Chi, Mingmin; Huang, Xuanjing

    Hierarchical classifications are used pervasively by humans as a means to organize their data and knowledge about the world. One of their main advantages is that natural language labels, used to describe their contents, are easily understood by human users. However, at the same time, this is also one of their main disadvantages as these same labels are ambiguous and very hard to be reasoned about by software agents. This fact creates an insuperable hindrance for classifications to being embedded in the Semantic Web infrastructure. This paper presents an approach to converting classifications into lightweight ontologies, and it makes the following contributions: (i) it identifies the main NLP problems related to the conversion process and shows how they are different from the classical problems of NLP; (ii) it proposes heuristic solutions to these problems, which are especially effective in this domain; and (iii) it evaluates the proposed solutions by testing them on DMoz data.

  15. On Dataless Hierarchical Text Classification (Author’s Manuscript)

    DTIC Science & Technology

    2014-07-27

    compound talk.politics.mideast politics mideast israel arab jews jewish muslim talk.politics.misc politics gay homosexual sexual alt.atheism atheism...tion in NLP tasks; it was further used in several NLP works, such as by Liang (2005), to measure words’ distributional similarity. This method...embedding trained by neural networks has been used widely in the NLP community and has become a hot trend recently. In this pa- per, we test the suitability

  16. Implicitly-Defined Neural Networks for Sequence Labeling

    DTIC Science & Technology

    2016-09-09

    this is to improve performance on long-range dependencies, and to improve stability (solution drift) in NLP tasks. We choose an implicit neural network...there have been NLP tasks, and there are many effective approaches to dealing with them. In the context of HMMs, there are the “Forward-Backward...Malyska for interesting discussion of related work, and Liz Salesky for NLP application suggestions! Tagger WSJ Accuracy Word vectors only 0.9626 Single

  17. Combating Weapons of Mass Destruction: Models, Complexity, and Algorithms in Complex Dynamic and Evolving Networks

    DTIC Science & Technology

    2015-11-01

    NLP Blondel Oslom Infomap 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 N M I (N = 5 0 0 0 ) µ SCD SCD- NLP Blondel Oslom Infomap A...Networks with minC ,maxC unconstrained. 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 N M I (N = 1 0 0 0 ) µ SCD SCD- NLP Blondel Oslom Infomap 0...0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 N M I (N = 5 0 0 0 ) µ SCD SCD- NLP Blondel Oslom Infomap B

  18. Discovery of nitrate-CPK-NLP signalling in central nutrient-growth networks

    PubMed Central

    Liu, Kun-hsiang; Niu, Yajie; Konishi, Mineko; Wu, Yue; Du, Hao; Sun Chung, Hoo; Li, Lei; Boudsocq, Marie; McCormack, Matthew; Maekawa, Shugo; Ishida, Tetsuya; Zhang, Chao; Shokat, Kevan; Yanagisawa, Shuichi; Sheen, Jen

    2018-01-01

    Nutrient signalling integrates and coordinates gene expression, metabolism and growth. However, its primary molecular mechanisms remain incompletely understood in plants and animals. Here we report novel Ca2+ signalling triggered by nitrate with live imaging of an ultrasensitive biosensor in Arabidopsis leaves and roots. A nitrate-sensitized and targeted functional genomic screen identifies subgroup III Ca2+-sensor protein kinases (CPKs) as master regulators orchestrating primary nitrate responses. A chemical switch with the engineered CPK10(M141G) kinase enables conditional analyses of cpk10,30,32 to define comprehensive nitrate-associated regulatory and developmental programs, circumventing embryo lethality. Nitrate-CPK signalling phosphorylates conserved NIN-LIKE PROTEIN (NLP) transcription factors (TFs) to specify reprogramming of gene sets for downstream TFs, transporters, N-assimilation, C/N-metabolism, redox, signalling, hormones, and proliferation. Conditional cpk10,30,32 and nlp7 similarly impair nitrate-stimulated system-wide shoot growth and root establishment. The nutrient-coupled Ca2+ signalling network integrates transcriptome and cellular metabolism with shoot-root coordination and developmental plasticity in shaping organ biomass and architecture. PMID:28489820

  19. Monoamines differentially modulate neuropeptide release from distinct sites within a single neuron pair.

    PubMed

    Clark, Tobias; Hapiak, Vera; Oakes, Mitchell; Mills, Holly; Komuniecki, Richard

    2018-01-01

    Monoamines and neuropeptides often modulate the same behavior, but monoaminergic-peptidergic crosstalk remains poorly understood. In Caenorhabditis elegans, the adrenergic-like ligands, tyramine (TA) and octopamine (OA) require distinct subsets of neuropeptides in the two ASI sensory neurons to inhibit nociception. TA selectively increases the release of ASI neuropeptides encoded by nlp-14 or nlp-18 from either synaptic/perisynaptic regions of ASI axons or the ASI soma, respectively, and OA selectively increases the release of ASI neuropeptides encoded by nlp-9 asymmetrically, from only the synaptic/perisynaptic region of the right ASI axon. The predicted amino acid preprosequences of genes encoding either TA- or OA-dependent neuropeptides differed markedly. However, these distinct preprosequences were not sufficient to confer monoamine-specificity and additional N-terminal peptide-encoding sequence was required. Collectively, our results demonstrate that TA and OA specifically and differentially modulate the release of distinct subsets of neuropeptides from different subcellular sites within the ASIs, highlighting the complexity of monoaminergic/peptidergic modulation, even in animals with a relatively simple nervous system.

  20. Monoamines differentially modulate neuropeptide release from distinct sites within a single neuron pair

    PubMed Central

    Oakes, Mitchell; Mills, Holly; Komuniecki, Richard

    2018-01-01

    Monoamines and neuropeptides often modulate the same behavior, but monoaminergic-peptidergic crosstalk remains poorly understood. In Caenorhabditis elegans, the adrenergic-like ligands, tyramine (TA) and octopamine (OA) require distinct subsets of neuropeptides in the two ASI sensory neurons to inhibit nociception. TA selectively increases the release of ASI neuropeptides encoded by nlp-14 or nlp-18 from either synaptic/perisynaptic regions of ASI axons or the ASI soma, respectively, and OA selectively increases the release of ASI neuropeptides encoded by nlp-9 asymmetrically, from only the synaptic/perisynaptic region of the right ASI axon. The predicted amino acid preprosequences of genes encoding either TA- or OA-dependent neuropeptides differed markedly. However, these distinct preprosequences were not sufficient to confer monoamine-specificity and additional N-terminal peptide-encoding sequence was required. Collectively, our results demonstrate that TA and OA specifically and differentially modulate the release of distinct subsets of neuropeptides from different subcellular sites within the ASIs, highlighting the complexity of monoaminergic/peptidergic modulation, even in animals with a relatively simple nervous system. PMID:29723289

  1. Discovery of nitrate-CPK-NLP signalling in central nutrient-growth networks.

    PubMed

    Liu, Kun-Hsiang; Niu, Yajie; Konishi, Mineko; Wu, Yue; Du, Hao; Sun Chung, Hoo; Li, Lei; Boudsocq, Marie; McCormack, Matthew; Maekawa, Shugo; Ishida, Tetsuya; Zhang, Chao; Shokat, Kevan; Yanagisawa, Shuichi; Sheen, Jen

    2017-05-18

    Nutrient signalling integrates and coordinates gene expression, metabolism and growth. However, its primary molecular mechanisms remain incompletely understood in plants and animals. Here we report unique Ca 2+ signalling triggered by nitrate with live imaging of an ultrasensitive biosensor in Arabidopsis leaves and roots. A nitrate-sensitized and targeted functional genomic screen identifies subgroup III Ca 2+ -sensor protein kinases (CPKs) as master regulators that orchestrate primary nitrate responses. A chemical switch with the engineered mutant CPK10(M141G) circumvents embryo lethality and enables conditional analyses of cpk10 cpk30 cpk32 triple mutants to define comprehensive nitrate-associated regulatory and developmental programs. Nitrate-coupled CPK signalling phosphorylates conserved NIN-LIKE PROTEIN (NLP) transcription factors to specify the reprogramming of gene sets for downstream transcription factors, transporters, nitrogen assimilation, carbon/nitrogen metabolism, redox, signalling, hormones and proliferation. Conditional cpk10 cpk30 cpk32 and nlp7 mutants similarly impair nitrate-stimulated system-wide shoot growth and root establishment. The nutrient-coupled Ca 2+ signalling network integrates transcriptome and cellular metabolism with shoot-root coordination and developmental plasticity in shaping organ biomass and architecture.

  2. Corpora Processing and Computational Scaffolding for a Web-Based English Learning Environment: The CANDLE Project

    ERIC Educational Resources Information Center

    Liou, Hsien-Chin; Chang, Jason S; Chen, Hao-Jan; Lin, Chih-Cheng; Liaw, Meei-Ling; Gao, Zhao-Ming; Jang, Jyh-Shing Roger; Yeh, Yuli; Chuang, Thomas C.; You, Geeng-Neng

    2006-01-01

    This paper describes the development of an innovative web-based environment for English language learning with advanced data-driven and statistical approaches. The project uses various corpora, including a Chinese-English parallel corpus ("Sinorama") and various natural language processing (NLP) tools to construct effective English…

  3. Automating Electronic Clinical Data Capture for Quality Improvement and Research: The CERTAIN Validation Project of Real World Evidence.

    PubMed

    Devine, Emily Beth; Van Eaton, Erik; Zadworny, Megan E; Symons, Rebecca; Devlin, Allison; Yanez, David; Yetisgen, Meliha; Keyloun, Katelyn R; Capurro, Daniel; Alfonso-Cristancho, Rafael; Flum, David R; Tarczy-Hornoch, Peter

    2018-05-22

    The availability of high fidelity electronic health record (EHR) data is a hallmark of the learning health care system. Washington State's Surgical Care Outcomes and Assessment Program (SCOAP) is a network of hospitals participating in quality improvement (QI) registries wherein data are manually abstracted from EHRs. To create the Comparative Effectiveness Research and Translation Network (CERTAIN), we semi-automated SCOAP data abstraction using a centralized federated data model, created a central data repository (CDR), and assessed whether these data could be used as real world evidence for QI and research. Describe the validation processes and complexities involved and lessons learned. Investigators installed a commercial CDR to retrieve and store data from disparate EHRs. Manual and automated abstraction systems were conducted in parallel (10/2012-7/2013) and validated in three phases using the EHR as the gold standard: 1) ingestion, 2) standardization, and 3) concordance of automated versus manually abstracted cases. Information retrieval statistics were calculated. Four unaffiliated health systems provided data. Between 6 and 15 percent of data elements were abstracted: 51 to 86 percent from structured data; the remainder using natural language processing (NLP). In phase 1, data ingestion from 12 out of 20 feeds reached 95 percent accuracy. In phase 2, 55 percent of structured data elements performed with 96 to 100 percent accuracy; NLP with 89 to 91 percent accuracy. In phase 3, concordance ranged from 69 to 89 percent. Information retrieval statistics were consistently above 90 percent. Semi-automated data abstraction may be useful, although raw data collected as a byproduct of health care delivery is not immediately available for use as real world evidence. New approaches to gathering and analyzing extant data are required.

  4. Expression of an oxalate decarboxylase impairs the necrotic effect induced by Nep1-like protein (NLP) of Moniliophthora perniciosa in transgenic tobacco.

    PubMed

    da Silva, Leonardo F; Dias, Cristiano V; Cidade, Luciana C; Mendes, Juliano S; Pirovani, Carlos P; Alvim, Fátima C; Pereira, Gonçalo A G; Aragão, Francisco J L; Cascardo, Júlio C M; Costa, Marcio G C

    2011-07-01

    Oxalic acid (OA) and Nep1-like proteins (NLP) are recognized as elicitors of programmed cell death (PCD) in plants, which is crucial for the pathogenic success of necrotrophic plant pathogens and involves reactive oxygen species (ROS). To determine the importance of oxalate as a source of ROS for OA- and NLP-induced cell death, a full-length cDNA coding for an oxalate decarboxylase (FvOXDC) from the basidiomycete Flammulina velutipes, which converts OA into CO(2) and formate, was overexpressed in tobacco plants. The transgenic plants contained less OA and more formic acid compared with the control plants and showed enhanced resistance to cell death induced by exogenous OA and MpNEP2, an NLP of the hemibiotrophic fungus Moniliophthora perniciosa. This resistance was correlated with the inhibition of ROS formation in the transgenic plants inoculated with OA, MpNEP2, or a combination of both PCD elicitors. Taken together, these results have established a pivotal function for oxalate as a source of ROS required for the PCD-inducing activity of OA and NLP. The results also indicate that FvOXDC represents a potentially novel source of resistance against OA- and NLP-producing pathogens such as M. perniciosa, the causal agent of witches' broom disease of cacao (Theobroma cacao L.).

  5. A factorization approach to next-to-leading-power threshold logarithms

    NASA Astrophysics Data System (ADS)

    Bonocore, D.; Laenen, E.; Magnea, L.; Melville, S.; Vernazza, L.; White, C. D.

    2015-06-01

    Threshold logarithms become dominant in partonic cross sections when the selected final state forces gluon radiation to be soft or collinear. Such radiation factorizes at the level of scattering amplitudes, and this leads to the resummation of threshold logarithms which appear at leading power in the threshold variable. In this paper, we consider the extension of this factorization to include effects suppressed by a single power of the threshold variable. Building upon the Low-Burnett-Kroll-Del Duca (LBKD) theorem, we propose a decomposition of radiative amplitudes into universal building blocks, which contain all effects ultimately responsible for next-to-leading-power (NLP) threshold logarithms in hadronic cross sections for electroweak annihilation processes. In particular, we provide a NLO evaluation of the radiative jet function, responsible for the interference of next-to-soft and collinear effects in these cross sections. As a test, using our expression for the amplitude, we reproduce all abelian-like NLP threshold logarithms in the NNLO Drell-Yan cross section, including the interplay of real and virtual emissions. Our results are a significant step towards developing a generally applicable resummation formalism for NLP threshold effects, and illustrate the breakdown of next-to-soft theorems for gauge theory amplitudes at loop level.

  6. A homogeneous superconducting magnet design using a hybrid optimization algorithm

    NASA Astrophysics Data System (ADS)

    Ni, Zhipeng; Wang, Qiuliang; Liu, Feng; Yan, Luguang

    2013-12-01

    This paper employs a hybrid optimization algorithm with a combination of linear programming (LP) and nonlinear programming (NLP) to design the highly homogeneous superconducting magnets for magnetic resonance imaging (MRI). The whole work is divided into two stages. The first LP stage provides a global optimal current map with several non-zero current clusters, and the mathematical model for the LP was updated by taking into account the maximum axial and radial magnetic field strength limitations. In the second NLP stage, the non-zero current clusters were discretized into practical solenoids. The superconducting conductor consumption was set as the objective function both in the LP and NLP stages to minimize the construction cost. In addition, the peak-peak homogeneity over the volume of imaging (VOI), the scope of 5 Gauss fringe field, and maximum magnetic field strength within superconducting coils were set as constraints. The detailed design process for a dedicated 3.0 T animal MRI scanner was presented. The homogeneous magnet produces a magnetic field quality of 6.0 ppm peak-peak homogeneity over a 16 cm by 18 cm elliptical VOI, and the 5 Gauss fringe field was limited within a 1.5 m by 2.0 m elliptical region.

  7. Feature generation and representations for protein-protein interaction classification.

    PubMed

    Lan, Man; Tan, Chew Lim; Su, Jian

    2009-10-01

    Automatic detecting protein-protein interaction (PPI) relevant articles is a crucial step for large-scale biological database curation. The previous work adopted POS tagging, shallow parsing and sentence splitting techniques, but they achieved worse performance than the simple bag-of-words representation. In this paper, we generated and investigated multiple types of feature representations in order to further improve the performance of PPI text classification task. Besides the traditional domain-independent bag-of-words approach and the term weighting methods, we also explored other domain-dependent features, i.e. protein-protein interaction trigger keywords, protein named entities and the advanced ways of incorporating Natural Language Processing (NLP) output. The integration of these multiple features has been evaluated on the BioCreAtIvE II corpus. The experimental results showed that both the advanced way of using NLP output and the integration of bag-of-words and NLP output improved the performance of text classification. Specifically, in comparison with the best performance achieved in the BioCreAtIvE II IAS, the feature-level and classifier-level integration of multiple features improved the performance of classification 2.71% and 3.95%, respectively.

  8. Weight maintenance through behaviour modification with a cooking course or neurolinguistic programming.

    PubMed

    Sørensen, Lone Brinkmann; Greve, Tine; Kreutzer, Martin; Pedersen, Ulla; Nielsen, Claus Meyer; Toubro, Søren; Astrup, Arne

    2011-01-01

    We compared the effect on weight regain of behaviour modification consisting of either a gourmet cooking course or neurolinguistic programming (NLP) therapy. Fifty-six overweight and obese subjects participated. The first step was a 12-week weight loss program. Participants achieving at least 8% weight loss were randomized to five months of either NLP therapy or a course in gourmet cooking. Follow-up occurred after two and three years. Forty-nine participants lost at least 8% of their initial body weight and were randomized to the next step. The NLP group lost an additional 1.8 kg and the cooking group lost 0.2 kg during the five months of weight maintenance (NS). The dropout rate in the cooking group was 4%, compared with 26% in the NLP group (p=0.04). There was no difference in weight maintenance after two and three years of follow-up. In conclusion, weight loss in overweight and obese participants was maintained equally efficiently with a healthy cooking course or NLP therapy, but the dropout rate was lower during the active cooking treatment.

  9. Prediction of advertisement preference by fusing EEG response and sentiment analysis.

    PubMed

    Gauba, Himaanshu; Kumar, Pradeep; Roy, Partha Pratim; Singh, Priyanka; Dogra, Debi Prosad; Raman, Balasubramanian

    2017-08-01

    This paper presents a novel approach to predict rating of video-advertisements based on a multimodal framework combining physiological analysis of the user and global sentiment-rating available on the internet. We have fused Electroencephalogram (EEG) waves of user and corresponding global textual comments of the video to understand the user's preference more precisely. In our framework, the users were asked to watch the video-advertisement and simultaneously EEG signals were recorded. Valence scores were obtained using self-report for each video. A higher valence corresponds to intrinsic attractiveness of the user. Furthermore, the multimedia data that comprised of the comments posted by global viewers, were retrieved and processed using Natural Language Processing (NLP) technique for sentiment analysis. Textual contents from review comments were analyzed to obtain a score to understand sentiment nature of the video. A regression technique based on Random forest was used to predict the rating of an advertisement using EEG data. Finally, EEG based rating is combined with NLP-based sentiment score to improve the overall prediction. The study was carried out using 15 video clips of advertisements available online. Twenty five participants were involved in our study to analyze our proposed system. The results are encouraging and these suggest that the proposed multimodal approach can achieve lower RMSE in rating prediction as compared to the prediction using only EEG data. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Comparison of linear and nonlinear programming approaches for "worst case dose" and "minmax" robust optimization of intensity-modulated proton therapy dose distributions.

    PubMed

    Zaghian, Maryam; Cao, Wenhua; Liu, Wei; Kardar, Laleh; Randeniya, Sharmalee; Mohan, Radhe; Lim, Gino

    2017-03-01

    Robust optimization of intensity-modulated proton therapy (IMPT) takes uncertainties into account during spot weight optimization and leads to dose distributions that are resilient to uncertainties. Previous studies demonstrated benefits of linear programming (LP) for IMPT in terms of delivery efficiency by considerably reducing the number of spots required for the same quality of plans. However, a reduction in the number of spots may lead to loss of robustness. The purpose of this study was to evaluate and compare the performance in terms of plan quality and robustness of two robust optimization approaches using LP and nonlinear programming (NLP) models. The so-called "worst case dose" and "minmax" robust optimization approaches and conventional planning target volume (PTV)-based optimization approach were applied to designing IMPT plans for five patients: two with prostate cancer, one with skull-based cancer, and two with head and neck cancer. For each approach, both LP and NLP models were used. Thus, for each case, six sets of IMPT plans were generated and assessed: LP-PTV-based, NLP-PTV-based, LP-worst case dose, NLP-worst case dose, LP-minmax, and NLP-minmax. The four robust optimization methods behaved differently from patient to patient, and no method emerged as superior to the others in terms of nominal plan quality and robustness against uncertainties. The plans generated using LP-based robust optimization were more robust regarding patient setup and range uncertainties than were those generated using NLP-based robust optimization for the prostate cancer patients. However, the robustness of plans generated using NLP-based methods was superior for the skull-based and head and neck cancer patients. Overall, LP-based methods were suitable for the less challenging cancer cases in which all uncertainty scenarios were able to satisfy tight dose constraints, while NLP performed better in more difficult cases in which most uncertainty scenarios were hard to meet tight dose limits. For robust optimization, the worst case dose approach was less sensitive to uncertainties than was the minmax approach for the prostate and skull-based cancer patients, whereas the minmax approach was superior for the head and neck cancer patients. The robustness of the IMPT plans was remarkably better after robust optimization than after PTV-based optimization, and the NLP-PTV-based optimization outperformed the LP-PTV-based optimization regarding robustness of clinical target volume coverage. In addition, plans generated using LP-based methods had notably fewer scanning spots than did those generated using NLP-based methods. © 2017 The Authors. Journal of Applied Clinical Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.

  11. Gender Differences in the Primary Representational System according to Neurolinguistic Programming.

    ERIC Educational Resources Information Center

    Cassiere, M. F.; And Others

    Neurolinguistic Programming (NLP) is a currently popular therapeutic modality in which individuals organize information through three basic sensory systems, one of which is the Primary Representational System (PRS). This study was designed to investigate gender differences in PRS according to the predicate preference method. It was expected that…

  12. Significant lexical relationships

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pedersen, T.; Kayaalp, M.; Bruce, R.

    Statistical NLP inevitably deals with a large number of rare events. As a consequence, NLP data often violates the assumptions implicit in traditional statistical procedures such as significance testing. We describe a significance test, an exact conditional test, that is appropriate for NLP data and can be performed using freely available software. We apply this test to the study of lexical relationships and demonstrate that the results obtained using this test are both theoretically more reliable and different from the results obtained using previously applied tests.

  13. ADESSA: A Real-Time Decision Support Service for Delivery of Semantically Coded Adverse Drug Event Data

    PubMed Central

    Duke, Jon D.; Friedlin, Jeff

    2010-01-01

    Evaluating medications for potential adverse events is a time-consuming process, typically involving manual lookup of information by physicians. This process can be expedited by CDS systems that support dynamic retrieval and filtering of adverse drug events (ADE’s), but such systems require a source of semantically-coded ADE data. We created a two-component system that addresses this need. First we created a natural language processing application which extracts adverse events from Structured Product Labels and generates a standardized ADE knowledge base. We then built a decision support service that consumes a Continuity of Care Document and returns a list of patient-specific ADE’s. Our database currently contains 534,125 ADE’s from 5602 product labels. An NLP evaluation of 9529 ADE’s showed recall of 93% and precision of 95%. On a trial set of 30 CCD’s, the system provided adverse event data for 88% of drugs and returned these results in an average of 620ms. PMID:21346964

  14. A reduced successive quadratic programming strategy for errors-in-variables estimation.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tjoa, I.-B.; Biegler, L. T.; Carnegie-Mellon Univ.

    Parameter estimation problems in process engineering represent a special class of nonlinear optimization problems, because the maximum likelihood structure of the objective function can be exploited. Within this class, the errors in variables method (EVM) is particularly interesting. Here we seek a weighted least-squares fit to the measurements with an underdetermined process model. Thus, both the number of variables and degrees of freedom available for optimization increase linearly with the number of data sets. Large optimization problems of this type can be particularly challenging and expensive to solve because, for general-purpose nonlinear programming (NLP) algorithms, the computational effort increases atmore » least quadratically with problem size. In this study we develop a tailored NLP strategy for EVM problems. The method is based on a reduced Hessian approach to successive quadratic programming (SQP), but with the decomposition performed separately for each data set. This leads to the elimination of all variables but the model parameters, which are determined by a QP coordination step. In this way the computational effort remains linear in the number of data sets. Moreover, unlike previous approaches to the EVM problem, global and superlinear properties of the SQP algorithm apply naturally. Also, the method directly incorporates inequality constraints on the model parameters (although not on the fitted variables). This approach is demonstrated on five example problems with up to 102 degrees of freedom. Compared to general-purpose NLP algorithms, large improvements in computational performance are observed.« less

  15. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages

    ERIC Educational Resources Information Center

    Jarman, Jay

    2011-01-01

    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

  16. Construct Validity in TOEFL iBT Speaking Tasks: Insights from Natural Language Processing

    ERIC Educational Resources Information Center

    Kyle, Kristopher; Crossley, Scott A.; McNamara, Danielle S.

    2016-01-01

    This study explores the construct validity of speaking tasks included in the TOEFL iBT (e.g., integrated and independent speaking tasks). Specifically, advanced natural language processing (NLP) tools, MANOVA difference statistics, and discriminant function analyses (DFA) are used to assess the degree to which and in what ways responses to these…

  17. Advanced Natural Language Processing and Temporal Mining for Clinical Discovery

    ERIC Educational Resources Information Center

    Mehrabi, Saeed

    2016-01-01

    There has been vast and growing amount of healthcare data especially with the rapid adoption of electronic health records (EHRs) as a result of the HITECH act of 2009. It is estimated that around 80% of the clinical information resides in the unstructured narrative of an EHR. Recently, natural language processing (NLP) techniques have offered…

  18. Performance and carcass characteristics of guinea fowl fed on dietary Neem (Azadirachta indica) leaf powder as a growth promoter.

    PubMed

    Singh, M K; Singh, S K; Sharma, R K; Singh, B; Kumar, Sh; Joshi, S K; Kumar, S; Sathapathy, S

    2015-01-01

    The present work aimed at studying growth pattern and carcass traits in pearl grey guinea fowl fed on dietary Neem (Azadirachta indica) leaf powder (NLP) over a period of 12 weeks. Day old guinea fowl keets (n=120) were randomly assigned to four treatment groups, each with 3 replicates. The first treatment was designated as control (T0) in which no supplement was added to the feed, while in treatments T1, T2 and T3, NLP was provided as 1, 2 and 3 g per kg of feed, respectively. The results revealed a significant increase in body weight at 12 weeks; 1229.7 for T1, 1249.8 for T2, and 1266.2 g T3 compared to 1220.0 g for the control group (P<0.05). The results also showed that the supplementation of NLP significantly increased feed intake (P≤0.05) which might be due to the hypoglycaemic activity of Neem. A significant increase was also found in the feed conversion ratio (FCR) of the treated groups over the control, showing that feeding NLP to the treated groups has lowered their residual feed efficiency. The results of the study demonstrate the beneficial effects of supplementing NLP on body weight gain and dressed yield in the treated groups in guinea fowl. NLP is, therefore, suggested to be used as a feed supplement in guinea fowl for higher profitability.

  19. Evaluation of Nanolipoprotein Particles (NLPs) as an In Vivo Delivery Platform

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fischer, Nicholas O.; Weilhammer, Dina R.; Dunkle, Alexis

    Nanoparticles hold great promise for the delivery of therapeutics, yet limitations remain with regards to the use of these nanosystems for efficient long-lasting targeted delivery of therapeutics, including imparting functionality to the platform, in vivo stability, drug entrapment efficiency and toxicity. In order to begin to address these limitations, we evaluated the functionality, stability, cytotoxicity, toxicity, immunogenicity and in vivo biodistribution of nanolipoprotein particles (NLPs), which are mimetics of naturally occurring high-density lipoproteins (HDLs). We also found that a wide range of molecules could be reliably conjugated to the NLP, including proteins, single-stranded DNA, and small molecules. The NLP wasmore » also found to be relatively stable in complex biological fluids and displayed no cytotoxicity in vitro at doses as high as 320 µg/ml. In addition, we observed that in vivo administration of the NLP daily for 14 consecutive days did not induce significant weight loss or result in lesions on excised organs. Furthermore, the NLPs did not display overt immunogenicity with respect to antibody generation. Finally, the biodistribution of the NLP in vivo was found to be highly dependent on the route of administration, where intranasal administration resulted in prolonged retention in the lung tissue. Though only a select number of NLP compositions were evaluated, the findings of this study suggest that the NLP platform holds promise for use as both a targeted and non-targeted in vivo delivery vehicle for a range of therapeutics.« less

  20. Evaluation of Nanolipoprotein Particles (NLPs) as an In Vivo Delivery Platform

    PubMed Central

    Fischer, Nicholas O.; Weilhammer, Dina R.; Dunkle, Alexis; Thomas, Cynthia; Hwang, Mona; Corzett, Michele; Lychak, Cheri; Mayer, Wasima; Urbin, Salustra; Collette, Nicole; Chiun Chang, Jiun; Loots, Gabriela G.; Rasley, Amy; Blanchette, Craig D.

    2014-01-01

    Nanoparticles hold great promise for the delivery of therapeutics, yet limitations remain with regards to the use of these nanosystems for efficient long-lasting targeted delivery of therapeutics, including imparting functionality to the platform, in vivo stability, drug entrapment efficiency and toxicity. To begin to address these limitations, we evaluated the functionality, stability, cytotoxicity, toxicity, immunogenicity and in vivo biodistribution of nanolipoprotein particles (NLPs), which are mimetics of naturally occurring high-density lipoproteins (HDLs). We found that a wide range of molecules could be reliably conjugated to the NLP, including proteins, single-stranded DNA, and small molecules. The NLP was also found to be relatively stable in complex biological fluids and displayed no cytotoxicity in vitro at doses as high as 320 µg/ml. In addition, we observed that in vivo administration of the NLP daily for 14 consecutive days did not induce significant weight loss or result in lesions on excised organs. Furthermore, the NLPs did not display overt immunogenicity with respect to antibody generation. Finally, the biodistribution of the NLP in vivo was found to be highly dependent on the route of administration, where intranasal administration resulted in prolonged retention in the lung tissue. Although only a select number of NLP compositions were evaluated, the findings of this study suggest that the NLP platform holds promise for use as both a targeted and non-targeted in vivo delivery vehicle for a range of therapeutics. PMID:24675794

  1. Task-Driven Dynamic Text Summarization

    ERIC Educational Resources Information Center

    Workman, Terri Elizabeth

    2011-01-01

    The objective of this work is to examine the efficacy of natural language processing (NLP) in summarizing bibliographic text for multiple purposes. Researchers have noted the accelerating growth of bibliographic databases. Information seekers using traditional information retrieval techniques when searching large bibliographic databases are often…

  2. Stefanyshyn-Piper works with NLP-Vaccine-2 on MDDK

    NASA Image and Video Library

    2008-11-19

    S126-E-008302 (19 Nov. 2008) --- Astronaut Heidemarie M. Stefanyshyn-Piper, STS-126 mission specialist, works with the Microbe Group Activation Pack containing eight Fluid Processing Apparatuses on the middeck of Space Shuttle Endeavour while docked with the International Space Station.

  3. Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts.

    PubMed

    Liao, Katherine P; Ananthakrishnan, Ashwin N; Kumar, Vishesh; Xia, Zongqi; Cagan, Andrew; Gainer, Vivian S; Goryachev, Sergey; Chen, Pei; Savova, Guergana K; Agniel, Denis; Churchill, Susanne; Lee, Jaeyoung; Murphy, Shawn N; Plenge, Robert M; Szolovits, Peter; Kohane, Isaac; Shaw, Stanley Y; Karlson, Elizabeth W; Cai, Tianxi

    2015-01-01

    Typically, algorithms to classify phenotypes using electronic medical record (EMR) data were developed to perform well in a specific patient population. There is increasing interest in analyses which can allow study of a specific outcome across different diseases. Such a study in the EMR would require an algorithm that can be applied across different patient populations. Our objectives were: (1) to develop an algorithm that would enable the study of coronary artery disease (CAD) across diverse patient populations; (2) to study the impact of adding narrative data extracted using natural language processing (NLP) in the algorithm. Additionally, we demonstrate how to implement CAD algorithm to compare risk across 3 chronic diseases in a preliminary study. We studied 3 established EMR based patient cohorts: diabetes mellitus (DM, n = 65,099), inflammatory bowel disease (IBD, n = 10,974), and rheumatoid arthritis (RA, n = 4,453) from two large academic centers. We developed a CAD algorithm using NLP in addition to structured data (e.g. ICD9 codes) in the RA cohort and validated it in the DM and IBD cohorts. The CAD algorithm using NLP in addition to structured data achieved specificity >95% with a positive predictive value (PPV) 90% in the training (RA) and validation sets (IBD and DM). The addition of NLP data improved the sensitivity for all cohorts, classifying an additional 17% of CAD subjects in IBD and 10% in DM while maintaining PPV of 90%. The algorithm classified 16,488 DM (26.1%), 457 IBD (4.2%), and 245 RA (5.0%) with CAD. In a cross-sectional analysis, CAD risk was 63% lower in RA and 68% lower in IBD compared to DM (p<0.0001) after adjusting for traditional cardiovascular risk factors. We developed and validated a CAD algorithm that performed well across diverse patient populations. The addition of NLP into the CAD algorithm improved the sensitivity of the algorithm, particularly in cohorts where the prevalence of CAD was low. Preliminary data suggest that CAD risk was significantly lower in RA and IBD compared to DM.

  4. Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts

    PubMed Central

    Liao, Katherine P.; Ananthakrishnan, Ashwin N.; Kumar, Vishesh; Xia, Zongqi; Cagan, Andrew; Gainer, Vivian S.; Goryachev, Sergey; Chen, Pei; Savova, Guergana K.; Agniel, Denis; Churchill, Susanne; Lee, Jaeyoung; Murphy, Shawn N.; Plenge, Robert M.; Szolovits, Peter; Kohane, Isaac; Shaw, Stanley Y.; Karlson, Elizabeth W.; Cai, Tianxi

    2015-01-01

    Background Typically, algorithms to classify phenotypes using electronic medical record (EMR) data were developed to perform well in a specific patient population. There is increasing interest in analyses which can allow study of a specific outcome across different diseases. Such a study in the EMR would require an algorithm that can be applied across different patient populations. Our objectives were: (1) to develop an algorithm that would enable the study of coronary artery disease (CAD) across diverse patient populations; (2) to study the impact of adding narrative data extracted using natural language processing (NLP) in the algorithm. Additionally, we demonstrate how to implement CAD algorithm to compare risk across 3 chronic diseases in a preliminary study. Methods and Results We studied 3 established EMR based patient cohorts: diabetes mellitus (DM, n = 65,099), inflammatory bowel disease (IBD, n = 10,974), and rheumatoid arthritis (RA, n = 4,453) from two large academic centers. We developed a CAD algorithm using NLP in addition to structured data (e.g. ICD9 codes) in the RA cohort and validated it in the DM and IBD cohorts. The CAD algorithm using NLP in addition to structured data achieved specificity >95% with a positive predictive value (PPV) 90% in the training (RA) and validation sets (IBD and DM). The addition of NLP data improved the sensitivity for all cohorts, classifying an additional 17% of CAD subjects in IBD and 10% in DM while maintaining PPV of 90%. The algorithm classified 16,488 DM (26.1%), 457 IBD (4.2%), and 245 RA (5.0%) with CAD. In a cross-sectional analysis, CAD risk was 63% lower in RA and 68% lower in IBD compared to DM (p<0.0001) after adjusting for traditional cardiovascular risk factors. Conclusions We developed and validated a CAD algorithm that performed well across diverse patient populations. The addition of NLP into the CAD algorithm improved the sensitivity of the algorithm, particularly in cohorts where the prevalence of CAD was low. Preliminary data suggest that CAD risk was significantly lower in RA and IBD compared to DM. PMID:26301417

  5. The Old Brain, the New Mirror: Matching Teaching and Learning Styles in Foreign Language Class (Based on Neuro-Linguistic Programming).

    ERIC Educational Resources Information Center

    Knowles, John K.

    The process of matching teaching materials and methods to the student's learning style and ability level in foreign language classes is explored. The Neuro-Linguistic Programming (NLP) model offers a diagnostic process for the identification of style. This process can be applied to the language learning setting as a way of presenting material to…

  6. Extraction of CYP chemical interactions from biomedical literature using natural language processing methods.

    PubMed

    Jiao, Dazhi; Wild, David J

    2009-02-01

    This paper proposes a system that automatically extracts CYP protein and chemical interactions from journal article abstracts, using natural language processing (NLP) and text mining methods. In our system, we employ a maximum entropy based learning method, using results from syntactic, semantic, and lexical analysis of texts. We first present our system architecture and then discuss the data set for training our machine learning based models and the methods in building components in our system, such as part of speech (POS) tagging, Named Entity Recognition (NER), dependency parsing, and relation extraction. An evaluation of the system is conducted at the end, yielding very promising results: The POS, dependency parsing, and NER components in our system have achieved a very high level of accuracy as measured by precision, ranging from 85.9% to 98.5%, and the precision and the recall of the interaction extraction component are 76.0% and 82.6%, and for the overall system are 68.4% and 72.2%, respectively.

  7. An information extraction framework for cohort identification using electronic health records.

    PubMed

    Liu, Hongfang; Bielinski, Suzette J; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B; Jonnalagadda, Siddhartha R; Ravikumar, K E; Wu, Stephen T; Kullo, Iftikhar J; Chute, Christopher G

    2013-01-01

    Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework.

  8. Internship Abstract and Final Reflection

    NASA Technical Reports Server (NTRS)

    Sandor, Edward

    2016-01-01

    The primary objective for this internship is the evaluation of an embedded natural language processor (NLP) as a way to introduce voice control into future space suits. An embedded natural language processor would provide an astronaut hands-free control for making adjustments to the environment of the space suit and checking status of consumables procedures and navigation. Additionally, the use of an embedded NLP could potentially reduce crew fatigue, increase the crewmember's situational awareness during extravehicular activity (EVA) and improve the ability to focus on mission critical details. The use of an embedded NLP may be valuable for other human spaceflight applications desiring hands-free control as well. An embedded NLP is unique because it is a small device that performs language tasks, including speech recognition, which normally require powerful processors. The dedicated device could perform speech recognition locally with a smaller form-factor and lower power consumption than traditional methods.

  9. An RLP23-SOBIR1-BAK1 complex mediates NLP-triggered immunity.

    PubMed

    Albert, Isabell; Böhm, Hannah; Albert, Markus; Feiler, Christina E; Imkampe, Julia; Wallmeroth, Niklas; Brancato, Caterina; Raaymakers, Tom M; Oome, Stan; Zhang, Heqiao; Krol, Elzbieta; Grefen, Christopher; Gust, Andrea A; Chai, Jijie; Hedrich, Rainer; Van den Ackerveken, Guido; Nürnberger, Thorsten

    2015-10-05

    Plants and animals employ innate immune systems to cope with microbial infection. Pattern-triggered immunity relies on the recognition of microbe-derived patterns by pattern recognition receptors (PRRs). Necrosis and ethylene-inducing peptide 1-like proteins (NLPs) constitute plant immunogenic patterns that are unique, as these proteins are produced by multiple prokaryotic (bacterial) and eukaryotic (fungal, oomycete) species. Here we show that the leucine-rich repeat receptor protein (LRR-RP) RLP23 binds in vivo to a conserved 20-amino-acid fragment found in most NLPs (nlp20), thereby mediating immune activation in Arabidopsis thaliana. RLP23 forms a constitutive, ligand-independent complex with the LRR receptor kinase (LRR-RK) SOBIR1 (Suppressor of Brassinosteroid insensitive 1 (BRI1)-associated kinase (BAK1)-interacting receptor kinase 1), and recruits a second LRR-RK, BAK1, into a tripartite complex upon ligand binding. Stable, ectopic expression of RLP23 in potato (Solanum tuberosum) confers nlp20 pattern recognition and enhanced immunity to destructive oomycete and fungal plant pathogens, such as Phytophthora infestans and Sclerotinia sclerotiorum. PRRs that recognize widespread microbial patterns might be particularly suited for engineering immunity in crop plants.

  10. Structural centrosome aberrations sensitize polarized epithelia to basal cell extrusion.

    PubMed

    Ganier, Olivier; Schnerch, Dominik; Nigg, Erich A

    2018-06-01

    Centrosome aberrations disrupt tissue architecture and may confer invasive properties to cancer cells. Here we show that structural centrosome aberrations, induced by overexpression of either Ninein-like protein (NLP) or CEP131/AZI1, sensitize polarized mammalian epithelia to basal cell extrusion. While unperturbed epithelia typically dispose of damaged cells through apical dissemination into luminal cavities, certain oncogenic mutations cause a switch in directionality towards basal cell extrusion, raising the potential for metastatic cell dissemination. Here we report that NLP-induced centrosome aberrations trigger the preferential extrusion of damaged cells towards the basal surface of epithelial monolayers. This switch in directionality from apical to basal dissemination coincides with a profound reorganization of the microtubule cytoskeleton, which in turn prevents the contractile ring repositioning that is required to support extrusion towards the apical surface. While the basal extrusion of cells harbouring NLP-induced centrosome aberrations requires exogenously induced cell damage, structural centrosome aberrations induced by excess CEP131 trigger the spontaneous dissemination of dying cells towards the basal surface from MDCK cysts. Thus, similar to oncogenic mutations, structural centrosome aberrations can favour basal extrusion of damaged cells from polarized epithelia. Assuming that additional mutations may promote cell survival, this process could sensitize epithelia to disseminate potentially metastatic cells. © 2018 The Authors.

  11. All-optical polarization control and noise cleaning based on a nonlinear lossless polarizer

    NASA Astrophysics Data System (ADS)

    Barozzi, Matteo; Vannucci, Armando; Picchi, Giorgio

    2015-01-01

    We propose an all-optical fiber-based device able to accomplish both polarization control and OSNR enhancement of an amplitude modulated optical signal, affected by unpolarized additive white Gaussian noise, at the same time. The proposed noise cleaning device is made of a nonlinear lossless polarizer (NLP), that performs polarization control, followed by an ideal polarizing filter that removes the orthogonally polarized half of additive noise. The NLP transforms every input signal polarization into a unique, well defined output polarization (without any loss of signal energy) and its task is to impose a signal polarization aligned with the transparent eigenstate of the polarizing filter. In order to effectively control the polarization of the modulated signal, we show that two different NLP configurations (with counter- or co-propagating pump laser) are needed, as a function of the signal polarization coherence time. The NLP is designed so that polarization attraction is effective only on the "noiseless" (i.e., information-bearing) component of the signal and not on noise, that remains unpolarized at the NLP output. Hence, the proposed device is able to discriminate signal power (that is preserved) from in-band noise power (that is partly suppressed). Since signal repolarization is detrimental if applied to polarization-multiplexed formats, the noise cleaner application is limited here to "legacy" links, with 10 Gb/s OOK modulation, still representing the most common format in deployed networks. By employing the appropriate NLP configurations, we obtain an OSNR gain close to 3dB. Furthermore, we show how the achievable OSNR gain can be estimated theoretically.

  12. AutoMap User’s Guide

    DTIC Science & Technology

    2006-10-01

    Hierarchy of Pre-Processing Techniques 3. NLP (Natural Language Processing) Utilities 3.1 Named-Entity Recognition 3.1.1 Example for Named-Entity... Recognition 3.2 Symbol RemovalN-Gram Identification: Bi-Grams 4. Stemming 4.1 Stemming Example 5. Delete List 5.1 Open a Delete List 5.1.1 Small...iterative and involves several key processes: • Named-Entity Recognition Named-Entity Recognition is an Automap feature that allows you to

  13. Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries.

    PubMed

    Lin, Ching-Heng; Wu, Nai-Yuan; Lai, Wei-Shao; Liou, Der-Ming

    2015-01-01

    Electronic medical records with encoded entries should enhance the semantic interoperability of document exchange. However, it remains a challenge to encode the narrative concept and to transform the coded concepts into a standard entry-level document. This study aimed to use a novel approach for the generation of entry-level interoperable clinical documents. Using HL7 clinical document architecture (CDA) as the example, we developed three pipelines to generate entry-level CDA documents. The first approach was a semi-automatic annotation pipeline (SAAP), the second was a natural language processing (NLP) pipeline, and the third merged the above two pipelines. We randomly selected 50 test documents from the i2b2 corpora to evaluate the performance of the three pipelines. The 50 randomly selected test documents contained 9365 words, including 588 Observation terms and 123 Procedure terms. For the Observation terms, the merged pipeline had a significantly higher F-measure than the NLP pipeline (0.89 vs 0.80, p<0.0001), but a similar F-measure to that of the SAAP (0.89 vs 0.87). For the Procedure terms, the F-measure was not significantly different among the three pipelines. The combination of a semi-automatic annotation approach and the NLP application seems to be a solution for generating entry-level interoperable clinical documents. © The Author 2014. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.comFor numbered affiliation see end of article.

  14. Automated Outcome Classification of Computed Tomography Imaging Reports for Pediatric Traumatic Brain Injury.

    PubMed

    Yadav, Kabir; Sarioglu, Efsun; Choi, Hyeong Ah; Cartwright, Walter B; Hinds, Pamela S; Chamberlain, James M

    2016-02-01

    The authors have previously demonstrated highly reliable automated classification of free-text computed tomography (CT) imaging reports using a hybrid system that pairs linguistic (natural language processing) and statistical (machine learning) techniques. Previously performed for identifying the outcome of orbital fracture in unprocessed radiology reports from a clinical data repository, the performance has not been replicated for more complex outcomes. To validate automated outcome classification performance of a hybrid natural language processing (NLP) and machine learning system for brain CT imaging reports. The hypothesis was that our system has performance characteristics for identifying pediatric traumatic brain injury (TBI). This was a secondary analysis of a subset of 2,121 CT reports from the Pediatric Emergency Care Applied Research Network (PECARN) TBI study. For that project, radiologists dictated CT reports as free text, which were then deidentified and scanned as PDF documents. Trained data abstractors manually coded each report for TBI outcome. Text was extracted from the PDF files using optical character recognition. The data set was randomly split evenly for training and testing. Training patient reports were used as input to the Medical Language Extraction and Encoding (MedLEE) NLP tool to create structured output containing standardized medical terms and modifiers for negation, certainty, and temporal status. A random subset stratified by site was analyzed using descriptive quantitative content analysis to confirm identification of TBI findings based on the National Institute of Neurological Disorders and Stroke (NINDS) Common Data Elements project. Findings were coded for presence or absence, weighted by frequency of mentions, and past/future/indication modifiers were filtered. After combining with the manual reference standard, a decision tree classifier was created using data mining tools WEKA 3.7.5 and Salford Predictive Miner 7.0. Performance of the decision tree classifier was evaluated on the test patient reports. The prevalence of TBI in the sampled population was 159 of 2,217 (7.2%). The automated classification for pediatric TBI is comparable to our prior results, with the notable exception of lower positive predictive value. Manual review of misclassified reports, 95.5% of which were false-positives, revealed that a sizable number of false-positive errors were due to differing outcome definitions between NINDS TBI findings and PECARN clinical important TBI findings and report ambiguity not meeting definition criteria. A hybrid NLP and machine learning automated classification system continues to show promise in coding free-text electronic clinical data. For complex outcomes, it can reliably identify negative reports, but manual review of positive reports may be required. As such, it can still streamline data collection for clinical research and performance improvement. © 2016 by the Society for Academic Emergency Medicine.

  15. Automated Outcome Classification of Computed Tomography Imaging Reports for Pediatric Traumatic Brain Injury

    PubMed Central

    Yadav, Kabir; Sarioglu, Efsun; Choi, Hyeong-Ah; Cartwright, Walter B.; Hinds, Pamela S.; Chamberlain, James M.

    2016-01-01

    Background The authors have previously demonstrated highly reliable automated classification of free text computed tomography (CT) imaging reports using a hybrid system that pairs linguistic (natural language processing) and statistical (machine learning) techniques. Previously performed for identifying the outcome of orbital fracture in unprocessed radiology reports from a clinical data repository, the performance has not been replicated for more complex outcomes. Objectives To validate automated outcome classification performance of a hybrid natural language processing (NLP) and machine learning system for brain CT imaging reports. The hypothesis was that our system has performance characteristics for identifying pediatric traumatic brain injury (TBI). Methods This was a secondary analysis of a subset of 2,121 CT reports from the Pediatric Emergency Care Applied Research Network (PECARN) TBI study. For that project, radiologists dictated CT reports as free text, which were then de-identified and scanned as PDF documents. Trained data abstractors manually coded each report for TBI outcome. Text was extracted from the PDF files using optical character recognition. The dataset was randomly split evenly for training and testing. Training patient reports were used as input to the Medical Language Extraction and Encoding (MedLEE) NLP tool to create structured output containing standardized medical terms and modifiers for negation, certainty, and temporal status. A random subset stratified by site was analyzed using descriptive quantitative content analysis to confirm identification of TBI findings based upon the National Institute of Neurological Disorders and Stroke Common Data Elements project. Findings were coded for presence or absence, weighted by frequency of mentions, and past/future/indication modifiers were filtered. After combining with the manual reference standard, a decision tree classifier was created using data mining tools WEKA 3.7.5 and Salford Predictive Miner 7.0. Performance of the decision tree classifier was evaluated on the test patient reports. Results The prevalence of TBI in the sampled population was 159 out of 2,217 (7.2%). The automated classification for pediatric TBI is comparable to our prior results, with the notable exception of lower positive predictive value (PPV). Manual review of misclassified reports, 95.5% of which were false positives, revealed that a sizable number of false-positive errors were due to differing outcome definitions between NINDS TBI findings and PECARN clinical important TBI findings, and report ambiguity not meeting definition criteria. Conclusions A hybrid NLP and machine learning automated classification system continues to show promise in coding free-text electronic clinical data. For complex outcomes, it can reliably identify negative reports, but manual review of positive reports may be required. As such, it can still streamline data collection for clinical research and performance improvement. PMID:26766600

  16. Dissecting the Signaling Mechanisms Underlying Recognition and Preference of Food Odors

    PubMed Central

    Harris, Gareth; Shen, Yu; Ha, Heonick; Donato, Alessandra; Wallis, Samuel; Zhang, Xiaodong

    2014-01-01

    Food is critical for survival. Many animals, including the nematode Caenorhabditis elegans, use sensorimotor systems to detect and locate preferred food sources. However, the signaling mechanisms underlying food-choice behaviors are poorly understood. Here, we characterize the molecular signaling that regulates recognition and preference between different food odors in C. elegans. We show that the major olfactory sensory neurons, AWB and AWC, play essential roles in this behavior. A canonical Gα-protein, together with guanylate cyclases and cGMP-gated channels, is needed for the recognition of food odors. The food-odor-evoked signal is transmitted via glutamatergic neurotransmission from AWC and through AMPA and kainate-like glutamate receptor subunits. In contrast, peptidergic signaling is required to generate preference between different food odors while being dispensable for the recognition of the odors. We show that this regulation is achieved by the neuropeptide NLP-9 produced in AWB, which acts with its putative receptor NPR-18, and by the neuropeptide NLP-1 produced in AWC. In addition, another set of sensory neurons inhibits food-odor preference. These mechanistic logics, together with a previously mapped neural circuit underlying food-odor preference, provide a functional network linking sensory response, transduction, and downstream receptors to process complex olfactory information and generate the appropriate behavioral decision essential for survival. PMID:25009271

  17. Generation of an annotated reference standard for vaccine adverse event reports.

    PubMed

    Foster, Matthew; Pandey, Abhishek; Kreimeyer, Kory; Botsis, Taxiarchis

    2018-07-05

    As part of a collaborative project between the US Food and Drug Administration (FDA) and the Centers for Disease Control and Prevention for the development of a web-based natural language processing (NLP) workbench, we created a corpus of 1000 Vaccine Adverse Event Reporting System (VAERS) reports annotated for 36,726 clinical features, 13,365 temporal features, and 22,395 clinical-temporal links. This paper describes the final corpus, as well as the methodology used to create it, so that clinical NLP researchers outside FDA can evaluate the utility of the corpus to aid their own work. The creation of this standard went through four phases: pre-training, pre-production, production-clinical feature annotation, and production-temporal annotation. The pre-production phase used a double annotation followed by adjudication strategy to refine and finalize the annotation model while the production phases followed a single annotation strategy to maximize the number of reports in the corpus. An analysis of 30 reports randomly selected as part of a quality control assessment yielded accuracies of 0.97, 0.96, and 0.83 for clinical features, temporal features, and clinical-temporal associations, respectively and speaks to the quality of the corpus. Copyright © 2018 Elsevier Ltd. All rights reserved.

  18. Text-based Analytics for Biosurveillance

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Charles, Lauren E.; Smith, William P.; Rounds, Jeremiah

    The ability to prevent, mitigate, or control a biological threat depends on how quickly the threat is identified and characterized. Ensuring the timely delivery of data and analytics is an essential aspect of providing adequate situational awareness in the face of a disease outbreak. This chapter outlines an analytic pipeline for supporting an advanced early warning system that can integrate multiple data sources and provide situational awareness of potential and occurring disease situations. The pipeline, includes real-time automated data analysis founded on natural language processing (NLP), semantic concept matching, and machine learning techniques, to enrich content with metadata related tomore » biosurveillance. Online news articles are presented as an example use case for the pipeline, but the processes can be generalized to any textual data. In this chapter, the mechanics of a streaming pipeline are briefly discussed as well as the major steps required to provide targeted situational awareness. The text-based analytic pipeline includes various processing steps as well as identifying article relevance to biosurveillance (e.g., relevance algorithm) and article feature extraction (who, what, where, why, how, and when). The ability to prevent, mitigate, or control a biological threat depends on how quickly the threat is identified and characterized. Ensuring the timely delivery of data and analytics is an essential aspect of providing adequate situational awareness in the face of a disease outbreak. This chapter outlines an analytic pipeline for supporting an advanced early warning system that can integrate multiple data sources and provide situational awareness of potential and occurring disease situations. The pipeline, includes real-time automated data analysis founded on natural language processing (NLP), semantic concept matching, and machine learning techniques, to enrich content with metadata related to biosurveillance. Online news articles are presented as an example use case for the pipeline, but the processes can be generalized to any textual data. In this chapter, the mechanics of a streaming pipeline are briefly discussed as well as the major steps required to provide targeted situational awareness. The text-based analytic pipeline includes various processing steps as well as identifying article relevance to biosurveillance (e.g., relevance algorithm) and article feature extraction (who, what, where, why, how, and when).« less

  19. The information exchange.

    PubMed

    Hendron, Brid

    2015-02-01

    This article has been written to highlight the importance of unconscious communication in the dental environment using Neuro-Linguistic Programming (NLP) principles. A single aspect of unconscious communication is described to demonstrate the value to dental team members of studying NLP in order to improve their communication skills.

  20. Observations concerning Research Literature on Neuro-Linguistic Programming.

    ERIC Educational Resources Information Center

    Einspruch, Eric L.; Forman, Bruce D.

    1985-01-01

    Identifies six categories of design and methodological errors contained in the 39 empirical studies of neurolinguistic programming (NLP) documented through April 1984. Representative reports reflecting each category are discussed. Suggestions are offered for improving the quality of research on NLP. (Author/MCF)

  1. Neurolinguistic Programming Examined: Imagery, Sensory Mode, and Communication.

    ERIC Educational Resources Information Center

    Fromme, Donald K.; Daniell, Jennifer

    1984-01-01

    Tested Neurolinguistic Programming (NLP) assumptions by examining intercorrelations among response times of students (N=64) for extracting visual, auditory, and kinesthetic information from alphabetic images. Large positive intercorrelations were obtained, the only outcome not compatible with NLP. Good visualizers were significantly better in…

  2. Expression and Association of the Yersinia pestis Translocon Proteins, YopB and YopD, Are Facilitated by Nanolipoprotein Particles

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Coleman, Matthew A.; Cappuccio, Jenny A.; Blanchette, Craig D.

    Yersinia pestis enters host cells and evades host defenses, in part, through interactions between Yersinia pestis proteins and host membranes. One such interaction is through the type III secretion system, which uses a highly conserved and ordered complex for Yersinia pestis outer membrane effector protein translocation called the injectisome. The portion of the injectisome that interacts directly with host cell membranes is referred to as the translocon. The translocon is believed to form a pore allowing effector molecules to enter host cells. To facilitate mechanistic studies of the translocon, we have developed a cell-free approach for expressing translocon pore proteinsmore » as a complex supported in a bilayer membrane mimetic nano-scaffold known as a nanolipoprotein particle (NLP) Initial results show cell-free expression of Yersinia pestis outer membrane proteins YopB and YopD was enhanced in the presence of liposomes. However, these complexes tended to aggregate and precipitate. With the addition of co-expressed (NLP) forming components, the YopB and/or YopD complex was rendered soluble, increasing the yield of protein for biophysical studies. Biophysical methods such as Atomic Force Microscopy and Fluorescence Correlation Spectroscopy were used to confirm that the soluble YopB/D complex was associated with NLPs. An interaction between the YopB/D complex and NLP was validated by immunoprecipitation. The YopB/D translocon complex embedded in a NLP provides a platform for protein interaction studies between pathogen and host proteins. Ultimately, these studies will help elucidate the poorly understood mechanism which enables this pathogen to inject effector proteins into host cells, thus evading host defenses.« less

  3. Expression and Association of the Yersinia pestis Translocon Proteins, YopB and YopD, Are Facilitated by Nanolipoprotein Particles

    DOE PAGES

    Coleman, Matthew A.; Cappuccio, Jenny A.; Blanchette, Craig D.; ...

    2016-03-25

    Yersinia pestis enters host cells and evades host defenses, in part, through interactions between Yersinia pestis proteins and host membranes. One such interaction is through the type III secretion system, which uses a highly conserved and ordered complex for Yersinia pestis outer membrane effector protein translocation called the injectisome. The portion of the injectisome that interacts directly with host cell membranes is referred to as the translocon. The translocon is believed to form a pore allowing effector molecules to enter host cells. To facilitate mechanistic studies of the translocon, we have developed a cell-free approach for expressing translocon pore proteinsmore » as a complex supported in a bilayer membrane mimetic nano-scaffold known as a nanolipoprotein particle (NLP) Initial results show cell-free expression of Yersinia pestis outer membrane proteins YopB and YopD was enhanced in the presence of liposomes. However, these complexes tended to aggregate and precipitate. With the addition of co-expressed (NLP) forming components, the YopB and/or YopD complex was rendered soluble, increasing the yield of protein for biophysical studies. Biophysical methods such as Atomic Force Microscopy and Fluorescence Correlation Spectroscopy were used to confirm that the soluble YopB/D complex was associated with NLPs. An interaction between the YopB/D complex and NLP was validated by immunoprecipitation. The YopB/D translocon complex embedded in a NLP provides a platform for protein interaction studies between pathogen and host proteins. Ultimately, these studies will help elucidate the poorly understood mechanism which enables this pathogen to inject effector proteins into host cells, thus evading host defenses.« less

  4. Data-Informed Language Learning

    ERIC Educational Resources Information Center

    Godwin-Jones, Robert

    2017-01-01

    Although data collection has been used in language learning settings for some time, it is only in recent decades that large corpora have become available, along with efficient tools for their use. Advances in natural language processing (NLP) have enabled rich tagging and annotation of corpus data, essential for their effective use in language…

  5. Generating a Spanish Affective Dictionary with Supervised Learning Techniques

    ERIC Educational Resources Information Center

    Bermudez-Gonzalez, Daniel; Miranda-Jiménez, Sabino; García-Moreno, Raúl-Ulises; Calderón-Nepamuceno, Dora

    2016-01-01

    Nowadays, machine learning techniques are being used in several Natural Language Processing (NLP) tasks such as Opinion Mining (OM). OM is used to analyse and determine the affective orientation of texts. Usually, OM approaches use affective dictionaries in order to conduct sentiment analysis. These lexicons are labeled manually with affective…

  6. Working Effectively with People: Contributions of Neurolinguistic Programming (NLP) to Visual Literacy.

    ERIC Educational Resources Information Center

    Ragan, Janet M.; Ragan, Tillman J.

    1982-01-01

    Briefly summarizes history of neurolinguistic programing, which set out to model elements and processes of effective communication and to reduce these to formulas that can be taught to others. Potential areas of inquiry for neurolinguistic programers which should be of concern to visual literacists are discussed. (MBR)

  7. Leveraging Code Comments to Improve Software Reliability

    ERIC Educational Resources Information Center

    Tan, Lin

    2009-01-01

    Commenting source code has long been a common practice in software development. This thesis, consisting of three pieces of work, made novel use of the code comments written in natural language to improve software reliability. Our solution combines Natural Language Processing (NLP), Machine Learning, Statistics, and Program Analysis techniques to…

  8. Lexical Link Analysis Application: Improving Web Service to Acquisition Visibility Portal

    DTIC Science & Technology

    2013-09-30

    during the Empire Challenge 2008 and 2009 (EC08/09) field experiments and for numerous other field experiments of new technologies during Trident Warrior...Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/ VLC -2000) (pp. 63–70). Retrieved from http://nlp.stanford.edu/manning

  9. English Complex Verb Constructions: Identification and Inference

    ERIC Educational Resources Information Center

    Tu, Yuancheng

    2012-01-01

    The fundamental problem faced by automatic text understanding in Natural Language Processing (NLP) is to identify semantically related pieces of text and integrate them together to compute the meaning of the whole text. However, the principle of compositionality runs into trouble very quickly when real language is examined with its frequent…

  10. An Information Extraction Framework for Cohort Identification Using Electronic Health Records

    PubMed Central

    Liu, Hongfang; Bielinski, Suzette J.; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B.; Jonnalagadda, Siddhartha R.; Ravikumar, K.E.; Wu, Stephen T.; Kullo, Iftikhar J.; Chute, Christopher G

    Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework. PMID:24303255

  11. On the formation of noise-like pulses in fiber ring cavity configurations

    NASA Astrophysics Data System (ADS)

    Jeong, Yoonchan; Vazquez-Zuniga, Luis Alonso; Lee, Seungjong; Kwon, Youngchul

    2014-12-01

    We give an overview of the current status of fiber-based noise-like pulse (NLP) research conducted over the past decade, together with presenting the newly conducted, systematic study on their temporal, spectral, and coherence characteristics in nonlinear polarization rotation (NPR)-based erbium-doped fiber ring cavity configurations. Firstly, our study includes experimental investigations on the characteristic features of NLPs both in the net anomalous dispersion regime and in the net normal dispersion regime, in comparison with coherent optical pulses that can alternatively be obtained from the same cavity configurations, i.e., with the conventional and dissipative solitons. Secondly, our study includes numerical simulations on the formation of NLPs, utilizing a simplified, scalar-field model based on the characteristic transfer function of the NPR mechanism in conjunction with the split-step Fourier algorithm, which offer a great help in exploring the interrelationship between the NLP formation and various cavity parameters, and eventually present good agreement with the experimental results. We stress that if the cavity operates with excessively high gain, i.e., higher than the levels just required for generating coherent mode-locked pulses, i.e., conventional solitons and dissipative solitons, it may trigger NLPs, depending on the characteristic transfer function of the NPR mechanism induced in the cavity. In particular, the NPR transfer function is characterized by the critical saturation power and the linear loss ratio. Finally, we also report on the applications of the fiber-based NLP sources, including supercontinuum generation in a master-oscillator power amplifier configuration seeded by a fiber-based NLP source, as one typical example. We expect that the NLP-related research area will continue to expand, and that NLP-based sources will also find more applications in the future.

  12. Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to Process Medication Information in Outpatient Clinical Notes

    PubMed Central

    Zhou, Li; Plasek, Joseph M; Mahoney, Lisa M; Karipineni, Neelima; Chang, Frank; Yan, Xuemin; Chang, Fenny; Dimaggio, Dana; Goldman, Debora S.; Rocha, Roberto A.

    2011-01-01

    Clinical information is often coded using different terminologies, and therefore is not interoperable. Our goal is to develop a general natural language processing (NLP) system, called Medical Text Extraction, Reasoning and Mapping System (MTERMS), which encodes clinical text using different terminologies and simultaneously establishes dynamic mappings between them. MTERMS applies a modular, pipeline approach flowing from a preprocessor, semantic tagger, terminology mapper, context analyzer, and parser to structure inputted clinical notes. Evaluators manually reviewed 30 free-text and 10 structured outpatient clinical notes compared to MTERMS output. MTERMS achieved an overall F-measure of 90.6 and 94.0 for free-text and structured notes respectively for medication and temporal information. The local medication terminology had 83.0% coverage compared to RxNorm’s 98.0% coverage for free-text notes. 61.6% of mappings between the terminologies are exact match. Capture of duration was significantly improved (91.7% vs. 52.5%) from systems in the third i2b2 challenge. PMID:22195230

  13. The eyes don't have it: lie detection and Neuro-Linguistic Programming.

    PubMed

    Wiseman, Richard; Watt, Caroline; ten Brinke, Leanne; Porter, Stephen; Couper, Sara-Louise; Rankin, Calum

    2012-01-01

    Proponents of Neuro-Linguistic Programming (NLP) claim that certain eye-movements are reliable indicators of lying. According to this notion, a person looking up to their right suggests a lie whereas looking up to their left is indicative of truth telling. Despite widespread belief in this claim, no previous research has examined its validity. In Study 1 the eye movements of participants who were lying or telling the truth were coded, but did not match the NLP patterning. In Study 2 one group of participants were told about the NLP eye-movement hypothesis whilst a second control group were not. Both groups then undertook a lie detection test. No significant differences emerged between the two groups. Study 3 involved coding the eye movements of both liars and truth tellers taking part in high profile press conferences. Once again, no significant differences were discovered. Taken together the results of the three studies fail to support the claims of NLP. The theoretical and practical implications of these findings are discussed.

  14. The Eyes Don’t Have It: Lie Detection and Neuro-Linguistic Programming

    PubMed Central

    Wiseman, Richard; Watt, Caroline; ten Brinke, Leanne; Porter, Stephen; Couper, Sara-Louise; Rankin, Calum

    2012-01-01

    Proponents of Neuro-Linguistic Programming (NLP) claim that certain eye-movements are reliable indicators of lying. According to this notion, a person looking up to their right suggests a lie whereas looking up to their left is indicative of truth telling. Despite widespread belief in this claim, no previous research has examined its validity. In Study 1 the eye movements of participants who were lying or telling the truth were coded, but did not match the NLP patterning. In Study 2 one group of participants were told about the NLP eye-movement hypothesis whilst a second control group were not. Both groups then undertook a lie detection test. No significant differences emerged between the two groups. Study 3 involved coding the eye movements of both liars and truth tellers taking part in high profile press conferences. Once again, no significant differences were discovered. Taken together the results of the three studies fail to support the claims of NLP. The theoretical and practical implications of these findings are discussed. PMID:22808128

  15. BioNLP Shared Task--The Bacteria Track.

    PubMed

    Bossy, Robert; Jourde, Julien; Manine, Alain-Pierre; Veber, Philippe; Alphonse, Erick; van de Guchte, Maarten; Bessières, Philippe; Nédellec, Claire

    2012-06-26

    We present the BioNLP 2011 Shared Task Bacteria Track, the first Information Extraction challenge entirely dedicated to bacteria. It includes three tasks that cover different levels of biological knowledge. The Bacteria Gene Renaming supporting task is aimed at extracting gene renaming and gene name synonymy in PubMed abstracts. The Bacteria Gene Interaction is a gene/protein interaction extraction task from individual sentences. The interactions have been categorized into ten different sub-types, thus giving a detailed account of genetic regulations at the molecular level. Finally, the Bacteria Biotopes task focuses on the localization and environment of bacteria mentioned in textbook articles. We describe the process of creation for the three corpora, including document acquisition and manual annotation, as well as the metrics used to evaluate the participants' submissions. Three teams submitted to the Bacteria Gene Renaming task; the best team achieved an F-score of 87%. For the Bacteria Gene Interaction task, the only participant's score had reached a global F-score of 77%, although the system efficiency varies significantly from one sub-type to another. Three teams submitted to the Bacteria Biotopes task with very different approaches; the best team achieved an F-score of 45%. However, the detailed study of the participating systems efficiency reveals the strengths and weaknesses of each participating system. The three tasks of the Bacteria Track offer participants a chance to address a wide range of issues in Information Extraction, including entity recognition, semantic typing and coreference resolution. We found common trends in the most efficient systems: the systematic use of syntactic dependencies and machine learning. Nevertheless, the originality of the Bacteria Biotopes task encouraged the use of interesting novel methods and techniques, such as term compositionality, scopes wider than the sentence.

  16. NLP-12 engages different UNC-13 proteins to potentiate tonic and evoked release.

    PubMed

    Hu, Zhitao; Vashlishan-Murray, Amy B; Kaplan, Joshua M

    2015-01-21

    A neuropeptide (NLP-12) and its receptor (CKR-2) potentiate tonic and evoked ACh release at Caenorhabditis elegans neuromuscular junctions. Increased evoked release is mediated by a presynaptic pathway (egl-30 Gαq and egl-8 PLCβ) that produces DAG, and by DAG binding to short and long UNC-13 proteins. Potentiation of tonic ACh release persists in mutants deficient for egl-30 Gαq and egl-8 PLCβ and requires DAG binding to UNC-13L (but not UNC-13S). Thus, NLP-12 adjusts tonic and evoked release by distinct mechanisms. Copyright © 2015 the authors 0270-6474/15/351038-05$15.00/0.

  17. System Architecture for Temporal Information Extraction, Representation and Reasoning in Clinical Narrative Reports

    PubMed Central

    Zhou, Li; Friedman, Carol; Parsons, Simon; Hripcsak, George

    2005-01-01

    Exploring temporal information in narrative Electronic Medical Records (EMRs) is essential and challenging. We propose an architecture for an integrated approach to process temporal information in clinical narrative reports. The goal is to initiate and build a foundation that supports applications which assist healthcare practice and research by including the ability to determine the time of clinical events (e.g., past vs. present). Key components include: (1) a temporal constraint structure for temporal expressions and the development of an associated tagger; (2) a Natural Language Processing (NLP) system for encoding and extracting medical events and associating them with formalized temporal data; (3) a post-processor, with a knowledge-based subsystem to help discover implicit information, that resolves temporal expressions and deals with issues such as granularity and vagueness; and (4) a reasoning mechanism which models clinical reports as Simple Temporal Problems (STPs). PMID:16779164

  18. WHU at TREC KBA Vital Filtering Track 2014

    DTIC Science & Technology

    2014-11-01

    view the problem as a classification problem and use Stanford NLP Toolkit to extract necessary information. Various kinds of features are leveraged to...profile of an entity. Our approach is to view the problem as a classification problem and use Stanford NLP Toolkit to extract necessary information

  19. Neuro-Linguistic Programming: A Discussion of Why and How.

    ERIC Educational Resources Information Center

    Partridge, Susan

    Intended for teachers, this article offers a definition of neuro-linguistic programming (NLP), discusses its relevance to instruction, and provides illustrations of the implementation of neuro-linguistic programming in instructional contexts. NLP is defined as an approach to instruction that recognizes the familiar visual, auditory, and…

  20. A missense mutation in the agouti signaling protein gene (ASIP) is associated with the no light points coat phenotype in donkeys.

    PubMed

    Abitbol, Marie; Legrand, Romain; Tiret, Laurent

    2015-04-08

    Seven donkey breeds are recognized by the French studbook and are characterized by a black, bay or grey coat colour including light cream-to-white points (LP). Occasionally, Normand bay donkeys give birth to dark foals that lack LP and display the no light points (NLP) pattern. This pattern is more frequent and officially recognized in American miniature donkeys. The LP (or pangare) phenotype resembles that of the light bellied agouti pattern in mouse, while the NLP pattern resembles that of the mammalian recessive black phenotype; both phenotypes are associated with the agouti signaling protein gene (ASIP). We used a panel of 127 donkeys to identify a recessive missense c.349 T > C variant in ASIP that was shown to be in complete association with the NLP phenotype. This variant results in a cysteine to arginine substitution at position 117 in the ASIP protein. This cysteine is highly-conserved among vertebrate ASIP proteins and was previously shown by mutagenesis experiments to lie within a functional site. Altogether, our results strongly support that the identified mutation is causative of the NLP phenotype. Thus, we propose to name the c.[349 T > C] allele in donkeys, the a(nlp) allele, which enlarges the panel of coat colour alleles in donkeys and ASIP recessive loss-of-function alleles in animals.

  1. Temporal data representation, normalization, extraction, and reasoning: A review from clinical domain

    PubMed Central

    Madkour, Mohcine; Benhaddou, Driss; Tao, Cui

    2016-01-01

    Background and Objective We live our lives by the calendar and the clock, but time is also an abstraction, even an illusion. The sense of time can be both domain-specific and complex, and is often left implicit, requiring significant domain knowledge to accurately recognize and harness. In the clinical domain, the momentum gained from recent advances in infrastructure and governance practices has enabled the collection of tremendous amount of data at each moment in time. Electronic Health Records (EHRs) have paved the way to making these data available for practitioners and researchers. However, temporal data representation, normalization, extraction and reasoning are very important in order to mine such massive data and therefore for constructing the clinical timeline. The objective of this work is to provide an overview of the problem of constructing a timeline at the clinical point of care and to summarize the state-of-the-art in processing temporal information of clinical narratives. Methods This review surveys the methods used in three important area: modeling and representing of time, Medical NLP methods for extracting time, and methods of time reasoning and processing. The review emphasis on the current existing gap between present methods and the semantic web technologies and catch up with the possible combinations. Results the main findings of this review is revealing the importance of time processing not only in constructing timelines and clinical decision support systems but also as a vital component of EHR data models and operations. Conclusions Extracting temporal information in clinical narratives is a challenging task. The inclusion of ontologies and semantic web will lead to better assessment of the annotation task and, together with medical NLP techniques, will help resolving granularity and co-reference resolution problems. PMID:27040831

  2. 49 CFR 563.8 - Data format.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... the first acceleration data point; (3) The number of the last point (NLP), which is an integer that...; and (4) NLP—NFP + 1 acceleration values sequentially beginning with the acceleration at time NFP * TS and continue sampling the acceleration at TS increments in time until the time NLP * TS is reached...

  3. 49 CFR 563.8 - Data format

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... number of the last point (NLP), which is an integer that when multiplied by the TS equals the time relative to time zero of the last acceleration data point; and (4) NLP—NFP + 1 acceleration values... increments in time until the time NLP * TS is reached. [73 FR 2183, Jan. 14, 2008] ...

  4. Automatic Selection of Suitable Sentences for Language Learning Exercises

    ERIC Educational Resources Information Center

    Pilán, Ildikó; Volodina, Elena; Johansson, Richard

    2013-01-01

    In our study we investigated second and foreign language (L2) sentence readability, an area little explored so far in the case of several languages, including Swedish. The outcome of our research consists of two methods for sentence selection from native language corpora based on Natural Language Processing (NLP) and machine learning (ML)…

  5. Human-Level Natural Language Understanding: False Progress and Real Challenges

    ERIC Educational Resources Information Center

    Bignoli, Perrin G.

    2013-01-01

    The field of Natural Language Processing (NLP) focuses on the study of how utterances composed of human-level languages can be understood and generated. Typically, there are considered to be three intertwined levels of structure that interact to create meaning in language: syntax, semantics, and pragmatics. Not only is a large amount of…

  6. Nonlinear-programming mathematical modeling of coal blending for power plant

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tang Longhua; Zhou Junhu; Yao Qiang

    At present most of the blending works are guided by experience or linear-programming (LP) which can not reflect the coal complicated characteristics properly. Experimental and theoretical research work shows that most of the coal blend properties can not always be measured as a linear function of the properties of the individual coals in the blend. The authors introduced nonlinear functions or processes (including neural network and fuzzy mathematics), established on the experiments directed by the authors and other researchers, to quantitatively describe the complex coal blend parameters. Finally nonlinear-programming (NLP) mathematical modeling of coal blend is introduced and utilized inmore » the Hangzhou Coal Blending Center. Predictions based on the new method resulted in different results from the ones based on LP modeling. The authors concludes that it is very important to introduce NLP modeling, instead of NL modeling, into the work of coal blending.« less

  7. Validation of psoriatic arthritis diagnoses in electronic medical records using natural language processing

    PubMed Central

    Cai, Tianxi; Karlson, Elizabeth W.

    2013-01-01

    Objectives To test whether data extracted from full text patient visit notes from an electronic medical record (EMR) would improve the classification of PsA compared to an algorithm based on codified data. Methods From the > 1,350,000 adults in a large academic EMR, all 2318 patients with a billing code for PsA were extracted and 550 were randomly selected for chart review and algorithm training. Using codified data and phrases extracted from narrative data using natural language processing, 31 predictors were extracted and three random forest algorithms trained using coded, narrative, and combined predictors. The receiver operator curve (ROC) was used to identify the optimal algorithm and a cut point was chosen to achieve the maximum sensitivity possible at a 90% positive predictive value (PPV). The algorithm was then used to classify the remaining 1768 charts and finally validated in a random sample of 300 cases predicted to have PsA. Results The PPV of a single PsA code was 57% (95%CI 55%–58%). Using a combination of coded data and NLP the random forest algorithm reached a PPV of 90% (95%CI 86%–93%) at sensitivity of 87% (95% CI 83% – 91%) in the training data. The PPV was 93% (95%CI 89%–96%) in the validation set. Adding NLP predictors to codified data increased the area under the ROC (p < 0.001). Conclusions Using NLP with text notes from electronic medical records improved the performance of the prediction algorithm significantly. Random forests were a useful tool to accurately classify psoriatic arthritis cases to enable epidemiological research. PMID:20701955

  8. Dependency-based Siamese long short-term memory network for learning sentence representations

    PubMed Central

    Zhu, Wenhao; Ni, Jianyue; Wei, Baogang; Lu, Zhiguo

    2018-01-01

    Textual representations play an important role in the field of natural language processing (NLP). The efficiency of NLP tasks, such as text comprehension and information extraction, can be significantly improved with proper textual representations. As neural networks are gradually applied to learn the representation of words and phrases, fairly efficient models of learning short text representations have been developed, such as the continuous bag of words (CBOW) and skip-gram models, and they have been extensively employed in a variety of NLP tasks. Because of the complex structure generated by the longer text lengths, such as sentences, algorithms appropriate for learning short textual representations are not applicable for learning long textual representations. One method of learning long textual representations is the Long Short-Term Memory (LSTM) network, which is suitable for processing sequences. However, the standard LSTM does not adequately address the primary sentence structure (subject, predicate and object), which is an important factor for producing appropriate sentence representations. To resolve this issue, this paper proposes the dependency-based LSTM model (D-LSTM). The D-LSTM divides a sentence representation into two parts: a basic component and a supporting component. The D-LSTM uses a pre-trained dependency parser to obtain the primary sentence information and generate supporting components, and it also uses a standard LSTM model to generate the basic sentence components. A weight factor that can adjust the ratio of the basic and supporting components in a sentence is introduced to generate the sentence representation. Compared with the representation learned by the standard LSTM, the sentence representation learned by the D-LSTM contains a greater amount of useful information. The experimental results show that the D-LSTM is superior to the standard LSTM for sentences involving compositional knowledge (SICK) data. PMID:29513748

  9. Assessing Primary Representational System (PRS) Preference for Neurolinguistic Programming (NLP) Using Three Methods.

    ERIC Educational Resources Information Center

    Dorn, Fred J.

    1983-01-01

    Considered three methods of identifying Primary Representational System (PRS)--an interview, a word list, and a self-report--in a study of 120 college students. Results suggested the three methods offer little to counselors either collectively or individually. Results did not validate the PRS construct, suggesting the need for further research.…

  10. The Sentence Fairy: A Natural-Language Generation System to Support Children's Essay Writing

    ERIC Educational Resources Information Center

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine

    2008-01-01

    We built an NLP system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary texts produced by pupils…

  11. MedEx/J: A One-Scan Simple and Fast NLP Tool for Japanese Clinical Texts.

    PubMed

    Aramaki, Eiji; Yano, Ken; Wakamiya, Shoko

    2017-01-01

    Because of recent replacement of physical documents with electronic medical records (EMR), the importance of information processing in the medical field has increased. In light of this trend, we have been developing MedEx/J, which retrieves important Japanese language information from medical reports. MedEx/J executes two tasks simultaneously: (1) term extraction, and (2) positive and negative event classification. We designate this approach as a one-scan approach, providing simplicity of systems and reasonable accuracy. MedEx/J performance on the two tasks is described herein: (1) term extraction (Fβ = 1 = 0.87) and (2) positive-negative classification (Fβ = 1 = 0.63). This paper also presents discussion and explains remaining issues in the medical natural language processing field.

  12. Neuro-linguistic programming and application in treatment of phobias.

    PubMed

    Karunaratne, Mahishika

    2010-11-01

    Phobias are a prevalent and often debilitating mental health problem all over the world. This article aims to explore what is known about the use of Neuro-linguistic Programming (NLP) as a treatment for this condition. Whilst there is abundant experiential evidence from NLP practitioners attesting to the efficacy of this method as a treatment for phobias, experimental research in this area is somewhat limited. This paper reviews evidence available in literature produced in the UK and US and reveals that NLP is a successful treatment for phobias as well as being particularly efficient due to the relatively brief time period it takes to effect an improvement. Copyright © 2010 Elsevier Ltd. All rights reserved.

  13. DE and NLP Based QPLS Algorithm

    NASA Astrophysics Data System (ADS)

    Yu, Xiaodong; Huang, Dexian; Wang, Xiong; Liu, Bo

    As a novel evolutionary computing technique, Differential Evolution (DE) has been considered to be an effective optimization method for complex optimization problems, and achieved many successful applications in engineering. In this paper, a new algorithm of Quadratic Partial Least Squares (QPLS) based on Nonlinear Programming (NLP) is presented. And DE is used to solve the NLP so as to calculate the optimal input weights and the parameters of inner relationship. The simulation results based on the soft measurement of diesel oil solidifying point on a real crude distillation unit demonstrate that the superiority of the proposed algorithm to linear PLS and QPLS which is based on Sequential Quadratic Programming (SQP) in terms of fitting accuracy and computational costs.

  14. Adaptive estimation of nonlinear parameters of a nonholonomic spherical robot using a modified fuzzy-based speed gradient algorithm

    NASA Astrophysics Data System (ADS)

    Roozegar, Mehdi; Mahjoob, Mohammad J.; Ayati, Moosa

    2017-05-01

    This paper deals with adaptive estimation of the unknown parameters and states of a pendulum-driven spherical robot (PDSR), which is a nonlinear in parameters (NLP) chaotic system with parametric uncertainties. Firstly, the mathematical model of the robot is deduced by applying the Newton-Euler methodology for a system of rigid bodies. Then, based on the speed gradient (SG) algorithm, the states and unknown parameters of the robot are estimated online for different step length gains and initial conditions. The estimated parameters are updated adaptively according to the error between estimated and true state values. Since the errors of the estimated states and parameters as well as the convergence rates depend significantly on the value of step length gain, this gain should be chosen optimally. Hence, a heuristic fuzzy logic controller is employed to adjust the gain adaptively. Simulation results indicate that the proposed approach is highly encouraging for identification of this NLP chaotic system even if the initial conditions change and the uncertainties increase; therefore, it is reliable to be implemented on a real robot.

  15. Neuro-Linguistic Programming: Developing Effective Communication in the Classroom.

    ERIC Educational Resources Information Center

    Torres, Cresencio; Katz, Judy H.

    Neuro-Linguistic Programming (NLP) is a method that teachers can use to increase their communication effectiveness by matching their communication patterns with those of their students. The basic premise of NLP is that people operate and make sense of their experience through information received from the world around them. This information is…

  16. Research Findings on Neurolinguistic Programming: Nonsupportive Data or an Untestable Theory?

    ERIC Educational Resources Information Center

    Sharpley, Christopher F.

    1987-01-01

    Examines the experimental literature on neurolinguistic programming (NLP). Sharpley (l984) and Einspruch and Forman (l985) concluded that the effectiveness of this therapy was yet to be demonstrated. Presents data from seven recent studies that further question the basic tenets of NLP and their application in counseling situations. (Author/KS)

  17. Neuro-Linguistic Programming as an Innovation in Education and Teaching

    ERIC Educational Resources Information Center

    Tosey, Paul; Mathison, Jane

    2010-01-01

    Neuro-linguistic programming (NLP)--an emergent, contested approach to communication and personal development created in the 1970s--has become increasingly familiar in education and teaching. There is little academic work on NLP to date. This article offers an informed introduction to, and appraisal of, the field for educators. We review the…

  18. 49 CFR 563.8 - Data format.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... point (NLP), which is an integer that when multiplied by the TS equals the time relative to time zero of the last acceleration data point; and (4) NLP—NFP + 1 acceleration values sequentially beginning with... until the time NLP * TS is reached. [73 FR 2183, Jan. 14, 2008] § 563.8, Nt. Effective Date Note: At 76...

  19. Noise-like pulse generation in an ytterbium-doped fiber laser using tungsten disulphide

    NASA Astrophysics Data System (ADS)

    Zhang, Wenping; Song, Yanrong; Guoyu, Heyang; Xu, Runqin; Dong, Zikai; Li, Kexuan; Tian, Jinrong; Gong, Shuang

    2017-12-01

    We demonstrated the noise-like pulse (NLP) generation in an ytterbium-doped fiber (YDF) laser with tungsten disulphide (WS2). Stable fundamental mode locking and second-order harmonic mode locking were observed. The saturable absorber (SA) was a WS2-polyvinyl alcohol film. The modulation depth of the WS2 film was 2.4%, and the saturable optical intensity was 155 MW cm-2. Based on this SA, the fundamental NLP with a pulse width of 20 ns and repetition rate of 7 MHz were observed. The autocorrelation trace of output pulses had a coherent spike, which came from NLP. The average pulse width of the spike was 550 fs on the top of a broad pedestal. The second-order harmonic NLP had a spectral bandwidth of 1.3 nm and pulse width of 10 ns. With the pump power of 400 mW, the maximum output power was 22.2 mW. To the best of our knowledge, this is the first time a noise-like mode locking in an YDF laser based on WS2-SA in an all normal dispersion regime was obtained.

  20. LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    VERSPOOR, KARIN; LIN, SHOU-DE

    An N-gram language model aims at capturing statistical syntactic word order information from corpora. Although the concept of language models has been applied extensively to handle a variety of NLP problems with reasonable success, the standard model does not incorporate semantic information, and consequently limits its applicability to semantic problems such as word sense disambiguation. We propose a framework that integrates semantic information into the language model schema, allowing a system to exploit both syntactic and semantic information to address NLP problems. Furthermore, acknowledging the limited availability of semantically annotated data, we discuss how the proposed model can be learnedmore » without annotated training examples. Finally, we report on a case study showing how the semantics-enhanced language model can be applied to unsupervised word sense disambiguation with promising results.« less

  1. TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations

    PubMed Central

    Miyao, Yusuke; Collier, Nigel

    2017-01-01

    Background Work on pharmacovigilance systems using texts from PubMed and Twitter typically target at different elements and use different annotation guidelines resulting in a scenario where there is no comparable set of documents from both Twitter and PubMed annotated in the same manner. Objective This study aimed to provide a comparable corpus of texts from PubMed and Twitter that can be used to study drug reports from these two sources of information, allowing researchers in the area of pharmacovigilance using natural language processing (NLP) to perform experiments to better understand the similarities and differences between drug reports in Twitter and PubMed. Methods We produced a corpus comprising 1000 tweets and 1000 PubMed sentences selected using the same strategy and annotated at entity level by the same experts (pharmacists) using the same set of guidelines. Results The resulting corpus, annotated by two pharmacists, comprises semantically correct annotations for a set of drugs, diseases, and symptoms. This corpus contains the annotations for 3144 entities, 2749 relations, and 5003 attributes. Conclusions We present a corpus that is unique in its characteristics as this is the first corpus for pharmacovigilance curated from Twitter messages and PubMed sentences using the same data selection and annotation strategies. We believe this corpus will be of particular interest for researchers willing to compare results from pharmacovigilance systems (eg, classifiers and named entity recognition systems) when using data from Twitter and from PubMed. We hope that given the comprehensive set of drug names and the annotated entities and relations, this corpus becomes a standard resource to compare results from different pharmacovigilance studies in the area of NLP. PMID:28468748

  2. TwiMed: Twitter and PubMed Comparable Corpus of Drugs, Diseases, Symptoms, and Their Relations.

    PubMed

    Alvaro, Nestor; Miyao, Yusuke; Collier, Nigel

    2017-05-03

    Work on pharmacovigilance systems using texts from PubMed and Twitter typically target at different elements and use different annotation guidelines resulting in a scenario where there is no comparable set of documents from both Twitter and PubMed annotated in the same manner. This study aimed to provide a comparable corpus of texts from PubMed and Twitter that can be used to study drug reports from these two sources of information, allowing researchers in the area of pharmacovigilance using natural language processing (NLP) to perform experiments to better understand the similarities and differences between drug reports in Twitter and PubMed. We produced a corpus comprising 1000 tweets and 1000 PubMed sentences selected using the same strategy and annotated at entity level by the same experts (pharmacists) using the same set of guidelines. The resulting corpus, annotated by two pharmacists, comprises semantically correct annotations for a set of drugs, diseases, and symptoms. This corpus contains the annotations for 3144 entities, 2749 relations, and 5003 attributes. We present a corpus that is unique in its characteristics as this is the first corpus for pharmacovigilance curated from Twitter messages and PubMed sentences using the same data selection and annotation strategies. We believe this corpus will be of particular interest for researchers willing to compare results from pharmacovigilance systems (eg, classifiers and named entity recognition systems) when using data from Twitter and from PubMed. We hope that given the comprehensive set of drug names and the annotated entities and relations, this corpus becomes a standard resource to compare results from different pharmacovigilance studies in the area of NLP. ©Nestor Alvaro, Yusuke Miyao, Nigel Collier. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 03.05.2017.

  3. BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.

    PubMed

    Sogancioglu, Gizem; Öztürk, Hakime; Özgür, Arzucan

    2017-07-15

    The amount of information available in textual format is rapidly increasing in the biomedical domain. Therefore, natural language processing (NLP) applications are becoming increasingly important to facilitate the retrieval and analysis of these data. Computing the semantic similarity between sentences is an important component in many NLP tasks including text retrieval and summarization. A number of approaches have been proposed for semantic sentence similarity estimation for generic English. However, our experiments showed that such approaches do not effectively cover biomedical knowledge and produce poor results for biomedical text. We propose several approaches for sentence-level semantic similarity computation in the biomedical domain, including string similarity measures and measures based on the distributed vector representations of sentences learned in an unsupervised manner from a large biomedical corpus. In addition, ontology-based approaches are presented that utilize general and domain-specific ontologies. Finally, a supervised regression based model is developed that effectively combines the different similarity computation metrics. A benchmark data set consisting of 100 sentence pairs from the biomedical literature is manually annotated by five human experts and used for evaluating the proposed methods. The experiments showed that the supervised semantic sentence similarity computation approach obtained the best performance (0.836 correlation with gold standard human annotations) and improved over the state-of-the-art domain-independent systems up to 42.6% in terms of the Pearson correlation metric. A web-based system for biomedical semantic sentence similarity computation, the source code, and the annotated benchmark data set are available at: http://tabilab.cmpe.boun.edu.tr/BIOSSES/ . gizemsogancioglu@gmail.com or arzucan.ozgur@boun.edu.tr. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  4. Translation of Japanese Noun Compounds at Super-Function Based MT System

    NASA Astrophysics Data System (ADS)

    Zhao, Xin; Ren, Fuji; Kuroiwa, Shingo

    Noun compounds are frequently encountered construction in nature language processing (NLP), consisting of a sequence of two or more nouns which functions syntactically as one noun. The translation of noun compounds has become a major issue in Machine Translation (MT) due to their frequency of occurrence and high productivity. In our previous studies on Super-Function Based Machine Translation (SFBMT), we have found that noun compounds are very frequently used and difficult to be translated correctly, the overgeneration of noun compounds can be dangerous as it may introduce ambiguity in the translation. In this paper, we discuss the challenges in handling Japanese noun compounds in an SFBMT system, we present a shallow method for translating noun compounds by using a word level translation dictionary and target language monolingual corpus.

  5. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications

    PubMed Central

    Masanz, James J; Ogren, Philip V; Zheng, Jiaping; Sohn, Sunghwan; Kipper-Schuler, Karin C; Chute, Christopher G

    2010-01-01

    We aim to build and evaluate an open-source natural language processing system for information extraction from electronic medical record clinical free-text. We describe and evaluate our system, the clinical Text Analysis and Knowledge Extraction System (cTAKES), released open-source at http://www.ohnlp.org. The cTAKES builds on existing open-source technologies—the Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Performance of individual components: sentence boundary detector accuracy=0.949; tokenizer accuracy=0.949; part-of-speech tagger accuracy=0.936; shallow parser F-score=0.924; named entity recognizer and system-level evaluation F-score=0.715 for exact and 0.824 for overlapping spans, and accuracy for concept mapping, negation, and status attributes for exact and overlapping spans of 0.957, 0.943, 0.859, and 0.580, 0.939, and 0.839, respectively. Overall performance is discussed against five applications. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text. PMID:20819853

  6. Dissecting the signaling mechanisms underlying recognition and preference of food odors.

    PubMed

    Harris, Gareth; Shen, Yu; Ha, Heonick; Donato, Alessandra; Wallis, Samuel; Zhang, Xiaodong; Zhang, Yun

    2014-07-09

    Food is critical for survival. Many animals, including the nematode Caenorhabditis elegans, use sensorimotor systems to detect and locate preferred food sources. However, the signaling mechanisms underlying food-choice behaviors are poorly understood. Here, we characterize the molecular signaling that regulates recognition and preference between different food odors in C. elegans. We show that the major olfactory sensory neurons, AWB and AWC, play essential roles in this behavior. A canonical Gα-protein, together with guanylate cyclases and cGMP-gated channels, is needed for the recognition of food odors. The food-odor-evoked signal is transmitted via glutamatergic neurotransmission from AWC and through AMPA and kainate-like glutamate receptor subunits. In contrast, peptidergic signaling is required to generate preference between different food odors while being dispensable for the recognition of the odors. We show that this regulation is achieved by the neuropeptide NLP-9 produced in AWB, which acts with its putative receptor NPR-18, and by the neuropeptide NLP-1 produced in AWC. In addition, another set of sensory neurons inhibits food-odor preference. These mechanistic logics, together with a previously mapped neural circuit underlying food-odor preference, provide a functional network linking sensory response, transduction, and downstream receptors to process complex olfactory information and generate the appropriate behavioral decision essential for survival. Copyright © 2014 the authors 0270-6474/14/339389-15$15.00/0.

  7. The ACODEA Framework: Developing Segmentation and Classification Schemes for Fully Automatic Analysis of Online Discussions

    ERIC Educational Resources Information Center

    Mu, Jin; Stegmann, Karsten; Mayfield, Elijah; Rose, Carolyn; Fischer, Frank

    2012-01-01

    Research related to online discussions frequently faces the problem of analyzing huge corpora. Natural Language Processing (NLP) technologies may allow automating this analysis. However, the state-of-the-art in machine learning and text mining approaches yields models that do not transfer well between corpora related to different topics. Also,…

  8. The Impact of Anonymization for Automated Essay Scoring

    ERIC Educational Resources Information Center

    Shermis, Mark D.; Lottridge, Sue; Mayfield, Elijah

    2015-01-01

    This study investigated the impact of anonymizing text on predicted scores made by two kinds of automated scoring engines: one that incorporates elements of natural language processing (NLP) and one that does not. Eight data sets (N = 22,029) were used to form both training and test sets in which the scoring engines had access to both text and…

  9. Entity Relation Detection with Factorial Hidden Markov Models and Maximum Entropy Discriminant Latent Dirichlet Allocations

    ERIC Educational Resources Information Center

    Li, Dingcheng

    2011-01-01

    Coreference resolution (CR) and entity relation detection (ERD) aim at finding predefined relations between pairs of entities in text. CR focuses on resolving identity relations while ERD focuses on detecting non-identity relations. Both CR and ERD are important as they can potentially improve other natural language processing (NLP) related tasks…

  10. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features.

    PubMed

    Nikfarjam, Azadeh; Sarker, Abeed; O'Connor, Karen; Ginn, Rachel; Gonzalez, Graciela

    2015-05-01

    Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media. We introduce ADRMine, a machine learning-based concept extraction system that uses conditional random fields (CRFs). ADRMine utilizes a variety of features, including a novel feature for modeling words' semantic similarities. The similarities are modeled by clustering words based on unsupervised, pretrained word representation vectors (embeddings) generated from unlabeled user posts in social media using a deep learning technique. ADRMine outperforms several strong baseline systems in the ADR extraction task by achieving an F-measure of 0.82. Feature analysis demonstrates that the proposed word cluster features significantly improve extraction performance. It is possible to extract complex medical concepts, with relatively high performance, from informal, user-generated content. Our approach is particularly scalable, suitable for social media mining, as it relies on large volumes of unlabeled data, thus diminishing the need for large, annotated training data sets. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  11. Social media based NPL system to find and retrieve ARM data: Concept paper

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra

    Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less

  12. A Sibling-Mediated Intervention for Children with Autism Spectrum Disorder: Using the Natural Language Paradigm (NLP)

    ERIC Educational Resources Information Center

    Spector, Vicki; Charlop, Marjorie H.

    2018-01-01

    We taught three typically developing siblings to occasion speech by implementing the Natural Language Paradigm (NLP) with their brothers with autism spectrum disorder (ASD). A non-concurrent multiple baseline design across children with ASD and sibling dyads was used. Ancillary behaviors of happiness, play, and joint attention for the children…

  13. Applications of NLP Techniques to Computer-Assisted Authoring of Test Items for Elementary Chinese

    ERIC Educational Resources Information Center

    Liu, Chao-Lin; Lin, Jen-Hsiang; Wang, Yu-Chun

    2010-01-01

    The authors report an implemented environment for computer-assisted authoring of test items and provide a brief discussion about the applications of NLP techniques for computer assisted language learning. Test items can serve as a tool for language learners to examine their competence in the target language. The authors apply techniques for…

  14. 49 CFR 563.8 - Data format.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... number of the last point (NLP), which is an integer that when multiplied by the TS equals the time relative to time zero of the last acceleration data point; and (4) NLP—NFP + 1 acceleration values... increments in time until the time NLP * TS is reached. [73 FR 2183, Jan. 14, 2008, as amended at 76 FR 47488...

  15. Parent-Implemented Natural Language Paradigm to Increase Language and Play in Children with Autism

    ERIC Educational Resources Information Center

    Gillett, Jill N.; LeBlanc, Linda A.

    2007-01-01

    Three parents of children with autism were taught to implement the Natural Language Paradigm (NLP). Data were collected on parent implementation, multiple measures of child language, and play. The parents were able to learn to implement the NLP procedures quickly and accurately with beneficial results for their children. Increases in the overall…

  16. Applying "What Works" in Psychology to Enhancing Examination Success in Schools: The Potential Contribution of NLP

    ERIC Educational Resources Information Center

    Kudliskis, Voldis; Burden, Robert

    2009-01-01

    The strengths and weaknesses of Neuro-Linguistic Programming (NLP) are described with reference to its origins, previous research and comments from critics and supporters. A case is made for this allegedly theoretical approach to provide the kind of outcomes focused intervention that psychology and psychologists can offer to schools. In…

  17. Research on trust-region algorithms for nonlinear programming. Final technical report, 1 January 1990--31 December 1992

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dennis, J.E. Jr.; Tapia, R.A.

    Goal of the research was to develop and test effective, robust algorithms for general nonlinear programming (NLP) problems, particularly large or otherwise expensive NLP problems. We discuss the research conducted over the 3-year period Jan. 1990-Dec. 1992. We also describe current and future directions of our research.

  18. 49 CFR 563.8 - Data format.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... number of the last point (NLP), which is an integer that when multiplied by the TS equals the time relative to time zero of the last acceleration data point; and (4) NLP—NFP + 1 acceleration values... increments in time until the time NLP * TS is reached. [73 FR 2183, Jan. 14, 2008, as amended at 76 FR 47488...

  19. Optimizing graph-based patterns to extract biomedical events from the literature

    PubMed Central

    2015-01-01

    In BioNLP-ST 2013 We participated in the BioNLP 2013 shared tasks on event extraction. Our extraction method is based on the search for an approximate subgraph isomorphism between key context dependencies of events and graphs of input sentences. Our system was able to address both the GENIA (GE) task focusing on 13 molecular biology related event types and the Cancer Genetics (CG) task targeting a challenging group of 40 cancer biology related event types with varying arguments concerning 18 kinds of biological entities. In addition to adapting our system to the two tasks, we also attempted to integrate semantics into the graph matching scheme using a distributional similarity model for more events, and evaluated the event extraction impact of using paths of all possible lengths as key context dependencies beyond using only the shortest paths in our system. We achieved a 46.38% F-score in the CG task (ranking 3rd) and a 48.93% F-score in the GE task (ranking 4th). After BioNLP-ST 2013 We explored three ways to further extend our event extraction system in our previously published work: (1) We allow non-essential nodes to be skipped, and incorporated a node skipping penalty into the subgraph distance function of our approximate subgraph matching algorithm. (2) Instead of assigning a unified subgraph distance threshold to all patterns of an event type, we learned a customized threshold for each pattern. (3) We implemented the well-known Empirical Risk Minimization (ERM) principle to optimize the event pattern set by balancing prediction errors on training data against regularization. When evaluated on the official GE task test data, these extensions help to improve the extraction precision from 62% to 65%. However, the overall F-score stays equivalent to the previous performance due to a 1% drop in recall. PMID:26551594

  20. Neuro-Linguistic Programming, Matching Sensory Predicates, and Rapport.

    ERIC Educational Resources Information Center

    Schmedlen, George W.; And Others

    A key task for the therapist in psychotherapy is to build trust and rapport with the client. Neuro-Linguistic Programming (NLP) practitioners believe that matching the sensory modality (representational system) of a client's predicates (verbs, adverbs, and adjectives) improves rapport. In this study, 16 volunteer subjects participated in two…

  1. The Military Language Tutor (MILT)

    DTIC Science & Technology

    1998-11-01

    interactive tutor in a Pentium based laptop computer. The first version of MILT with keyboard input was designed for Spanish and Arabic and can recognize... NLP ). The goal of the MILT design team was an authoring system which would require no formal external training and which could be learned within four

  2. Applying Active Learning to Assertion Classification of Concepts in Clinical Text

    PubMed Central

    Chen, Yukun; Mani, Subramani; Xu, Hua

    2012-01-01

    Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that actively samples from a large pool, provides an alternative solution. Its major goal in classification is to reduce the annotation effort while maintaining the quality of the predictive model. However, few studies have investigated its uses in clinical NLP. This paper reports an application of active learning to a clinical text classification task: to determine the assertion status of clinical concepts. The annotated corpus for the assertion classification task in the 2010 i2b2/VA Clinical NLP Challenge was used in this study. We implemented several existing and newly developed active learning algorithms and assessed their uses. The outcome is reported in the global ALC score, based on the Area under the average Learning Curve of the AUC (Area Under the Curve) score. Results showed that when the same number of annotated samples was used, active learning strategies could generate better classification models (best ALC – 0.7715) than the passive learning method (random sampling) (ALC – 0.7411). Moreover, to achieve the same classification performance, active learning strategies required fewer samples than the random sampling method. For example, to achieve an AUC of 0.79, the random sampling method used 32 samples, while our best active learning algorithm required only 12 samples, a reduction of 62.5% in manual annotation effort. PMID:22127105

  3. Efficacy of neurolinguistic programming training on mental health in nursing and midwifery students.

    PubMed

    Sahebalzamani, Mohammad

    2014-09-01

    Neurolinguistic programming (NLP) refers to the science and art of reaching success and perfection. It is a collection of the skills based on human beings' psychological characteristics through which the individuals obtain the ability to use their personal capabilities as much as possible. This study aimed to investigate the efficacy of NLP training on mental health in nursing and midwifery students in Islamic Azad University Tehran Medical Sciences branch. In this quasi-experimental study, the study population comprised all nursing and midwifery students in Islamic Azad University, Tehran Medical branch, of whom 52 were selected and assigned to two groups through random sampling. Data collection tool was Goldberg General Health Questionnaire (28-item version). After primary evaluation, NLP training was given in five 120-min sessions and the groups were re-evaluated. The obtained data were analyzed. In the nursing group, paired t-test showed a significant difference in the scores of mental health (with 39 points decrease), physical signs (with 7.96 scores decrease), anxiety (with 10.75 scores decrease), social function (with 7.05 scores decrease) and depression (with 9.38 scores decrease). In the midwifery group, it showed a significant difference in mental health (with 22.63 scores decrease), physical signs (with 6.54 scores decrease), anxiety (with nine scores decrease), and depression (with 8.38 scores decrease). This study showed that NLP strategies are effective in the improvement of general health and its various dimensions. Therefore, it is essential to conduct structured and executive programs concerning NLP among the students.

  4. Efficacy of neurolinguistic programming training on mental health in nursing and midwifery students

    PubMed Central

    Sahebalzamani, Mohammad

    2014-01-01

    Background: Neurolinguistic programming (NLP) refers to the science and art of reaching success and perfection. It is a collection of the skills based on human beings’ psychological characteristics through which the individuals obtain the ability to use their personal capabilities as much as possible. This study aimed to investigate the efficacy of NLP training on mental health in nursing and midwifery students in Islamic Azad University Tehran Medical Sciences branch. Materials and Methods: In this quasi-experimental study, the study population comprised all nursing and midwifery students in Islamic Azad University, Tehran Medical branch, of whom 52 were selected and assigned to two groups through random sampling. Data collection tool was Goldberg General Health Questionnaire (28-item version). After primary evaluation, NLP training was given in five 120-min sessions and the groups were re-evaluated. The obtained data were analyzed. Results: In the nursing group, paired t-test showed a significant difference in the scores of mental health (with 39 points decrease), physical signs (with 7.96 scores decrease), anxiety (with 10.75 scores decrease), social function (with 7.05 scores decrease) and depression (with 9.38 scores decrease). In the midwifery group, it showed a significant difference in mental health (with 22.63 scores decrease), physical signs (with 6.54 scores decrease), anxiety (with nine scores decrease), and depression (with 8.38 scores decrease). Conclusions: This study showed that NLP strategies are effective in the improvement of general health and its various dimensions. Therefore, it is essential to conduct structured and executive programs concerning NLP among the students. PMID:25400679

  5. Basic quantitative assessment of visual performance in patients with very low vision.

    PubMed

    Bach, Michael; Wilke, Michaela; Wilhelm, Barbara; Zrenner, Eberhart; Wilke, Robert

    2010-02-01

    A variety of approaches to developing visual prostheses are being pursued: subretinal, epiretinal, via the optic nerve, or via the visual cortex. This report presents a method of comparing their efficacy at genuinely improving visual function, starting at no light perception (NLP). A test battery (a computer program, Basic Assessment of Light and Motion [BaLM]) was developed in four basic visual dimensions: (1) light perception (light/no light), with an unstructured large-field stimulus; (2) temporal resolution, with single versus double flash discrimination; (3) localization of light, where a wedge extends from the center into four possible directions; and (4) motion, with a coarse pattern moving in one of four directions. Two- or four-alternative, forced-choice paradigms were used. The participants' responses were self-paced and delivered with a keypad. The feasibility of the BaLM was tested in 73 eyes of 51 patients with low vision. The light and time test modules discriminated between NLP and light perception (LP). The localization and motion modules showed no significant response for NLP but discriminated between LP and hand movement (HM). All four modules reached their ceilings in the acuity categories higher than HM. BaLM results systematically differed between the very-low-acuity categories NLP, LP, and HM. Light and time yielded similar results, as did localization and motion; still, for assessing the visual prostheses with differing temporal characteristics, they are not redundant. The results suggest that this simple test battery provides a quantitative assessment of visual function in the very-low-vision range from NLP to HM.

  6. An overview of computer-based natural language processing

    NASA Technical Reports Server (NTRS)

    Gevarter, W. B.

    1983-01-01

    Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  7. Teaching Assistants, Neuro-Linguistic Programming (NLP) and Special Educational Needs: "Reframing" the Learning Experience for Students with Mild SEN

    ERIC Educational Resources Information Center

    Kudliskis, Voldis

    2014-01-01

    This study examines how an understanding of two NLP concepts, the meta-model of language and the implementation of reframing, could be used to help teaching assistants enhance class-based interactions with students with mild SEN. Participants (students) completed a pre-intervention and a post-intervention "Beliefs About my Learning…

  8. Single-shot spectroscopy of broadband Yb fiber laser

    NASA Astrophysics Data System (ADS)

    Suzuki, Masayuki; Yoneya, Shin; Kuroda, Hiroto

    2017-02-01

    We have experimentally reported on a real-time single-shot spectroscopy of a broadband Yb-doped fiber (YDF) laser which based on a nonlinear polarization evolution by using a time-stretched dispersive Fourier transformation technique. We have measured an 8000 consecutive single-shot spectra of mode locking and noise-like pulse (NLP), because our developed broadband YDF oscillator can individually operate the mode locking and NLP by controlling a pump LD power and angle of waveplates. A shot-to-shot spectral fluctuation was observed in NLP. For the investigation of pulse formation dynamics, we have measured the spectral evolution in an initial fluctuations of mode locked broadband YDF laser at an intracavity dispersion of 1500 and 6200 fs2 for the first time. In both case, a build-up time between cw and steady-state mode locking was estimated to be 50 us, the dynamics of spectral evolution between cw and mode locking, however, was completely different. A shot-to-shot strong spectral fluctuation, as can be seen in NLP spectra, was observed in the initial timescale of 20 us at the intracavity dispersion of 1500 fs2. These new findings would impact on understanding the birth of the broadband spectral formation in fiber laser oscillator.

  9. Natural language processing-based COTS software and related technologies survey.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stickland, Michael G.; Conrad, Gregory N.; Eaton, Shelley M.

    Natural language processing-based knowledge management software, traditionally developed for security organizations, is now becoming commercially available. An informal survey was conducted to discover and examine current NLP and related technologies and potential applications for information retrieval, information extraction, summarization, categorization, terminology management, link analysis, and visualization for possible implementation at Sandia National Laboratories. This report documents our current understanding of the technologies, lists software vendors and their products, and identifies potential applications of these technologies.

  10. Training parents to use the natural language paradigm to increase their autistic children's speech.

    PubMed Central

    Laski, K E; Charlop, M H; Schreibman, L

    1988-01-01

    Parents of four nonverbal and four echolalic autistic children were trained to increase their children's speech by using the Natural Language Paradigm (NLP), a loosely structured procedure conducted in a play environment with a variety of toys. Parents were initially trained to use the NLP in a clinic setting, with subsequent parent-child speech sessions occurring at home. The results indicated that following training, parents increased the frequency with which they required their children to speak (i.e., modeled words and phrases, prompted answers to questions). Correspondingly, all children increased the frequency of their verbalizations in three nontraining settings. Thus, the NLP appears to be an efficacious program for parents to learn and use in the home to increase their children's speech. PMID:3225256

  11. A Sibling-Mediated Intervention for Children with Autism Spectrum Disorder: Using the Natural Language Paradigm (NLP).

    PubMed

    Spector, Vicki; Charlop, Marjorie H

    2018-05-01

    We taught three typically developing siblings to occasion speech by implementing the Natural Language Paradigm (NLP) with their brothers with autism spectrum disorder (ASD). A non-concurrent multiple baseline design across children with ASD and sibling dyads was used. Ancillary behaviors of happiness, play, and joint attention for the children with ASD were recorded. Generalization of speech for the children with ASD across setting and peers was also measured. During baseline, the children with ASD displayed few target speech behaviors and the siblings inconsistently occasioned speech from their brothers. After sibling training, however, they successfully delivered NLP, and in turn, for two of the brothers with ASD, speech reached criterion. Implications of this research suggest the inclusion of siblings in interventions.

  12. Text de-identification for privacy protection: a study of its impact on clinical text information content.

    PubMed

    Meystre, Stéphane M; Ferrández, Óscar; Friedlin, F Jeffrey; South, Brett R; Shen, Shuying; Samore, Matthew H

    2014-08-01

    As more and more electronic clinical information is becoming easier to access for secondary uses such as clinical research, approaches that enable faster and more collaborative research while protecting patient privacy and confidentiality are becoming more important. Clinical text de-identification offers such advantages but is typically a tedious manual process. Automated Natural Language Processing (NLP) methods can alleviate this process, but their impact on subsequent uses of the automatically de-identified clinical narratives has only barely been investigated. In the context of a larger project to develop and investigate automated text de-identification for Veterans Health Administration (VHA) clinical notes, we studied the impact of automated text de-identification on clinical information in a stepwise manner. Our approach started with a high-level assessment of clinical notes informativeness and formatting, and ended with a detailed study of the overlap of select clinical information types and Protected Health Information (PHI). To investigate the informativeness (i.e., document type information, select clinical data types, and interpretation or conclusion) of VHA clinical notes, we used five different existing text de-identification systems. The informativeness was only minimally altered by these systems while formatting was only modified by one system. To examine the impact of de-identification on clinical information extraction, we compared counts of SNOMED-CT concepts found by an open source information extraction application in the original (i.e., not de-identified) version of a corpus of VHA clinical notes, and in the same corpus after de-identification. Only about 1.2-3% less SNOMED-CT concepts were found in de-identified versions of our corpus, and many of these concepts were PHI that was erroneously identified as clinical information. To study this impact in more details and assess how generalizable our findings were, we examined the overlap between select clinical information annotated in the 2010 i2b2 NLP challenge corpus and automatic PHI annotations from our best-of-breed VHA clinical text de-identification system (nicknamed 'BoB'). Overall, only 0.81% of the clinical information exactly overlapped with PHI, and 1.78% partly overlapped. We conclude that automated text de-identification's impact on clinical information is small, but not negligible, and that improved clinical acronyms and eponyms disambiguation could significantly reduce this impact. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. Negation’s Not Solved: Generalizability Versus Optimizability in Clinical Natural Language Processing

    PubMed Central

    Wu, Stephen; Miller, Timothy; Masanz, James; Coarr, Matt; Halgrim, Scott; Carrell, David; Clark, Cheryl

    2014-01-01

    A review of published work in clinical natural language processing (NLP) may suggest that the negation detection task has been “solved.” This work proposes that an optimizable solution does not equal a generalizable solution. We introduce a new machine learning-based Polarity Module for detecting negation in clinical text, and extensively compare its performance across domains. Using four manually annotated corpora of clinical text, we show that negation detection performance suffers when there is no in-domain development (for manual methods) or training data (for machine learning-based methods). Various factors (e.g., annotation guidelines, named entity characteristics, the amount of data, and lexical and syntactic context) play a role in making generalizability difficult, but none completely explains the phenomenon. Furthermore, generalizability remains challenging because it is unclear whether to use a single source for accurate data, combine all sources into a single model, or apply domain adaptation methods. The most reliable means to improve negation detection is to manually annotate in-domain training data (or, perhaps, manually modify rules); this is a strategy for optimizing performance, rather than generalizing it. These results suggest a direction for future work in domain-adaptive and task-adaptive methods for clinical NLP. PMID:25393544

  14. Identifying Suicide Ideation and Suicidal Attempts in a Psychiatric Clinical Research Database using Natural Language Processing.

    PubMed

    Fernandes, Andrea C; Dutta, Rina; Velupillai, Sumithra; Sanyal, Jyoti; Stewart, Robert; Chandran, David

    2018-05-09

    Research into suicide prevention has been hampered by methodological limitations such as low sample size and recall bias. Recently, Natural Language Processing (NLP) strategies have been used with Electronic Health Records to increase information extraction from free text notes as well as structured fields concerning suicidality and this allows access to much larger cohorts than previously possible. This paper presents two novel NLP approaches - a rule-based approach to classify the presence of suicide ideation and a hybrid machine learning and rule-based approach to identify suicide attempts in a psychiatric clinical database. Good performance of the two classifiers in the evaluation study suggest they can be used to accurately detect mentions of suicide ideation and attempt within free-text documents in this psychiatric database. The novelty of the two approaches lies in the malleability of each classifier if a need to refine performance, or meet alternate classification requirements arises. The algorithms can also be adapted to fit infrastructures of other clinical datasets given sufficient clinical recording practice knowledge, without dependency on medical codes or additional data extraction of known risk factors to predict suicidal behaviour.

  15. Applying quality by design (QbD) concept for fabrication of chitosan coated nanoliposomes.

    PubMed

    Pandey, Abhijeet P; Karande, Kiran P; Sonawane, Raju O; Deshmukh, Prashant K

    2014-03-01

    In the present investigation, a quality by design (QbD) strategy was successfully applied to the fabrication of chitosan-coated nanoliposomes (CH-NLPs) encapsulating a hydrophilic drug. The effects of the processing variables on the particle size, encapsulation efficiency (%EE) and coating efficiency (%CE) of CH-NLPs (prepared using a modified ethanol injection method) were investigated. The concentrations of lipid, cholesterol, drug and chitosan; stirring speed, sonication time; organic:aqueous phase ratio; and temperature were identified as the key factors after risk analysis for conducting a screening design study. A separate study was designed to investigate the robustness of the predicted design space. The particle size, %EE and %CE of the optimized CH-NLPs were 111.3 nm, 33.4% and 35.2%, respectively. The observed responses were in accordance with the predicted response, which confirms the suitability and robustness of the design space for CH-NLP formulation. In conclusion, optimization of the selected key variables will help minimize the problems related to size, %EE and %CE that are generally encountered when scaling up processes for NLP formulations. The robustness of the design space will help minimize both intra-batch and inter-batch variations, which are quite common in the pharmaceutical industry.

  16. Multilingual Information Retrieval in Thoracic Radiology: Feasibility Study

    PubMed Central

    Castilla, André Coutinho; Furuie, Sérgio Shiguemi; Mendonça, Eneida A.

    2014-01-01

    Most of essential information contained on Electronic Medical Record is stored as text, imposing several difficulties on automated data extraction and retrieval. Natural language processing is an approach that can unlock clinical information from free texts. The proposed methodology uses the specialized natural language processor MEDLEE developed for English language. To use this processor on Portuguese medical texts, chest x-ray reports were Machine Translated into English. The result of serial coupling of MT an NLP is tagged text which needs further investigation for extracting clinical findings. The objective of this experiment was to investigate normal reports and reports with device description on a set of 165 chest x-ray reports. We obtained sensitivity and specificity of 1 and 0.71 for the first condition and 0.97 and 0.97 for the second respectively. The reference was formed by the opinion of two radiologists. The results of this experiment indicate the viability of extracting clinical findings from chest x-ray reports through coupling MT and NLP. PMID:17911745

  17. Kinase Pathway Database: An Integrated Protein-Kinase and NLP-Based Protein-Interaction Resource

    PubMed Central

    Koike, Asako; Kobayashi, Yoshiyuki; Takagi, Toshihisa

    2003-01-01

    Protein kinases play a crucial role in the regulation of cellular functions. Various kinds of information about these molecules are important for understanding signaling pathways and organism characteristics. We have developed the Kinase Pathway Database, an integrated database involving major completely sequenced eukaryotes. It contains the classification of protein kinases and their functional conservation, ortholog tables among species, protein–protein, protein–gene, and protein–compound interaction data, domain information, and structural information. It also provides an automatic pathway graphic image interface. The protein, gene, and compound interactions are automatically extracted from abstracts for all genes and proteins by natural-language processing (NLP).The method of automatic extraction uses phrase patterns and the GENA protein, gene, and compound name dictionary, which was developed by our group. With this database, pathways are easily compared among species using data with more than 47,000 protein interactions and protein kinase ortholog tables. The database is available for querying and browsing at http://kinasedb.ontology.ims.u-tokyo.ac.jp/. PMID:12799355

  18. TextHunter – A User Friendly Tool for Extracting Generic Concepts from Free Text in Clinical Research

    PubMed Central

    Jackson MSc, Richard G.; Ball, Michael; Patel, Rashmi; Hayes, Richard D.; Dobson, Richard J.B.; Stewart, Robert

    2014-01-01

    Observational research using data from electronic health records (EHR) is a rapidly growing area, which promises both increased sample size and data richness - therefore unprecedented study power. However, in many medical domains, large amounts of potentially valuable data are contained within the free text clinical narrative. Manually reviewing free text to obtain desired information is an inefficient use of researcher time and skill. Previous work has demonstrated the feasibility of applying Natural Language Processing (NLP) to extract information. However, in real world research environments, the demand for NLP skills outweighs supply, creating a bottleneck in the secondary exploitation of the EHR. To address this, we present TextHunter, a tool for the creation of training data, construction of concept extraction machine learning models and their application to documents. Using confidence thresholds to ensure high precision (>90%), we achieved recall measurements as high as 99% in real world use cases. PMID:25954379

  19. Bengali-English Relevant Cross Lingual Information Access Using Finite Automata

    NASA Astrophysics Data System (ADS)

    Banerjee, Avishek; Bhattacharyya, Swapan; Hazra, Simanta; Mondal, Shatabdi

    2010-10-01

    CLIR techniques searches unrestricted texts and typically extract term and relationships from bilingual electronic dictionaries or bilingual text collections and use them to translate query and/or document representations into a compatible set of representations with a common feature set. In this paper, we focus on dictionary-based approach by using a bilingual data dictionary with a combination to statistics-based methods to avoid the problem of ambiguity also the development of human computer interface aspects of NLP (Natural Language processing) is the approach of this paper. The intelligent web search with regional language like Bengali is depending upon two major aspect that is CLIA (Cross language information access) and NLP. In our previous work with IIT, KGP we already developed content based CLIA where content based searching in trained on Bengali Corpora with the help of Bengali data dictionary. Here we want to introduce intelligent search because to recognize the sense of meaning of a sentence and it has a better real life approach towards human computer interactions.

  20. Validation of Case Finding Algorithms for Hepatocellular Cancer from Administrative Data and Electronic Health Records using Natural Language Processing

    PubMed Central

    Sada, Yvonne; Hou, Jason; Richardson, Peter; El-Serag, Hashem; Davila, Jessica

    2013-01-01

    Background Accurate identification of hepatocellular cancer (HCC) cases from automated data is needed for efficient and valid quality improvement initiatives and research. We validated HCC ICD-9 codes, and evaluated whether natural language processing (NLP) by the Automated Retrieval Console (ARC) for document classification improves HCC identification. Methods We identified a cohort of patients with ICD-9 codes for HCC during 2005–2010 from Veterans Affairs administrative data. Pathology and radiology reports were reviewed to confirm HCC. The positive predictive value (PPV), sensitivity, and specificity of ICD-9 codes were calculated. A split validation study of pathology and radiology reports was performed to develop and validate ARC algorithms. Reports were manually classified as diagnostic of HCC or not. ARC generated document classification algorithms using the Clinical Text Analysis and Knowledge Extraction System. ARC performance was compared to manual classification. PPV, sensitivity, and specificity of ARC were calculated. Results 1138 patients with HCC were identified by ICD-9 codes. Based on manual review, 773 had HCC. The HCC ICD-9 code algorithm had a PPV of 0.67, sensitivity of 0.95, and specificity of 0.93. For a random subset of 619 patients, we identified 471 pathology reports for 323 patients and 943 radiology reports for 557 patients. The pathology ARC algorithm had PPV of 0.96, sensitivity of 0.96, and specificity of 0.97. The radiology ARC algorithm had PPV of 0.75, sensitivity of 0.94, and specificity of 0.68. Conclusion A combined approach of ICD-9 codes and NLP of pathology and radiology reports improves HCC case identification in automated data. PMID:23929403

  1. Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies

    PubMed Central

    Fan, Jung-Wei; Friedman, Carol

    2011-01-01

    Biomedical natural language processing (BioNLP) is a useful technique that unlocks valuable information stored in textual data for practice and/or research. Syntactic parsing is a critical component of BioNLP applications that rely on correctly determining the sentence and phrase structure of free text. In addition to dealing with the vast amount of domain-specific terms, a robust biomedical parser needs to model the semantic grammar to obtain viable syntactic structures. With either a rule-based or corpus-based approach, the grammar engineering process requires substantial time and knowledge from experts, and does not always yield a semantically transferable grammar. To reduce the human effort and to promote semantic transferability, we propose an automated method for deriving a probabilistic grammar based on a training corpus consisting of concept strings and semantic classes from the Unified Medical Language System (UMLS), a comprehensive terminology resource widely used by the community. The grammar is designed to specify noun phrases only due to the nominal nature of the majority of biomedical terminological concepts. Evaluated on manually parsed clinical notes, the derived grammar achieved a recall of 0.644, precision of 0.737, and average cross-bracketing of 0.61, which demonstrated better performance than a control grammar with the semantic information removed. Error analysis revealed shortcomings that could be addressed to improve performance. The results indicated the feasibility of an approach which automatically incorporates terminology semantics in the building of an operational grammar. Although the current performance of the unsupervised solution does not adequately replace manual engineering, we believe once the performance issues are addressed, it could serve as an aide in a semi-supervised solution. PMID:21549857

  2. Our personal space.

    PubMed

    Suthers, M

    2000-10-01

    Neuro Linguistic Programming (NLP) as a model of human behaviour is presented. Its basic tenets and the factors that give rise to the physiological and emotional response to an external event are described. A number of psychotherapeutic interventions are also described, along with the influence of NLP on sporting and academic success. Finally, an exploration of these ideas for the purpose of contributing to personal well-being is given.

  3. Enhancing the photoelectrochemical response of TiO2 nanotubes through their nanodecoration by pulsed-laser-deposited Ag nanoparticles

    NASA Astrophysics Data System (ADS)

    Trabelsi, K.; Hajjaji, A.; Gaidi, M.; Bessais, B.; El Khakani, M. A.

    2017-08-01

    We report on the pulsed laser deposition (PLD) based nanodecoration of titanium dioxide (TiO2) nanotube arrays (NTAs) by Ag nanoparticles (NPs). We focus here on the investigation of the effect of the number of laser ablation pulses (NLP) of the silver target on both the average size of the Ag-NPs and the photoelectrochemical conversion efficiency of the Ag-NP decorated TiO2-NT based photoanodes. By varying the NLP, we were able to not only control the size of the PLD-deposited Ag nanoparticles from 20 to ˜50 nm, but also to increase concomitantly the surface coverage of the TiO2 NTAs by Ag-NPs. The red-shifting of the surface plasmon resonance peak of the PLD-deposited Ag-NPs deposited onto quartz substrates confirmed the increase of their size as the NLP is increased from 500 to 10 000. By investigating the photo-electrochemical properties of Ag-NP decorated TiO2-NTAs, by means of linear sweep cyclic voltammetry under UV-Vis illumination, we found that the generated photocurrent is sensitive to the size of the Ag-NPs and reaches a maximum value at NLP =500 (i.e.,; Ag-NP size of ˜20 nm). For NLP = 500, the photoconversion efficiency of the Ag-NP decorated TiO2-NTAs is shown to reach a maximum of 4.5% (at 0.5 V vs Ag/AgCl). The photocurrent enhancement of Ag-NP decorated TiO2-NTAs is believed to result from the additional light harvesting enabled by the ability of Ag-NPs to absorb visible irradiation caused by various localized surface plasmon resonances, which in turn depend on the size and interdistance of the Ag nanoparticles.

  4. The nucleoplasmin homolog NLP mediates centromere clustering and anchoring to the nucleolus.

    PubMed

    Padeken, Jan; Mendiburo, María José; Chlamydas, Sarantis; Schwarz, Hans-Jürgen; Kremmer, Elisabeth; Heun, Patrick

    2013-04-25

    Centromere clustering during interphase is a phenomenon known to occur in many different organisms and cell types, yet neither the factors involved nor their physiological relevance is well understood. Using Drosophila tissue culture cells and flies, we identified a network of proteins, including the nucleoplasmin-like protein (NLP), the insulator protein CTCF, and the nucleolus protein Modulo, to be essential for the positioning of centromeres. Artificial targeting further demonstrated that NLP and CTCF are sufficient for clustering, while Modulo serves as the anchor to the nucleolus. Centromere clustering was found to depend on centric chromatin rather than specific DNA sequences. Moreover, unclustering of centromeres results in the spatial destabilization of pericentric heterochromatin organization, leading to partial defects in the silencing of repetitive elements, defects during chromosome segregation, and genome instability. Copyright © 2013 Elsevier Inc. All rights reserved.

  5. Controlling the vocabulary for anatomy.

    PubMed Central

    Baud, R. H.; Lovis, C.; Rassinoux, A. M.; Ruch, P.; Geissbuhler, A.

    2002-01-01

    When confronted with the representation of human anatomy, natural language processing (NLP) system designers are facing an unsolved and frequent problem: the lack of a suitable global reference. The available sources in electronic format are numerous, but none fits adequately all the constraints and needs of language analysis. These sources are usually incomplete, difficult to use or tailored to specific needs. The anatomist's or ontologist's view does not necessarily match that of the linguist. The purpose of this paper is to review most recognized sources of knowledge in anatomy usable for linguistic analysis. Their potential and limits are emphasized according to this point of view. Focus is given on the role of the consensus work of the International Federation of Associations of Anatomists (IFAA) giving the Terminologia Anatomica. PMID:12463780

  6. Automated concept-level information extraction to reduce the need for custom software and rules development.

    PubMed

    D'Avolio, Leonard W; Nguyen, Thien M; Goryachev, Sergey; Fiore, Louis D

    2011-01-01

    Despite at least 40 years of promising empirical performance, very few clinical natural language processing (NLP) or information extraction systems currently contribute to medical science or care. The authors address this gap by reducing the need for custom software and rules development with a graphical user interface-driven, highly generalizable approach to concept-level retrieval. A 'learn by example' approach combines features derived from open-source NLP pipelines with open-source machine learning classifiers to automatically and iteratively evaluate top-performing configurations. The Fourth i2b2/VA Shared Task Challenge's concept extraction task provided the data sets and metrics used to evaluate performance. Top F-measure scores for each of the tasks were medical problems (0.83), treatments (0.82), and tests (0.83). Recall lagged precision in all experiments. Precision was near or above 0.90 in all tasks. Discussion With no customization for the tasks and less than 5 min of end-user time to configure and launch each experiment, the average F-measure was 0.83, one point behind the mean F-measure of the 22 entrants in the competition. Strong precision scores indicate the potential of applying the approach for more specific clinical information extraction tasks. There was not one best configuration, supporting an iterative approach to model creation. Acceptable levels of performance can be achieved using fully automated and generalizable approaches to concept-level information extraction. The described implementation and related documentation is available for download.

  7. Optimization-Based Selection of Influential Agents in a Rural Afghan Social Network

    DTIC Science & Technology

    2010-06-01

    nonlethal targeting model, a nonlinear programming ( NLP ) optimization formulation that identifies the k US agent assignment strategy producing the greatest...leader social network, and 3) the nonlethal targeting model, a nonlinear programming ( NLP ) optimization formulation that identifies the k US agent...NATO Coalition in Afghanistan. 55 for Afghanistan ( [54], [31], [48], [55], [30]). While Arab tribes tend to be more hierarchical, Pashtun tribes are

  8. Using electronic medical records to enable large-scale studies in psychiatry: treatment resistant depression as a model

    PubMed Central

    Perlis, R. H.; Iosifescu, D. V.; Castro, V. M.; Murphy, S. N.; Gainer, V. S.; Minnier, J.; Cai, T.; Goryachev, S.; Zeng, Q.; Gallagher, P. J.; Fava, M.; Weilburg, J. B.; Churchill, S. E.; Kohane, I. S.; Smoller, J. W.

    2013-01-01

    Background Electronic medical records (EMR) provide a unique opportunity for efficient, large-scale clinical investigation in psychiatry. However, such studies will require development of tools to define treatment outcome. Method Natural language processing (NLP) was applied to classify notes from 127 504 patients with a billing diagnosis of major depressive disorder, drawn from out-patient psychiatry practices affiliated with multiple, large New England hospitals. Classifications were compared with results using billing data (ICD-9 codes) alone and to a clinical gold standard based on chart review by a panel of senior clinicians. These cross-sectional classifications were then used to define longitudinal treatment outcomes, which were compared with a clinician-rated gold standard. Results Models incorporating NLP were superior to those relying on billing data alone for classifying current mood state (area under receiver operating characteristic curve of 0.85–0.88 v. 0.54–0.55). When these cross-sectional visits were integrated to define longitudinal outcomes and incorporate treatment data, 15% of the cohort remitted with a single antidepressant treatment, while 13% were identified as failing to remit despite at least two antidepressant trials. Non-remitting patients were more likely to be non-Caucasian (p<0.001). Conclusions The application of bioinformatics tools such as NLP should enable accurate and efficient determination of longitudinal outcomes, enabling existing EMR data to be applied to clinical research, including biomarker investigations. Continued development will be required to better address moderators of outcome such as adherence and co-morbidity. PMID:21682950

  9. Creation of a simple natural language processing tool to support an imaging utilization quality dashboard.

    PubMed

    Swartz, Jordan; Koziatek, Christian; Theobald, Jason; Smith, Silas; Iturrate, Eduardo

    2017-05-01

    Testing for venous thromboembolism (VTE) is associated with cost and risk to patients (e.g. radiation). To assess the appropriateness of imaging utilization at the provider level, it is important to know that provider's diagnostic yield (percentage of tests positive for the diagnostic entity of interest). However, determining diagnostic yield typically requires either time-consuming, manual review of radiology reports or the use of complex and/or proprietary natural language processing software. The objectives of this study were twofold: 1) to develop and implement a simple, user-configurable, and open-source natural language processing tool to classify radiology reports with high accuracy and 2) to use the results of the tool to design a provider-specific VTE imaging dashboard, consisting of both utilization rate and diagnostic yield. Two physicians reviewed a training set of 400 lower extremity ultrasound (UTZ) and computed tomography pulmonary angiogram (CTPA) reports to understand the language used in VTE-positive and VTE-negative reports. The insights from this review informed the arguments to the five modifiable parameters of the NLP tool. A validation set of 2,000 studies was then independently classified by the reviewers and by the tool; the classifications were compared and the performance of the tool was calculated. The tool was highly accurate in classifying the presence and absence of VTE for both the UTZ (sensitivity 95.7%; 95% CI 91.5-99.8, specificity 100%; 95% CI 100-100) and CTPA reports (sensitivity 97.1%; 95% CI 94.3-99.9, specificity 98.6%; 95% CI 97.8-99.4). The diagnostic yield was then calculated at the individual provider level and the imaging dashboard was created. We have created a novel NLP tool designed for users without a background in computer programming, which has been used to classify venous thromboembolism reports with a high degree of accuracy. The tool is open-source and available for download at http://iturrate.com/simpleNLP. Results obtained using this tool can be applied to enhance quality by presenting information about utilization and yield to providers via an imaging dashboard. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Applying Natural Language Processing to Understand Motivational Profiles for Maintaining Physical Activity After a Mobile App and Accelerometer-Based Intervention: The mPED Randomized Controlled Trial.

    PubMed

    Fukuoka, Yoshimi; Lindgren, Teri G; Mintz, Yonatan Dov; Hooper, Julie; Aswani, Anil

    2018-06-20

    Regular physical activity is associated with reduced risk of chronic illnesses. Despite various types of successful physical activity interventions, maintenance of activity over the long term is extremely challenging. The aims of this original paper are to 1) describe physical activity engagement post intervention, 2) identify motivational profiles using natural language processing (NLP) and clustering techniques in a sample of women who completed the physical activity intervention, and 3) compare sociodemographic and clinical data among these identified cluster groups. In this cross-sectional analysis of 203 women completing a 12-month study exit (telephone) interview in the mobile phone-based physical activity education study were examined. The mobile phone-based physical activity education study was a randomized, controlled trial to test the efficacy of the app and accelerometer intervention and its sustainability over a 9-month period. All subjects returned the accelerometer and stopped accessing the app at the last 9-month research office visit. Physical engagement and motivational profiles were assessed by both closed and open-ended questions, such as "Since your 9-month study visit, has your physical activity been more, less, or about the same (compared to the first 9 months of the study)?" and, "What motivates you the most to be physically active?" NLP and cluster analysis were used to classify motivational profiles. Descriptive statistics were used to compare participants' baseline characteristics among identified groups. Approximately half of the 2 intervention groups (Regular and Plus) reported that they were still wearing an accelerometer and engaging in brisk walking as they were directed during the intervention phases. These numbers in the 2 intervention groups were much higher than the control group (overall P=.01 and P=.003, respectively). Three clusters were identified through NLP and named as the Weight Loss group (n=19), the Illness Prevention group (n=138), and the Health Promotion group (n=46). The Weight Loss group was significantly younger than the Illness Prevention and Health Promotion groups (overall P<.001). The Illness Prevention group had a larger number of Caucasians as compared to the Weight Loss group (P=.001), which was composed mostly of those who identified as African American, Hispanic, or mixed race. Additionally, the Health Promotion group tended to have lower BMI scores compared to the Illness Prevention group (overall P=.02). However, no difference was noted in the baseline moderate-to-vigorous intensity activity level among the 3 groups (overall P>.05). The findings could be relevant to tailoring a physical activity maintenance intervention. Furthermore, the findings from NLP and cluster analysis are useful methods to analyze short free text to differentiate motivational profiles. As more sophisticated NL tools are developed in the future, the potential of NLP application in behavioral research will broaden. ClinicalTrials.gov NCT01280812; https://clinicaltrials.gov/ct2/show/NCT01280812 (Archived by WebCite at http://www.webcitation.org/70IkGagAJ). ©Yoshimi Fukuoka, Teri G Lindgren, Yonatan Dov Mintz, Julie Hooper, Anil Aswani. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 20.06.2018.

  11. An Intelligent Decision Support System for Workforce Forecast

    DTIC Science & Technology

    2011-01-01

    ARIMA ) model to forecast the demand for construction skills in Hong Kong. This model was based...Decision Trees ARIMA Rule Based Forecasting Segmentation Forecasting Regression Analysis Simulation Modeling Input-Output Models LP and NLP Markovian...data • When results are needed as a set of easily interpretable rules 4.1.4 ARIMA Auto-regressive, integrated, moving-average ( ARIMA ) models

  12. Using Machine Learning and Natural Language Processing Algorithms to Automate the Evaluation of Clinical Decision Support in Electronic Medical Record Systems.

    PubMed

    Szlosek, Donald A; Ferrett, Jonathan

    2016-01-01

    As the number of clinical decision support systems (CDSSs) incorporated into electronic medical records (EMRs) increases, so does the need to evaluate their effectiveness. The use of medical record review and similar manual methods for evaluating decision rules is laborious and inefficient. The authors use machine learning and Natural Language Processing (NLP) algorithms to accurately evaluate a clinical decision support rule through an EMR system, and they compare it against manual evaluation. Modeled after the EMR system EPIC at Maine Medical Center, we developed a dummy data set containing physician notes in free text for 3,621 artificial patients records undergoing a head computed tomography (CT) scan for mild traumatic brain injury after the incorporation of an electronic best practice approach. We validated the accuracy of the Best Practice Advisories (BPA) using three machine learning algorithms-C-Support Vector Classification (SVC), Decision Tree Classifier (DecisionTreeClassifier), k-nearest neighbors classifier (KNeighborsClassifier)-by comparing their accuracy for adjudicating the occurrence of a mild traumatic brain injury against manual review. We then used the best of the three algorithms to evaluate the effectiveness of the BPA, and we compared the algorithm's evaluation of the BPA to that of manual review. The electronic best practice approach was found to have a sensitivity of 98.8 percent (96.83-100.0), specificity of 10.3 percent, PPV = 7.3 percent, and NPV = 99.2 percent when reviewed manually by abstractors. Though all the machine learning algorithms were observed to have a high level of prediction, the SVC displayed the highest with a sensitivity 93.33 percent (92.49-98.84), specificity of 97.62 percent (96.53-98.38), PPV = 50.00, NPV = 99.83. The SVC algorithm was observed to have a sensitivity of 97.9 percent (94.7-99.86), specificity 10.30 percent, PPV 7.25 percent, and NPV 99.2 percent for evaluating the best practice approach, after accounting for 17 cases (0.66 percent) where the patient records had to be reviewed manually due to the NPL systems inability to capture the proper diagnosis. CDSSs incorporated into EMRs can be evaluated in an automatic fashion by using NLP and machine learning techniques.

  13. Transfer Learning for Adaptive Relation Extraction

    DTIC Science & Technology

    2011-09-13

    other NLP tasks, however, supervised learning approach fails when there is not a sufficient amount of labeled data for training, which is often the case...always 12 Syntactic Pattern Relation Instance Relation Type (Subtype) arg-2 arg-1 Arab leaders OTHER-AFF (Ethnic) his father PER-SOC (Family) South...for x. For sequence labeling tasks in NLP , linear-chain conditional random field has been rather suc- cessful. It is an undirected graphical model in

  14. An exploratory study of neuro linguistic programming and communication anxiety

    NASA Astrophysics Data System (ADS)

    Brunner, Lois M.

    1993-12-01

    This thesis is an exploratory study of Neuro-Linguistic Programming (NLP), and its capabilities to provide a technique or a composite technique that will reduce the anxiety associated with making an oral brief or presentation before a group, sometimes referred to as Communication Apprehension. The composite technique comes from NLP and Time Line Therapy, which is an extension to NLP. Student volunteers (17) from a Communications course given by the Administrative Sciences Department were taught this technique. For each volunteer, an informational oral presentation was made and videotaped before the training and another informational oral presentation made and videotaped following the training. The before and after training presentations for each individual volunteer were evaluated against criteria for communications anxiety and analyzed to determine if there was a noticeable reduction of anxiety after the training. Anxiety was reduced in all of the volunteers in this study.

  15. Global optimization algorithm for heat exchanger networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Quesada, I.; Grossmann, I.E.

    This paper deals with the global optimization of heat exchanger networks with fixed topology. It is shown that if linear area cost functions are assumed, as well as arithmetic mean driving force temperature differences in networks with isothermal mixing, the corresponding nonlinear programming (NLP) optimization problem involves linear constraints and a sum of linear fractional functions in the objective which are nonconvex. A rigorous algorithm is proposed that is based on a convex NLP underestimator that involves linear and nonlinear estimators for fractional and bilinear terms which provide a tight lower bound to the global optimum. This NLP problem ismore » used within a spatial branch and bound method for which branching rules are given. Basic properties of the proposed method are presented, and its application is illustrated with several example problems. The results show that the proposed method only requires few nodes in the branch and bound search.« less

  16. Angular momentum projection for a Nilsson mean-field plus pairing model

    NASA Astrophysics Data System (ADS)

    Wang, Yin; Pan, Feng; Launey, Kristina D.; Luo, Yan-An; Draayer, J. P.

    2016-06-01

    The angular momentum projection for the axially deformed Nilsson mean-field plus a modified standard pairing (MSP) or the nearest-level pairing (NLP) model is proposed. Both the exact projection, in which all intrinsic states are taken into consideration, and the approximate projection, in which only intrinsic states with K = 0 are taken in the projection, are considered. The analysis shows that the approximate projection with only K = 0 intrinsic states seems reasonable, of which the configuration subspace considered is greatly reduced. As simple examples for the model application, low-lying spectra and electromagnetic properties of 18O and 18Ne are described by using both the exact and approximate angular momentum projection of the MSP or the NLP, while those of 20Ne and 24Mg are described by using the approximate angular momentum projection of the MSP or NLP.

  17. Double Parton Fragmentation Function and its Evolution in Quarkonium Production

    NASA Astrophysics Data System (ADS)

    Kang, Zhong-Bo

    2014-01-01

    We summarize the results of a recent study on a new perturbative QCD factorization formalism for the production of heavy quarkonia of large transverse momentum pT at collider energies. Such a new factorization formalism includes both the leading power (LP) and next-to-leading power (NLP) contributions to the cross section in the mQ2/p_T^2 expansion for heavy quark mass mQ. For the NLP contribution, the so-called double parton fragmentation functions are involved, whose evolution equations have been derived. We estimate fragmentation functions in the non-relativistic QCD formalism, and found that their contribution reproduce the bulk of the large enhancement found in explicit NLO calculations in the color singlet model. Heavy quarkonia produced from NLP channels prefer longitudinal polarization, in contrast to the single parton fragmentation function. This might shed some light on the heavy quarkonium polarization puzzle.

  18. Automatic detection of protected health information from clinic narratives.

    PubMed

    Yang, Hui; Garibaldi, Jonathan M

    2015-12-01

    This paper presents a natural language processing (NLP) system that was designed to participate in the 2014 i2b2 de-identification challenge. The challenge task aims to identify and classify seven main Protected Health Information (PHI) categories and 25 associated sub-categories. A hybrid model was proposed which combines machine learning techniques with keyword-based and rule-based approaches to deal with the complexity inherent in PHI categories. Our proposed approaches exploit a rich set of linguistic features, both syntactic and word surface-oriented, which are further enriched by task-specific features and regular expression template patterns to characterize the semantics of various PHI categories. Our system achieved promising accuracy on the challenge test data with an overall micro-averaged F-measure of 93.6%, which was the winner of this de-identification challenge. Copyright © 2015 Elsevier Inc. All rights reserved.

  19. An Investigation of the "e-rater"® Automated Scoring Engine's Grammar, Usage, Mechanics, and Style Microfeatures and Their Aggregation Model. Research Report. ETS RR-17-04

    ERIC Educational Resources Information Center

    Chen, Jing; Zhang, Mo; Bejar, Isaac I.

    2017-01-01

    Automated essay scoring (AES) generally computes essay scores as a function of macrofeatures derived from a set of microfeatures extracted from the text using natural language processing (NLP). In the "e-rater"® automated scoring engine, developed at "Educational Testing Service" (ETS) for the automated scoring of essays, each…

  20. Acquiring Information from Wider Scope to Improve Event Extraction

    DTIC Science & Technology

    2012-05-01

    solve all the problems might be hard or even impossible: Word sense disambiguation is already a hard NLP task, and normalizing different expressions...blindfolded woman seen being shot in the head by a hooded militant on a video obtained but not aired by the Arab television station Al-Jazeera. She...imbalance Why are we interested in unsupervised topic features? There is a problem that arises in the evaluation of almost all the tasks in NLP , concerning

Top