knowledge extraction system: Topics by Science.gov

Sample records for knowledge extraction system

Knowledge-Driven Event Extraction in Russian: Corpus-Based Linguistic Resources

PubMed Central

Solovyev, Valery; Ivanov, Vladimir

2016-01-01

Automatic event extraction form text is an important step in knowledge acquisition and knowledge base population. Manual work in development of extraction system is indispensable either in corpus annotation or in vocabularies and pattern creation for a knowledge-based system. Recent works have been focused on adaptation of existing system (for extraction from English texts) to new domains. Event extraction in other languages was not studied due to the lack of resources and algorithms necessary for natural language processing. In this paper we define a set of linguistic resources that are necessary in development of a knowledge-based event extraction system in Russian: a vocabulary of subordination models, a vocabulary of event triggers, and a vocabulary of Frame Elements that are basic building blocks for semantic patterns. We propose a set of methods for creation of such vocabularies in Russian and other languages using Google Books NGram Corpus. The methods are evaluated in development of event extraction system for Russian. PMID:26955386
Background Knowledge in Learning-Based Relation Extraction

ERIC Educational Resources Information Center

Do, Quang Xuan

2012-01-01

In this thesis, we study the importance of background knowledge in relation extraction systems. We not only demonstrate the benefits of leveraging background knowledge to improve the systems' performance but also propose a principled framework that allows one to effectively incorporate knowledge into statistical machine learning models for…
FEX: A Knowledge-Based System For Planimetric Feature Extraction

NASA Astrophysics Data System (ADS)

Zelek, John S.

1988-10-01

Topographical planimetric features include natural surfaces (rivers, lakes) and man-made surfaces (roads, railways, bridges). In conventional planimetric feature extraction, a photointerpreter manually interprets and extracts features from imagery on a stereoplotter. Visual planimetric feature extraction is a very labour intensive operation. The advantages of automating feature extraction include: time and labour savings; accuracy improvements; and planimetric data consistency. FEX (Feature EXtraction) combines techniques from image processing, remote sensing and artificial intelligence for automatic feature extraction. The feature extraction process co-ordinates the information and knowledge in a hierarchical data structure. The system simulates the reasoning of a photointerpreter in determining the planimetric features. Present efforts have concentrated on the extraction of road-like features in SPOT imagery. Keywords: Remote Sensing, Artificial Intelligence (AI), SPOT, image understanding, knowledge base, apars.
Using decision-tree classifier systems to extract knowledge from databases

NASA Technical Reports Server (NTRS)

St.clair, D. C.; Sabharwal, C. L.; Hacke, Keith; Bond, W. E.

1990-01-01

One difficulty in applying artificial intelligence techniques to the solution of real world problems is that the development and maintenance of many AI systems, such as those used in diagnostics, require large amounts of human resources. At the same time, databases frequently exist which contain information about the process(es) of interest. Recently, efforts to reduce development and maintenance costs of AI systems have focused on using machine learning techniques to extract knowledge from existing databases. Research is described in the area of knowledge extraction using a class of machine learning techniques called decision-tree classifier systems. Results of this research suggest ways of performing knowledge extraction which may be applied in numerous situations. In addition, a measurement called the concept strength metric (CSM) is described which can be used to determine how well the resulting decision tree can differentiate between the concepts it has learned. The CSM can be used to determine whether or not additional knowledge needs to be extracted from the database. An experiment involving real world data is presented to illustrate the concepts described.
An information extraction framework for cohort identification using electronic health records.

PubMed

Liu, Hongfang; Bielinski, Suzette J; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B; Jonnalagadda, Siddhartha R; Ravikumar, K E; Wu, Stephen T; Kullo, Iftikhar J; Chute, Christopher G

2013-01-01

Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework.
Knowledge extraction from evolving spiking neural networks with rank order population coding.

PubMed

Soltic, Snjezana; Kasabov, Nikola

2010-12-01

This paper demonstrates how knowledge can be extracted from evolving spiking neural networks with rank order population coding. Knowledge discovery is a very important feature of intelligent systems. Yet, a disproportionally small amount of research is centered on the issue of knowledge extraction from spiking neural networks which are considered to be the third generation of artificial neural networks. The lack of knowledge representation compatibility is becoming a major detriment to end users of these networks. We show that a high-level knowledge can be obtained from evolving spiking neural networks. More specifically, we propose a method for fuzzy rule extraction from an evolving spiking network with rank order population coding. The proposed method was used for knowledge discovery on two benchmark taste recognition problems where the knowledge learnt by an evolving spiking neural network was extracted in the form of zero-order Takagi-Sugeno fuzzy IF-THEN rules.
Using Best Practices to Extract, Organize, and Reuse Embedded Decision Support Content Knowledge Rules from Mature Clinical Systems.

PubMed

DesAutels, Spencer J; Fox, Zachary E; Giuse, Dario A; Williams, Annette M; Kou, Qing-Hua; Weitkamp, Asli; Neal R, Patel; Bettinsoli Giuse, Nunzia

2016-01-01

Clinical decision support (CDS) knowledge, embedded over time in mature medical systems, presents an interesting and complex opportunity for information organization, maintenance, and reuse. To have a holistic view of all decision support requires an in-depth understanding of each clinical system as well as expert knowledge of the latest evidence. This approach to clinical decision support presents an opportunity to unify and externalize the knowledge within rules-based decision support. Driven by an institutional need to prioritize decision support content for migration to new clinical systems, the Center for Knowledge Management and Health Information Technology teams applied their unique expertise to extract content from individual systems, organize it through a single extensible schema, and present it for discovery and reuse through a newly created Clinical Support Knowledge Acquisition and Archival Tool (CS-KAAT). CS-KAAT can build and maintain the underlying knowledge infrastructure needed by clinical systems.
An Information Extraction Framework for Cohort Identification Using Electronic Health Records

PubMed Central

Liu, Hongfang; Bielinski, Suzette J.; Sohn, Sunghwan; Murphy, Sean; Wagholikar, Kavishwar B.; Jonnalagadda, Siddhartha R.; Ravikumar, K.E.; Wu, Stephen T.; Kullo, Iftikhar J.; Chute, Christopher G

Information extraction (IE), a natural language processing (NLP) task that automatically extracts structured or semi-structured information from free text, has become popular in the clinical domain for supporting automated systems at point-of-care and enabling secondary use of electronic health records (EHRs) for clinical and translational research. However, a high performance IE system can be very challenging to construct due to the complexity and dynamic nature of human language. In this paper, we report an IE framework for cohort identification using EHRs that is a knowledge-driven framework developed under the Unstructured Information Management Architecture (UIMA). A system to extract specific information can be developed by subject matter experts through expert knowledge engineering of the externalized knowledge resources used in the framework. PMID:24303255
KAM (Knowledge Acquisition Module): A tool to simplify the knowledge acquisition process

NASA Technical Reports Server (NTRS)

Gettig, Gary A.

1988-01-01

Analysts, knowledge engineers and information specialists are faced with increasing volumes of time-sensitive data in text form, either as free text or highly structured text records. Rapid access to the relevant data in these sources is essential. However, due to the volume and organization of the contents, and limitations of human memory and association, frequently: (1) important information is not located in time; (2) reams of irrelevant data are searched; and (3) interesting or critical associations are missed due to physical or temporal gaps involved in working with large files. The Knowledge Acquisition Module (KAM) is a microcomputer-based expert system designed to assist knowledge engineers, analysts, and other specialists in extracting useful knowledge from large volumes of digitized text and text-based files. KAM formulates non-explicit, ambiguous, or vague relations, rules, and facts into a manageable and consistent formal code. A library of system rules or heuristics is maintained to control the extraction of rules, relations, assertions, and other patterns from the text. These heuristics can be added, deleted or customized by the user. The user can further control the extraction process with optional topic specifications. This allows the user to cluster extracts based on specific topics. Because KAM formalizes diverse knowledge, it can be used by a variety of expert systems and automated reasoning applications. KAM can also perform important roles in computer-assisted training and skill development. Current research efforts include the applicability of neural networks to aid in the extraction process and the conversion of these extracts into standard formats.
Automated extraction of knowledge for model-based diagnostics

NASA Technical Reports Server (NTRS)

Gonzalez, Avelino J.; Myler, Harley R.; Towhidnejad, Massood; Mckenzie, Frederic D.; Kladke, Robin R.

1990-01-01

The concept of accessing computer aided design (CAD) design databases and extracting a process model automatically is investigated as a possible source for the generation of knowledge bases for model-based reasoning systems. The resulting system, referred to as automated knowledge generation (AKG), uses an object-oriented programming structure and constraint techniques as well as internal database of component descriptions to generate a frame-based structure that describes the model. The procedure has been designed to be general enough to be easily coupled to CAD systems that feature a database capable of providing label and connectivity data from the drawn system. The AKG system is capable of defining knowledge bases in formats required by various model-based reasoning tools.
Using Best Practices to Extract, Organize, and Reuse Embedded Decision Support Content Knowledge Rules from Mature Clinical Systems

PubMed Central

DesAutels, Spencer J.; Fox, Zachary E.; Giuse, Dario A.; Williams, Annette M.; Kou, Qing-hua; Weitkamp, Asli; Neal R, Patel; Bettinsoli Giuse, Nunzia

2016-01-01

Clinical decision support (CDS) knowledge, embedded over time in mature medical systems, presents an interesting and complex opportunity for information organization, maintenance, and reuse. To have a holistic view of all decision support requires an in-depth understanding of each clinical system as well as expert knowledge of the latest evidence. This approach to clinical decision support presents an opportunity to unify and externalize the knowledge within rules-based decision support. Driven by an institutional need to prioritize decision support content for migration to new clinical systems, the Center for Knowledge Management and Health Information Technology teams applied their unique expertise to extract content from individual systems, organize it through a single extensible schema, and present it for discovery and reuse through a newly created Clinical Support Knowledge Acquisition and Archival Tool (CS-KAAT). CS-KAAT can build and maintain the underlying knowledge infrastructure needed by clinical systems. PMID:28269846
OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression

PubMed Central

Hunter, Lawrence; Lu, Zhiyong; Firby, James; Baumgartner, William A; Johnson, Helen L; Ogren, Philip V; Cohen, K Bretonnel

2008-01-01

Background Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering. Results OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. Conclusion OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. The open source OpenDMAP code library is freely available at PMID:18237434
Reactive extraction at liquid-liquid systems

NASA Astrophysics Data System (ADS)

Wieszczycka, Karolina

2018-01-01

The chapter summarizes the state of knowledge about a metal transport in two-phase system. The first part of this review focuses on the distribution law and main factors determination in classical solvent extraction (solubility and polarity of the solute, as well as inter- and intramolecules interaction. Next part of the chapter is devoted to the reactive solvent extraction and the molecular modeling requiring knowledge on type of extractants, complexation mechanisms, metals ions speciation and oxidation during complexes forming, and other parameters that enable to understand the extraction process. Also the kinetic data that is needed for proper modeling, simulation and design of processes needed for critical separations are discussed. Extraction at liquid-solid system using solvent impregnated resins is partially identical as in the case of the corresponding solvent extraction, therefore this subject was also presented in all aspects of separation process (equilibrium, mechanism, kinetics).
Enhancing acronym/abbreviation knowledge bases with semantic information.

PubMed

Torii, Manabu; Liu, Hongfang

2007-10-11

In the biomedical domain, a terminology knowledge base that associates acronyms/abbreviations (denoted as SFs) with the definitions (denoted as LFs) is highly needed. For the construction such terminology knowledge base, we investigate the feasibility to build a system automatically assigning semantic categories to LFs extracted from text. Given a collection of pairs (SF,LF) derived from text, we i) assess the coverage of LFs and pairs (SF,LF) in the UMLS and justify the need of a semantic category assignment system; and ii) automatically derive name phrases annotated with semantic category and construct a system using machine learning. Utilizing ADAM, an existing collection of (SF,LF) pairs extracted from MEDLINE, our system achieved an f-measure of 87% when assigning eight UMLS-based semantic groups to LFs. The system has been incorporated into a web interface which integrates SF knowledge from multiple SF knowledge bases. Web site: http://gauss.dbb.georgetown.edu/liblab/SFThesurus.
Extracting semantically enriched events from biomedical literature

PubMed Central

2012-01-01

Background Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them. Results Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP’09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP’09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task. Conclusions We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions, dealing with phenomena such as rhetorical intent, speculations, contradictions and negations. This finer grained search functionality can assist in several important tasks, e.g., database curation (by locating new experimental knowledge) and pathway enrichment (by providing information for inference). To allow easy integration into text mining systems, EventMine-MK is provided as a UIMA component that can be used in the interoperable text mining infrastructure, U-Compare. PMID:22621266
Extracting semantically enriched events from biomedical literature.

PubMed

Miwa, Makoto; Thompson, Paul; McNaught, John; Kell, Douglas B; Ananiadou, Sophia

2012-05-23

Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them. Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP'09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP'09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task. We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions, dealing with phenomena such as rhetorical intent, speculations, contradictions and negations. This finer grained search functionality can assist in several important tasks, e.g., database curation (by locating new experimental knowledge) and pathway enrichment (by providing information for inference). To allow easy integration into text mining systems, EventMine-MK is provided as a UIMA component that can be used in the interoperable text mining infrastructure, U-Compare.
A Proposal of 3-dimensional Self-organizing Memory and Its Application to Knowledge Extraction from Natural Language

NASA Astrophysics Data System (ADS)

Sakakibara, Kai; Hagiwara, Masafumi

In this paper, we propose a 3-dimensional self-organizing memory and describe its application to knowledge extraction from natural language. First, the proposed system extracts a relation between words by JUMAN (morpheme analysis system) and KNP (syntax analysis system), and stores it in short-term memory. In the short-term memory, the relations are attenuated with the passage of processing. However, the relations with high frequency of appearance are stored in the long-term memory without attenuation. The relations in the long-term memory are placed to the proposed 3-dimensional self-organizing memory. We used a new learning algorithm called ``Potential Firing'' in the learning phase. In the recall phase, the proposed system recalls relational knowledge from the learned knowledge based on the input sentence. We used a new recall algorithm called ``Waterfall Recall'' in the recall phase. We added a function to respond to questions in natural language with ``yes/no'' in order to confirm the validity of proposed system by evaluating the quantity of correct answers.
Automatic acquisition of domain and procedural knowledge

NASA Technical Reports Server (NTRS)

Ferber, H. J.; Ali, M.

1988-01-01

The design concept and performance of AKAS, an automated knowledge-acquisition system for the development of expert systems, are discussed. AKAS was developed using the FLES knowledge base for the electrical system of the B-737 aircraft and employs a 'learn by being told' strategy. The system comprises four basic modules, a system administration module, a natural-language concept-comprehension module, a knowledge-classification/extraction module, and a knowledge-incorporation module; details of the module architectures are explored.
Knowledge Representation Of CT Scans Of The Head

NASA Astrophysics Data System (ADS)

Ackerman, Laurens V.; Burke, M. W.; Rada, Roy

1984-06-01

We have been investigating diagnostic knowledge models which assist in the automatic classification of medical images by combining information extracted from each image with knowledge specific to that class of images. In a more general sense we are trying to integrate verbal and pictorial descriptions of disease via representations of knowledge, study automatic hypothesis generation as related to clinical medicine, evolve new mathematical image measures while integrating them into the total diagnostic process, and investigate ways to augment the knowledge of the physician. Specifically, we have constructed an artificial intelligence knowledge model using the technique of a production system blending pictorial and verbal knowledge about the respective CT scan and patient history. It is an attempt to tie together different sources of knowledge representation, picture feature extraction and hypothesis generation. Our knowledge reasoning and representation system (KRRS) works with data at the conscious reasoning level of the practicing physician while at the visual perceptional level we are building another production system, the picture parameter extractor (PPE). This paper describes KRRS and its relationship to PPE.
PKDE4J: Entity and relation extraction for public knowledge discovery.

PubMed

Song, Min; Kim, Won Chul; Lee, Dahee; Heo, Go Eun; Kang, Keun Young

2015-10-01

Due to an enormous number of scientific publications that cannot be handled manually, there is a rising interest in text-mining techniques for automated information extraction, especially in the biomedical field. Such techniques provide effective means of information search, knowledge discovery, and hypothesis generation. Most previous studies have primarily focused on the design and performance improvement of either named entity recognition or relation extraction. In this paper, we present PKDE4J, a comprehensive text-mining system that integrates dictionary-based entity extraction and rule-based relation extraction in a highly flexible and extensible framework. Starting with the Stanford CoreNLP, we developed the system to cope with multiple types of entities and relations. The system also has fairly good performance in terms of accuracy as well as the ability to configure text-processing components. We demonstrate its competitive performance by evaluating it on many corpora and found that it surpasses existing systems with average F-measures of 85% for entity extraction and 81% for relation extraction. Copyright © 2015 Elsevier Inc. All rights reserved.

Web-Based Knowledge Exchange through Social Links in the Workplace

ERIC Educational Resources Information Center

Filipowski, Tomasz; Kazienko, Przemyslaw; Brodka, Piotr; Kajdanowicz, Tomasz

2012-01-01

Knowledge exchange between employees is an essential feature of recent commercial organisations on the competitive market. Based on the data gathered by various information technology (IT) systems, social links can be extracted and exploited in knowledge exchange systems of a new kind. Users of such a system ask their queries and the system…
A knowledge engineering approach to recognizing and extracting sequences of nucleic acids from scientific literature.

PubMed

García-Remesal, Miguel; Maojo, Victor; Crespo, José

2010-01-01

In this paper we present a knowledge engineering approach to automatically recognize and extract genetic sequences from scientific articles. To carry out this task, we use a preliminary recognizer based on a finite state machine to extract all candidate DNA/RNA sequences. The latter are then fed into a knowledge-based system that automatically discards false positives and refines noisy and incorrectly merged sequences. We created the knowledge base by manually analyzing different manuscripts containing genetic sequences. Our approach was evaluated using a test set of 211 full-text articles in PDF format containing 3134 genetic sequences. For such set, we achieved 87.76% precision and 97.70% recall respectively. This method can facilitate different research tasks. These include text mining, information extraction, and information retrieval research dealing with large collections of documents containing genetic sequences.
Extracting genetic alteration information for personalized cancer therapy from ClinicalTrials.gov

PubMed Central

Xu, Jun; Lee, Hee-Jin; Zeng, Jia; Wu, Yonghui; Zhang, Yaoyun; Huang, Liang-Chin; Johnson, Amber; Holla, Vijaykumar; Bailey, Ann M; Cohen, Trevor; Meric-Bernstam, Funda; Bernstam, Elmer V

2016-01-01

Objective: Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov. Methods: We developed a semi-automatic framework that combines advanced text-processing techniques with manual review to curate genetic alteration information in cancer trials. The framework consists of a document classification system to identify cancer treatment trials from ClinicalTrials.gov and an information extraction system to extract gene and alteration pairs from the Title and Eligibility Criteria sections of clinical trials. By applying the framework to trials at ClinicalTrials.gov, we created a knowledge base of cancer treatment trials with genetic alteration annotations. We then evaluated each component of the framework against manually reviewed sets of clinical trials and generated descriptive statistics of the knowledge base. Results and Discussion: The automated cancer treatment trial identification system achieved a high precision of 0.9944. Together with the manual review process, it identified 20 193 cancer treatment trials from ClinicalTrials.gov. The automated gene-alteration extraction system achieved a precision of 0.8300 and a recall of 0.6803. After validation by manual review, we generated a knowledge base of 2024 cancer trials that are labeled with specific genetic alteration information. Analysis of the knowledge base revealed the trend of increased use of targeted therapy for cancer, as well as top frequent gene-alteration pairs of interest. We expect this knowledge base to be a valuable resource for physicians and patients who are seeking information about personalized cancer therapy. PMID:27013523
Extracting genetic alteration information for personalized cancer therapy from ClinicalTrials.gov.

PubMed

Xu, Jun; Lee, Hee-Jin; Zeng, Jia; Wu, Yonghui; Zhang, Yaoyun; Huang, Liang-Chin; Johnson, Amber; Holla, Vijaykumar; Bailey, Ann M; Cohen, Trevor; Meric-Bernstam, Funda; Bernstam, Elmer V; Xu, Hua

2016-07-01

Clinical trials investigating drugs that target specific genetic alterations in tumors are important for promoting personalized cancer therapy. The goal of this project is to create a knowledge base of cancer treatment trials with annotations about genetic alterations from ClinicalTrials.gov. We developed a semi-automatic framework that combines advanced text-processing techniques with manual review to curate genetic alteration information in cancer trials. The framework consists of a document classification system to identify cancer treatment trials from ClinicalTrials.gov and an information extraction system to extract gene and alteration pairs from the Title and Eligibility Criteria sections of clinical trials. By applying the framework to trials at ClinicalTrials.gov, we created a knowledge base of cancer treatment trials with genetic alteration annotations. We then evaluated each component of the framework against manually reviewed sets of clinical trials and generated descriptive statistics of the knowledge base. The automated cancer treatment trial identification system achieved a high precision of 0.9944. Together with the manual review process, it identified 20 193 cancer treatment trials from ClinicalTrials.gov. The automated gene-alteration extraction system achieved a precision of 0.8300 and a recall of 0.6803. After validation by manual review, we generated a knowledge base of 2024 cancer trials that are labeled with specific genetic alteration information. Analysis of the knowledge base revealed the trend of increased use of targeted therapy for cancer, as well as top frequent gene-alteration pairs of interest. We expect this knowledge base to be a valuable resource for physicians and patients who are seeking information about personalized cancer therapy. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Deep Logic Networks: Inserting and Extracting Knowledge From Deep Belief Networks.

PubMed

Tran, Son N; d'Avila Garcez, Artur S

2018-02-01

Developments in deep learning have seen the use of layerwise unsupervised learning combined with supervised learning for fine-tuning. With this layerwise approach, a deep network can be seen as a more modular system that lends itself well to learning representations. In this paper, we investigate whether such modularity can be useful to the insertion of background knowledge into deep networks, whether it can improve learning performance when it is available, and to the extraction of knowledge from trained deep networks, and whether it can offer a better understanding of the representations learned by such networks. To this end, we use a simple symbolic language-a set of logical rules that we call confidence rules-and show that it is suitable for the representation of quantitative reasoning in deep networks. We show by knowledge extraction that confidence rules can offer a low-cost representation for layerwise networks (or restricted Boltzmann machines). We also show that layerwise extraction can produce an improvement in the accuracy of deep belief networks. Furthermore, the proposed symbolic characterization of deep networks provides a novel method for the insertion of prior knowledge and training of deep networks. With the use of this method, a deep neural-symbolic system is proposed and evaluated, with the experimental results indicating that modularity through the use of confidence rules and knowledge insertion can be beneficial to network performance.
New Method for Knowledge Management Focused on Communication Pattern in Product Development

NASA Astrophysics Data System (ADS)

Noguchi, Takashi; Shiba, Hajime

In the field of manufacturing, the importance of utilizing knowledge and know-how has been growing. To meet this background, there is a need for new methods to efficiently accumulate and extract effective knowledge and know-how. To facilitate the extraction of knowledge and know-how needed by engineers, we first defined business process information which includes schedule/progress information, document data, information about communication among parties concerned, and information which corresponds to these three types of information. Based on our definitions, we proposed an IT system (FlexPIM: Flexible and collaborative Process Information Management) to register and accumulate business process information with the least effort. In order to efficiently extract effective information from huge volumes of accumulated business process information, focusing attention on “actions” and communication patterns, we propose a new extraction method using communication patterns. And the validity of this method has been verified for some communication patterns.
A knowledge-based decision support system in bioinformatics: an application to protein complex extraction

PubMed Central

2013-01-01

Background We introduce a Knowledge-based Decision Support System (KDSS) in order to face the Protein Complex Extraction issue. Using a Knowledge Base (KB) coding the expertise about the proposed scenario, our KDSS is able to suggest both strategies and tools, according to the features of input dataset. Our system provides a navigable workflow for the current experiment and furthermore it offers support in the configuration and running of every processing component of that workflow. This last feature makes our system a crossover between classical DSS and Workflow Management Systems. Results We briefly present the KDSS' architecture and basic concepts used in the design of the knowledge base and the reasoning component. The system is then tested using a subset of Saccharomyces cerevisiae Protein-Protein interaction dataset. We used this subset because it has been well studied in literature by several research groups in the field of complex extraction: in this way we could easily compare the results obtained through our KDSS with theirs. Our system suggests both a preprocessing and a clustering strategy, and for each of them it proposes and eventually runs suited algorithms. Our system's final results are then composed of a workflow of tasks, that can be reused for other experiments, and the specific numerical results for that particular trial. Conclusions The proposed approach, using the KDSS' knowledge base, provides a novel workflow that gives the best results with regard to the other workflows produced by the system. This workflow and its numeric results have been compared with other approaches about PPI network analysis found in literature, offering similar results. PMID:23368995
NASA's online machine aided indexing system

NASA Technical Reports Server (NTRS)

Silvester, June P.; Genuardi, Michael T.; Klingbiel, Paul H.

1993-01-01

This report describes the NASA Lexical Dictionary, a machine aided indexing system used online at the National Aeronautics and Space Administration's Center for Aerospace Information (CASI). This system is comprised of a text processor that is based on the computational, non-syntactic analysis of input text, and an extensive 'knowledge base' that serves to recognize and translate text-extracted concepts. The structure and function of the various NLD system components are described in detail. Methods used for the development of the knowledge base are discussed. Particular attention is given to a statistically-based text analysis program that provides the knowledge base developer with a list of concept-specific phrases extracted from large textual corpora. Production and quality benefits resulting from the integration of machine aided indexing at CASI are discussed along with a number of secondary applications of NLD-derived systems including on-line spell checking and machine aided lexicography.
A model for indexing medical documents combining statistical and symbolic knowledge.

PubMed

Avillach, Paul; Joubert, Michel; Fieschi, Marius

2007-10-11

To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context. We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs). The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases. The use of several terminologies leads to more precise indexing. The improvement achieved in the models implementation performances as a result of using semantic relationships is encouraging.
A Model for Indexing Medical Documents Combining Statistical and Symbolic Knowledge.

PubMed Central

Avillach, Paul; Joubert, Michel; Fieschi, Marius

2007-01-01

OBJECTIVES: To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context. METHODS: We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs). RESULTS: The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases. CONCLUSIONS: The use of several terminologies leads to more precise indexing. The improvement achieved in the model’s implementation performances as a result of using semantic relationships is encouraging. PMID:18693792
Development of a knowledge acquisition tool for an expert system flight status monitor

NASA Technical Reports Server (NTRS)

Disbrow, J. D.; Duke, E. L.; Regenie, V. A.

1986-01-01

Two of the main issues in artificial intelligence today are knowledge acquisition dion and knowledge representation. The Dryden Flight Research Facility of NASA's Ames Research Center is presently involved in the design and implementation of an expert system flight status monitor that will provide expertise and knowledge to aid the flight systems engineer in monitoring today's advanced high-performance aircraft. The flight status monitor can be divided into two sections: the expert system itself and the knowledge acquisition tool. The knowledge acquisition tool, the means it uses to extract knowledge from the domain expert, and how that knowledge is represented for computer use is discussed. An actual aircraft system has been codified by this tool with great success. Future real-time use of the expert system has been facilitated by using the knowledge acquisition tool to easily generate a logically consistent and complete knowledge base.
Development of a knowledge acquisition tool for an expert system flight status monitor

NASA Technical Reports Server (NTRS)

Disbrow, J. D.; Duke, E. L.; Regenie, V. A.

1986-01-01

Two of the main issues in artificial intelligence today are knowledge acquisition and knowledge representation. The Dryden Flight Research Facility of NASA's Ames Research Center is presently involved in the design and implementation of an expert system flight status monitor that will provide expertise and knowledge to aid the flight systems engineer in monitoring today's advanced high-performance aircraft. The flight status monitor can be divided into two sections: the expert system itself and the knowledge acquisition tool. This paper discusses the knowledge acquisition tool, the means it uses to extract knowledge from the domain expert, and how that knowledge is represented for computer use. An actual aircraft system has been codified by this tool with great success. Future real-time use of the expert system has been facilitated by using the knowledge acquisition tool to easily generate a logically consistent and complete knowledge base.
User-centered evaluation of Arizona BioPathway: an information extraction, integration, and visualization system.

PubMed

Quiñones, Karin D; Su, Hua; Marshall, Byron; Eggers, Shauna; Chen, Hsinchun

2007-09-01

Explosive growth in biomedical research has made automated information extraction, knowledge integration, and visualization increasingly important and critically needed. The Arizona BioPathway (ABP) system extracts and displays biological regulatory pathway information from the abstracts of journal articles. This study uses relations extracted from more than 200 PubMed abstracts presented in a tabular and graphical user interface with built-in search and aggregation functionality. This paper presents a task-centered assessment of the usefulness and usability of the ABP system focusing on its relation aggregation and visualization functionalities. Results suggest that our graph-based visualization is more efficient in supporting pathway analysis tasks and is perceived as more useful and easier to use as compared to a text-based literature-viewing method. Relation aggregation significantly contributes to knowledge-acquisition efficiency. Together, the graphic and tabular views in the ABP Visualizer provide a flexible and effective interface for pathway relation browsing and analysis. Our study contributes to pathway-related research and biological information extraction by assessing the value of a multiview, relation-based interface that supports user-controlled exploration of pathway information across multiple granularities.
Document Exploration and Automatic Knowledge Extraction for Unstructured Biomedical Text

NASA Astrophysics Data System (ADS)

Chu, S.; Totaro, G.; Doshi, N.; Thapar, S.; Mattmann, C. A.; Ramirez, P.

2015-12-01

We describe our work on building a web-browser based document reader with built-in exploration tool and automatic concept extraction of medical entities for biomedical text. Vast amounts of biomedical information are offered in unstructured text form through scientific publications and R&D reports. Utilizing text mining can help us to mine information and extract relevant knowledge from a plethora of biomedical text. The ability to employ such technologies to aid researchers in coping with information overload is greatly desirable. In recent years, there has been an increased interest in automatic biomedical concept extraction [1, 2] and intelligent PDF reader tools with the ability to search on content and find related articles [3]. Such reader tools are typically desktop applications and are limited to specific platforms. Our goal is to provide researchers with a simple tool to aid them in finding, reading, and exploring documents. Thus, we propose a web-based document explorer, which we called Shangri-Docs, which combines a document reader with automatic concept extraction and highlighting of relevant terms. Shangri-Docsalso provides the ability to evaluate a wide variety of document formats (e.g. PDF, Words, PPT, text, etc.) and to exploit the linked nature of the Web and personal content by performing searches on content from public sites (e.g. Wikipedia, PubMed) and private cataloged databases simultaneously. Shangri-Docsutilizes Apache cTAKES (clinical Text Analysis and Knowledge Extraction System) [4] and Unified Medical Language System (UMLS) to automatically identify and highlight terms and concepts, such as specific symptoms, diseases, drugs, and anatomical sites, mentioned in the text. cTAKES was originally designed specially to extract information from clinical medical records. Our investigation leads us to extend the automatic knowledge extraction process of cTAKES for biomedical research domain by improving the ontology guided information extraction process. We will describe our experience and implementation of our system and share lessons learned from our development. We will also discuss ways in which this could be adapted to other science fields. [1] Funk et al., 2014. [2] Kang et al., 2014. [3] Utopia Documents, http://utopiadocs.com [4] Apache cTAKES, http://ctakes.apache.org
Ontology-Based Search of Genomic Metadata.

PubMed

Fernandez, Javier D; Lenzerini, Maurizio; Masseroli, Marco; Venco, Francesco; Ceri, Stefano

2016-01-01

The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic, and epigenomic discoveries. Yet, search of relevant datasets for knowledge discovery is limitedly supported: metadata describing ENCODE datasets are quite simple and incomplete, and not described by a coherent underlying ontology. Here, we show how to overcome this limitation, by adopting an ENCODE metadata searching approach which uses high-quality ontological knowledge and state-of-the-art indexing technologies. Specifically, we developed S.O.S. GeM (http://www.bioinformatics.deib.polimi.it/SOSGeM/), a system supporting effective semantic search and retrieval of ENCODE datasets. First, we constructed a Semantic Knowledge Base by starting with concepts extracted from ENCODE metadata, matched to and expanded on biomedical ontologies integrated in the well-established Unified Medical Language System. We prove that this inference method is sound and complete. Then, we leveraged the Semantic Knowledge Base to semantically search ENCODE data from arbitrary biologists' queries. This allows correctly finding more datasets than those extracted by a purely syntactic search, as supported by the other available systems. We empirically show the relevance of found datasets to the biologists' queries.
Harvesting Intelligence in Multimedia Social Tagging Systems

NASA Astrophysics Data System (ADS)

Giannakidou, Eirini; Kaklidou, Foteini; Chatzilari, Elisavet; Kompatsiaris, Ioannis; Vakali, Athena

As more people adopt tagging practices, social tagging systems tend to form rich knowledge repositories that enable the extraction of patterns reflecting the way content semantics is perceived by the web users. This is of particular importance, especially in the case of multimedia content, since the availability of such content in the web is very high and its efficient retrieval using textual annotations or content-based automatically extracted metadata still remains a challenge. It is argued that complementing multimedia analysis techniques with knowledge drawn from web social annotations may facilitate multimedia content management. This chapter focuses on analyzing tagging patterns and combining them with content feature extraction methods, generating, thus, intelligence from multimedia social tagging systems. Emphasis is placed on using all available "tracks" of knowledge, that is tag co-occurrence together with semantic relations among tags and low-level features of the content. Towards this direction, a survey on the theoretical background and the adopted practices for analysis of multimedia social content are presented. A case study from Flickr illustrates the efficiency of the proposed approach.
Information Extraction from Unstructured Text for the Biodefense Knowledge Center

DOE Office of Scientific and Technical Information (OSTI.GOV)

Samatova, N F; Park, B; Krishnamurthy, R

2005-04-29

The Bio-Encyclopedia at the Biodefense Knowledge Center (BKC) is being constructed to allow an early detection of emerging biological threats to homeland security. It requires highly structured information extracted from variety of data sources. However, the quantity of new and vital information available from every day sources cannot be assimilated by hand, and therefore reliable high-throughput information extraction techniques are much anticipated. In support of the BKC, Lawrence Livermore National Laboratory and Oak Ridge National Laboratory, together with the University of Utah, are developing an information extraction system built around the bioterrorism domain. This paper reports two important pieces ofmore » our effort integrated in the system: key phrase extraction and semantic tagging. Whereas two key phrase extraction technologies developed during the course of project help identify relevant texts, our state-of-the-art semantic tagging system can pinpoint phrases related to emerging biological threats. Also we are enhancing and tailoring the Bio-Encyclopedia by augmenting semantic dictionaries and extracting details of important events, such as suspected disease outbreaks. Some of these technologies have already been applied to large corpora of free text sources vital to the BKC mission, including ProMED-mail, PubMed abstracts, and the DHS's Information Analysis and Infrastructure Protection (IAIP) news clippings. In order to address the challenges involved in incorporating such large amounts of unstructured text, the overall system is focused on precise extraction of the most relevant information for inclusion in the BKC.« less
Technical design and system implementation of region-line primitive association framework

NASA Astrophysics Data System (ADS)

Wang, Min; Xing, Jinjin; Wang, Jie; Lv, Guonian

2017-08-01

Apart from regions, image edge lines are an important information source, and they deserve more attention in object-based image analysis (OBIA) than they currently receive. In the region-line primitive association framework (RLPAF), we promote straight-edge lines as line primitives to achieve powerful OBIAs. Along with regions, straight lines become basic units for subsequent extraction and analysis of OBIA features. This study develops a new software system called remote-sensing knowledge finder (RSFinder) to implement RLPAF for engineering application purposes. This paper introduces the extended technical framework, a comprehensively designed feature set, key technology, and software implementation. To our knowledge, RSFinder is the world's first OBIA system based on two types of primitives, namely, regions and lines. It is fundamentally different from other well-known region-only-based OBIA systems, such as eCogntion and ENVI feature extraction module. This paper has important reference values for the development of similarly structured OBIA systems and line-involved extraction algorithms of remote sensing information.
Concept of operations for knowledge discovery from Big Data across enterprise data warehouses

NASA Astrophysics Data System (ADS)

Sukumar, Sreenivas R.; Olama, Mohammed M.; McNair, Allen W.; Nutaro, James J.

2013-05-01

The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Options that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.
Knowledge Acquisition of Generic Queries for Information Retrieval

PubMed Central

Seol, Yoon-Ho; Johnson, Stephen B.; Cimino, James J.

2002-01-01

Several studies have identified clinical questions posed by health care professionals to understand the nature of information needs during clinical practice. To support access to digital information sources, it is necessary to integrate the information needs with a computer system. We have developed a conceptual guidance approach in information retrieval, based on a knowledge base that contains the patterns of information needs. The knowledge base uses a formal representation of clinical questions based on the UMLS knowledge sources, called the Generic Query model. To improve the coverage of the knowledge base, we investigated a method for extracting plausible clinical questions from the medical literature. This poster presents the Generic Query model, shows how it is used to represent the patterns of clinical questions, and describes the framework used to extract knowledge from the medical literature.

Support patient search on pathology reports with interactive online learning based data extraction.

PubMed

Zheng, Shuai; Lu, James J; Appin, Christina; Brat, Daniel; Wang, Fusheng

2015-01-01

Structural reporting enables semantic understanding and prompt retrieval of clinical findings about patients. While synoptic pathology reporting provides templates for data entries, information in pathology reports remains primarily in narrative free text form. Extracting data of interest from narrative pathology reports could significantly improve the representation of the information and enable complex structured queries. However, manual extraction is tedious and error-prone, and automated tools are often constructed with a fixed training dataset and not easily adaptable. Our goal is to extract data from pathology reports to support advanced patient search with a highly adaptable semi-automated data extraction system, which can adjust and self-improve by learning from a user's interaction with minimal human effort. We have developed an online machine learning based information extraction system called IDEAL-X. With its graphical user interface, the system's data extraction engine automatically annotates values for users to review upon loading each report text. The system analyzes users' corrections regarding these annotations with online machine learning, and incrementally enhances and refines the learning model as reports are processed. The system also takes advantage of customized controlled vocabularies, which can be adaptively refined during the online learning process to further assist the data extraction. As the accuracy of automatic annotation improves overtime, the effort of human annotation is gradually reduced. After all reports are processed, a built-in query engine can be applied to conveniently define queries based on extracted structured data. We have evaluated the system with a dataset of anatomic pathology reports from 50 patients. Extracted data elements include demographical data, diagnosis, genetic marker, and procedure. The system achieves F-1 scores of around 95% for the majority of tests. Extracting data from pathology reports could enable more accurate knowledge to support biomedical research and clinical diagnosis. IDEAL-X provides a bridge that takes advantage of online machine learning based data extraction and the knowledge from human's feedback. By combining iterative online learning and adaptive controlled vocabularies, IDEAL-X can deliver highly adaptive and accurate data extraction to support patient search.
Intelligent control for modeling of real-time reservoir operation, part II: artificial neural network with operating rule curves

NASA Astrophysics Data System (ADS)

Chang, Ya-Ting; Chang, Li-Chiu; Chang, Fi-John

2005-04-01

To bridge the gap between academic research and actual operation, we propose an intelligent control system for reservoir operation. The methodology includes two major processes, the knowledge acquired and implemented, and the inference system. In this study, a genetic algorithm (GA) and a fuzzy rule base (FRB) are used to extract knowledge based on the historical inflow data with a design objective function and on the operating rule curves respectively. The adaptive network-based fuzzy inference system (ANFIS) is then used to implement the knowledge, to create the fuzzy inference system, and then to estimate the optimal reservoir operation. To investigate its applicability and practicability, the Shihmen reservoir, Taiwan, is used as a case study. For the purpose of comparison, a simulation of the currently used M-5 operating rule curve is also performed. The results demonstrate that (1) the GA is an efficient way to search the optimal input-output patterns, (2) the FRB can extract the knowledge from the operating rule curves, and (3) the ANFIS models built on different types of knowledge can produce much better performance than the traditional M-5 curves in real-time reservoir operation. Moreover, we show that the model can be more intelligent for reservoir operation if more information (or knowledge) is involved.
EMR-based medical knowledge representation and inference via Markov random fields and distributed representation learning.

PubMed

Zhao, Chao; Jiang, Jingchi; Guan, Yi; Guo, Xitong; He, Bin

2018-05-01

Electronic medical records (EMRs) contain medical knowledge that can be used for clinical decision support (CDS). Our objective is to develop a general system that can extract and represent knowledge contained in EMRs to support three CDS tasks-test recommendation, initial diagnosis, and treatment plan recommendation-given the condition of a patient. We extracted four kinds of medical entities from records and constructed an EMR-based medical knowledge network (EMKN), in which nodes are entities and edges reflect their co-occurrence in a record. Three bipartite subgraphs (bigraphs) were extracted from the EMKN, one to support each task. One part of the bigraph was the given condition (e.g., symptoms), and the other was the condition to be inferred (e.g., diseases). Each bigraph was regarded as a Markov random field (MRF) to support the inference. We proposed three graph-based energy functions and three likelihood-based energy functions. Two of these functions are based on knowledge representation learning and can provide distributed representations of medical entities. Two EMR datasets and three metrics were utilized to evaluate the performance. As a whole, the evaluation results indicate that the proposed system outperformed the baseline methods. The distributed representation of medical entities does reflect similarity relationships with respect to knowledge level. Combining EMKN and MRF is an effective approach for general medical knowledge representation and inference. Different tasks, however, require individually designed energy functions. Copyright © 2018 Elsevier B.V. All rights reserved.
Automated Extraction of Substance Use Information from Clinical Texts.

PubMed

Wang, Yan; Chen, Elizabeth S; Pakhomov, Serguei; Arsoniadis, Elliot; Carter, Elizabeth W; Lindemann, Elizabeth; Sarkar, Indra Neil; Melton, Genevieve B

2015-01-01

Within clinical discourse, social history (SH) includes important information about substance use (alcohol, drug, and nicotine use) as key risk factors for disease, disability, and mortality. In this study, we developed and evaluated a natural language processing (NLP) system for automated detection of substance use statements and extraction of substance use attributes (e.g., temporal and status) based on Stanford Typed Dependencies. The developed NLP system leveraged linguistic resources and domain knowledge from a multi-site social history study, Propbank and the MiPACQ corpus. The system attained F-scores of 89.8, 84.6 and 89.4 respectively for alcohol, drug, and nicotine use statement detection, as well as average F-scores of 82.1, 90.3, 80.8, 88.7, 96.6, and 74.5 respectively for extraction of attributes. Our results suggest that NLP systems can achieve good performance when augmented with linguistic resources and domain knowledge when applied to a wide breadth of substance use free text clinical notes.
Establishment of a Digital Knowledge Conversion Architecture Design Learning with High User Acceptance

ERIC Educational Resources Information Center

Wu, Yun-Wu; Weng, Apollo; Weng, Kuo-Hua

2017-01-01

The purpose of this study is to design a knowledge conversion and management digital learning system for architecture design learning, helping students to share, extract, use and create their design knowledge through web-based interactive activities based on socialization, internalization, combination and externalization process in addition to…
Towards building a disease-phenotype knowledge base: extracting disease-manifestation relationship from literature

PubMed Central

Xu, Rong; Li, Li; Wang, QuanQiu

2013-01-01

Motivation: Systems approaches to studying phenotypic relationships among diseases are emerging as an active area of research for both novel disease gene discovery and drug repurposing. Currently, systematic study of disease phenotypic relationships on a phenome-wide scale is limited because large-scale machine-understandable disease–phenotype relationship knowledge bases are often unavailable. Here, we present an automatic approach to extract disease–manifestation (D-M) pairs (one specific type of disease–phenotype relationship) from the wide body of published biomedical literature. Data and Methods: Our method leverages external knowledge and limits the amount of human effort required. For the text corpus, we used 119 085 682 MEDLINE sentences (21 354 075 citations). First, we used D-M pairs from existing biomedical ontologies as prior knowledge to automatically discover D-M–specific syntactic patterns. We then extracted additional pairs from MEDLINE using the learned patterns. Finally, we analysed correlations between disease manifestations and disease-associated genes and drugs to demonstrate the potential of this newly created knowledge base in disease gene discovery and drug repurposing. Results: In total, we extracted 121 359 unique D-M pairs with a high precision of 0.924. Among the extracted pairs, 120 419 (99.2%) have not been captured in existing structured knowledge sources. We have shown that disease manifestations correlate positively with both disease-associated genes and drug treatments. Conclusions: The main contribution of our study is the creation of a large-scale and accurate D-M phenotype relationship knowledge base. This unique knowledge base, when combined with existing phenotypic, genetic and proteomic datasets, can have profound implications in our deeper understanding of disease etiology and in rapid drug repurposing. Availability: http://nlp.case.edu/public/data/DMPatternUMLS/ Contact: rxx@case.edu PMID:23828786
Domain-independent information extraction in unstructured text

DOE Office of Scientific and Technical Information (OSTI.GOV)

Irwin, N.H.

Extracting information from unstructured text has become an important research area in recent years due to the large amount of text now electronically available. This status report describes the findings and work done during the second year of a two-year Laboratory Directed Research and Development Project. Building on the first-year`s work of identifying important entities, this report details techniques used to group words into semantic categories and to output templates containing selective document content. Using word profiles and category clustering derived during a training run, the time-consuming knowledge-building task can be avoided. Though the output still lacks in completeness whenmore » compared to systems with domain-specific knowledge bases, the results do look promising. The two approaches are compatible and could complement each other within the same system. Domain-independent approaches retain appeal as a system that adapts and learns will soon outpace a system with any amount of a priori knowledge.« less
New developments of a knowledge based system (VEG) for inferring vegetation characteristics

NASA Technical Reports Server (NTRS)

Kimes, D. S.; Harrison, P. A.; Harrison, P. R.

1992-01-01

An extraction technique for inferring physical and biological surface properties of vegetation using nadir and/or directional reflectance data as input has been developed. A knowledge-based system (VEG) accepts spectral data of an unknown target as input, determines the best strategy for inferring the desired vegetation characteristic, applies the strategy to the target data, and provides a rigorous estimate of the accuracy of the inference. Progress in developing the system is presented. VEG combines methods from remote sensing and artificial intelligence, and integrates input spectral measurements with diverse knowledge bases. VEG has been developed to (1) infer spectral hemispherical reflectance from any combination of nadir and/or off-nadir view angles; (2) test and develop new extraction techniques on an internal spectral database; (3) browse, plot, or analyze directional reflectance data in the system's spectral database; (4) discriminate between user-defined vegetation classes using spectral and directional reflectance relationships; and (5) infer unknown view angles from known view angles (known as view angle extension).
XML-based data model and architecture for a knowledge-based grid-enabled problem-solving environment for high-throughput biological imaging.

PubMed

Ahmed, Wamiq M; Lenz, Dominik; Liu, Jia; Paul Robinson, J; Ghafoor, Arif

2008-03-01

High-throughput biological imaging uses automated imaging devices to collect a large number of microscopic images for analysis of biological systems and validation of scientific hypotheses. Efficient manipulation of these datasets for knowledge discovery requires high-performance computational resources, efficient storage, and automated tools for extracting and sharing such knowledge among different research sites. Newly emerging grid technologies provide powerful means for exploiting the full potential of these imaging techniques. Efficient utilization of grid resources requires the development of knowledge-based tools and services that combine domain knowledge with analysis algorithms. In this paper, we first investigate how grid infrastructure can facilitate high-throughput biological imaging research, and present an architecture for providing knowledge-based grid services for this field. We identify two levels of knowledge-based services. The first level provides tools for extracting spatiotemporal knowledge from image sets and the second level provides high-level knowledge management and reasoning services. We then present cellular imaging markup language, an extensible markup language-based language for modeling of biological images and representation of spatiotemporal knowledge. This scheme can be used for spatiotemporal event composition, matching, and automated knowledge extraction and representation for large biological imaging datasets. We demonstrate the expressive power of this formalism by means of different examples and extensive experimental results.
Proceedings of Joint RL/AFOSR Workshop on Intelligent Information Systems Held at Griffiss AFB, New York on October 22-23, 1991

DTIC Science & Technology

1992-04-01

AND SCHEDULING" TIM FINN, UNIVERSITY OF MARYLAND, BALTIMORE COUNTY E. " EXTRACTING RULES FROM SOFTWARE FOR KNOWLEDGE-BASES" NOAH S. PRYWES, UNIVERSITY...Databases for Planning and Scheduling" Tim Finin, Unisys Corporation 8:30 - 9:00 " Extracting Rules from Software for Knowledge Baseso Noah Prywes, U. of...Space Requirements are Tractable E.G.: FEM, Multiplication Routines, Sorting Programs Lebmwmy fo Al Roseew d. The Ohio Male Unlversity A-2 Type 2
Main Road Extraction from ZY-3 Grayscale Imagery Based on Directional Mathematical Morphology and VGI Prior Knowledge in Urban Areas

PubMed Central

Liu, Bo; Wu, Huayi; Wang, Yandong; Liu, Wenming

2015-01-01

Main road features extracted from remotely sensed imagery play an important role in many civilian and military applications, such as updating Geographic Information System (GIS) databases, urban structure analysis, spatial data matching and road navigation. Current methods for road feature extraction from high-resolution imagery are typically based on threshold value segmentation. It is difficult however, to completely separate road features from the background. We present a new method for extracting main roads from high-resolution grayscale imagery based on directional mathematical morphology and prior knowledge obtained from the Volunteered Geographic Information found in the OpenStreetMap. The two salient steps in this strategy are: (1) using directional mathematical morphology to enhance the contrast between roads and non-roads; (2) using OpenStreetMap roads as prior knowledge to segment the remotely sensed imagery. Experiments were conducted on two ZiYuan-3 images and one QuickBird high-resolution grayscale image to compare our proposed method to other commonly used techniques for road feature extraction. The results demonstrated the validity and better performance of the proposed method for urban main road feature extraction. PMID:26397832
Semantic extraction and processing of medical records for patient-oriented visual index

NASA Astrophysics Data System (ADS)

Zheng, Weilin; Dong, Wenjie; Chen, Xiangjiao; Zhang, Jianguo

2012-02-01

To have comprehensive and completed understanding healthcare status of a patient, doctors need to search patient medical records from different healthcare information systems, such as PACS, RIS, HIS, USIS, as a reference of diagnosis and treatment decisions for the patient. However, it is time-consuming and tedious to do these procedures. In order to solve this kind of problems, we developed a patient-oriented visual index system (VIS) to use the visual technology to show health status and to retrieve the patients' examination information stored in each system with a 3D human model. In this presentation, we present a new approach about how to extract the semantic and characteristic information from the medical record systems such as RIS/USIS to create the 3D Visual Index. This approach includes following steps: (1) Building a medical characteristic semantic knowledge base; (2) Developing natural language processing (NLP) engine to perform semantic analysis and logical judgment on text-based medical records; (3) Applying the knowledge base and NLP engine on medical records to extract medical characteristics (e.g., the positive focus information), and then mapping extracted information to related organ/parts of 3D human model to create the visual index. We performed the testing procedures on 559 samples of radiological reports which include 853 focuses, and achieved 828 focuses' information. The successful rate of focus extraction is about 97.1%.
The smooth (tractor) operator: insights of knowledge engineering.

PubMed

Cullen, Ralph H; Smarr, Cory-Ann; Serrano-Baquero, Daniel; McBride, Sara E; Beer, Jenay M; Rogers, Wendy A

2012-11-01

The design of and training for complex systems requires in-depth understanding of task demands imposed on users. In this project, we used the knowledge engineering approach (Bowles et al., 2004) to assess the task of mowing in a citrus grove. Knowledge engineering is divided into four phases: (1) Establish goals. We defined specific goals based on the stakeholders involved. The main goal was to identify operator demands to support improvement of the system. (2) Create a working model of the system. We reviewed product literature, analyzed the system, and conducted expert interviews. (3) Extract knowledge. We interviewed tractor operators to understand their knowledge base. (4) Structure knowledge. We analyzed and organized operator knowledge to inform project goals. We categorized the information and developed diagrams to display the knowledge effectively. This project illustrates the benefits of knowledge engineering as a qualitative research method to inform technology design and training. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Extraction of the human cerebral ventricular system from MRI: inclusion of anatomical knowledge and clinical perspective

NASA Astrophysics Data System (ADS)

Aziz, Aamer; Hu, Qingmao; Nowinski, Wieslaw L.

2004-04-01

The human cerebral ventricular system is a complex structure that is essential for the well being and changes in which reflect disease. It is clinically imperative that the ventricular system be studied in details. For this reason computer assisted algorithms are essential to be developed. We have developed a novel (patent pending) and robust anatomical knowledge-driven algorithm for automatic extraction of the cerebral ventricular system from MRI. The algorithm is not only unique in its image processing aspect but also incorporates knowledge of neuroanatomy, radiological properties, and variability of the ventricular system. The ventricular system is divided into six 3D regions based on the anatomy and its variability. Within each ventricular region a 2D region of interest (ROI) is defined and is then further subdivided into sub-regions. Various strict conditions that detect and prevent leakage into the extra-ventricular space are specified for each sub-region based on anatomical knowledge. Each ROI is processed to calculate its local statistics, local intensity ranges of cerebrospinal fluid and grey and white matters, set a seed point within the ROI, grow region directionally in 3D, check anti-leakage conditions and correct growing if leakage occurs and connects all unconnected regions grown by relaxing growing conditions. The algorithm was tested qualitatively and quantitatively on normal and pathological MRI cases and worked well. In this paper we discuss in more detail inclusion of anatomical knowledge in the algorithm and usefulness of our approach from clinical perspective.
Apache Clinical Text and Knowledge Extraction System (cTAKES) | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

The tool extracts deep phenotypic information from the clinical narrative at the document-, episode-, and patient-level. The final output is FHIR compliant patient-level phenotypic summary which can be consumed by research warehouses or the DeepPhe native visualization tool.
Concept of Operations for Collaboration and Discovery from Big Data Across Enterprise Data Warehouses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Olama, Mohammed M; Nutaro, James J; Sukumar, Sreenivas R

2013-01-01

The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Optionsmore » that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.« less
A Bayesian framework for extracting human gait using strong prior knowledge.

PubMed

Zhou, Ziheng; Prügel-Bennett, Adam; Damper, Robert I

2006-11-01

Extracting full-body motion of walking people from monocular video sequences in complex, real-world environments is an important and difficult problem, going beyond simple tracking, whose satisfactory solution demands an appropriate balance between use of prior knowledge and learning from data. We propose a consistent Bayesian framework for introducing strong prior knowledge into a system for extracting human gait. In this work, the strong prior is built from a simple articulated model having both time-invariant (static) and time-variant (dynamic) parameters. The model is easily modified to cater to situations such as walkers wearing clothing that obscures the limbs. The statistics of the parameters are learned from high-quality (indoor laboratory) data and the Bayesian framework then allows us to "bootstrap" to accurate gait extraction on the noisy images typical of cluttered, outdoor scenes. To achieve automatic fitting, we use a hidden Markov model to detect the phases of images in a walking cycle. We demonstrate our approach on silhouettes extracted from fronto-parallel ("sideways on") sequences of walkers under both high-quality indoor and noisy outdoor conditions. As well as high-quality data with synthetic noise and occlusions added, we also test walkers with rucksacks, skirts, and trench coats. Results are quantified in terms of chamfer distance and average pixel error between automatically extracted body points and corresponding hand-labeled points. No one part of the system is novel in itself, but the overall framework makes it feasible to extract gait from very much poorer quality image sequences than hitherto. This is confirmed by comparing person identification by gait using our method and a well-established baseline recognition algorithm.
Eliciting and Representing High-Level Knowledge Requirements to Discover Ecological Knowledge in Flower-Visiting Data

PubMed Central

2016-01-01

Observations of individual organisms (data) can be combined with expert ecological knowledge of species, especially causal knowledge, to model and extract from flower–visiting data useful information about behavioral interactions between insect and plant organisms, such as nectar foraging and pollen transfer. We describe and evaluate a method to elicit and represent such expert causal knowledge of behavioral ecology, and discuss the potential for wider application of this method to the design of knowledge-based systems for knowledge discovery in biodiversity and ecosystem informatics. PMID:27851814
A Generic Framework for Extraction of Knowledge from Social Web Sources (Social Networking Websites) for an Online Recommendation System

ERIC Educational Resources Information Center

Sathick, Javubar; Venkat, Jaya

2015-01-01

Mining social web data is a challenging task and finding user interest for personalized and non-personalized recommendation systems is another important task. Knowledge sharing among web users has become crucial in determining usage of web data and personalizing content in various social websites as per the user's wish. This paper aims to design a…
Knowledge-based expert systems and a proof-of-concept case study for multiple sequence alignment construction and analysis.

PubMed

Aniba, Mohamed Radhouene; Siguenza, Sophie; Friedrich, Anne; Plewniak, Frédéric; Poch, Olivier; Marchler-Bauer, Aron; Thompson, Julie Dawn

2009-01-01

The traditional approach to bioinformatics analyses relies on independent task-specific services and applications, using different input and output formats, often idiosyncratic, and frequently not designed to inter-operate. In general, such analyses were performed by experts who manually verified the results obtained at each step in the process. Today, the amount of bioinformatics information continuously being produced means that handling the various applications used to study this information presents a major data management and analysis challenge to researchers. It is now impossible to manually analyse all this information and new approaches are needed that are capable of processing the large-scale heterogeneous data in order to extract the pertinent information. We review the recent use of integrated expert systems aimed at providing more efficient knowledge extraction for bioinformatics research. A general methodology for building knowledge-based expert systems is described, focusing on the unstructured information management architecture, UIMA, which provides facilities for both data and process management. A case study involving a multiple alignment expert system prototype called AlexSys is also presented.

Knowledge-based expert systems and a proof-of-concept case study for multiple sequence alignment construction and analysis

PubMed Central

Aniba, Mohamed Radhouene; Siguenza, Sophie; Friedrich, Anne; Plewniak, Frédéric; Poch, Olivier; Marchler-Bauer, Aron

2009-01-01

The traditional approach to bioinformatics analyses relies on independent task-specific services and applications, using different input and output formats, often idiosyncratic, and frequently not designed to inter-operate. In general, such analyses were performed by experts who manually verified the results obtained at each step in the process. Today, the amount of bioinformatics information continuously being produced means that handling the various applications used to study this information presents a major data management and analysis challenge to researchers. It is now impossible to manually analyse all this information and new approaches are needed that are capable of processing the large-scale heterogeneous data in order to extract the pertinent information. We review the recent use of integrated expert systems aimed at providing more efficient knowledge extraction for bioinformatics research. A general methodology for building knowledge-based expert systems is described, focusing on the unstructured information management architecture, UIMA, which provides facilities for both data and process management. A case study involving a multiple alignment expert system prototype called AlexSys is also presented. PMID:18971242
Concept recognition for extracting protein interaction relations from biomedical text

PubMed Central

Baumgartner, William A; Lu, Zhiyong; Johnson, Helen L; Caporaso, J Gregory; Paquette, Jesse; Lindemann, Anna; White, Elizabeth K; Medvedeva, Olga; Cohen, K Bretonnel; Hunter, Lawrence

2008-01-01

Background: Reliable information extraction applications have been a long sought goal of the biomedical text mining community, a goal that if reached would provide valuable tools to benchside biologists in their increasingly difficult task of assimilating the knowledge contained in the biomedical literature. We present an integrated approach to concept recognition in biomedical text. Concept recognition provides key information that has been largely missing from previous biomedical information extraction efforts, namely direct links to well defined knowledge resources that explicitly cement the concept's semantics. The BioCreative II tasks discussed in this special issue have provided a unique opportunity to demonstrate the effectiveness of concept recognition in the field of biomedical language processing. Results: Through the modular construction of a protein interaction relation extraction system, we present several use cases of concept recognition in biomedical text, and relate these use cases to potential uses by the benchside biologist. Conclusion: Current information extraction technologies are approaching performance standards at which concept recognition can begin to deliver high quality data to the benchside biologist. Our system is available as part of the BioCreative Meta-Server project and on the internet . PMID:18834500
Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications

PubMed Central

Masanz, James J; Ogren, Philip V; Zheng, Jiaping; Sohn, Sunghwan; Kipper-Schuler, Karin C; Chute, Christopher G

2010-01-01

We aim to build and evaluate an open-source natural language processing system for information extraction from electronic medical record clinical free-text. We describe and evaluate our system, the clinical Text Analysis and Knowledge Extraction System (cTAKES), released open-source at http://www.ohnlp.org. The cTAKES builds on existing open-source technologies—the Unstructured Information Management Architecture framework and OpenNLP natural language processing toolkit. Its components, specifically trained for the clinical domain, create rich linguistic and semantic annotations. Performance of individual components: sentence boundary detector accuracy=0.949; tokenizer accuracy=0.949; part-of-speech tagger accuracy=0.936; shallow parser F-score=0.924; named entity recognizer and system-level evaluation F-score=0.715 for exact and 0.824 for overlapping spans, and accuracy for concept mapping, negation, and status attributes for exact and overlapping spans of 0.957, 0.943, 0.859, and 0.580, 0.939, and 0.839, respectively. Overall performance is discussed against five applications. The cTAKES annotations are the foundation for methods and modules for higher-level semantic processing of clinical free-text. PMID:20819853
A tutorial on information retrieval: basic terms and concepts

PubMed Central

Zhou, Wei; Smalheiser, Neil R; Yu, Clement

2006-01-01

This informal tutorial is intended for investigators and students who would like to understand the workings of information retrieval systems, including the most frequently used search engines: PubMed and Google. Having a basic knowledge of the terms and concepts of information retrieval should improve the efficiency and productivity of searches. As well, this knowledge is needed in order to follow current research efforts in biomedical information retrieval and text mining that are developing new systems not only for finding documents on a given topic, but extracting and integrating knowledge across documents. PMID:16722601
Competitive-Cooperative Automated Reasoning from Distributed and Multiple Source of Data

NASA Astrophysics Data System (ADS)

Fard, Amin Milani

Knowledge extraction from distributed database systems, have been investigated during past decade in order to analyze billions of information records. In this work a competitive deduction approach in a heterogeneous data grid environment is proposed using classic data mining and statistical methods. By applying a game theory concept in a multi-agent model, we tried to design a policy for hierarchical knowledge discovery and inference fusion. To show the system run, a sample multi-expert system has also been developed.
ECO: A Framework for Entity Co-Occurrence Exploration with Faceted Navigation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Halliday, K. D.

2010-08-20

Even as highly structured databases and semantic knowledge bases become more prevalent, a substantial amount of human knowledge is reported as written prose. Typical textual reports, such as news articles, contain information about entities (people, organizations, and locations) and their relationships. Automatically extracting such relationships from large text corpora is a key component of corporate and government knowledge bases. The primary goal of the ECO project is to develop a scalable framework for extracting and presenting these relationships for exploration using an easily navigable faceted user interface. ECO uses entity co-occurrence relationships to identify related entities. The system aggregates andmore » indexes information on each entity pair, allowing the user to rapidly discover and mine relational information.« less
Assessing an AI knowledge-base for asymptomatic liver diseases.

PubMed

Babic, A; Mathiesen, U; Hedin, K; Bodemar, G; Wigertz, O

1998-01-01

Discovering not yet seen knowledge from clinical data is of importance in the field of asymptomatic liver diseases. Avoidance of liver biopsy which is used as the ultimate confirmation of diagnosis by making the decision based on relevant laboratory findings only, would be considered an essential support. The system based on Quinlan's ID3 algorithm was simple and efficient in extracting the sought knowledge. Basic principles of applying the AI systems are therefore described and complemented with medical evaluation. Some of the diagnostic rules were found to be useful as decision algorithms i.e. they could be directly applied in clinical work and made a part of the knowledge-base of the Liver Guide, an automated decision support system.
Neural network explanation using inversion.

PubMed

Saad, Emad W; Wunsch, Donald C

2007-01-01

An important drawback of many artificial neural networks (ANN) is their lack of explanation capability [Andrews, R., Diederich, J., & Tickle, A. B. (1996). A survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems, 8, 373-389]. This paper starts with a survey of algorithms which attempt to explain the ANN output. We then present HYPINV, a new explanation algorithm which relies on network inversion; i.e. calculating the ANN input which produces a desired output. HYPINV is a pedagogical algorithm, that extracts rules, in the form of hyperplanes. It is able to generate rules with arbitrarily desired fidelity, maintaining a fidelity-complexity tradeoff. To our knowledge, HYPINV is the only pedagogical rule extraction method, which extracts hyperplane rules from continuous or binary attribute neural networks. Different network inversion techniques, involving gradient descent as well as an evolutionary algorithm, are presented. An information theoretic treatment of rule extraction is presented. HYPINV is applied to example synthetic problems, to a real aerospace problem, and compared with similar algorithms using benchmark problems.
[System evaluation on Ginkgo Biloba extract in the treatment of acute cerebral infarction].

PubMed

Wang, Lin; Zhang, Tao; Bai, Kezhen

2015-10-01

To evaluate the effect and safety of Ginkgo Biloba extract on the treatment of acute cerebral infarction.  The Database of Wanfang, China National Knowledge Infrastructure (CNKI) and VIPU were screened for literatures regarding Ginkgo Biloba extract in the treatment of acute cerebral infarction, including the clinical randomized controlled trials. Meta-analysis based on the Revman 4.2 system was performed.  Compared with the control group, treatment with Ginkgo Biloba extract enhanced efficacy in the treatment of acute cerebral infarction (OR: 1.60-5.53), which displayed an improved neural function defect score [WMD -3.12 (95%CI: -3.96- -2.28)].  Ginkgo Biloba extract is beneficial to the improvement of neurological function in patients with acute cerebral infarction and it is safe for patients.
Valx: A system for extracting and structuring numeric lab test comparison statements from text

PubMed Central

Hao, Tianyong; Liu, Hongfang; Weng, Chunhua

2017-01-01

Objectives To develop an automated method for extracting and structuring numeric lab test comparison statements from text and evaluate the method using clinical trial eligibility criteria text. Methods Leveraging semantic knowledge from the Unified Medical Language System (UMLS) and domain knowledge acquired from the Internet, Valx takes 7 steps to extract and normalize numeric lab test expressions: 1) text preprocessing, 2) numeric, unit, and comparison operator extraction, 3) variable identification using hybrid knowledge, 4) variable - numeric association, 5) context-based association filtering, 6) measurement unit normalization, and 7) heuristic rule-based comparison statements verification. Our reference standard was the consensus-based annotation among three raters for all comparison statements for two variables, i.e., HbA1c and glucose, identified from all of Type 1 and Type 2 diabetes trials in ClinicalTrials.gov. Results The precision, recall, and F-measure for structuring HbA1c comparison statements were 99.6%, 98.1%, 98.8% for Type 1 diabetes trials, and 98.8%, 96.9%, 97.8% for Type 2 Diabetes trials, respectively. The precision, recall, and F-measure for structuring glucose comparison statements were 97.3%, 94.8%, 96.1% for Type 1 diabetes trials, and 92.3%, 92.3%, 92.3% for Type 2 diabetes trials, respectively. Conclusions Valx is effective at extracting and structuring free-text lab test comparison statements in clinical trial summaries. Future studies are warranted to test its generalizability beyond eligibility criteria text. The open-source Valx enables its further evaluation and continued improvement among the collaborative scientific community. PMID:26940748
Valx: A System for Extracting and Structuring Numeric Lab Test Comparison Statements from Text.

PubMed

Hao, Tianyong; Liu, Hongfang; Weng, Chunhua

2016-05-17

To develop an automated method for extracting and structuring numeric lab test comparison statements from text and evaluate the method using clinical trial eligibility criteria text. Leveraging semantic knowledge from the Unified Medical Language System (UMLS) and domain knowledge acquired from the Internet, Valx takes seven steps to extract and normalize numeric lab test expressions: 1) text preprocessing, 2) numeric, unit, and comparison operator extraction, 3) variable identification using hybrid knowledge, 4) variable - numeric association, 5) context-based association filtering, 6) measurement unit normalization, and 7) heuristic rule-based comparison statements verification. Our reference standard was the consensus-based annotation among three raters for all comparison statements for two variables, i.e., HbA1c and glucose, identified from all of Type 1 and Type 2 diabetes trials in ClinicalTrials.gov. The precision, recall, and F-measure for structuring HbA1c comparison statements were 99.6%, 98.1%, 98.8% for Type 1 diabetes trials, and 98.8%, 96.9%, 97.8% for Type 2 diabetes trials, respectively. The precision, recall, and F-measure for structuring glucose comparison statements were 97.3%, 94.8%, 96.1% for Type 1 diabetes trials, and 92.3%, 92.3%, 92.3% for Type 2 diabetes trials, respectively. Valx is effective at extracting and structuring free-text lab test comparison statements in clinical trial summaries. Future studies are warranted to test its generalizability beyond eligibility criteria text. The open-source Valx enables its further evaluation and continued improvement among the collaborative scientific community.
Managing interoperability and complexity in health systems.

PubMed

Bouamrane, M-M; Tao, C; Sarkar, I N

2015-01-01

In recent years, we have witnessed substantial progress in the use of clinical informatics systems to support clinicians during episodes of care, manage specialised domain knowledge, perform complex clinical data analysis and improve the management of health organisations' resources. However, the vision of fully integrated health information eco-systems, which provide relevant information and useful knowledge at the point-of-care, remains elusive. This journal Focus Theme reviews some of the enduring challenges of interoperability and complexity in clinical informatics systems. Furthermore, a range of approaches are proposed in order to address, harness and resolve some of the many remaining issues towards a greater integration of health information systems and extraction of useful or new knowledge from heterogeneous electronic data repositories.
A knowledge-base generating hierarchical fuzzy-neural controller.

PubMed

Kandadai, R M; Tien, J M

1997-01-01

We present an innovative fuzzy-neural architecture that is able to automatically generate a knowledge base, in an extractable form, for use in hierarchical knowledge-based controllers. The knowledge base is in the form of a linguistic rule base appropriate for a fuzzy inference system. First, we modify Berenji and Khedkar's (1992) GARIC architecture to enable it to automatically generate a knowledge base; a pseudosupervised learning scheme using reinforcement learning and error backpropagation is employed. Next, we further extend this architecture to a hierarchical controller that is able to generate its own knowledge base. Example applications are provided to underscore its viability.
Educational System Efficiency Improvement Using Knowledge Discovery in Databases

ERIC Educational Resources Information Center

Lukaš, Mirko; Leškovic, Darko

2007-01-01

This study describes one of possible way of usage ICT in education system. We basically treated educational system like Business Company and develop appropriate model for clustering of student population. Modern educational systems are forced to extract the most necessary and purposeful information from a large amount of available data. Clustering…
Knowledge Discovery from Posts in Online Health Communities Using Unified Medical Language System.

PubMed

Chen, Donghua; Zhang, Runtong; Liu, Kecheng; Hou, Lei

2018-06-19

Patient-reported posts in Online Health Communities (OHCs) contain various valuable information that can help establish knowledge-based online support for online patients. However, utilizing these reports to improve online patient services in the absence of appropriate medical and healthcare expert knowledge is difficult. Thus, we propose a comprehensive knowledge discovery method that is based on the Unified Medical Language System for the analysis of narrative posts in OHCs. First, we propose a domain-knowledge support framework for OHCs to provide a basis for post analysis. Second, we develop a Knowledge-Involved Topic Modeling (KI-TM) method to extract and expand explicit knowledge within the text. We propose four metrics, namely, explicit knowledge rate, latent knowledge rate, knowledge correlation rate, and perplexity, for the evaluation of the KI-TM method. Our experimental results indicate that our proposed method outperforms existing methods in terms of providing knowledge support. Our method enhances knowledge support for online patients and can help develop intelligent OHCs in the future.
A Method of Sharing Tacit Knowledge by a Bulletin Board Link to Video Scene and an Evaluation in the Field of Nursing Skill

NASA Astrophysics Data System (ADS)

Shimada, Satoshi; Azuma, Shouzou; Teranaka, Sayaka; Kojima, Akira; Majima, Yukie; Maekawa, Yasuko

We developed the system that knowledge could be discovered and shared cooperatively in the organization based on the SECI model of knowledge management. This system realized three processes by the following method. (1)A video that expressed skill is segmented into a number of scenes according to its contents. Tacit knowledge is shared in each scene. (2)Tacit knowledge is extracted by bulletin board linked to each scene. (3)Knowledge is acquired by repeatedly viewing the video scene with the comment that shows the technical content to be practiced. We conducted experiments that the system was used by nurses working for general hospitals. Experimental results show that the nursing practical knack is able to be collected by utilizing bulletin board linked to video scene. Results of this study confirmed the possibility of expressing the tacit knowledge of nurses' empirical nursing skills sensitively with a clue of video images.
Learning a Health Knowledge Graph from Electronic Medical Records.

PubMed

Rotmensch, Maya; Halpern, Yoni; Tlimat, Abdulhakim; Horng, Steven; Sontag, David

2017-07-20

Demand for clinical decision support systems in medicine and self-diagnostic symptom checkers has substantially increased in recent years. Existing platforms rely on knowledge bases manually compiled through a labor-intensive process or automatically derived using simple pairwise statistics. This study explored an automated process to learn high quality knowledge bases linking diseases and symptoms directly from electronic medical records. Medical concepts were extracted from 273,174 de-identified patient records and maximum likelihood estimation of three probabilistic models was used to automatically construct knowledge graphs: logistic regression, naive Bayes classifier and a Bayesian network using noisy OR gates. A graph of disease-symptom relationships was elicited from the learned parameters and the constructed knowledge graphs were evaluated and validated, with permission, against Google's manually-constructed knowledge graph and against expert physician opinions. Our study shows that direct and automated construction of high quality health knowledge graphs from medical records using rudimentary concept extraction is feasible. The noisy OR model produces a high quality knowledge graph reaching precision of 0.85 for a recall of 0.6 in the clinical evaluation. Noisy OR significantly outperforms all tested models across evaluation frameworks (p < 0.01).
Effective low-level processing for interferometric image enhancement

NASA Astrophysics Data System (ADS)

Joo, Wonjong; Cha, Soyoung S.

1995-09-01

The hybrid operation of digital image processing and a knowledge-based AI system has been recognized as a desirable approach of the automated evaluation of noise-ridden interferogram. Early noise/data reduction before phase is extracted is essential for the success of the knowledge- based processing. In this paper, new concepts of effective, interactive low-level processing operators: that is, a background-matched filter and a directional-smoothing filter, are developed and tested with transonic aerodynamic interferograms. The results indicate that these new operators have promising advantages in noise/data reduction over the conventional ones, leading success of the high-level, intelligent phase extraction.
Biomedical question answering using semantic relations.

PubMed

Hristovski, Dimitar; Dinevski, Dejan; Kastrin, Andrej; Rindflesch, Thomas C

2015-01-16

The proliferation of the scientific literature in the field of biomedicine makes it difficult to keep abreast of current knowledge, even for domain experts. While general Web search engines and specialized information retrieval (IR) systems have made important strides in recent decades, the problem of accurate knowledge extraction from the biomedical literature is far from solved. Classical IR systems usually return a list of documents that have to be read by the user to extract relevant information. This tedious and time-consuming work can be lessened with automatic Question Answering (QA) systems, which aim to provide users with direct and precise answers to their questions. In this work we propose a novel methodology for QA based on semantic relations extracted from the biomedical literature. We extracted semantic relations with the SemRep natural language processing system from 122,421,765 sentences, which came from 21,014,382 MEDLINE citations (i.e., the complete MEDLINE distribution up to the end of 2012). A total of 58,879,300 semantic relation instances were extracted and organized in a relational database. The QA process is implemented as a search in this database, which is accessed through a Web-based application, called SemBT (available at http://sembt.mf.uni-lj.si ). We conducted an extensive evaluation of the proposed methodology in order to estimate the accuracy of extracting a particular semantic relation from a particular sentence. Evaluation was performed by 80 domain experts. In total 7,510 semantic relation instances belonging to 2,675 distinct relations were evaluated 12,083 times. The instances were evaluated as correct 8,228 times (68%). In this work we propose an innovative methodology for biomedical QA. The system is implemented as a Web-based application that is able to provide precise answers to a wide range of questions. A typical question is answered within a few seconds. The tool has some extensions that make it especially useful for interpretation of DNA microarray results.
Effect of HEH[EHP] impurities on the ALSEP solvent extraction process

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holfeltz, Vanessa E.; Campbell, Emily L.; Peterman, Dean R.

In solvent extraction processes, organic phase impurities can negatively impact separation factors, hydrolytic performance, and overall system robustness. This affects the process-level viability of a separation concept and necessitates knowledge of the behavior and mechanisms to control impurities in the solvent. The most widespread way through which impurities are introduced into a system is through impure extractants and/or diluents used to prepare the solvent, and often development of new purification schemes to achieve the desired level of purity is needed. In this work, the acidic extractant, 2-ethylhexylphosphonic acid mono-2-ethylhexyl ester (HEH[EHP])—proposed for application in extractive processes aimed at separating trivalentmore » minor actinides from lanthanides and other fission products—is characterized with respect to its common impurities and their impact on Am(III) stripping in the Actinide Lanthanide SEParation (ALSEP) system. To control impurities in HEH[EHP], existing purification technologies commonly applied for the acidic organophosphorus reagents are reviewed, and a new method specific to HEH[EHP] purification is presented.« less

Automatic information extraction from unstructured mammography reports using distributed semantics.

PubMed

Gupta, Anupama; Banerjee, Imon; Rubin, Daniel L

2018-02-01

To date, the methods developed for automated extraction of information from radiology reports are mainly rule-based or dictionary-based, and, therefore, require substantial manual effort to build these systems. Recent efforts to develop automated systems for entity detection have been undertaken, but little work has been done to automatically extract relations and their associated named entities in narrative radiology reports that have comparable accuracy to rule-based methods. Our goal is to extract relations in a unsupervised way from radiology reports without specifying prior domain knowledge. We propose a hybrid approach for information extraction that combines dependency-based parse tree with distributed semantics for generating structured information frames about particular findings/abnormalities from the free-text mammography reports. The proposed IE system obtains a F 1 -score of 0.94 in terms of completeness of the content in the information frames, which outperforms a state-of-the-art rule-based system in this domain by a significant margin. The proposed system can be leveraged in a variety of applications, such as decision support and information retrieval, and may also easily scale to other radiology domains, since there is no need to tune the system with hand-crafted information extraction rules. Copyright © 2018 Elsevier Inc. All rights reserved.
Enhancing biomedical text summarization using semantic relation extraction.

PubMed

Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

2011-01-01

Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization.
Structuring and extracting knowledge for the support of hypothesis generation in molecular biology

PubMed Central

Roos, Marco; Marshall, M Scott; Gibson, Andrew P; Schuemie, Martijn; Meij, Edgar; Katrenko, Sophia; van Hage, Willem Robert; Krommydas, Konstantinos; Adriaans, Pieter W

2009-01-01

Background Hypothesis generation in molecular and cellular biology is an empirical process in which knowledge derived from prior experiments is distilled into a comprehensible model. The requirement of automated support is exemplified by the difficulty of considering all relevant facts that are contained in the millions of documents available from PubMed. Semantic Web provides tools for sharing prior knowledge, while information retrieval and information extraction techniques enable its extraction from literature. Their combination makes prior knowledge available for computational analysis and inference. While some tools provide complete solutions that limit the control over the modeling and extraction processes, we seek a methodology that supports control by the experimenter over these critical processes. Results We describe progress towards automated support for the generation of biomolecular hypotheses. Semantic Web technologies are used to structure and store knowledge, while a workflow extracts knowledge from text. We designed minimal proto-ontologies in OWL for capturing different aspects of a text mining experiment: the biological hypothesis, text and documents, text mining, and workflow provenance. The models fit a methodology that allows focus on the requirements of a single experiment while supporting reuse and posterior analysis of extracted knowledge from multiple experiments. Our workflow is composed of services from the 'Adaptive Information Disclosure Application' (AIDA) toolkit as well as a few others. The output is a semantic model with putative biological relations, with each relation linked to the corresponding evidence. Conclusion We demonstrated a 'do-it-yourself' approach for structuring and extracting knowledge in the context of experimental research on biomolecular mechanisms. The methodology can be used to bootstrap the construction of semantically rich biological models using the results of knowledge extraction processes. Models specific to particular experiments can be constructed that, in turn, link with other semantic models, creating a web of knowledge that spans experiments. Mapping mechanisms can link to other knowledge resources such as OBO ontologies or SKOS vocabularies. AIDA Web Services can be used to design personalized knowledge extraction procedures. In our example experiment, we found three proteins (NF-Kappa B, p21, and Bax) potentially playing a role in the interplay between nutrients and epigenetic gene regulation. PMID:19796406
A Knowledge-Based Approach to Automatic Detection of Equipment Alarm Sounds in a Neonatal Intensive Care Unit Environment.

PubMed

Raboshchuk, Ganna; Nadeu, Climent; Jancovic, Peter; Lilja, Alex Peiro; Kokuer, Munevver; Munoz Mahamud, Blanca; Riverola De Veciana, Ana

2018-01-01

A large number of alarm sounds triggered by biomedical equipment occur frequently in the noisy environment of a neonatal intensive care unit (NICU) and play a key role in providing healthcare. In this paper, our work on the development of an automatic system for detection of acoustic alarms in that difficult environment is presented. Such automatic detection system is needed for the investigation of how a preterm infant reacts to auditory stimuli of the NICU environment and for an improved real-time patient monitoring. The approach presented in this paper consists of using the available knowledge about each alarm class in the design of the detection system. The information about the frequency structure is used in the feature extraction stage, and the time structure knowledge is incorporated at the post-processing stage. Several alternative methods are compared for feature extraction, modeling, and post-processing. The detection performance is evaluated with real data recorded in the NICU of the hospital, and by using both frame-level and period-level metrics. The experimental results show that the inclusion of both spectral and temporal information allows to improve the baseline detection performance by more than 60%.
A Knowledge-Based Approach to Automatic Detection of Equipment Alarm Sounds in a Neonatal Intensive Care Unit Environment

PubMed Central

Nadeu, Climent; Jančovič, Peter; Lilja, Alex Peiró; Köküer, Münevver; Muñoz Mahamud, Blanca; Riverola De Veciana, Ana

2018-01-01

A large number of alarm sounds triggered by biomedical equipment occur frequently in the noisy environment of a neonatal intensive care unit (NICU) and play a key role in providing healthcare. In this paper, our work on the development of an automatic system for detection of acoustic alarms in that difficult environment is presented. Such automatic detection system is needed for the investigation of how a preterm infant reacts to auditory stimuli of the NICU environment and for an improved real-time patient monitoring. The approach presented in this paper consists of using the available knowledge about each alarm class in the design of the detection system. The information about the frequency structure is used in the feature extraction stage, and the time structure knowledge is incorporated at the post-processing stage. Several alternative methods are compared for feature extraction, modeling, and post-processing. The detection performance is evaluated with real data recorded in the NICU of the hospital, and by using both frame-level and period-level metrics. The experimental results show that the inclusion of both spectral and temporal information allows to improve the baseline detection performance by more than 60%. PMID:29404227
GDRMS: a system for automatic extraction of the disease-centre relation

NASA Astrophysics Data System (ADS)

Yang, Ronggen; Zhang, Yue; Gong, Lejun

2012-01-01

With the rapidly increasing of biomedical literature, the deluge of new articles is leading to information overload. Extracting the available knowledge from the huge amount of biomedical literature has become a major challenge. GDRMS is developed as a tool that extracts the relationship between disease and gene, gene and gene from biomedical literatures using text mining technology. It is a ruled-based system which also provides disease-centre network visualization, constructs the disease-gene database, and represents a gene engine for understanding the function of the gene. The main focus of GDRMS is to provide a valuable opportunity to explore the relationship between disease and gene for the research community about etiology of disease.
Linear feature extraction from radar imagery: SBIR (Small Business Innovative Research), phase 2, option 2

NASA Astrophysics Data System (ADS)

Milgram, David L.; Kahn, Philip; Conner, Gary D.; Lawton, Daryl T.

1988-12-01

The goal of this effort is to develop and demonstrate prototype processing capabilities for a knowledge-based system to automatically extract and analyze features from Synthetic Aperture Radar (SAR) imagery. This effort constitutes Phase 2 funding through the Defense Small Business Innovative Research (SBIR) Program. Previous work examined the feasibility of and technology issues involved in the development of an automated linear feature extraction system. This final report documents this examination and the technologies involved in automating this image understanding task. In particular, it reports on a major software delivery containing an image processing algorithmic base, a perceptual structures manipulation package, a preliminary hypothesis management framework and an enhanced user interface.
Leveraging Semantic Knowledge in IRB Databases to Improve Translation Science

PubMed Central

Hurdle, John F.; Botkin, Jeffery; Rindflesch, Thomas C.

2007-01-01

We introduce the notion that research administrative databases (RADs), such as those increasingly used to manage information flow in the Institutional Review Board (IRB), offer a novel, useful, and mine-able data source overlooked by informaticists. As a proof of concept, using an IRB database we extracted all titles and abstracts from system startup through January 2007 (n=1,876); formatted these in a pseudo-MEDLINE format; and processed them through the SemRep semantic knowledge extraction system. Even though SemRep is tuned to find semantic relations in MEDLINE citations, we found that it performed comparably well on the IRB texts. When adjusted to eliminate non-healthcare IRB submissions (e.g., economic and education studies), SemRep extracted an average of 7.3 semantic relations per IRB abstract (compared to an average of 11.1 for MEDLINE citations) with a precision of 70% (compared to 78% for MEDLINE). We conclude that RADs, as represented by IRB data, are mine-able with existing tools, but that performance will improve as these tools are tuned for RAD structures. PMID:18693856
Ontology-based reusable clinical document template production system.

PubMed

Nam, Sejin; Lee, Sungin; Kim, James G Boram; Kim, Hong-Gee

2012-01-01

Clinical documents embody professional clinical knowledge. This paper shows an effective clinical document template (CDT) production system that uses a clinical description entity (CDE) model, a CDE ontology, and a knowledge management system called STEP that manages ontology-based clinical description entities. The ontology represents CDEs and their inter-relations, and the STEP system stores and manages CDE ontology-based information regarding CDTs. The system also provides Web Services interfaces for search and reasoning over clinical entities. The system was populated with entities and relations extracted from 35 CDTs that were used in admission, discharge, and progress reports, as well as those used in nursing and operation functions. A clinical document template editor is shown that uses STEP.
Antimicrobial thin films based on ayurvedic plants extracts embedded in a bioactive glass matrix

NASA Astrophysics Data System (ADS)

Floroian, L.; Ristoscu, C.; Candiani, G.; Pastori, N.; Moscatelli, M.; Mihailescu, N.; Negut, I.; Badea, M.; Gilca, M.; Chiesa, R.; Mihailescu, I. N.

2017-09-01

Ayurvedic medicine is one of the oldest medical systems. It is an example of a coherent traditional system which has a time-tested and precise algorithm for medicinal plant selection, based on several ethnopharmacophore descriptors which knowledge endows the user to adequately choose the optimal plant for the treatment of certain pathology. This work aims for linking traditional knowledge with biomedical science by using traditional ayurvedic plants extracts with antimicrobial effect in form of thin films for implant protection. We report on the transfer of novel composites from bioactive glass mixed with antimicrobial plants extracts and polymer by matrix-assisted pulsed laser evaporation into uniform thin layers onto stainless steel implant-like surfaces. The comprehensive characterization of the deposited films was performed by complementary analyses: Fourier transformed infrared spectroscopy, glow discharge optical emission spectroscopy, scanning electron microscopy, atomic force microscopy, electrochemical impedance spectroscopy, UV-VIS absorption spectroscopy and antimicrobial tests. The results emphasize upon the multifunctionality of these coatings which allow to halt the leakage of metal and metal oxides into the biological fluids and eventually to inner organs (by polymer use), to speed up the osseointegration (due to the bioactive glass use), to exert antimicrobial effects (by ayurvedic plants extracts use) and to decrease the implant price (by cheaper stainless steel use).
Drug side effect extraction from clinical narratives of psychiatry and psychology patients

PubMed Central

Kocher, Jean-Pierre A; Chute, Christopher G; Savova, Guergana K

2011-01-01

Objective To extract physician-asserted drug side effects from electronic medical record clinical narratives. Materials and methods Pattern matching rules were manually developed through examining keywords and expression patterns of side effects to discover an individual side effect and causative drug relationship. A combination of machine learning (C4.5) using side effect keyword features and pattern matching rules was used to extract sentences that contain side effect and causative drug pairs, enabling the system to discover most side effect occurrences. Our system was implemented as a module within the clinical Text Analysis and Knowledge Extraction System. Results The system was tested in the domain of psychiatry and psychology. The rule-based system extracting side effects and causative drugs produced an F score of 0.80 (0.55 excluding allergy section). The hybrid system identifying side effect sentences had an F score of 0.75 (0.56 excluding allergy section) but covered more side effect and causative drug pairs than individual side effect extraction. Discussion The rule-based system was able to identify most side effects expressed by clear indication words. More sophisticated semantic processing is required to handle complex side effect descriptions in the narrative. We demonstrated that our system can be trained to identify sentences with complex side effect descriptions that can be submitted to a human expert for further abstraction. Conclusion Our system was able to extract most physician-asserted drug side effects. It can be used in either an automated mode for side effect extraction or semi-automated mode to identify side effect sentences that can significantly simplify abstraction by a human expert. PMID:21946242
Materials Knowledge Systems in Python - A Data Science Framework for Accelerated Development of Hierarchical Materials.

PubMed

Brough, David B; Wheeler, Daniel; Kalidindi, Surya R

2017-03-01

There is a critical need for customized analytics that take into account the stochastic nature of the internal structure of materials at multiple length scales in order to extract relevant and transferable knowledge. Data driven Process-Structure-Property (PSP) linkages provide systemic, modular and hierarchical framework for community driven curation of materials knowledge, and its transference to design and manufacturing experts. The Materials Knowledge Systems in Python project (PyMKS) is the first open source materials data science framework that can be used to create high value PSP linkages for hierarchical materials that can be leveraged by experts in materials science and engineering, manufacturing, machine learning and data science communities. This paper describes the main functions available from this repository, along with illustrations of how these can be accessed, utilized, and potentially further refined by the broader community of researchers.
Materials Knowledge Systems in Python - A Data Science Framework for Accelerated Development of Hierarchical Materials

PubMed Central

Brough, David B; Wheeler, Daniel; Kalidindi, Surya R.

2017-01-01

There is a critical need for customized analytics that take into account the stochastic nature of the internal structure of materials at multiple length scales in order to extract relevant and transferable knowledge. Data driven Process-Structure-Property (PSP) linkages provide systemic, modular and hierarchical framework for community driven curation of materials knowledge, and its transference to design and manufacturing experts. The Materials Knowledge Systems in Python project (PyMKS) is the first open source materials data science framework that can be used to create high value PSP linkages for hierarchical materials that can be leveraged by experts in materials science and engineering, manufacturing, machine learning and data science communities. This paper describes the main functions available from this repository, along with illustrations of how these can be accessed, utilized, and potentially further refined by the broader community of researchers. PMID:28690971
ViDI: Virtual Diagnostics Interface. Volume 1; The Future of Wind Tunnel Testing

NASA Technical Reports Server (NTRS)

Fleming, Gary A. (Technical Monitor); Schwartz, Richard J.

2004-01-01

The quality of data acquired in a given test facility ultimately resides within the fidelity and implementation of the instrumentation systems. Over the last decade, the emergence of robust optical techniques has vastly expanded the envelope of measurement possibilities. At the same time the capabilities for data processing, data archiving and data visualization required to extract the highest level of knowledge from these global, on and off body measurement techniques have equally expanded. Yet today, while the instrumentation has matured to the production stage, an optimized solution for gaining knowledge from the gigabytes of data acquired per test (or even per test point) is lacking. A technological void has to be filled in order to possess a mechanism for near-real time knowledge extraction during wind tunnel experiments. Under these auspices, the Virtual Diagnostics Interface, or ViDI, was developed.
A knowledge-based object recognition system for applications in the space station

NASA Technical Reports Server (NTRS)

Dhawan, Atam P.

1988-01-01

A knowledge-based three-dimensional (3D) object recognition system is being developed. The system uses primitive-based hierarchical relational and structural matching for the recognition of 3D objects in the two-dimensional (2D) image for interpretation of the 3D scene. At present, the pre-processing, low-level preliminary segmentation, rule-based segmentation, and the feature extraction are completed. The data structure of the primitive viewing knowledge-base (PVKB) is also completed. Algorithms and programs based on attribute-trees matching for decomposing the segmented data into valid primitives were developed. The frame-based structural and relational descriptions of some objects were created and stored in a knowledge-base. This knowledge-base of the frame-based descriptions were developed on the MICROVAX-AI microcomputer in LISP environment. The simulated 3D scene of simple non-overlapping objects as well as real camera data of images of 3D objects of low-complexity have been successfully interpreted.
Knowledge representation and management: transforming textual information into useful knowledge.

PubMed

Rassinoux, A-M

2010-01-01

To summarize current outstanding research in the field of knowledge representation and management. Synopsis of the articles selected for the IMIA Yearbook 2010. Four interesting papers, dealing with structured knowledge, have been selected for the section knowledge representation and management. Combining the newest techniques in computational linguistics and natural language processing with the latest methods in statistical data analysis, machine learning and text mining has proved to be efficient for turning unstructured textual information into meaningful knowledge. Three of the four selected papers for the section knowledge representation and management corroborate this approach and depict various experiments conducted to .extract meaningful knowledge from unstructured free texts such as extracting cancer disease characteristics from pathology reports, or extracting protein-protein interactions from biomedical papers, as well as extracting knowledge for the support of hypothesis generation in molecular biology from the Medline literature. Finally, the last paper addresses the level of formally representing and structuring information within clinical terminologies in order to render such information easily available and shareable among the health informatics community. Delivering common powerful tools able to automatically extract meaningful information from the huge amount of electronically unstructured free texts is an essential step towards promoting sharing and reusability across applications, domains, and institutions thus contributing to building capacities worldwide.
Framework for automatic information extraction from research papers on nanocrystal devices

PubMed Central

Yoshioka, Masaharu; Hara, Shinjiro; Newton, Marcus C

2015-01-01

Summary To support nanocrystal device development, we have been working on a computational framework to utilize information in research papers on nanocrystal devices. We developed an annotated corpus called “ NaDev” (Nanocrystal Device Development) for this purpose. We also proposed an automatic information extraction system called “NaDevEx” (Nanocrystal Device Automatic Information Extraction Framework). NaDevEx aims at extracting information from research papers on nanocrystal devices using the NaDev corpus and machine-learning techniques. However, the characteristics of NaDevEx were not examined in detail. In this paper, we conduct system evaluation experiments for NaDevEx using the NaDev corpus. We discuss three main issues: system performance, compared with human annotators; the effect of paper type (synthesis or characterization) on system performance; and the effects of domain knowledge features (e.g., a chemical named entity recognition system and list of names of physical quantities) on system performance. We found that overall system performance was 89% in precision and 69% in recall. If we consider identification of terms that intersect with correct terms for the same information category as the correct identification, i.e., loose agreement (in many cases, we can find that appropriate head nouns such as temperature or pressure loosely match between two terms), the overall performance is 95% in precision and 74% in recall. The system performance is almost comparable with results of human annotators for information categories with rich domain knowledge information (source material). However, for other information categories, given the relatively large number of terms that exist only in one paper, recall of individual information categories is not high (39–73%); however, precision is better (75–97%). The average performance for synthesis papers is better than that for characterization papers because of the lack of training examples for characterization papers. Based on these results, we discuss future research plans for improving the performance of the system. PMID:26665057
Framework for automatic information extraction from research papers on nanocrystal devices.

PubMed

Dieb, Thaer M; Yoshioka, Masaharu; Hara, Shinjiro; Newton, Marcus C

2015-01-01

To support nanocrystal device development, we have been working on a computational framework to utilize information in research papers on nanocrystal devices. We developed an annotated corpus called " NaDev" (Nanocrystal Device Development) for this purpose. We also proposed an automatic information extraction system called "NaDevEx" (Nanocrystal Device Automatic Information Extraction Framework). NaDevEx aims at extracting information from research papers on nanocrystal devices using the NaDev corpus and machine-learning techniques. However, the characteristics of NaDevEx were not examined in detail. In this paper, we conduct system evaluation experiments for NaDevEx using the NaDev corpus. We discuss three main issues: system performance, compared with human annotators; the effect of paper type (synthesis or characterization) on system performance; and the effects of domain knowledge features (e.g., a chemical named entity recognition system and list of names of physical quantities) on system performance. We found that overall system performance was 89% in precision and 69% in recall. If we consider identification of terms that intersect with correct terms for the same information category as the correct identification, i.e., loose agreement (in many cases, we can find that appropriate head nouns such as temperature or pressure loosely match between two terms), the overall performance is 95% in precision and 74% in recall. The system performance is almost comparable with results of human annotators for information categories with rich domain knowledge information (source material). However, for other information categories, given the relatively large number of terms that exist only in one paper, recall of individual information categories is not high (39-73%); however, precision is better (75-97%). The average performance for synthesis papers is better than that for characterization papers because of the lack of training examples for characterization papers. Based on these results, we discuss future research plans for improving the performance of the system.
Linear feature extraction from radar imagery: SBIR (Small Business Innovative Research) phase 2, option 1

NASA Astrophysics Data System (ADS)

Conner, Gary D.; Milgram, David L.; Lawton, Daryl T.; McConnell, Christopher C.

1988-04-01

The goal of this effort is to develop and demonstrate prototype processing capabilities for a knowledge-based system to automatically extract and analyze linear features from synthetic aperture radar (SAR) imagery. This effort constitutes Phase 2 funding through the Defense Small Business Innovative Research (SBIR) Program. Previous work examined the feasibility of the technology issues involved in the development of an automatedlinear feature extraction system. This Option 1 Final Report documents this examination and the technologies involved in automating this image understanding task. In particular, it reports on a major software delivery containing an image processing algorithmic base, a perceptual structures manipulation package, a preliminary hypothesis management framework and an enhanced user interface.
Enhancing Biomedical Text Summarization Using Semantic Relation Extraction

PubMed Central

Shang, Yue; Li, Yanpeng; Lin, Hongfei; Yang, Zhihao

2011-01-01

Automatic text summarization for a biomedical concept can help researchers to get the key points of a certain topic from large amount of biomedical literature efficiently. In this paper, we present a method for generating text summary for a given biomedical concept, e.g., H1N1 disease, from multiple documents based on semantic relation extraction. Our approach includes three stages: 1) We extract semantic relations in each sentence using the semantic knowledge representation tool SemRep. 2) We develop a relation-level retrieval method to select the relations most relevant to each query concept and visualize them in a graphic representation. 3) For relations in the relevant set, we extract informative sentences that can interpret them from the document collection to generate text summary using an information retrieval based method. Our major focus in this work is to investigate the contribution of semantic relation extraction to the task of biomedical text summarization. The experimental results on summarization for a set of diseases show that the introduction of semantic knowledge improves the performance and our results are better than the MEAD system, a well-known tool for text summarization. PMID:21887336

A bioinformatics knowledge discovery in text application for grid computing

PubMed Central

Castellano, Marcello; Mastronardi, Giuseppe; Bellotti, Roberto; Tarricone, Gianfranco

2009-01-01

Background A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. Methods The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. Results A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. Conclusion In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities. PMID:19534749
A bioinformatics knowledge discovery in text application for grid computing.

PubMed

Castellano, Marcello; Mastronardi, Giuseppe; Bellotti, Roberto; Tarricone, Gianfranco

2009-06-16

A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities.
Using UMLS to construct a generalized hierarchical concept-based dictionary of brain functions for information extraction from the fMRI literature.

PubMed

Hsiao, Mei-Yu; Chen, Chien-Chung; Chen, Jyh-Horng

2009-10-01

With a rapid progress in the field, a great many fMRI studies are published every year, to the extent that it is now becoming difficult for researchers to keep up with the literature, since reading papers is extremely time-consuming and labor-intensive. Thus, automatic information extraction has become an important issue. In this study, we used the Unified Medical Language System (UMLS) to construct a hierarchical concept-based dictionary of brain functions. To the best of our knowledge, this is the first generalized dictionary of this kind. We also developed an information extraction system for recognizing, mapping and classifying terms relevant to human brain study. The precision and recall of our system was on a par with that of human experts in term recognition, term mapping and term classification. Our approach presented in this paper presents an alternative to the more laborious, manual entry approach to information extraction.
A rule-based named-entity recognition method for knowledge extraction of evidence-based dietary recommendations

PubMed Central

2017-01-01

Evidence-based dietary information represented as unstructured text is a crucial information that needs to be accessed in order to help dietitians follow the new knowledge arrives daily with newly published scientific reports. Different named-entity recognition (NER) methods have been introduced previously to extract useful information from the biomedical literature. They are focused on, for example extracting gene mentions, proteins mentions, relationships between genes and proteins, chemical concepts and relationships between drugs and diseases. In this paper, we present a novel NER method, called drNER, for knowledge extraction of evidence-based dietary information. To the best of our knowledge this is the first attempt at extracting dietary concepts. DrNER is a rule-based NER that consists of two phases. The first one involves the detection and determination of the entities mention, and the second one involves the selection and extraction of the entities. We evaluate the method by using text corpora from heterogeneous sources, including text from several scientifically validated web sites and text from scientific publications. Evaluation of the method showed that drNER gives good results and can be used for knowledge extraction of evidence-based dietary recommendations. PMID:28644863
Assessing the role of a medication-indication resource in the treatment relation extraction from clinical text

PubMed Central

Bejan, Cosmin Adrian; Wei, Wei-Qi; Denny, Joshua C

2015-01-01

Objective To evaluate the contribution of the MEDication Indication (MEDI) resource and SemRep for identifying treatment relations in clinical text. Materials and methods We first processed clinical documents with SemRep to extract the Unified Medical Language System (UMLS) concepts and the treatment relations between them. Then, we incorporated MEDI into a simple algorithm that identifies treatment relations between two concepts if they match a medication-indication pair in this resource. For a better coverage, we expanded MEDI using ontology relationships from RxNorm and UMLS Metathesaurus. We also developed two ensemble methods, which combined the predictions of SemRep and the MEDI algorithm. We evaluated our selected methods on two datasets, a Vanderbilt corpus of 6864 discharge summaries and the 2010 Informatics for Integrating Biology and the Bedside (i2b2)/Veteran's Affairs (VA) challenge dataset. Results The Vanderbilt dataset included 958 manually annotated treatment relations. A double annotation was performed on 25% of relations with high agreement (Cohen's κ = 0.86). The evaluation consisted of comparing the manual annotated relations with the relations identified by SemRep, the MEDI algorithm, and the two ensemble methods. On the first dataset, the best F1-measure results achieved by the MEDI algorithm and the union of the two resources (78.7 and 80, respectively) were significantly higher than the SemRep results (72.3). On the second dataset, the MEDI algorithm achieved better precision and significantly lower recall values than the best system in the i2b2 challenge. The two systems obtained comparable F1-measure values on the subset of i2b2 relations with both arguments in MEDI. Conclusions Both SemRep and MEDI can be used to extract treatment relations from clinical text. Knowledge-based extraction with MEDI outperformed use of SemRep alone, but superior performance was achieved by integrating both systems. The integration of knowledge-based resources such as MEDI into information extraction systems such as SemRep and the i2b2 relation extractors may improve treatment relation extraction from clinical text. PMID:25336593
Recommendation System Based On Association Rules For Distributed E-Learning Management Systems

NASA Astrophysics Data System (ADS)

Mihai, Gabroveanu

2015-09-01

Traditional Learning Management Systems are installed on a single server where learning materials and user data are kept. To increase its performance, the Learning Management System can be installed on multiple servers; learning materials and user data could be distributed across these servers obtaining a Distributed Learning Management System. In this paper is proposed the prototype of a recommendation system based on association rules for Distributed Learning Management System. Information from LMS databases is analyzed using distributed data mining algorithms in order to extract the association rules. Then the extracted rules are used as inference rules to provide personalized recommendations. The quality of provided recommendations is improved because the rules used to make the inferences are more accurate, since these rules aggregate knowledge from all e-Learning systems included in Distributed Learning Management System.
A semantic model for multimodal data mining in healthcare information systems.

PubMed

Iakovidis, Dimitris; Smailis, Christos

2012-01-01

Electronic health records (EHRs) are representative examples of multimodal/multisource data collections; including measurements, images and free texts. The diversity of such information sources and the increasing amounts of medical data produced by healthcare institutes annually, pose significant challenges in data mining. In this paper we present a novel semantic model that describes knowledge extracted from the lowest-level of a data mining process, where information is represented by multiple features i.e. measurements or numerical descriptors extracted from measurements, images, texts or other medical data, forming multidimensional feature spaces. Knowledge collected by manual annotation or extracted by unsupervised data mining from one or more feature spaces is modeled through generalized qualitative spatial semantics. This model enables a unified representation of knowledge across multimodal data repositories. It contributes to bridging the semantic gap, by enabling direct links between low-level features and higher-level concepts e.g. describing body parts, anatomies and pathological findings. The proposed model has been developed in web ontology language based on description logics (OWL-DL) and can be applied to a variety of data mining tasks in medical informatics. It utility is demonstrated for automatic annotation of medical data.
Using Evolved Fuzzy Neural Networks for Injury Detection from Isokinetic Curves

NASA Astrophysics Data System (ADS)

Couchet, Jorge; Font, José María; Manrique, Daniel

In this paper we propose an evolutionary fuzzy neural networks system for extracting knowledge from a set of time series containing medical information. The series represent isokinetic curves obtained from a group of patients exercising the knee joint on an isokinetic dynamometer. The system has two parts: i) it analyses the time series input in order generate a simplified model of an isokinetic curve; ii) it applies a grammar-guided genetic program to obtain a knowledge base represented by a fuzzy neural network. Once the knowledge base has been generated, the system is able to perform knee injuries detection. The results suggest that evolved fuzzy neural networks perform better than non-evolutionary approaches and have a high accuracy rate during both the training and testing phases. Additionally, they are robust, as the system is able to self-adapt to changes in the problem without human intervention.
Inductive System Health Monitoring

NASA Technical Reports Server (NTRS)

Iverson, David L.

2004-01-01

The Inductive Monitoring System (IMS) software was developed to provide a technique to automatically produce health monitoring knowledge bases for systems that are either difficult to model (simulate) with a computer or which require computer models that are too complex to use for real time monitoring. IMS uses nominal data sets collected either directly from the system or from simulations to build a knowledge base that can be used to detect anomalous behavior in the system. Machine learning and data mining techniques are used to characterize typical system behavior by extracting general classes of nominal data from archived data sets. IMS is able to monitor the system by comparing real time operational data with these classes. We present a description of learning and monitoring method used by IMS and summarize some recent IMS results.
A structural informatics approach to mine kinase knowledge bases.

PubMed

Brooijmans, Natasja; Mobilio, Dominick; Walker, Gary; Nilakantan, Ramaswamy; Denny, Rajiah A; Feyfant, Eric; Diller, David; Bikker, Jack; Humblet, Christine

2010-03-01

In this paper, we describe a combination of structural informatics approaches developed to mine data extracted from existing structure knowledge bases (Protein Data Bank and the GVK database) with a focus on kinase ATP-binding site data. In contrast to existing systems that retrieve and analyze protein structures, our techniques are centered on a database of ligand-bound geometries in relation to residues lining the binding site and transparent access to ligand-based SAR data. We illustrate the systems in the context of the Abelson kinase and related inhibitor structures. 2009 Elsevier Ltd. All rights reserved.
Model of experts for decision support in the diagnosis of leukemia patients.

PubMed

Corchado, Juan M; De Paz, Juan F; Rodríguez, Sara; Bajo, Javier

2009-07-01

Recent advances in the field of biomedicine, specifically in the field of genomics, have led to an increase in the information available for conducting expression analysis. Expression analysis is a technique used in transcriptomics, a branch of genomics that deals with the study of messenger ribonucleic acid (mRNA) and the extraction of information contained in the genes. This increase in information is reflected in the exon arrays, which require the use of new techniques in order to extract the information. The purpose of this study is to provide a tool based on a mixture of experts model that allows the analysis of the information contained in the exon arrays, from which automatic classifications for decision support in diagnoses of leukemia patients can be made. The proposed model integrates several cooperative algorithms characterized for their efficiency for data processing, filtering, classification and knowledge extraction. The Cancer Institute of the University of Salamanca is making an effort to develop tools to automate the evaluation of data and to facilitate de analysis of information. This proposal is a step forward in this direction and the first step toward the development of a mixture of experts tool that integrates different cognitive and statistical approaches to deal with the analysis of exon arrays. The mixture of experts model presented within this work provides great capacities for learning and adaptation to the characteristics of the problem in consideration, using novel algorithms in each of the stages of the analysis process that can be easily configured and combined, and provides results that notably improve those provided by the existing methods for exon arrays analysis. The material used consists of data from exon arrays provided by the Cancer Institute that contain samples from leukemia patients. The methodology used consists of a system based on a mixture of experts. Each one of the experts incorporates novel artificial intelligence techniques that improve the process of carrying out various tasks such as pre-processing, filtering, classification and extraction of knowledge. This article will detail the manner in which individual experts are combined so that together they generate a system capable of extracting knowledge, thus permitting patients to be classified in an automatic and efficient manner that is also comprehensible for medical personnel. The system has been tested in a real setting and has been used for classifying patients who suffer from different forms of leukemia at various stages. Personnel from the Cancer Institute supervised and participated throughout the testing period. Preliminary results are promising, notably improving the results obtained with previously used tools. The medical staff from the Cancer Institute considers the tools that have been developed to be positive and very useful in a supporting capacity for carrying out their daily tasks. Additionally the mixture of experts supplies a tool for the extraction of necessary information in order to explain the associations that have been made in simple terms. That is, it permits the extraction of knowledge for each classification made and generalized in order to be used in subsequent classifications. This allows for a large amount of learning and adaptation within the proposed system.
Modelling and representation issues in automated feature extraction from aerial and satellite images

NASA Astrophysics Data System (ADS)

Sowmya, Arcot; Trinder, John

New digital systems for the processing of photogrammetric and remote sensing images have led to new approaches to information extraction for mapping and Geographic Information System (GIS) applications, with the expectation that data can become more readily available at a lower cost and with greater currency. Demands for mapping and GIS data are increasing as well for environmental assessment and monitoring. Hence, researchers from the fields of photogrammetry and remote sensing, as well as computer vision and artificial intelligence, are bringing together their particular skills for automating these tasks of information extraction. The paper will review some of the approaches used in knowledge representation and modelling for machine vision, and give examples of their applications in research for image understanding of aerial and satellite imagery.
Automatic Line Network Extraction from Aerial Imagery of Urban Areas through Knowledge Based Image Analysis

DTIC Science & Technology

1989-08-01

Automatic Line Network Extraction from Aerial Imangery of Urban Areas Sthrough KnowledghBased Image Analysis N 04 Final Technical ReportI December...Automatic Line Network Extraction from Aerial Imagery of Urban Areas through Knowledge Based Image Analysis Accesion For NTIS CRA&I DTIC TAB 0...paittern re’ognlition. blac’kboardl oriented symbollic processing, knowledge based image analysis , image understanding, aer’ial imsagery, urban area, 17
Concurrent evolution of feature extractors and modular artificial neural networks

NASA Astrophysics Data System (ADS)

Hannak, Victor; Savakis, Andreas; Yang, Shanchieh Jay; Anderson, Peter

2009-05-01

This paper presents a new approach for the design of feature-extracting recognition networks that do not require expert knowledge in the application domain. Feature-Extracting Recognition Networks (FERNs) are composed of interconnected functional nodes (feurons), which serve as feature extractors, and are followed by a subnetwork of traditional neural nodes (neurons) that act as classifiers. A concurrent evolutionary process (CEP) is used to search the space of feature extractors and neural networks in order to obtain an optimal recognition network that simultaneously performs feature extraction and recognition. By constraining the hill-climbing search functionality of the CEP on specific parts of the solution space, i.e., individually limiting the evolution of feature extractors and neural networks, it was demonstrated that concurrent evolution is a necessary component of the system. Application of this approach to a handwritten digit recognition task illustrates that the proposed methodology is capable of producing recognition networks that perform in-line with other methods without the need for expert knowledge in image processing.
Designing and Implementation of Fuzzy Case-based Reasoning System on Android Platform Using Electronic Discharge Summary of Patients with Chronic Kidney Diseases

PubMed Central

Tahmasebian, Shahram; Langarizadeh, Mostafa; Ghazisaeidi, Marjan; Mahdavi-Mazdeh, Mitra

2016-01-01

Introduction: Case-based reasoning (CBR) systems are one of the effective methods to find the nearest solution to the current problems. These systems are used in various spheres as well as industry, business, and economy. The medical field is not an exception in this regard, and these systems are nowadays used in the various aspects of diagnosis and treatment. Methodology: In this study, the effective parameters were first extracted from the structured discharge summary prepared for patients with chronic kidney diseases based on data mining method. Then, through holding a meeting with experts in nephrology and using data mining methods, the weights of the parameters were extracted. Finally, fuzzy system has been employed in order to compare the similarities of current case and previous cases, and the system was implemented on the Android platform. Discussion: The data on electronic discharge records of patients with chronic kidney diseases were entered into the system. The measure of similarity was assessed using the algorithm provided in the system, and then compared with other known methods in CBR systems. Conclusion: Developing Clinical fuzzy CBR system used in Knowledge management framework for registering specific therapeutic methods, Knowledge sharing environment for experts in a specific domain and Powerful tools at the point of care. PMID:27708490
Refractive index variance of cells and tissues measured by quantitative phase imaging.

PubMed

Shan, Mingguang; Kandel, Mikhail E; Popescu, Gabriel

2017-01-23

The refractive index distribution of cells and tissues governs their interaction with light and can report on morphological modifications associated with disease. Through intensity-based measurements, refractive index information can be extracted only via scattering models that approximate light propagation. As a result, current knowledge of refractive index distributions across various tissues and cell types remains limited. Here we use quantitative phase imaging and the statistical dispersion relation (SDR) to extract information about the refractive index variance in a variety of specimens. Due to the phase-resolved measurement in three-dimensions, our approach yields refractive index results without prior knowledge about the tissue thickness. With the recent progress in quantitative phase imaging systems, we anticipate that using SDR will become routine in assessing tissue optical properties.
Artificial intelligence within the chemical laboratory.

PubMed

Winkel, P

1994-01-01

Various techniques within the area of artificial intelligence such as expert systems and neural networks may play a role during the problem-solving processes within the clinical biochemical laboratory. Neural network analysis provides a non-algorithmic approach to information processing, which results in the ability of the computer to form associations and to recognize patterns or classes among data. It belongs to the machine learning techniques which also include probabilistic techniques such as discriminant function analysis and logistic regression and information theoretical techniques. These techniques may be used to extract knowledge from example patients to optimize decision limits and identify clinically important laboratory quantities. An expert system may be defined as a computer program that can give advice in a well-defined area of expertise and is able to explain its reasoning. Declarative knowledge consists of statements about logical or empirical relationships between things. Expert systems typically separate declarative knowledge residing in a knowledge base from the inference engine: an algorithm that dynamically directs and controls the system when it searches its knowledge base. A tool is an expert system without a knowledge base. The developer of an expert system uses a tool by entering knowledge into the system. Many, if not the majority of problems encountered at the laboratory level are procedural. A problem is procedural if it is possible to write up a step-by-step description of the expert's work or if it can be represented by a decision tree. To solve problems of this type only small expert system tools and/or conventional programming are required.(ABSTRACT TRUNCATED AT 250 WORDS)
Composition of extracts of airborne grain dusts: lectins and lymphocyte mitogens.

PubMed Central

Olenchock, S A; Lewis, D M; Mull, J C

1986-01-01

Airborne grain dusts are heterogeneous materials that can elicit acute and chronic respiratory pathophysiology in exposed workers. Previous characterizations of the dusts include the identification of viable microbial contaminants, mycotoxins, and endotoxins. We provide information on the lectin-like activity of grain dust extracts and its possible biological relationship. Hemagglutination of erythrocytes and immunochemical modulation by antibody to specific lectins showed the presence of these substances in extracts of airborne dusts from barley, corn, and rye. Proliferation of normal rat splenic lymphocytes in vitro provided evidence for direct biological effects on the cells of the immune system. These data expand the knowledge of the composition of grain dusts (extracts), and suggest possible mechanisms that may contribute to respiratory disease in grain workers. PMID:3709474
Recent progress in automatically extracting information from the pharmacogenomic literature

PubMed Central

Garten, Yael; Coulet, Adrien; Altman, Russ B

2011-01-01

The biomedical literature holds our understanding of pharmacogenomics, but it is dispersed across many journals. In order to integrate our knowledge, connect important facts across publications and generate new hypotheses we must organize and encode the contents of the literature. By creating databases of structured pharmocogenomic knowledge, we can make the value of the literature much greater than the sum of the individual reports. We can, for example, generate candidate gene lists or interpret surprising hits in genome-wide association studies. Text mining automatically adds structure to the unstructured knowledge embedded in millions of publications, and recent years have seen a surge in work on biomedical text mining, some specific to pharmacogenomics literature. These methods enable extraction of specific types of information and can also provide answers to general, systemic queries. In this article, we describe the main tasks of text mining in the context of pharmacogenomics, summarize recent applications and anticipate the next phase of text mining applications. PMID:21047206
Development of a GIService based on spatial data mining for location choice of convenience stores in Taipei City

NASA Astrophysics Data System (ADS)

Jung, Chinte; Sun, Chih-Hong

2006-10-01

Motivated by the increasing accessibility of technology, more and more spatial data are being made digitally available. How to extract the valuable knowledge from these large (spatial) databases is becoming increasingly important to businesses, as well. It is essential to be able to analyze and utilize these large datasets, convert them into useful knowledge, and transmit them through GIS-enabled instruments and the Internet, conveying the key information to business decision-makers effectively and benefiting business entities. In this research, we combine the techniques of GIS, spatial decision support system (SDSS), spatial data mining (SDM), and ArcGIS Server to achieve the following goals: (1) integrate databases from spatial and non-spatial datasets about the locations of businesses in Taipei, Taiwan; (2) use the association rules, one of the SDM methods, to extract the knowledge from the integrated databases; and (3) develop a Web-based SDSS GIService as a location-selection tool for business by the product of ArcGIS Server.

Exploiting graph kernels for high performance biomedical relation extraction.

PubMed

Panyam, Nagesh C; Verspoor, Karin; Cohn, Trevor; Ramamohanarao, Kotagiri

2018-01-30

Relation extraction from biomedical publications is an important task in the area of semantic mining of text. Kernel methods for supervised relation extraction are often preferred over manual feature engineering methods, when classifying highly ordered structures such as trees and graphs obtained from syntactic parsing of a sentence. Tree kernels such as the Subset Tree Kernel and Partial Tree Kernel have been shown to be effective for classifying constituency parse trees and basic dependency parse graphs of a sentence. Graph kernels such as the All Path Graph kernel (APG) and Approximate Subgraph Matching (ASM) kernel have been shown to be suitable for classifying general graphs with cycles, such as the enhanced dependency parse graph of a sentence. In this work, we present a high performance Chemical-Induced Disease (CID) relation extraction system. We present a comparative study of kernel methods for the CID task and also extend our study to the Protein-Protein Interaction (PPI) extraction task, an important biomedical relation extraction task. We discuss novel modifications to the ASM kernel to boost its performance and a method to apply graph kernels for extracting relations expressed in multiple sentences. Our system for CID relation extraction attains an F-score of 60%, without using external knowledge sources or task specific heuristic or rules. In comparison, the state of the art Chemical-Disease Relation Extraction system achieves an F-score of 56% using an ensemble of multiple machine learning methods, which is then boosted to 61% with a rule based system employing task specific post processing rules. For the CID task, graph kernels outperform tree kernels substantially, and the best performance is obtained with APG kernel that attains an F-score of 60%, followed by the ASM kernel at 57%. The performance difference between the ASM and APG kernels for CID sentence level relation extraction is not significant. In our evaluation of ASM for the PPI task, ASM performed better than APG kernel for the BioInfer dataset, in the Area Under Curve (AUC) measure (74% vs 69%). However, for all the other PPI datasets, namely AIMed, HPRD50, IEPA and LLL, ASM is substantially outperformed by the APG kernel in F-score and AUC measures. We demonstrate a high performance Chemical Induced Disease relation extraction, without employing external knowledge sources or task specific heuristics. Our work shows that graph kernels are effective in extracting relations that are expressed in multiple sentences. We also show that the graph kernels, namely the ASM and APG kernels, substantially outperform the tree kernels. Among the graph kernels, we showed the ASM kernel as effective for biomedical relation extraction, with comparable performance to the APG kernel for datasets such as the CID-sentence level relation extraction and BioInfer in PPI. Overall, the APG kernel is shown to be significantly more accurate than the ASM kernel, achieving better performance on most datasets.
Verification and Validation of KBS with Neural Network Components

NASA Technical Reports Server (NTRS)

Wen, Wu; Callahan, John

1996-01-01

Artificial Neural Network (ANN) play an important role in developing robust Knowledge Based Systems (KBS). The ANN based components used in these systems learn to give appropriate predictions through training with correct input-output data patterns. Unlike traditional KBS that depends on a rule database and a production engine, the ANN based system mimics the decisions of an expert without specifically formulating the if-than type of rules. In fact, the ANNs demonstrate their superiority when such if-then type of rules are hard to generate by human expert. Verification of traditional knowledge based system is based on the proof of consistency and completeness of the rule knowledge base and correctness of the production engine.These techniques, however, can not be directly applied to ANN based components.In this position paper, we propose a verification and validation procedure for KBS with ANN based components. The essence of the procedure is to obtain an accurate system specification through incremental modification of the specifications using an ANN rule extraction algorithm.
Real-time diagnostics for a reusable rocket engine

NASA Technical Reports Server (NTRS)

Guo, T. H.; Merrill, W.; Duyar, A.

1992-01-01

A hierarchical, decentralized diagnostic system is proposed for the Real-Time Diagnostic System component of the Intelligent Control System (ICS) for reusable rocket engines. The proposed diagnostic system has three layers of information processing: condition monitoring, fault mode detection, and expert system diagnostics. The condition monitoring layer is the first level of signal processing. Here, important features of the sensor data are extracted. These processed data are then used by the higher level fault mode detection layer to do preliminary diagnosis on potential faults at the component level. Because of the closely coupled nature of the rocket engine propulsion system components, it is expected that a given engine condition may trigger more than one fault mode detector. Expert knowledge is needed to resolve the conflicting reports from the various failure mode detectors. This is the function of the diagnostic expert layer. Here, the heuristic nature of this decision process makes it desirable to use an expert system approach. Implementation of the real-time diagnostic system described above requires a wide spectrum of information processing capability. Generally, in the condition monitoring layer, fast data processing is often needed for feature extraction and signal conditioning. This is usually followed by some detection logic to determine the selected faults on the component level. Three different techniques are used to attack different fault detection problems in the NASA LeRC ICS testbed simulation. The first technique employed is the neural network application for real-time sensor validation which includes failure detection, isolation, and accommodation. The second approach demonstrated is the model-based fault diagnosis system using on-line parameter identification. Besides these model based diagnostic schemes, there are still many failure modes which need to be diagnosed by the heuristic expert knowledge. The heuristic expert knowledge is implemented using a real-time expert system tool called G2 by Gensym Corp. Finally, the distributed diagnostic system requires another level of intelligence to oversee the fault mode reports generated by component fault detectors. The decision making at this level can best be done using a rule-based expert system. This level of expert knowledge is also implemented using G2.
NOUS: Construction and Querying of Dynamic Knowledge Graphs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Choudhury, Sutanay; Agarwal, Khushbu; Purohit, Sumit

The ability to construct domain specific knowledge graphs (KG) and perform question-answering or hypothesis generation is a trans- formative capability. Despite their value, automated construction of knowledge graphs remains an expensive technical challenge that is beyond the reach for most enterprises and academic institutions. We propose an end-to-end framework for developing custom knowl- edge graph driven analytics for arbitrary application domains. The uniqueness of our system lies A) in its combination of curated KGs along with knowledge extracted from unstructured text, B) support for advanced trending and explanatory questions on a dynamic KG, and C) the ability to answer queriesmore » where the answer is embedded across multiple data sources.« less
Lessons from the business sector for successful knowledge management in health care: a systematic review.

PubMed

Kothari, Anita; Hovanec, Nina; Hastie, Robyn; Sibbald, Shannon

2011-07-25

The concept of knowledge management has been prevalent in the business sector for decades. Only recently has knowledge management been receiving attention by the health care sector, in part due to the ever growing amount of information that health care practitioners must handle. It has become essential to develop a way to manage the information coming in to and going out of a health care organization. The purpose of this paper was to summarize previous studies from the business literature that explored specific knowledge management tools, with the aim of extracting lessons that could be applied in the health domain. We searched seven databases using keywords such as "knowledge management", "organizational knowledge", and "business performance". We included articles published between 2000-2009; we excluded non-English articles. 83 articles were reviewed and data were extracted to: (1) uncover reasons for initiating knowledge management strategies, (2) identify potential knowledge management strategies/solutions, and (3) describe facilitators and barriers to knowledge management. KM strategies include such things as training sessions, communication technologies, process mapping and communities of practice. Common facilitators and barriers to implementing these strategies are discussed in the business literature, but rigorous studies about the effectiveness of such initiatives are lacking. The health care sector is at a pinnacle place, with incredible opportunities to design, implement (and evaluate) knowledge management systems. While more research needs to be done on how best to do this in healthcare, the lessons learned from the business sector can provide a foundation on which to build.
Lessons from the business sector for successful knowledge management in health care: A systematic review

PubMed Central

2011-01-01

Background The concept of knowledge management has been prevalent in the business sector for decades. Only recently has knowledge management been receiving attention by the health care sector, in part due to the ever growing amount of information that health care practitioners must handle. It has become essential to develop a way to manage the information coming in to and going out of a health care organization. The purpose of this paper was to summarize previous studies from the business literature that explored specific knowledge management tools, with the aim of extracting lessons that could be applied in the health domain. Methods We searched seven databases using keywords such as "knowledge management", "organizational knowledge", and "business performance". We included articles published between 2000-2009; we excluded non-English articles. Results 83 articles were reviewed and data were extracted to: (1) uncover reasons for initiating knowledge management strategies, (2) identify potential knowledge management strategies/solutions, and (3) describe facilitators and barriers to knowledge management. Conclusions KM strategies include such things as training sessions, communication technologies, process mapping and communities of practice. Common facilitators and barriers to implementing these strategies are discussed in the business literature, but rigorous studies about the effectiveness of such initiatives are lacking. The health care sector is at a pinnacle place, with incredible opportunities to design, implement (and evaluate) knowledge management systems. While more research needs to be done on how best to do this in healthcare, the lessons learned from the business sector can provide a foundation on which to build. PMID:21787403
Knowledge Discovery for Smart Grid Operation, Control, and Situation Awareness -- A Big Data Visualization Platform

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gu, Yi; Jiang, Huaiguang; Zhang, Yingchen

In this paper, a big data visualization platform is designed to discover the hidden useful knowledge for smart grid (SG) operation, control and situation awareness. The spawn of smart sensors at both grid side and customer side can provide large volume of heterogeneous data that collect information in all time spectrums. Extracting useful knowledge from this big-data poll is still challenging. In this paper, the Apache Spark, an open source cluster computing framework, is used to process the big-data to effectively discover the hidden knowledge. A high-speed communication architecture utilizing the Open System Interconnection (OSI) model is designed to transmitmore » the data to a visualization platform. This visualization platform uses Google Earth, a global geographic information system (GIS) to link the geological information with the SG knowledge and visualize the information in user defined fashion. The University of Denver's campus grid is used as a SG test bench and several demonstrations are presented for the proposed platform.« less
Knowledge Acquisition and Management for the NASA Earth Exchange (NEX)

NASA Astrophysics Data System (ADS)

Votava, P.; Michaelis, A.; Nemani, R. R.

2013-12-01

NASA Earth Exchange (NEX) is a data, computing and knowledge collaboratory that houses NASA satellite, climate and ancillary data where a focused community can come together to share modeling and analysis codes, scientific results, knowledge and expertise on a centralized platform with access to large supercomputing resources. As more and more projects are being executed on NEX, we are increasingly focusing on capturing the knowledge of the NEX users and provide mechanisms for sharing it with the community in order to facilitate reuse and accelerate research. There are many possible knowledge contributions to NEX, it can be a wiki entry on the NEX portal contributed by a developer, information extracted from a publication in an automated way, or a workflow captured during code execution on the supercomputing platform. The goal of the NEX knowledge platform is to capture and organize this information and make it easily accessible to the NEX community and beyond. The knowledge acquisition process consists of three main faucets - data and metadata, workflows and processes, and web-based information. Once the knowledge is acquired, it is processed in a number of ways ranging from custom metadata parsers to entity extraction using natural language processing techniques. The processed information is linked with existing taxonomies and aligned with internal ontology (which heavily reuses number of external ontologies). This forms a knowledge graph that can then be used to improve users' search query results as well as provide additional analytics capabilities to the NEX system. Such a knowledge graph will be an important building block in creating a dynamic knowledge base for the NEX community where knowledge is both generated and easily shared.
On-the-spot lung cancer differential diagnosis by label-free, molecular vibrational imaging and knowledge-based classification

NASA Astrophysics Data System (ADS)

Gao, Liang; Li, Fuhai; Thrall, Michael J.; Yang, Yaliang; Xing, Jiong; Hammoudi, Ahmad A.; Zhao, Hong; Massoud, Yehia; Cagle, Philip T.; Fan, Yubo; Wong, Kelvin K.; Wang, Zhiyong; Wong, Stephen T. C.

2011-09-01

We report the development and application of a knowledge-based coherent anti-Stokes Raman scattering (CARS) microscopy system for label-free imaging, pattern recognition, and classification of cells and tissue structures for differentiating lung cancer from non-neoplastic lung tissues and identifying lung cancer subtypes. A total of 1014 CARS images were acquired from 92 fresh frozen lung tissue samples. The established pathological workup and diagnostic cellular were used as prior knowledge for establishment of a knowledge-based CARS system using a machine learning approach. This system functions to separate normal, non-neoplastic, and subtypes of lung cancer tissues based on extracted quantitative features describing fibrils and cell morphology. The knowledge-based CARS system showed the ability to distinguish lung cancer from normal and non-neoplastic lung tissue with 91% sensitivity and 92% specificity. Small cell carcinomas were distinguished from nonsmall cell carcinomas with 100% sensitivity and specificity. As an adjunct to submitting tissue samples to routine pathology, our novel system recognizes the patterns of fibril and cell morphology, enabling medical practitioners to perform differential diagnosis of lung lesions in mere minutes. The demonstration of the strategy is also a necessary step toward in vivo point-of-care diagnosis of precancerous and cancerous lung lesions with a fiber-based CARS microendoscope.
A semi-supervised learning framework for biomedical event extraction based on hidden topics.

PubMed

Zhou, Deyu; Zhong, Dayou

2015-05-01

Scientists have devoted decades of efforts to understanding the interaction between proteins or RNA production. The information might empower the current knowledge on drug reactions or the development of certain diseases. Nevertheless, due to the lack of explicit structure, literature in life science, one of the most important sources of this information, prevents computer-based systems from accessing. Therefore, biomedical event extraction, automatically acquiring knowledge of molecular events in research articles, has attracted community-wide efforts recently. Most approaches are based on statistical models, requiring large-scale annotated corpora to precisely estimate models' parameters. However, it is usually difficult to obtain in practice. Therefore, employing un-annotated data based on semi-supervised learning for biomedical event extraction is a feasible solution and attracts more interests. In this paper, a semi-supervised learning framework based on hidden topics for biomedical event extraction is presented. In this framework, sentences in the un-annotated corpus are elaborately and automatically assigned with event annotations based on their distances to these sentences in the annotated corpus. More specifically, not only the structures of the sentences, but also the hidden topics embedded in the sentences are used for describing the distance. The sentences and newly assigned event annotations, together with the annotated corpus, are employed for training. Experiments were conducted on the multi-level event extraction corpus, a golden standard corpus. Experimental results show that more than 2.2% improvement on F-score on biomedical event extraction is achieved by the proposed framework when compared to the state-of-the-art approach. The results suggest that by incorporating un-annotated data, the proposed framework indeed improves the performance of the state-of-the-art event extraction system and the similarity between sentences might be precisely described by hidden topics and structures of the sentences. Copyright © 2015 Elsevier B.V. All rights reserved.
Analogy between gambling and measurement-based work extraction

NASA Astrophysics Data System (ADS)

Vinkler, Dror A.; Permuter, Haim H.; Merhav, Neri

2016-04-01

In information theory, one area of interest is gambling, where mutual information characterizes the maximal gain in wealth growth rate due to knowledge of side information; the betting strategy that achieves this maximum is named the Kelly strategy. In the field of physics, it was recently shown that mutual information can characterize the maximal amount of work that can be extracted from a single heat bath using measurement-based control protocols, i.e. using ‘information engines’. However, to the best of our knowledge, no relation between gambling and information engines has been presented before. In this paper, we briefly review the two concepts and then demonstrate an analogy between gambling, where bits are converted into wealth, and information engines, where bits representing measurements are converted into energy. From this analogy follows an extension of gambling to the continuous-valued case, which is shown to be useful for investments in currency exchange rates or in the stock market using options. Moreover, the analogy enables us to use well-known methods and results from one field to solve problems in the other. We present three such cases: maximum work extraction when the probability distributions governing the system and measurements are unknown, work extraction when some energy is lost in each cycle, e.g. due to friction, and an analysis of systems with memory. In all three cases, the analogy enables us to use known results in order to obtain new ones.
Calibration of Viking imaging system pointing, image extraction, and optical navigation measure

NASA Technical Reports Server (NTRS)

Breckenridge, W. G.; Fowler, J. W.; Morgan, E. M.

1977-01-01

Pointing control and knowledge accuracy of Viking Orbiter science instruments is controlled by the scan platform. Calibration of the scan platform and the imaging system was accomplished through mathematical models. The calibration procedure and results obtained for the two Viking spacecraft are described. Included are both ground and in-flight scan platform calibrations, and the additional calibrations unique to optical navigation.
Medical document anonymization with a semantic lexicon.

PubMed Central

Ruch, P.; Baud, R. H.; Rassinoux, A. M.; Bouillon, P.; Robert, G.

2000-01-01

We present an original system for locating and removing personally-identifying information in patient records. In this experiment, anonymization is seen as a particular case of knowledge extraction. We use natural language processing tools provided by the MEDTAG framework: a semantic lexicon specialized in medicine, and a toolkit for word-sense and morpho-syntactic tagging. The system finds 98-99% of all personally-identifying information. PMID:11079980
A framework for extracting and representing project knowledge contexts using topic models and dynamic knowledge maps

NASA Astrophysics Data System (ADS)

Xu, Jin; Li, Zheng; Li, Shuliang; Zhang, Yanyan

2015-07-01

There is still a lack of effective paradigms and tools for analysing and discovering the contents and relationships of project knowledge contexts in the field of project management. In this paper, a new framework for extracting and representing project knowledge contexts using topic models and dynamic knowledge maps under big data environments is proposed and developed. The conceptual paradigm, theoretical underpinning, extended topic model, and illustration examples of the ontology model for project knowledge maps are presented, with further research work envisaged.
Unsupervised Ontology Generation from Unstructured Text. CRESST Report 827

ERIC Educational Resources Information Center

Mousavi, Hamid; Kerr, Deirdre; Iseli, Markus R.

2013-01-01

Ontologies are a vital component of most knowledge acquisition systems, and recently there has been a huge demand for generating ontologies automatically since manual or supervised techniques are not scalable. In this paper, we introduce "OntoMiner", a rule-based, iterative method to extract and populate ontologies from unstructured or…
PREDOSE: A Semantic Web Platform for Drug Abuse Epidemiology using Social Media

PubMed Central

Cameron, Delroy; Smith, Gary A.; Daniulaityte, Raminta; Sheth, Amit P.; Dave, Drashti; Chen, Lu; Anand, Gaurish; Carlson, Robert; Watkins, Kera Z.; Falck, Russel

2013-01-01

Objectives The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel Semantic Web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO) (pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC). A combination of lexical, pattern-based and semantics-based techniques is used together with the domain knowledge to extract fine-grained semantic information from UGC. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks. Methods Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, routes of administration, etc. The DAO is also used to help recognize three types of data, namely: 1) entities, 2) relationships and 3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information from UGC, and querying, search, trend analysis and overall content analysis of social media related to prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques. Results A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University. Conclusion A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future. PMID:23892295
Research on complex 3D tree modeling based on L-system

NASA Astrophysics Data System (ADS)

Gang, Chen; Bin, Chen; Yuming, Liu; Hui, Li

2018-03-01

L-system as a fractal iterative system could simulate complex geometric patterns. Based on the field observation data of trees and knowledge of forestry experts, this paper extracted modeling constraint rules and obtained an L-system rules set. Using the self-developed L-system modeling software the L-system rule set was parsed to generate complex tree 3d models.The results showed that the geometrical modeling method based on l-system could be used to describe the morphological structure of complex trees and generate 3D tree models.
Knowledge Integration to Make Decisions About Complex Systems: Sustainability of Energy Production from Agriculture

ScienceCinema

Danuso, Francesco

2017-12-22

A major bottleneck for improving the governance of complex systems, rely on our ability to integrate different forms of knowledge into a decision support system (DSS). Preliminary aspects are the classification of different types of knowledge (a priori or general, a posteriori or specific, with uncertainty, numerical, textual, algorithmic, complete/incomplete, etc.), the definition of ontologies for knowledge management and the availability of proper tools like continuous simulation models, event driven models, statistical approaches, computational methods (neural networks, evolutionary optimization, rule based systems etc.) and procedure for textual documentation. Following these views at University of Udine, a computer language (SEMoLa, Simple, Easy Modelling Language) for knowledge integration has been developed.Â SEMoLa can handle models, data, metadata and textual knowledge; it implements and extends the system dynamics ontology (Forrester, 1968; JÃ¸rgensen, 1994) in which systems are modelled by the concepts of material, group, state, rate, parameter, internal and external events and driving variables. As an example, a SEMoLa model to improve management and sustainability (economical, energetic, environmental) of the agricultural farms is presented. The model (X-Farm) simulates a farm in which cereal and forage yield, oil seeds, milk, calves and wastes can be sold or reused. X-Farm is composed by integrated modules describing fields (crop and soil), feeds and materials storage, machinery management, manpowerÂ management, animal husbandry, economic and energetic balances, seed oil extraction, manure and wastes management, biogas production from animal wastes and biomasses.
Character-level neural network for biomedical named entity recognition.

PubMed

Gridach, Mourad

2017-06-01

Biomedical named entity recognition (BNER), which extracts important named entities such as genes and proteins, is a challenging task in automated systems that mine knowledge in biomedical texts. The previous state-of-the-art systems required large amounts of task-specific knowledge in the form of feature engineering, lexicons and data pre-processing to achieve high performance. In this paper, we introduce a novel neural network architecture that benefits from both word- and character-level representations automatically, by using a combination of bidirectional long short-term memory (LSTM) and conditional random field (CRF) eliminating the need for most feature engineering tasks. We evaluate our system on two datasets: JNLPBA corpus and the BioCreAtIvE II Gene Mention (GM) corpus. We obtained state-of-the-art performance by outperforming the previous systems. To the best of our knowledge, we are the first to investigate the combination of deep neural networks, CRF, word embeddings and character-level representation in recognizing biomedical named entities. Copyright © 2017 Elsevier Inc. All rights reserved.
Text Mining in Biomedical Domain with Emphasis on Document Clustering.

PubMed

Renganathan, Vinaitheerthan

2017-07-01

With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

Biomedical discovery acceleration, with applications to craniofacial development.

PubMed

Leach, Sonia M; Tipney, Hannah; Feng, Weiguo; Baumgartner, William A; Kasliwal, Priyanka; Schuyler, Ronald P; Williams, Trevor; Spritz, Richard A; Hunter, Lawrence

2009-03-01

The profusion of high-throughput instruments and the explosion of new results in the scientific literature, particularly in molecular biomedicine, is both a blessing and a curse to the bench researcher. Even knowledgeable and experienced scientists can benefit from computational tools that help navigate this vast and rapidly evolving terrain. In this paper, we describe a novel computational approach to this challenge, a knowledge-based system that combines reading, reasoning, and reporting methods to facilitate analysis of experimental data. Reading methods extract information from external resources, either by parsing structured data or using biomedical language processing to extract information from unstructured data, and track knowledge provenance. Reasoning methods enrich the knowledge that results from reading by, for example, noting two genes that are annotated to the same ontology term or database entry. Reasoning is also used to combine all sources into a knowledge network that represents the integration of all sorts of relationships between a pair of genes, and to calculate a combined reliability score. Reporting methods combine the knowledge network with a congruent network constructed from experimental data and visualize the combined network in a tool that facilitates the knowledge-based analysis of that data. An implementation of this approach, called the Hanalyzer, is demonstrated on a large-scale gene expression array dataset relevant to craniofacial development. The use of the tool was critical in the creation of hypotheses regarding the roles of four genes never previously characterized as involved in craniofacial development; each of these hypotheses was validated by further experimental work.
Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach

PubMed Central

2012-01-01

Background Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. Methods We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. Results We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. Conclusions We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data. PMID:22759462
Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach.

PubMed

Ratkovic, Zorana; Golik, Wiktoria; Warnier, Pierre

2012-06-26

Bacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field. We present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors. We participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points. We have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data.
A high-precision rule-based extraction system for expanding geospatial metadata in GenBank records

PubMed Central

Weissenbacher, Davy; Rivera, Robert; Beard, Rachel; Firago, Mari; Wallstrom, Garrick; Scotch, Matthew; Gonzalez, Graciela

2016-01-01

Objective The metadata reflecting the location of the infected host (LOIH) of virus sequences in GenBank often lacks specificity. This work seeks to enhance this metadata by extracting more specific geographic information from related full-text articles and mapping them to their latitude/longitudes using knowledge derived from external geographical databases. Materials and Methods We developed a rule-based information extraction framework for linking GenBank records to the latitude/longitudes of the LOIH. Our system first extracts existing geospatial metadata from GenBank records and attempts to improve it by seeking additional, relevant geographic information from text and tables in related full-text PubMed Central articles. The final extracted locations of the records, based on data assimilated from these sources, are then disambiguated and mapped to their respective geo-coordinates. We evaluated our approach on a manually annotated dataset comprising of 5728 GenBank records for the influenza A virus. Results We found the precision, recall, and f-measure of our system for linking GenBank records to the latitude/longitudes of their LOIH to be 0.832, 0.967, and 0.894, respectively. Discussion Our system had a high level of accuracy for linking GenBank records to the geo-coordinates of the LOIH. However, it can be further improved by expanding our database of geospatial data, incorporating spell correction, and enhancing the rules used for extraction. Conclusion Our system performs reasonably well for linking GenBank records for the influenza A virus to the geo-coordinates of their LOIH based on record metadata and information extracted from related full-text articles. PMID:26911818
A high-precision rule-based extraction system for expanding geospatial metadata in GenBank records.

PubMed

Tahsin, Tasnia; Weissenbacher, Davy; Rivera, Robert; Beard, Rachel; Firago, Mari; Wallstrom, Garrick; Scotch, Matthew; Gonzalez, Graciela

2016-09-01

The metadata reflecting the location of the infected host (LOIH) of virus sequences in GenBank often lacks specificity. This work seeks to enhance this metadata by extracting more specific geographic information from related full-text articles and mapping them to their latitude/longitudes using knowledge derived from external geographical databases. We developed a rule-based information extraction framework for linking GenBank records to the latitude/longitudes of the LOIH. Our system first extracts existing geospatial metadata from GenBank records and attempts to improve it by seeking additional, relevant geographic information from text and tables in related full-text PubMed Central articles. The final extracted locations of the records, based on data assimilated from these sources, are then disambiguated and mapped to their respective geo-coordinates. We evaluated our approach on a manually annotated dataset comprising of 5728 GenBank records for the influenza A virus. We found the precision, recall, and f-measure of our system for linking GenBank records to the latitude/longitudes of their LOIH to be 0.832, 0.967, and 0.894, respectively. Our system had a high level of accuracy for linking GenBank records to the geo-coordinates of the LOIH. However, it can be further improved by expanding our database of geospatial data, incorporating spell correction, and enhancing the rules used for extraction. Our system performs reasonably well for linking GenBank records for the influenza A virus to the geo-coordinates of their LOIH based on record metadata and information extracted from related full-text articles. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
SAMS--a systems architecture for developing intelligent health information systems.

PubMed

Yılmaz, Özgün; Erdur, Rıza Cenk; Türksever, Mustafa

2013-12-01

In this paper, SAMS, a novel health information system architecture for developing intelligent health information systems is proposed and also some strategies for developing such systems are discussed. The systems fulfilling this architecture will be able to store electronic health records of the patients using OWL ontologies, share patient records among different hospitals and provide physicians expertise to assist them in making decisions. The system is intelligent because it is rule-based, makes use of rule-based reasoning and has the ability to learn and evolve itself. The learning capability is provided by extracting rules from previously given decisions by the physicians and then adding the extracted rules to the system. The proposed system is novel and original in all of these aspects. As a case study, a system is implemented conforming to SAMS architecture for use by dentists in the dental domain. The use of the developed system is described with a scenario. For evaluation, the developed dental information system will be used and tried by a group of dentists. The development of this system proves the applicability of SAMS architecture. By getting decision support from a system derived from this architecture, the cognitive gap between experienced and inexperienced physicians can be compensated. Thus, patient satisfaction can be achieved, inexperienced physicians are supported in decision making and the personnel can improve their knowledge. A physician can diagnose a case, which he/she has never diagnosed before, using this system. With the help of this system, it will be possible to store general domain knowledge in this system and the personnel's need to medical guideline documents will be reduced.
Radical scavenging activities of Rio Red grapefruits and Sour orange fruit extracts in different in vitro model systems.

PubMed

Jayaprakasha, G K; Girennavar, Basavaraj; Patil, Bhimanagouda S

2008-07-01

Antioxidant fractions from two different citrus species such as Rio Red (Citrus paradise Macf.) and Sour orange (Citrus aurantium L.) were extracted with five different polar solvents using Soxhlet type extractor. The total phenolic content of the extracts was determined by Folin-Ciocalteu method. Ethyl acetate extract of Rio Red and Sour orange was found to contain maximum phenolics. The dried fractions were screened for their antioxidant activity potential using in vitro model systems such as 1,1-diphenyl-2-picryl hydrazyl (DPPH), phosphomolybdenum method and nitroblue tetrazolium (NBT) reduction at different concentrations. The methanol:water (80:20) fraction of Rio Red showed the highest radical scavenging activity 42.5%, 77.8% and 92.1% at 250, 500 and 1000 ppm, respectively, while methanol:water (80:20) fraction of Sour orange showed the lowest radical scavenging activity at all the tested concentrations. All citrus fractions showed good antioxidant capacity by the formation of phosphomolybdenum complex at 200 ppm. In addition, superoxide radical scavenging activity was assayed using non-enzymatic (NADH/phenaxine methosulfate) superoxide generating system. All the extracts showed variable superoxide radical scavenging activity. Moreover, methanol:water (80:20) extract of Rio Red and methanol extract of Sour orange exhibited marked reducing power in potassium ferricyanide reduction method. The data obtained using above in vitro models clearly establish the antioxidant potential of citrus fruit extracts. However, comprehensive studies need to be conducted to ascertain the in vivo bioavailability, safety and efficacy of such extracts in experimental animals. To the best of our knowledge, this is the first report on antioxidant activity of different polar extracts from Rio Red and Sour oranges.
Establishment of Application Guidance for OTC non-Kampo Crude Drug Extract Products in Japan

PubMed Central

Somekawa, Layla; Maegawa, Hikoichiro; Tsukada, Shinsuke; Nakamura, Takatoshi

2017-01-01

Currently, there are no standardized regulatory systems for herbal medicinal products worldwide. Communication and sharing of knowledge between different regulatory systems will lead to mutual understanding and might help identify topics which deserve further discussion in the establishment of common standards. Regulatory information on traditional herbal medicinal products in Japan is updated by the establishment of Application Guidance for over-the-counter non-Kampo Crude Drug Extract Products. We would like to report on updated regulatory information on the new Application Guidance. Methods for comparison of Crude Drug Extract formulation and standard decoction and criteria for application and the key points to consider for each criterion are indicated in the guidance. Establishment of the guidance contributes to improvements in public health. We hope that the regulatory information about traditional herbal medicinal products in Japan will be of contribution to tackling the challenging task of regulating traditional herbal products worldwide. PMID:28894633
Extracting Social Information from Chemosensory Cues: Consideration of Several Scenarios and Their Functional Implications

PubMed Central

Ben-Shaul, Yoram

2015-01-01

Across all sensory modalities, stimuli can vary along multiple dimensions. Efficient extraction of information requires sensitivity to those stimulus dimensions that provide behaviorally relevant information. To derive social information from chemosensory cues, sensory systems must embed information about the relationships between behaviorally relevant traits of individuals and the distributions of the chemical cues that are informative about these traits. In simple cases, the mere presence of one particular compound is sufficient to guide appropriate behavior. However, more generally, chemosensory information is conveyed via relative levels of multiple chemical cues, in non-trivial ways. The computations and networks needed to derive information from multi-molecule stimuli are distinct from those required by single molecule cues. Our current knowledge about how socially relevant information is encoded by chemical blends, and how it is extracted by chemosensory systems is very limited. This manuscript explores several scenarios and the neuronal computations required to identify them. PMID:26635515
Building Scalable Knowledge Graphs for Earth Science

NASA Astrophysics Data System (ADS)

Ramachandran, R.; Maskey, M.; Gatlin, P. N.; Zhang, J.; Duan, X.; Bugbee, K.; Christopher, S. A.; Miller, J. J.

2017-12-01

Estimates indicate that the world's information will grow by 800% in the next five years. In any given field, a single researcher or a team of researchers cannot keep up with this rate of knowledge expansion without the help of cognitive systems. Cognitive computing, defined as the use of information technology to augment human cognition, can help tackle large systemic problems. Knowledge graphs, one of the foundational components of cognitive systems, link key entities in a specific domain with other entities via relationships. Researchers could mine these graphs to make probabilistic recommendations and to infer new knowledge. At this point, however, there is a dearth of tools to generate scalable Knowledge graphs using existing corpus of scientific literature for Earth science research. Our project is currently developing an end-to-end automated methodology for incrementally constructing Knowledge graphs for Earth Science. Semantic Entity Recognition (SER) is one of the key steps in this methodology. SER for Earth Science uses external resources (including metadata catalogs and controlled vocabulary) as references to guide entity extraction and recognition (i.e., labeling) from unstructured text, in order to build a large training set to seed the subsequent auto-learning component in our algorithm. Results from several SER experiments will be presented as well as lessons learned.
A Multi-Language System for Knowledge Extraction in E-Learning Videos

ERIC Educational Resources Information Center

Sood, Aparesh; Mittal, Ankush; Sarthi, Divya

2006-01-01

The existing multimedia software in E-Learning does not provide par excellence multimedia data service to the common user, hence E-Learning services are still short of intelligence and sophisticated end user tools for visualization and retrieval. An efficient approach to achieve the tasks such as, regional language narration, regional language…
The minimal local-asperity hypothesis of early retinal lateral inhibition.

PubMed

Balboa, R M; Grzywacz, N M

2000-07-01

Recently we found that the theories related to information theory existent in the literature cannot explain the behavior of the extent of the lateral inhibition mediated by retinal horizontal cells as a function of background light intensity. These theories can explain the fall of the extent from intermediate to high intensities, but not its rise from dim to intermediate intensities. We propose an alternate hypothesis that accounts for the extent's bell-shape behavior. This hypothesis proposes that the lateral-inhibition adaptation in the early retina is part of a system to extract several image attributes, such as occlusion borders and contrast. To do so, this system would use prior probabilistic knowledge about the biological processing and relevant statistics in natural images. A key novel statistic used here is the probability of the presence of an occlusion border as a function of local contrast. Using this probabilistic knowledge, the retina would optimize the spatial profile of lateral inhibition to minimize attribute-extraction error. The two significant errors that this minimization process must reduce are due to the quantal noise in photoreceptors and the straddling of occlusion borders by lateral inhibition.
Knowledge Modeling in Prior Art Search

NASA Astrophysics Data System (ADS)

Graf, Erik; Frommholz, Ingo; Lalmas, Mounia; van Rijsbergen, Keith

This study explores the benefits of integrating knowledge representations in prior art patent retrieval. Key to the introduced approach is the utilization of human judgment available in the form of classifications assigned to patent documents. The paper first outlines in detail how a methodology for the extraction of knowledge from such an hierarchical classification system can be established. Further potential ways of integrating this knowledge with existing Information Retrieval paradigms in a scalable and flexible manner are investigated. Finally based on these integration strategies the effectiveness in terms of recall and precision is evaluated in the context of a prior art search task for European patents. As a result of this evaluation it can be established that in general the proposed knowledge expansion techniques are particularly beneficial to recall and, with respect to optimizing field retrieval settings, further result in significant precision gains.
Self-Supervised Chinese Ontology Learning from Online Encyclopedias

PubMed Central

Shao, Zhiqing; Ruan, Tong

2014-01-01

Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO. PMID:24715819
Self-supervised Chinese ontology learning from online encyclopedias.

PubMed

Hu, Fanghuai; Shao, Zhiqing; Ruan, Tong

2014-01-01

Constructing ontology manually is a time-consuming, error-prone, and tedious task. We present SSCO, a self-supervised learning based chinese ontology, which contains about 255 thousand concepts, 5 million entities, and 40 million facts. We explore the three largest online Chinese encyclopedias for ontology learning and describe how to transfer the structured knowledge in encyclopedias, including article titles, category labels, redirection pages, taxonomy systems, and InfoBox modules, into ontological form. In order to avoid the errors in encyclopedias and enrich the learnt ontology, we also apply some machine learning based methods. First, we proof that the self-supervised machine learning method is practicable in Chinese relation extraction (at least for synonymy and hyponymy) statistically and experimentally and train some self-supervised models (SVMs and CRFs) for synonymy extraction, concept-subconcept relation extraction, and concept-instance relation extraction; the advantages of our methods are that all training examples are automatically generated from the structural information of encyclopedias and a few general heuristic rules. Finally, we evaluate SSCO in two aspects, scale and precision; manual evaluation results show that the ontology has excellent precision, and high coverage is concluded by comparing SSCO with other famous ontologies and knowledge bases; the experiment results also indicate that the self-supervised models obviously enrich SSCO.
Knowledge Discovery and Data Mining: An Overview

NASA Technical Reports Server (NTRS)

Fayyad, U.

1995-01-01

The process of knowledge discovery and data mining is the process of information extraction from very large databases. Its importance is described along with several techniques and considerations for selecting the most appropriate technique for extracting information from a particular data set.
Lynx: a database and knowledge extraction engine for integrative medicine.

PubMed

Sulakhe, Dinanath; Balasubramanian, Sandhya; Xie, Bingqing; Feng, Bo; Taylor, Andrew; Wang, Sheng; Berrocal, Eduardo; Dave, Utpal; Xu, Jinbo; Börnigen, Daniela; Gilliam, T Conrad; Maltsev, Natalia

2014-01-01

We have developed Lynx (http://lynx.ci.uchicago.edu)--a web-based database and a knowledge extraction engine, supporting annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Its underlying knowledge base (LynxKB) integrates various classes of information from >35 public databases and private collections, as well as manually curated data from our group and collaborators. Lynx provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization to assist the user in extracting meaningful knowledge from LynxKB and experimental data, whereas its service-oriented architecture provides public access to LynxKB and its analytical tools via user-friendly web services and interfaces.
Automatic generation of nursing narratives from entity-attribute-value triplet for electronic nursing records system.

PubMed

Min, Yul Ha; Park, Hyeoun-Ae; Lee, Joo Yun; Jo, Soo Jung; Jeon, Eunjoo; Byeon, Namsoo; Choi, Seung Yong; Chung, Eunja

2014-01-01

The aim of this study is to develop and evaluate a natural language generation system to populate nursing narratives using detailed clinical models. Semantic, contextual, and syntactical knowledges were extracted. A natural language generation system linking these knowledges was developed. The quality of generated nursing narratives was evaluated by the three nurse experts using a five-point rating scale. With 82 detailed clinical models, in total 66,888 nursing narratives in four different types of statement were generated. The mean scores for overall quality was 4.66, for content 4.60, for grammaticality 4.40, for writing style 4.13, and for correctness 4.60. The system developed in this study generated nursing narratives with different levels of granularity. The generated nursing narratives can improve semantic interoperability of nursing data documented in nursing records.
ADEpedia 2.0: Integration of Normalized Adverse Drug Events (ADEs) Knowledge from the UMLS.

PubMed

Jiang, Guoqian; Liu, Hongfang; Solbrig, Harold R; Chute, Christopher G

2013-01-01

A standardized Adverse Drug Events (ADEs) knowledge base that encodes known ADE knowledge can be very useful in improving ADE detection for drug safety surveillance. In our previous study, we developed the ADEpedia that is a standardized knowledge base of ADEs based on drug product labels. The objectives of the present study are 1) to integrate normalized ADE knowledge from the Unified Medical Language System (UMLS) into the ADEpedia; and 2) to enrich the knowledge base with the drug-disorder co-occurrence data from a 51-million-document electronic medical records (EMRs) system. We extracted 266,832 drug-disorder concept pairs from the UMLS, covering 14,256 (1.69%) distinct drug concepts and 19,006 (3.53%) distinct disorder concepts. Of them, 71,626 (26.8%) concept pairs from UMLS co-occurred in the EMRs. We performed a preliminary evaluation on the utility of the UMLS ADE data. In conclusion, we have built an ADEpedia 2.0 framework that intends to integrate known ADE knowledge from disparate sources. The UMLS is a useful source for providing standardized ADE knowledge relevant to indications, contraindications and adverse effects, and complementary to the ADE data from drug product labels. The statistics from EMRs would enable the meaningful use of ADE data for drug safety surveillance.
Psychological tools for knowledge acquisition

NASA Technical Reports Server (NTRS)

Rueter, Henry H.; Olson, Judith Reitman

1988-01-01

Knowledge acquisition is said to be the biggest bottleneck in the development of expert systems. The problem is getting the knowledge out of the expert's head and into a computer. In cognitive psychology, characterizing metal structures and why experts are good at what they do is an important research area. Is there some way that the tools that psychologists have developed to uncover mental structure can be used to benefit knowledge engineers? We think that the way to find out is to browse through the psychologist's toolbox to see what there is in it that might be of use to knowledge engineers. Expert system developers have relied on two standard methods for extracting knowledge from the expert: (1) the knowledge engineer engages in an intense bout of interviews with the expert or experts, or (2) the knowledge engineer becomes an expert himself, relying on introspection to uncover the basis of his own expertise. Unfortunately, these techniques have the difficulty that often the expert himself isn't consciously aware of the basis of his expertise. If the expert himself isn't conscious of how he solves problems, introspection is useless. Cognitive psychology has faced similar problems for many years and has developed exploratory methods that can be used to discover cognitive structure from simple data.

Creating Shareable Clinical Decision Support Rules for a Pharmacogenomics Clinical Guideline Using Structured Knowledge Representation.

PubMed

Linan, Margaret K; Sottara, Davide; Freimuth, Robert R

2015-01-01

Pharmacogenomics (PGx) guidelines contain drug-gene relationships, therapeutic and clinical recommendations from which clinical decision support (CDS) rules can be extracted, rendered and then delivered through clinical decision support systems (CDSS) to provide clinicians with just-in-time information at the point of care. Several tools exist that can be used to generate CDS rules that are based on computer interpretable guidelines (CIG), but none have been previously applied to the PGx domain. We utilized the Unified Modeling Language (UML), the Health Level 7 virtual medical record (HL7 vMR) model, and standard terminologies to represent the semantics and decision logic derived from a PGx guideline, which were then mapped to the Health eDecisions (HeD) schema. The modeling and extraction processes developed here demonstrate how structured knowledge representations can be used to support the creation of shareable CDS rules from PGx guidelines.
New auto-segment method of cerebral hemorrhage

NASA Astrophysics Data System (ADS)

Wang, Weijiang; Shen, Tingzhi; Dang, Hua

2007-12-01

A novel method for Computerized tomography (CT) cerebral hemorrhage (CH) image automatic segmentation is presented in the paper, which uses expert system that models human knowledge about the CH automatic segmentation problem. The algorithm adopts a series of special steps and extracts some easy ignored CH features which can be found by statistic results of mass real CH images, such as region area, region CT number, region smoothness and some statistic CH region relationship. And a seven steps' extracting mechanism will ensure these CH features can be got correctly and efficiently. By using these CH features, a decision tree which models the human knowledge about the CH automatic segmentation problem has been built and it will ensure the rationality and accuracy of the algorithm. Finally some experiments has been taken to verify the correctness and reasonable of the automatic segmentation, and the good correct ratio and fast speed make it possible to be widely applied into practice.
Knowledge Discovery in Spectral Data by Means of Complex Networks

PubMed Central

Zanin, Massimiliano; Papo, David; Solís, José Luis González; Espinosa, Juan Carlos Martínez; Frausto-Reyes, Claudio; Anda, Pascual Palomares; Sevilla-Escoboza, Ricardo; Boccaletti, Stefano; Menasalvas, Ernestina; Sousa, Pedro

2013-01-01

In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, due to the lack of a natural network representation of spectral data. Here we define a technique for reconstructing networks from spectral data sets, where nodes represent spectral bins, and pairs of them are connected when their intensities follow a pattern associated with a disease. The structural analysis of the resulting network can then be used to feed standard data-mining algorithms, for instance for the classification of new (unlabeled) subjects. Furthermore, we show how the structure of the network is resilient to the presence of external additive noise, and how it can be used to extract relevant knowledge about the development of the disease. PMID:24957895
Knowledge discovery in spectral data by means of complex networks.

PubMed

Zanin, Massimiliano; Papo, David; Solís, José Luis González; Espinosa, Juan Carlos Martínez; Frausto-Reyes, Claudio; Anda, Pascual Palomares; Sevilla-Escoboza, Ricardo; Jaimes-Reategui, Rider; Boccaletti, Stefano; Menasalvas, Ernestina; Sousa, Pedro

2013-03-11

In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, due to the lack of a natural network representation of spectral data. Here we define a technique for reconstructing networks from spectral data sets, where nodes represent spectral bins, and pairs of them are connected when their intensities follow a pattern associated with a disease. The structural analysis of the resulting network can then be used to feed standard data-mining algorithms, for instance for the classification of new (unlabeled) subjects. Furthermore, we show how the structure of the network is resilient to the presence of external additive noise, and how it can be used to extract relevant knowledge about the development of the disease.
Text Mining in Biomedical Domain with Emphasis on Document Clustering

PubMed Central

2017-01-01

Objectives With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. Methods This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Results Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Conclusions Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise. PMID:28875048
Clinical Assistant Diagnosis for Electronic Medical Record Based on Convolutional Neural Network.

PubMed

Yang, Zhongliang; Huang, Yongfeng; Jiang, Yiran; Sun, Yuxi; Zhang, Yu-Jin; Luo, Pengcheng

2018-04-20

Automatically extracting useful information from electronic medical records along with conducting disease diagnoses is a promising task for both clinical decision support(CDS) and neural language processing(NLP). Most of the existing systems are based on artificially constructed knowledge bases, and then auxiliary diagnosis is done by rule matching. In this study, we present a clinical intelligent decision approach based on Convolutional Neural Networks(CNN), which can automatically extract high-level semantic information of electronic medical records and then perform automatic diagnosis without artificial construction of rules or knowledge bases. We use collected 18,590 copies of the real-world clinical electronic medical records to train and test the proposed model. Experimental results show that the proposed model can achieve 98.67% accuracy and 96.02% recall, which strongly supports that using convolutional neural network to automatically learn high-level semantic features of electronic medical records and then conduct assist diagnosis is feasible and effective.
From data to information and knowledge for geospatial applications

NASA Astrophysics Data System (ADS)

Schenk, T.; Csatho, B.; Yoon, T.

2006-12-01

An ever-increasing number of airborne and spaceborne data-acquisition missions with various sensors produce a glut of data. Sensory data rarely contains information in a explicit form such that an application can directly use it. The processing and analyzing of data constitutes a real bottleneck; therefore, automating the processes of gaining useful information and knowledge from the raw data is of paramount interest. This presentation is concerned with the transition from data to information and knowledge. With data we refer to the sensor output and we notice that data provide very rarely direct answers for applications. For example, a pixel in a digital image or a laser point from a LIDAR system (data) have no direct relationship with elevation changes of topographic surfaces or the velocity of a glacier (information, knowledge). We propose to employ the computer vision paradigm to extract information and knowledge as it pertains to a wide range of geoscience applications. After introducing the paradigm we describe the major steps to be undertaken for extracting information and knowledge from sensory input data. Features play an important role in this process. Thus we focus on extracting features and their perceptual organization to higher order constructs. We demonstrate these concepts with imaging data and laser point clouds. The second part of the presentation addresses the problem of combining data obtained by different sensors. An absolute prerequisite for successful fusion is to establish a common reference frame. We elaborate on the concept of sensor invariant features that allow the registration of such disparate data sets as aerial/satellite imagery, 3D laser point clouds, and multi/hyperspectral imagery. Fusion takes place on the data level (sensor registration) and on the information level. We show how fusion increases the degree of automation for reconstructing topographic surfaces. Moreover, fused information gained from the three sensors results in a more abstract surface representation with a rich set of explicit surface information that can be readily used by an analyst for applications such as change detection.
ZK DrugResist 2.0: A TextMiner to extract semantic relations of drug resistance from PubMed.

PubMed

Khalid, Zoya; Sezerman, Osman Ugur

2017-05-01

Extracting useful knowledge from an unstructured textual data is a challenging task for biologists, since biomedical literature is growing exponentially on a daily basis. Building an automated method for such tasks is gaining much attention of researchers. ZK DrugResist is an online tool that automatically extracts mutations and expression changes associated with drug resistance from PubMed. In this study we have extended our tool to include semantic relations extracted from biomedical text covering drug resistance and established a server including both of these features. Our system was tested for three relations, Resistance (R), Intermediate (I) and Susceptible (S) by applying hybrid feature set. From the last few decades the focus has changed to hybrid approaches as it provides better results. In our case this approach combines rule-based methods with machine learning techniques. The results showed 97.67% accuracy with 96% precision, recall and F-measure. The results have outperformed the previously existing relation extraction systems thus can facilitate computational analysis of drug resistance against complex diseases and further can be implemented on other areas of biomedicine. Copyright © 2017 Elsevier Inc. All rights reserved.
Building a glaucoma interaction network using a text mining approach.

PubMed

Soliman, Maha; Nasraoui, Olfa; Cooper, Nigel G F

2016-01-01

The volume of biomedical literature and its underlying knowledge base is rapidly expanding, making it beyond the ability of a single human being to read through all the literature. Several automated methods have been developed to help make sense of this dilemma. The present study reports on the results of a text mining approach to extract gene interactions from the data warehouse of published experimental results which are then used to benchmark an interaction network associated with glaucoma. To the best of our knowledge, there is, as yet, no glaucoma interaction network derived solely from text mining approaches. The presence of such a network could provide a useful summative knowledge base to complement other forms of clinical information related to this disease. A glaucoma corpus was constructed from PubMed Central and a text mining approach was applied to extract genes and their relations from this corpus. The extracted relations between genes were checked using reference interaction databases and classified generally as known or new relations. The extracted genes and relations were then used to construct a glaucoma interaction network. Analysis of the resulting network indicated that it bears the characteristics of a small world interaction network. Our analysis showed the presence of seven glaucoma linked genes that defined the network modularity. A web-based system for browsing and visualizing the extracted glaucoma related interaction networks is made available at http://neurogene.spd.louisville.edu/GlaucomaINViewer/Form1.aspx. This study has reported the first version of a glaucoma interaction network using a text mining approach. The power of such an approach is in its ability to cover a wide range of glaucoma related studies published over many years. Hence, a bigger picture of the disease can be established. To the best of our knowledge, this is the first glaucoma interaction network to summarize the known literature. The major findings were a set of relations that could not be found in existing interaction databases and that were found to be new, in addition to a smaller subnetwork consisting of interconnected clusters of seven glaucoma genes. Future improvements can be applied towards obtaining a better version of this network.
Artisanal Extraction and Traditional Knowledge Associated with Medicinal Use of Crabwood Oil (Carapa guianensis Aublet.) in a Peri-Urban Várzea Environment in the Amazon Estuary.

PubMed

Nardi, Mariane; Lira-Guedes, Ana Cláudia; Albuquerque Cunha, Helenilza Ferreira; Guedes, Marcelino Carneiro; Mustin, Karen; Gomes, Suellen Cristina Pantoja

2016-01-01

Várzea forests of the Amazon estuary contain species of importance to riverine communities. For example, the oil extracted from the seeds of crabwood trees is traditionally used to combat various illnesses and as such artisanal extraction processes have been maintained. The objectives of this study were to (1) describe the process involved in artisanal extraction of crabwood oil in the Fazendinha Protected Area, in the state of Amapá; (2) characterise the processes of knowledge transfer associated with the extraction and use of crabwood oil within a peri-urban riverine community; and (3) discern medicinal uses of the oil. The data were obtained using semistructured interviews with 13 community members involved in crabwood oil extraction and via direct observation. The process of oil extraction is divided into four stages: seed collection; cooking and resting of the seeds; shelling of the seeds and dough preparation; and oil collection. Oil extraction is carried out within the home for personal use, with surplus marketed within the community. More than 90% of the members of the community involved in extraction of crabwood oil highlighted the use of the oil to combat inflammation of the throat. Knowledge transfer occurs via oral transmission and through direct observation.
Artisanal Extraction and Traditional Knowledge Associated with Medicinal Use of Crabwood Oil (Carapa guianensis Aublet.) in a Peri-Urban Várzea Environment in the Amazon Estuary

PubMed Central

Lira-Guedes, Ana Cláudia; Albuquerque Cunha, Helenilza Ferreira; Guedes, Marcelino Carneiro; Mustin, Karen; Gomes, Suellen Cristina Pantoja

2016-01-01

Várzea forests of the Amazon estuary contain species of importance to riverine communities. For example, the oil extracted from the seeds of crabwood trees is traditionally used to combat various illnesses and as such artisanal extraction processes have been maintained. The objectives of this study were to (1) describe the process involved in artisanal extraction of crabwood oil in the Fazendinha Protected Area, in the state of Amapá; (2) characterise the processes of knowledge transfer associated with the extraction and use of crabwood oil within a peri-urban riverine community; and (3) discern medicinal uses of the oil. The data were obtained using semistructured interviews with 13 community members involved in crabwood oil extraction and via direct observation. The process of oil extraction is divided into four stages: seed collection; cooking and resting of the seeds; shelling of the seeds and dough preparation; and oil collection. Oil extraction is carried out within the home for personal use, with surplus marketed within the community. More than 90% of the members of the community involved in extraction of crabwood oil highlighted the use of the oil to combat inflammation of the throat. Knowledge transfer occurs via oral transmission and through direct observation. PMID:27478479
Design of Automatic Extraction Algorithm of Knowledge Points for MOOCs

PubMed Central

Chen, Haijian; Han, Dongmei; Zhao, Lina

2015-01-01

In recent years, Massive Open Online Courses (MOOCs) are very popular among college students and have a powerful impact on academic institutions. In the MOOCs environment, knowledge discovery and knowledge sharing are very important, which currently are often achieved by ontology techniques. In building ontology, automatic extraction technology is crucial. Because the general methods of text mining algorithm do not have obvious effect on online course, we designed automatic extracting course knowledge points (AECKP) algorithm for online course. It includes document classification, Chinese word segmentation, and POS tagging for each document. Vector Space Model (VSM) is used to calculate similarity and design the weight to optimize the TF-IDF algorithm output values, and the higher scores will be selected as knowledge points. Course documents of “C programming language” are selected for the experiment in this study. The results show that the proposed approach can achieve satisfactory accuracy rate and recall rate. PMID:26448738
Adaptive simplification of complex multiscale systems.

PubMed

Chiavazzo, Eliodoro; Karlin, Ilya

2011-03-01

A fully adaptive methodology is developed for reducing the complexity of large dissipative systems. This represents a significant step toward extracting essential physical knowledge from complex systems, by addressing the challenging problem of a minimal number of variables needed to exactly capture the system dynamics. Accurate reduced description is achieved, by construction of a hierarchy of slow invariant manifolds, with an embarrassingly simple implementation in any dimension. The method is validated with the autoignition of the hydrogen-air mixture where a reduction to a cascade of slow invariant manifolds is observed.
Grist : grid-based data mining for astronomy

NASA Technical Reports Server (NTRS)

Jacob, Joseph C.; Katz, Daniel S.; Miller, Craig D.; Walia, Harshpreet; Williams, Roy; Djorgovski, S. George; Graham, Matthew J.; Mahabal, Ashish; Babu, Jogesh; Berk, Daniel E. Vanden;

2004-01-01

The Grist project is developing a grid-technology based system as a research environment for astronomy with massive and complex datasets. This knowledge extraction system will consist of a library of distributed grid services controlled by a workflow system, compliant with standards emerging from the grid computing, web services, and virtual observatory communities. This new technology is being used to find high redshift quasars, study peculiar variable objects, search for transients in real time, and fit SDSS QSO spectra to measure black hole masses. Grist services are also a component of the 'hyperatlas' project to serve high-resolution multi-wavelength imagery over the Internet. In support of these science and outreach objectives, the Grist framework will provide the enabling fabric to tie together distributed grid services in the areas of data access, federation, mining, subsetting, source extraction, image mosaicking, statistics, and visualization.

Grist: Grid-based Data Mining for Astronomy

NASA Astrophysics Data System (ADS)

Jacob, J. C.; Katz, D. S.; Miller, C. D.; Walia, H.; Williams, R. D.; Djorgovski, S. G.; Graham, M. J.; Mahabal, A. A.; Babu, G. J.; vanden Berk, D. E.; Nichol, R.

2005-12-01

The Grist project is developing a grid-technology based system as a research environment for astronomy with massive and complex datasets. This knowledge extraction system will consist of a library of distributed grid services controlled by a workflow system, compliant with standards emerging from the grid computing, web services, and virtual observatory communities. This new technology is being used to find high redshift quasars, study peculiar variable objects, search for transients in real time, and fit SDSS QSO spectra to measure black hole masses. Grist services are also a component of the ``hyperatlas'' project to serve high-resolution multi-wavelength imagery over the Internet. In support of these science and outreach objectives, the Grist framework will provide the enabling fabric to tie together distributed grid services in the areas of data access, federation, mining, subsetting, source extraction, image mosaicking, statistics, and visualization.
Identification of research hypotheses and new knowledge from scientific literature.

PubMed

Shardlow, Matthew; Batista-Navarro, Riza; Thompson, Paul; Nawaz, Raheel; McNaught, John; Ananiadou, Sophia

2018-06-25

Text mining (TM) methods have been used extensively to extract relations and events from the literature. In addition, TM techniques have been used to extract various types or dimensions of interpretative information, known as Meta-Knowledge (MK), from the context of relations and events, e.g. negation, speculation, certainty and knowledge type. However, most existing methods have focussed on the extraction of individual dimensions of MK, without investigating how they can be combined to obtain even richer contextual information. In this paper, we describe a novel, supervised method to extract new MK dimensions that encode Research Hypotheses (an author's intended knowledge gain) and New Knowledge (an author's findings). The method incorporates various features, including a combination of simple MK dimensions. We identify previously explored dimensions and then use a random forest to combine these with linguistic features into a classification model. To facilitate evaluation of the model, we have enriched two existing corpora annotated with relations and events, i.e., a subset of the GENIA-MK corpus and the EU-ADR corpus, by adding attributes to encode whether each relation or event corresponds to Research Hypothesis or New Knowledge. In the GENIA-MK corpus, these new attributes complement simpler MK dimensions that had previously been annotated. We show that our approach is able to assign different types of MK dimensions to relations and events with a high degree of accuracy. Firstly, our method is able to improve upon the previously reported state of the art performance for an existing dimension, i.e., Knowledge Type. Secondly, we also demonstrate high F1-score in predicting the new dimensions of Research Hypothesis (GENIA: 0.914, EU-ADR 0.802) and New Knowledge (GENIA: 0.829, EU-ADR 0.836). We have presented a novel approach for predicting New Knowledge and Research Hypothesis, which combines simple MK dimensions to achieve high F1-scores. The extraction of such information is valuable for a number of practical TM applications.
FY 1992-1993 RDT&E Descriptive Summaries: DARPA

DTIC Science & Technology

1991-02-01

combining natural language and user workflow model information. * Determine effectiveness of auditory models as preprocessors for robust speech...for indexing and retrieving design knowledge. * Evaluate ability of message understanding systems to extract crisis -situation data from news wires...energy effects , underwater vehicles, neutrino detection, speech, tailored nuclear weapons, hypervelocity, nanosecond timing, and MAD/RPV. FY 1991 Planned
Lynx: a database and knowledge extraction engine for integrative medicine

PubMed Central

Sulakhe, Dinanath; Balasubramanian, Sandhya; Xie, Bingqing; Feng, Bo; Taylor, Andrew; Wang, Sheng; Berrocal, Eduardo; Dave, Utpal; Xu, Jinbo; Börnigen, Daniela; Gilliam, T. Conrad; Maltsev, Natalia

2014-01-01

We have developed Lynx (http://lynx.ci.uchicago.edu)—a web-based database and a knowledge extraction engine, supporting annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Its underlying knowledge base (LynxKB) integrates various classes of information from >35 public databases and private collections, as well as manually curated data from our group and collaborators. Lynx provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization to assist the user in extracting meaningful knowledge from LynxKB and experimental data, whereas its service-oriented architecture provides public access to LynxKB and its analytical tools via user-friendly web services and interfaces. PMID:24270788
Mindtagger: A Demonstration of Data Labeling in Knowledge Base Construction.

PubMed

Shin, Jaeho; Ré, Christopher; Cafarella, Michael

2015-08-01

End-to-end knowledge base construction systems using statistical inference are enabling more people to automatically extract high-quality domain-specific information from unstructured data. As a result of deploying DeepDive framework across several domains, we found new challenges in debugging and improving such end-to-end systems to construct high-quality knowledge bases. DeepDive has an iterative development cycle in which users improve the data. To help our users, we needed to develop principles for analyzing the system's error as well as provide tooling for inspecting and labeling various data products of the system. We created guidelines for error analysis modeled after our colleagues' best practices, in which data labeling plays a critical role in every step of the analysis. To enable more productive and systematic data labeling, we created Mindtagger, a versatile tool that can be configured to support a wide range of tasks. In this demonstration, we show in detail what data labeling tasks are modeled in our error analysis guidelines and how each of them is performed using Mindtagger.
Electronic health records (EHRs): supporting ASCO's vision of cancer care.

PubMed

Yu, Peter; Artz, David; Warner, Jeremy

2014-01-01

ASCO's vision for cancer care in 2030 is built on the expanding importance of panomics and big data, and envisions enabling better health for patients with cancer by the rapid transformation of systems biology knowledge into cancer care advances. This vision will be heavily dependent on the use of health information technology for computational biology and clinical decision support systems (CDSS). Computational biology will allow us to construct models of cancer biology that encompass the complexity of cancer panomics data and provide us with better understanding of the mechanisms governing cancer behavior. The Agency for Healthcare Research and Quality promotes CDSS based on clinical practice guidelines, which are knowledge bases that grow too slowly to match the rate of panomic-derived knowledge. CDSS that are based on systems biology models will be more easily adaptable to rapid advancements and translational medicine. We describe the characteristics of health data representation, a model for representing molecular data that supports data extraction and use for panomic-based clinical research, and argue for CDSS that are based on systems biology and are algorithm-based.

KneeTex: an ontology-driven system for information extraction from MRI reports.

PubMed

Spasić, Irena; Zhao, Bo; Jones, Christopher B; Button, Kate

2015-01-01

In the realm of knee pathology, magnetic resonance imaging (MRI) has the advantage of visualising all structures within the knee joint, which makes it a valuable tool for increasing diagnostic accuracy and planning surgical treatments. Therefore, clinical narratives found in MRI reports convey valuable diagnostic information. A range of studies have proven the feasibility of natural language processing for information extraction from clinical narratives. However, no study focused specifically on MRI reports in relation to knee pathology, possibly due to the complexity of knee anatomy and a wide range of conditions that may be associated with different anatomical entities. In this paper we describe KneeTex, an information extraction system that operates in this domain. As an ontology-driven information extraction system, KneeTex makes active use of an ontology to strongly guide and constrain text analysis. We used automatic term recognition to facilitate the development of a domain-specific ontology with sufficient detail and coverage for text mining applications. In combination with the ontology, high regularity of the sublanguage used in knee MRI reports allowed us to model its processing by a set of sophisticated lexico-semantic rules with minimal syntactic analysis. The main processing steps involve named entity recognition combined with coordination, enumeration, ambiguity and co-reference resolution, followed by text segmentation. Ontology-based semantic typing is then used to drive the template filling process. We adopted an existing ontology, TRAK (Taxonomy for RehAbilitation of Knee conditions), for use within KneeTex. The original TRAK ontology expanded from 1,292 concepts, 1,720 synonyms and 518 relationship instances to 1,621 concepts, 2,550 synonyms and 560 relationship instances. This provided KneeTex with a very fine-grained lexico-semantic knowledge base, which is highly attuned to the given sublanguage. Information extraction results were evaluated on a test set of 100 MRI reports. A gold standard consisted of 1,259 filled template records with the following slots: finding, finding qualifier, negation, certainty, anatomy and anatomy qualifier. KneeTex extracted information with precision of 98.00 %, recall of 97.63 % and F-measure of 97.81 %, the values of which are in line with human-like performance. KneeTex is an open-source, stand-alone application for information extraction from narrative reports that describe an MRI scan of the knee. Given an MRI report as input, the system outputs the corresponding clinical findings in the form of JavaScript Object Notation objects. The extracted information is mapped onto TRAK, an ontology that formally models knowledge relevant for the rehabilitation of knee conditions. As a result, formally structured and coded information allows for complex searches to be conducted efficiently over the original MRI reports, thereby effectively supporting epidemiologic studies of knee conditions.
Generating disease-pertinent treatment vocabularies from MEDLINE citations.

PubMed

Wang, Liqin; Del Fiol, Guilherme; Bray, Bruce E; Haug, Peter J

2017-01-01

Healthcare communities have identified a significant need for disease-specific information. Disease-specific ontologies are useful in assisting the retrieval of disease-relevant information from various sources. However, building these ontologies is labor intensive. Our goal is to develop a system for an automated generation of disease-pertinent concepts from a popular knowledge resource for the building of disease-specific ontologies. A pipeline system was developed with an initial focus of generating disease-specific treatment vocabularies. It was comprised of the components of disease-specific citation retrieval, predication extraction, treatment predication extraction, treatment concept extraction, and relevance ranking. A semantic schema was developed to support the extraction of treatment predications and concepts. Four ranking approaches (i.e., occurrence, interest, degree centrality, and weighted degree centrality) were proposed to measure the relevance of treatment concepts to the disease of interest. We measured the performance of four ranks in terms of the mean precision at the top 100 concepts with five diseases, as well as the precision-recall curves against two reference vocabularies. The performance of the system was also compared to two baseline approaches. The pipeline system achieved a mean precision of 0.80 for the top 100 concepts with the ranking by interest. There were no significant different among the four ranks (p=0.53). However, the pipeline-based system had significantly better performance than the two baselines. The pipeline system can be useful for an automated generation of disease-relevant treatment concepts from the biomedical literature. Copyright © 2016 Elsevier Inc. All rights reserved.
Integrated system for automated financial document processing

NASA Astrophysics Data System (ADS)

Hassanein, Khaled S.; Wesolkowski, Slawo; Higgins, Ray; Crabtree, Ralph; Peng, Antai

1997-02-01

A system was developed that integrates intelligent document analysis with multiple character/numeral recognition engines in order to achieve high accuracy automated financial document processing. In this system, images are accepted in both their grayscale and binary formats. A document analysis module starts by extracting essential features from the document to help identify its type (e.g. personal check, business check, etc.). These features are also utilized to conduct a full analysis of the image to determine the location of interesting zones such as the courtesy amount and the legal amount. These fields are then made available to several recognition knowledge sources such as courtesy amount recognition engines and legal amount recognition engines through a blackboard architecture. This architecture allows all the available knowledge sources to contribute incrementally and opportunistically to the solution of the given recognition query. Performance results on a test set of machine printed business checks using the integrated system are also reported.
Multiresolutional schemata for unsupervised learning of autonomous robots for 3D space operation

NASA Technical Reports Server (NTRS)

Lacaze, Alberto; Meystel, Michael; Meystel, Alex

1994-01-01

This paper describes a novel approach to the development of a learning control system for autonomous space robot (ASR) which presents the ASR as a 'baby' -- that is, a system with no a priori knowledge of the world in which it operates, but with behavior acquisition techniques that allows it to build this knowledge from the experiences of actions within a particular environment (we will call it an Astro-baby). The learning techniques are rooted in the recursive algorithm for inductive generation of nested schemata molded from processes of early cognitive development in humans. The algorithm extracts data from the environment and by means of correlation and abduction, it creates schemata that are used for control. This system is robust enough to deal with a constantly changing environment because such changes provoke the creation of new schemata by generalizing from experiences, while still maintaining minimal computational complexity, thanks to the system's multiresolutional nature.
A bioinformatics roadmap for the human vaccines project.

PubMed

Scheuermann, Richard H; Sinkovits, Robert S; Schenkelberg, Theodore; Koff, Wayne C

2017-06-01

Biomedical research has become a data intensive science in which high throughput experimentation is producing comprehensive data about biological systems at an ever-increasing pace. The Human Vaccines Project is a new public-private partnership, with the goal of accelerating development of improved vaccines and immunotherapies for global infectious diseases and cancers by decoding the human immune system. To achieve its mission, the Project is developing a Bioinformatics Hub as an open-source, multidisciplinary effort with the overarching goal of providing an enabling infrastructure to support the data processing, analysis and knowledge extraction procedures required to translate high throughput, high complexity human immunology research data into biomedical knowledge, to determine the core principles driving specific and durable protective immune responses.
A logical model of cooperating rule-based systems

NASA Technical Reports Server (NTRS)

Bailin, Sidney C.; Moore, John M.; Hilberg, Robert H.; Murphy, Elizabeth D.; Bahder, Shari A.

1989-01-01

A model is developed to assist in the planning, specification, development, and verification of space information systems involving distributed rule-based systems. The model is based on an analysis of possible uses of rule-based systems in control centers. This analysis is summarized as a data-flow model for a hypothetical intelligent control center. From this data-flow model, the logical model of cooperating rule-based systems is extracted. This model consists of four layers of increasing capability: (1) communicating agents, (2) belief-sharing knowledge sources, (3) goal-sharing interest areas, and (4) task-sharing job roles.
Cytotoxic and Cytolytic Cnidarian Venoms. A Review on Health Implications and Possible Therapeutic Applications

PubMed Central

Mariottini, Gian Luigi; Pane, Luigi

2013-01-01

The toxicity of Cnidaria is a subject of concern for its influence on human activities and public health. During the last decades, the mechanisms of cell injury caused by cnidarian venoms have been studied utilizing extracts from several Cnidaria that have been tested in order to evaluate some fundamental parameters, such as the activity on cell survival, functioning and metabolism, and to improve the knowledge about the mechanisms of action of these compounds. In agreement with the modern tendency aimed to avoid the utilization of living animals in the experiments and to substitute them with in vitro systems, established cell lines or primary cultures have been employed to test cnidarian extracts or derivatives. Several cnidarian venoms have been found to have cytotoxic properties and have been also shown to cause hemolytic effects. Some studied substances have been shown to affect tumour cells and microorganisms, so making cnidarian extracts particularly interesting for their possible therapeutic employment. The review aims to emphasize the up-to-date knowledge about this subject taking in consideration the importance of such venoms in human pathology, the health implications and the possible therapeutic application of these natural compounds. PMID:24379089
Cytotoxic and cytolytic cnidarian venoms. A review on health implications and possible therapeutic applications.

PubMed

Mariottini, Gian Luigi; Pane, Luigi

2013-12-27

The toxicity of Cnidaria is a subject of concern for its influence on human activities and public health. During the last decades, the mechanisms of cell injury caused by cnidarian venoms have been studied utilizing extracts from several Cnidaria that have been tested in order to evaluate some fundamental parameters, such as the activity on cell survival, functioning and metabolism, and to improve the knowledge about the mechanisms of action of these compounds. In agreement with the modern tendency aimed to avoid the utilization of living animals in the experiments and to substitute them with in vitro systems, established cell lines or primary cultures have been employed to test cnidarian extracts or derivatives. Several cnidarian venoms have been found to have cytotoxic properties and have been also shown to cause hemolytic effects. Some studied substances have been shown to affect tumour cells and microorganisms, so making cnidarian extracts particularly interesting for their possible therapeutic employment. The review aims to emphasize the up-to-date knowledge about this subject taking in consideration the importance of such venoms in human pathology, the health implications and the possible therapeutic application of these natural compounds.
Improvement of sand filter and constructed wetland design using an environmental decision support system.

PubMed

Turon, Clàudia; Comas, Joaquim; Torrens, Antonina; Molle, Pascal; Poch, Manel

2008-01-01

With the aim of improving effluent quality of waste stabilization ponds, different designs of vertical flow constructed wetlands and intermittent sand filters were tested on an experimental full-scale plant within the framework of a European project. The information extracted from this study was completed and updated with heuristic and bibliographic knowledge. The data and knowledge acquired were difficult to integrate into mathematical models because they involve qualitative information and expert reasoning. Therefore, it was decided to develop an environmental decision support system (EDSS-Filter-Design) as a tool to integrate mathematical models and knowledge-based techniques. This paper describes the development of this support tool, emphasizing the collection of data and knowledge and representation of this information by means of mathematical equations and a rule-based system. The developed support tool provides the main design characteristics of filters: (i) required surface, (ii) media type, and (iii) media depth. These design recommendations are based on wastewater characteristics, applied load, and required treatment level data provided by the user. The results of the EDSS-Filter-Design provide appropriate and useful information and guidelines on how to design filters, according to the expert criteria. The encapsulation of the information into a decision support system reduces the design period and provides a feasible, reasoned, and positively evaluated proposal.
Challenges for automatically extracting molecular interactions from full-text articles.

PubMed

McIntosh, Tara; Curran, James R

2009-09-24

The increasing availability of full-text biomedical articles will allow more biomedical knowledge to be extracted automatically with greater reliability. However, most Information Retrieval (IR) and Extraction (IE) tools currently process only abstracts. The lack of corpora has limited the development of tools that are capable of exploiting the knowledge in full-text articles. As a result, there has been little investigation into the advantages of full-text document structure, and the challenges developers will face in processing full-text articles. We manually annotated passages from full-text articles that describe interactions summarised in a Molecular Interaction Map (MIM). Our corpus tracks the process of identifying facts to form the MIM summaries and captures any factual dependencies that must be resolved to extract the fact completely. For example, a fact in the results section may require a synonym defined in the introduction. The passages are also annotated with negated and coreference expressions that must be resolved.We describe the guidelines for identifying relevant passages and possible dependencies. The corpus includes 2162 sentences from 78 full-text articles. Our corpus analysis demonstrates the necessity of full-text processing; identifies the article sections where interactions are most commonly stated; and quantifies the proportion of interaction statements requiring coherent dependencies. Further, it allows us to report on the relative importance of identifying synonyms and resolving negated expressions. We also experiment with an oracle sentence retrieval system using the corpus as a gold-standard evaluation set. We introduce the MIM corpus, a unique resource that maps interaction facts in a MIM to annotated passages within full-text articles. It is an invaluable case study providing guidance to developers of biomedical IR and IE systems, and can be used as a gold-standard evaluation set for full-text IR tasks.
PREDOSE: a semantic web platform for drug abuse epidemiology using social media.

PubMed

Cameron, Delroy; Smith, Gary A; Daniulaityte, Raminta; Sheth, Amit P; Dave, Drashti; Chen, Lu; Anand, Gaurish; Carlson, Robert; Watkins, Kera Z; Falck, Russel

2013-12-01

The role of social media in biomedical knowledge mining, including clinical, medical and healthcare informatics, prescription drug abuse epidemiology and drug pharmacology, has become increasingly significant in recent years. Social media offers opportunities for people to share opinions and experiences freely in online communities, which may contribute information beyond the knowledge of domain professionals. This paper describes the development of a novel semantic web platform called PREDOSE (PREscription Drug abuse Online Surveillance and Epidemiology), which is designed to facilitate the epidemiologic study of prescription (and related) drug abuse practices using social media. PREDOSE uses web forum posts and domain knowledge, modeled in a manually created Drug Abuse Ontology (DAO--pronounced dow), to facilitate the extraction of semantic information from User Generated Content (UGC), through combination of lexical, pattern-based and semantics-based techniques. In a previous study, PREDOSE was used to obtain the datasets from which new knowledge in drug abuse research was derived. Here, we report on various platform enhancements, including an updated DAO, new components for relationship and triple extraction, and tools for content analysis, trend detection and emerging patterns exploration, which enhance the capabilities of the PREDOSE platform. Given these enhancements, PREDOSE is now more equipped to impact drug abuse research by alleviating traditional labor-intensive content analysis tasks. Using custom web crawlers that scrape UGC from publicly available web forums, PREDOSE first automates the collection of web-based social media content for subsequent semantic annotation. The annotation scheme is modeled in the DAO, and includes domain specific knowledge such as prescription (and related) drugs, methods of preparation, side effects, and routes of administration. The DAO is also used to help recognize three types of data, namely: (1) entities, (2) relationships and (3) triples. PREDOSE then uses a combination of lexical and semantic-based techniques to extract entities and relationships from the scraped content, and a top-down approach for triple extraction that uses patterns expressed in the DAO. In addition, PREDOSE uses publicly available lexicons to identify initial sentiment expressions in text, and then a probabilistic optimization algorithm (from related research) to extract the final sentiment expressions. Together, these techniques enable the capture of fine-grained semantic information, which facilitate search, trend analysis and overall content analysis using social media on prescription drug abuse. Moreover, extracted data are also made available to domain experts for the creation of training and test sets for use in evaluation and refinements in information extraction techniques. A recent evaluation of the information extraction techniques applied in the PREDOSE platform indicates 85% precision and 72% recall in entity identification, on a manually created gold standard dataset. In another study, PREDOSE achieved 36% precision in relationship identification and 33% precision in triple extraction, through manual evaluation by domain experts. Given the complexity of the relationship and triple extraction tasks and the abstruse nature of social media texts, we interpret these as favorable initial results. Extracted semantic information is currently in use in an online discovery support system, by prescription drug abuse researchers at the Center for Interventions, Treatment and Addictions Research (CITAR) at Wright State University. A comprehensive platform for entity, relationship, triple and sentiment extraction from such abstruse texts has never been developed for drug abuse research. PREDOSE has already demonstrated the importance of mining social media by providing data from which new findings in drug abuse research were uncovered. Given the recent platform enhancements, including the refined DAO, components for relationship and triple extraction, and tools for content, trend and emerging pattern analysis, it is expected that PREDOSE will play a significant role in advancing drug abuse epidemiology in future. Copyright © 2013 Elsevier Inc. All rights reserved.
MMKG: An approach to generate metallic materials knowledge graph based on DBpedia and Wikipedia

NASA Astrophysics Data System (ADS)

Zhang, Xiaoming; Liu, Xin; Li, Xin; Pan, Dongyu

2017-02-01

The research and development of metallic materials are playing an important role in today's society, and in the meanwhile lots of metallic materials knowledge is generated and available on the Web (e.g., Wikipedia) for materials experts. However, due to the diversity and complexity of metallic materials knowledge, the knowledge utilization may encounter much inconvenience. The idea of knowledge graph (e.g., DBpedia) provides a good way to organize the knowledge into a comprehensive entity network. Therefore, the motivation of our work is to generate a metallic materials knowledge graph (MMKG) using available knowledge on the Web. In this paper, an approach is proposed to build MMKG based on DBpedia and Wikipedia. First, we use an algorithm based on directly linked sub-graph semantic distance (DLSSD) to preliminarily extract metallic materials entities from DBpedia according to some predefined seed entities; then based on the results of the preliminary extraction, we use an algorithm, which considers both semantic distance and string similarity (SDSS), to achieve the further extraction. Second, due to the absence of materials properties in DBpedia, we use an ontology-based method to extract properties knowledge from the HTML tables of corresponding Wikipedia Web pages for enriching MMKG. Materials ontology is used to locate materials properties tables as well as to identify the structure of the tables. The proposed approach is evaluated by precision, recall, F1 and time performance, and meanwhile the appropriate thresholds for the algorithms in our approach are determined through experiments. The experimental results show that our approach returns expected performance. A tool prototype is also designed to facilitate the process of building the MMKG as well as to demonstrate the effectiveness of our approach.
Dynamic replanning of 3D automated reconstruction using situation graph trees and illumination adjustment

NASA Astrophysics Data System (ADS)

Kohler, Sophie; Far, Aïcha Beya; Hirsch, Ernest

2007-01-01

This paper presents an original approach for the optimal 3D reconstruction of manufactured workpieces based on a priori planification of the task, enhanced on-line through dynamic adjustment of the lighting conditions, and built around a cognitive intelligent sensory system using so-called Situation Graph Trees. The system takes explicitely structural knowledge related to image acquisition conditions, type of illumination sources, contents of the scene (e. g., CAD models and tolerance information), etc. into account. The principle of the approach relies on two steps. First, a socalled initialization phase, leading to the a priori task plan, collects this structural knowledge. This knowledge is conveniently encoded, as a sub-part, in the Situation Graph Tree building the backbone of the planning system specifying exhaustively the behavior of the application. Second, the image is iteratively evaluated under the control of this Situation Graph Tree. The information describing the quality of the piece to analyze is thus extracted and further exploited for, e. g., inspection tasks. Lastly, the approach enables dynamic adjustment of the Situation Graph Tree, enabling the system to adjust itself to the actual application run-time conditions, thus providing the system with a self-learning capability.
Crataegus special extract WS 1442: up-to-date review of experimental and clinical experiences.

PubMed

Zorniak, M; Szydlo, B; Krzeminski, T F

2017-08-01

Extracts and tinctures made from Crataegus spp. (Hawthorn) have been used as cardioprotective remedies since ancient times. WS 1442 special extract, manufactured by Dr. W. Schwabe Pharmaceuticals©, made from Crataegus spp. Leaves and flowers is one of the most studied and popular of preparations received from Hawthorn. It is integral, and most important active component of such herbal drugs as Crataegutt® novo 450, and CardioMax®. This standardized extract contains 18.75% oligomeric procyanidins (OPC), which have beneficial cardioprotective values and play a role as free-radicals scavengers, that protect the ischemic heart tissue from neutrophile elastase action successions. Moreover, WS 1442 also carries proven vasorelaxant activity, via affecting eNOS synthase, and prevents ischemic heart tissue swelling by influence on calcium signaling pathways, and thus detain hyperpermeability of endothelium. Actions of WS 1442 special extract were investigated in in vitro as well as in vivo studies including large clinical trials. In this review authors present current state of knowledge about possible beneficial effects of WS 1442 special extract on cardiovascular system.
[Bacopa Monnieri - activity and applications in medicine].

PubMed

Sokołowska, Lilianna; Bylka, Wiesława

2015-01-01

Recently appeared medicinal preparations containing an extract of Bacopa monnieri. B. monnieri has been used in the Ayurvedic system of medicine for centuries, especially to enhance cognitive functions. Active compounds in this plant include numerous saponins type dammarane, mainly bacoside A and B, alkaloids and sterols. The pharmacological research of the extract of Bacopa monnieri support traditional uses of this plant. The results of so far clinical studies of the extract from B. monnieri, indicate to beneficial effect on memory, learning and concentration in adults, children and to improve in anxiety and depression after prolonged administration, although further clinical studies are needed to confirm medical indications. It has been found to be well tolerated. The present review summarizes current knowledge of mechanisms of actions also presents of dietary supplements with B. monnieri.
Extracting nursing practice patterns from structured labor and delivery data sets.

PubMed

Hall, Eric S; Thornton, Sidney N

2007-10-11

This study was designed to demonstrate the feasibility of a computerized care process model that provides real-time case profiling and outcome forecasting. A methodology was defined for extracting nursing practice patterns from structured point-of-care data collected using the labor and delivery information system at Intermountain Healthcare. Data collected during January 2006 were retrieved from Intermountain Healthcare's enterprise data warehouse for use in the study. The knowledge discovery in databases process provided a framework for data analysis including data selection, preprocessing, data-mining, and evaluation. Development of an interactive data-mining tool and construction of a data model for stratification of patient records into profiles supported the goals of the study. Five benefits of the practice pattern extraction capability, which extend to other clinical domains, are listed with supporting examples.
Rectification of elemental image set and extraction of lens lattice by projective image transformation in integral imaging.

PubMed

Hong, Keehoon; Hong, Jisoo; Jung, Jae-Hyun; Park, Jae-Hyeung; Lee, Byoungho

2010-05-24

We propose a new method for rectifying a geometrical distortion in the elemental image set and extracting an accurate lens lattice lines by projective image transformation. The information of distortion in the acquired elemental image set is found by Hough transform algorithm. With this initial information of distortions, the acquired elemental image set is rectified automatically without the prior knowledge on the characteristics of pickup system by stratified image transformation procedure. Computer-generated elemental image sets with distortion on purpose are used for verifying the proposed rectification method. Experimentally-captured elemental image sets are optically reconstructed before and after the rectification by the proposed method. The experimental results support the validity of the proposed method with high accuracy of image rectification and lattice extraction.
Fuzzy Linguistic Knowledge Based Behavior Extraction for Building Energy Management Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dumidu Wijayasekara; Milos Manic

2013-08-01

Significant portion of world energy production is consumed by building Heating, Ventilation and Air Conditioning (HVAC) units. Thus along with occupant comfort, energy efficiency is also an important factor in HVAC control. Modern buildings use advanced Multiple Input Multiple Output (MIMO) control schemes to realize these goals. However, since the performance of HVAC units is dependent on many criteria including uncertainties in weather, number of occupants, and thermal state, the performance of current state of the art systems are sub-optimal. Furthermore, because of the large number of sensors in buildings, and the high frequency of data collection, large amount ofmore » information is available. Therefore, important behavior of buildings that compromise energy efficiency or occupant comfort is difficult to identify. This paper presents an easy to use and understandable framework for identifying such behavior. The presented framework uses human understandable knowledge-base to extract important behavior of buildings and present it to users via a graphical user interface. The presented framework was tested on a building in the Pacific Northwest and was shown to be able to identify important behavior that relates to energy efficiency and occupant comfort.« less
A Global Assessment on Climate Research Engaging Indigenous Knowledge Systems and Recommendations for Quality Standards of Research Practice in Indigenous Communities

NASA Astrophysics Data System (ADS)

Davíd-Chavez, D. M.; Gavin, M. C.

2017-12-01

Indigenous communities worldwide have maintained their own knowledge systems for millennia informed through careful observation of dynamics of environmental changes. Withstanding centuries of challenges to their rights to maintain and practice these knowledge systems, Indigenous peoples continually speak to a need for quality standards for research in their communities. Although, international and Indigenous peoples' working groups emphasize Indigenous knowledge systems and the communities who hold them as critical resources for understanding and adapting to climate change, there has yet to be a comprehensive, evidence based analysis into how diverse knowledge systems are integrated in scientific studies. Do current research practices challenge or support Indigenous communities in their efforts to maintain and appropriately apply their knowledge systems? This study addresses this question using a systematic literature review and meta-analysis assessing levels of Indigenous community participation and decision-making in all stages of the research process (initiation, design, implementation, analysis, dissemination). Assessment is based on reported quality indicators such as: outputs that serve the community, ethical guidelines in practice (free, prior, and informed consent and intellectual property rights), and community access to findings. These indicators serve to identify patterns between levels of community participation and quality standards in practice. Meta-analysis indicates most climate studies practice an extractive model in which Indigenous knowledge systems are co-opted with minimal participation or decision-making authority from communities who hold them. Few studies report outputs that directly serve Indigenous communities, ethical guidelines in practice, or community access to findings. Studies reporting the most quality indicators were initiated in mutual agreement between Indigenous communities and outside researchers or by communities themselves. This study also draws from the researcher's experiences as an Indigenous scientist and includes recommendations for quality research practice. This global assessment provides an evidence base to inform our understanding of broader impacts related to research design.
Rapid matching of stereo vision based on fringe projection profilometry

NASA Astrophysics Data System (ADS)

Zhang, Ruihua; Xiao, Yi; Cao, Jian; Guo, Hongwei

2016-09-01

As the most important core part of stereo vision, there are still many problems to solve in stereo matching technology. For smooth surfaces on which feature points are not easy to extract, this paper adds a projector into stereo vision measurement system based on fringe projection techniques, according to the corresponding point phases which extracted from the left and right camera images are the same, to realize rapid matching of stereo vision. And the mathematical model of measurement system is established and the three-dimensional (3D) surface of the measured object is reconstructed. This measurement method can not only broaden application fields of optical 3D measurement technology, and enrich knowledge achievements in the field of optical 3D measurement, but also provide potential possibility for the commercialized measurement system in practical projects, which has very important scientific research significance and economic value.

The dynamics of discrete-time computation, with application to recurrent neural networks and finite state machine extraction.

PubMed

Casey, M

1996-08-15

Recurrent neural networks (RNNs) can learn to perform finite state computations. It is shown that an RNN performing a finite state computation must organize its state space to mimic the states in the minimal deterministic finite state machine that can perform that computation, and a precise description of the attractor structure of such systems is given. This knowledge effectively predicts activation space dynamics, which allows one to understand RNN computation dynamics in spite of complexity in activation dynamics. This theory provides a theoretical framework for understanding finite state machine (FSM) extraction techniques and can be used to improve training methods for RNNs performing FSM computations. This provides an example of a successful approach to understanding a general class of complex systems that has not been explicitly designed, e.g., systems that have evolved or learned their internal structure.
Dechlorination of DDT, DDD and DDE in soil (slurry) phase using magnesium/palladium system.

PubMed

Gautam, Sumit Kumar; Suresh, Sumathi

2006-12-01

Mg0/Pd4+ was able to dechlorinate >99% of extractable DDT (initial concentration of 10 mg DDT kg(-1) of soil) and >90% of extractable DDT (initial concentration of 50 mg DDT kg(-1) of soil) in soil slurry. Mg0/Pd4+ was also found to be effective in dechlorinating of 50 mg kg(-1) DDD and DDE, in soil aged for varying time periods. GC-MS analyses revealed the formation of 1,1-diphenylethane as an end product from DDT, DDE and DDD. To the best of our knowledge this is the first report describing the application Mg0/Pd4+ system for remediation of DDT, DDD and DDE contaminated soil. We conclude that reductive dechlorination reaction catalyzed by Mg0/Pd4+ may be a promising system to remediate soil contaminated with DDT and its dechlorinated products such as DDD and DDE.
A comparison of machine learning techniques for detection of drug target articles.

PubMed

Danger, Roxana; Segura-Bedmar, Isabel; Martínez, Paloma; Rosso, Paolo

2010-12-01

Important progress in treating diseases has been possible thanks to the identification of drug targets. Drug targets are the molecular structures whose abnormal activity, associated to a disease, can be modified by drugs, improving the health of patients. Pharmaceutical industry needs to give priority to their identification and validation in order to reduce the long and costly drug development times. In the last two decades, our knowledge about drugs, their mechanisms of action and drug targets has rapidly increased. Nevertheless, most of this knowledge is hidden in millions of medical articles and textbooks. Extracting knowledge from this large amount of unstructured information is a laborious job, even for human experts. Drug target articles identification, a crucial first step toward the automatic extraction of information from texts, constitutes the aim of this paper. A comparison of several machine learning techniques has been performed in order to obtain a satisfactory classifier for detecting drug target articles using semantic information from biomedical resources such as the Unified Medical Language System. The best result has been achieved by a Fuzzy Lattice Reasoning classifier, which reaches 98% of ROC area measure. Copyright © 2010 Elsevier Inc. All rights reserved.
OC-2-KB: A software pipeline to build an evidence-based obesity and cancer knowledge base.

PubMed

Lossio-Ventura, Juan Antonio; Hogan, William; Modave, François; Guo, Yi; He, Zhe; Hicks, Amanda; Bian, Jiang

2017-11-01

Obesity has been linked to several types of cancer. Access to adequate health information activates people's participation in managing their own health, which ultimately improves their health outcomes. Nevertheless, the existing online information about the relationship between obesity and cancer is heterogeneous and poorly organized. A formal knowledge representation can help better organize and deliver quality health information. Currently, there are several efforts in the biomedical domain to convert unstructured data to structured data and store them in Semantic Web knowledge bases (KB). In this demo paper, we present, OC-2-KB (Obesity and Cancer to Knowledge Base), a system that is tailored to guide the automatic KB construction for managing obesity and cancer knowledge from free-text scientific literature (i.e., PubMed abstracts) in a systematic way. OC-2-KB has two important modules which perform the acquisition of entities and the extraction then classification of relationships among these entities. We tested the OC-2-KB system on a data set with 23 manually annotated obesity and cancer PubMed abstracts and created a preliminary KB with 765 triples. We conducted a preliminary evaluation on this sample of triples and reported our evaluation results.
PASTE: patient-centered SMS text tagging in a medication management system.

PubMed

Stenner, Shane P; Johnson, Kevin B; Denny, Joshua C

2012-01-01

To evaluate the performance of a system that extracts medication information and administration-related actions from patient short message service (SMS) messages. Mobile technologies provide a platform for electronic patient-centered medication management. MyMediHealth (MMH) is a medication management system that includes a medication scheduler, a medication administration record, and a reminder engine that sends text messages to cell phones. The object of this work was to extend MMH to allow two-way interaction using mobile phone-based SMS technology. Unprompted text-message communication with patients using natural language could engage patients in their healthcare, but presents unique natural language processing challenges. The authors developed a new functional component of MMH, the Patient-centered Automated SMS Tagging Engine (PASTE). The PASTE web service uses natural language processing methods, custom lexicons, and existing knowledge sources to extract and tag medication information from patient text messages. A pilot evaluation of PASTE was completed using 130 medication messages anonymously submitted by 16 volunteers via a website. System output was compared with manually tagged messages. Verified medication names, medication terms, and action terms reached high F-measures of 91.3%, 94.7%, and 90.4%, respectively. The overall medication name F-measure was 79.8%, and the medication action term F-measure was 90%. Other studies have demonstrated systems that successfully extract medication information from clinical documents using semantic tagging, regular expression-based approaches, or a combination of both approaches. This evaluation demonstrates the feasibility of extracting medication information from patient-generated medication messages.
Semi-automated extraction of landslides in Taiwan based on SPOT imagery and DEMs

NASA Astrophysics Data System (ADS)

Eisank, Clemens; Hölbling, Daniel; Friedl, Barbara; Chen, Yi-Chin; Chang, Kang-Tsung

2014-05-01

The vast availability and improved quality of optical satellite data and digital elevation models (DEMs), as well as the need for complete and up-to-date landslide inventories at various spatial scales have fostered the development of semi-automated landslide recognition systems. Among the tested approaches for designing such systems, object-based image analysis (OBIA) stepped out to be a highly promising methodology. OBIA offers a flexible, spatially enabled framework for effective landslide mapping. Most object-based landslide mapping systems, however, have been tailored to specific, mainly small-scale study areas or even to single landslides only. Even though reported mapping accuracies tend to be higher than for pixel-based approaches, accuracy values are still relatively low and depend on the particular study. There is still room to improve the applicability and objectivity of object-based landslide mapping systems. The presented study aims at developing a knowledge-based landslide mapping system implemented in an OBIA environment, i.e. Trimble eCognition. In comparison to previous knowledge-based approaches, the classification of segmentation-derived multi-scale image objects relies on digital landslide signatures. These signatures hold the common operational knowledge on digital landslide mapping, as reported by 25 Taiwanese landslide experts during personal semi-structured interviews. Specifically, the signatures include information on commonly used data layers, spectral and spatial features, and feature thresholds. The signatures guide the selection and implementation of mapping rules that were finally encoded in Cognition Network Language (CNL). Multi-scale image segmentation is optimized by using the improved Estimation of Scale Parameter (ESP) tool. The approach described above is developed and tested for mapping landslides in a sub-region of the Baichi catchment in Northern Taiwan based on SPOT imagery and a high-resolution DEM. An object-based accuracy assessment is conducted by quantitatively comparing extracted landslide objects with landslide polygons that were visually interpreted by local experts. The applicability and transferability of the mapping system are evaluated by comparing initial accuracies with those achieved for the following two tests: first, usage of a SPOT image from the same year, but for a different area within the Baichi catchment; second, usage of SPOT images from multiple years for the same region. The integration of the common knowledge via digital landslide signatures is new in object-based landslide studies. In combination with strategies to optimize image segmentation this may lead to a more objective, transferable and stable knowledge-based system for the mapping of landslides from optical satellite data and DEMs.
Describing knowledge encounters in healthcare: a mixed studies systematic review and development of a classification.

PubMed

Hurst, Dominic; Mickan, Sharon

2017-03-14

Implementation science seeks to promote the uptake of research and other evidence-based findings into practice, but for healthcare professionals, this is complex as practice draws on, in addition to scientific principles, rules of thumb and a store of practical wisdom acquired from a range of informational and experiential sources. The aims of this review were to identify sources of information and professional experiences encountered by healthcare workers and from this to build a classification system, for use in future observational studies, that describes influences on how healthcare professionals acquire and use information in their clinical practice. This was a mixed studies systematic review of observational studies. OVID MEDLINE and Embase and Google Scholar were searched using terms around information, knowledge or evidence and sharing, searching and utilisation combined with terms relating to healthcare groups. Studies were eligible if one of the intentions was to identify information or experiential encounters by healthcare workers. Data was extracted by one author after piloting with another. Studies were assessed using the Mixed Methods Appraisal Tool (MMAT). The primary outcome extracted was the information source or professional experience encounter. Similar encounters were grouped together as single constructs. Our synthesis involved a mixed approach using the top-down logic of the Bliss Bibliographic Classification System (BC2) to generate classification categories and a bottom-up approach to develop descriptive codes (or "facets") for each category, from the data. The generic terms of BC2 were customised by an iterative process of thematic content analysis. Facets were developed by using available theory and keeping in mind the pragmatic end use of the classification. Eighty studies were included from which 178 discreet knowledge encounters were extracted. Six classification categories were developed: what information or experience was encountered; how was the information or experience encountered; what was the mode of encounter; from whom did the information originate or with whom was the experience; how many participants were there; and where did the encounter take place. For each of these categories, relevant descriptive facets were identified. We have sought to identify and classify all knowledge encounters, and we have developed a faceted description of key categories which will support richer descriptions and interrogations of knowledge encounters in healthcare research.
Text mining and its potential applications in systems biology.

PubMed

Ananiadou, Sophia; Kell, Douglas B; Tsujii, Jun-ichi

2006-12-01

With biomedical literature increasing at a rate of several thousand papers per week, it is impossible to keep abreast of all developments; therefore, automated means to manage the information overload are required. Text mining techniques, which involve the processes of information retrieval, information extraction and data mining, provide a means of solving this. By adding meaning to text, these techniques produce a more structured analysis of textual knowledge than simple word searches, and can provide powerful tools for the production and analysis of systems biology models.
PEGylation of magnetic multi-walled carbon nanotubes for enhanced selectivity of dispersive solid phase extraction.

PubMed

Zeng, Qiong; Liu, Yi-Ming; Jia, Yan-Wei; Wan, Li-Hong; Liao, Xun

2017-02-01

Carbon nanotubes (CNTs) possess large potential as extraction absorbents in solid phase extraction. They have been widely applied in biomedicine research, while very rare application in natural product chemistry has been reported. In this work, methoxypolyethylene glycol amine (mPEG-NH 2 ) is covalently coupled to CNTs-magnetic nanoparticles (CNTs-MNP) to prepare a novel magnetic nanocomposite (PEG-CNTs-MNP) for use as dispersive solid-phase extraction (DSPE) absorbent. The average particle size was 86nm, and the saturation magnetization was 52.30emu/g. This nanocomposite exhibits excellent dispersibility in aqueous systems, high selectivity and fast binding kinetics when used for extraction of Z-ligustilide, the characteristic bioactive compound from two popular Asian herbal plants, R. chuanxiong and R. ligusticum. HPLC quantification of Z-ligustilide extracted from the standard sample solution showed a high recovery of 98.9%, and the extraction rate from the extracts of the above two herbs are both around 70.0%. To our knowledge, this is the first report on using PEG-CNTs-MNP as DSPE nanosorbents for selective extraction of natural products. This nano-material has promising application in isolation and enrichment of targeted components from complex matrices. Copyright © 2016 Elsevier B.V. All rights reserved.
A Foreign Object Damage Event Detector Data Fusion System for Turbofan Engines

NASA Technical Reports Server (NTRS)

Turso, James A.; Litt, Jonathan S.

2004-01-01

A Data Fusion System designed to provide a reliable assessment of the occurrence of Foreign Object Damage (FOD) in a turbofan engine is presented. The FOD-event feature level fusion scheme combines knowledge of shifts in engine gas path performance obtained using a Kalman filter, with bearing accelerometer signal features extracted via wavelet analysis, to positively identify a FOD event. A fuzzy inference system provides basic probability assignments (bpa) based on features extracted from the gas path analysis and bearing accelerometers to a fusion algorithm based on the Dempster-Shafer-Yager Theory of Evidence. Details are provided on the wavelet transforms used to extract the foreign object strike features from the noisy data and on the Kalman filter-based gas path analysis. The system is demonstrated using a turbofan engine combined-effects model (CEM), providing both gas path and rotor dynamic structural response, and is suitable for rapid-prototyping of control and diagnostic systems. The fusion of the disparate data can provide significantly more reliable detection of a FOD event than the use of either method alone. The use of fuzzy inference techniques combined with Dempster-Shafer-Yager Theory of Evidence provides a theoretical justification for drawing conclusions based on imprecise or incomplete data.
The ability of in vitro antioxidant assays to predict the efficiency of a cod protein hydrolysate and brown seaweed extract to prevent oxidation in marine food model systems.

PubMed

Jónsdóttir, Rósa; Geirsdóttir, Margrét; Hamaguchi, Patricia Y; Jamnik, Polona; Kristinsson, Hordur G; Undeland, Ingrid

2016-04-01

The ability of different in vitro antioxidant assays to predict the efficiency of cod protein hydrolysate (CPH) and Fucus vesiculosus ethyl acetate extract (EA) towards lipid oxidation in haemoglobin-fortified washed cod mince and iron-containing cod liver oil emulsion was evaluated. The progression of oxidation was followed by sensory analysis, lipid hydroperoxides and thiobarbituric acid-reactive substances (TBARS) in both systems, as well as loss of redness and protein carbonyls in the cod system. The in vitro tests revealed high reducing capacity, high DPPH radical scavenging properties and a high oxygen radical absorbance capacity (ORAC) value of the EA which also inhibited lipid and protein oxidation in the cod model system. The CPH had a high metal chelating capacity and was efficient against oxidation in the cod liver oil emulsion. The results indicate that the F. vesiculosus extract has a potential as an excellent natural antioxidant against lipid oxidation in fish muscle foods while protein hydrolysates are more promising for fish oil emulsions. The usefulness of in vitro assays to predict the antioxidative properties of new natural ingredients in foods thus depends on the knowledge about the food systems, particularly the main pro-oxidants present. © 2015 Society of Chemical Industry.
Toward a normalized clinical drug knowledge base in China-applying the RxNorm model to Chinese clinical drugs.

PubMed

Wang, Li; Zhang, Yaoyun; Jiang, Min; Wang, Jingqi; Dong, Jiancheng; Liu, Yun; Tao, Cui; Jiang, Guoqian; Zhou, Yi; Xu, Hua

2018-07-01

In recent years, electronic health record systems have been widely implemented in China, making clinical data available electronically. However, little effort has been devoted to making drug information exchangeable among these systems. This study aimed to build a Normalized Chinese Clinical Drug (NCCD) knowledge base, by applying and extending the information model of RxNorm to Chinese clinical drugs. Chinese drugs were collected from 4 major resources-China Food and Drug Administration, China Health Insurance Systems, Hospital Pharmacy Systems, and China Pharmacopoeia-for integration and normalization in NCCD. Chemical drugs were normalized using the information model in RxNorm without much change. Chinese patent drugs (i.e., Chinese herbal extracts), however, were represented using an expanded RxNorm model to incorporate the unique characteristics of these drugs. A hybrid approach combining automated natural language processing technologies and manual review by domain experts was then applied to drug attribute extraction, normalization, and further generation of drug names at different specification levels. Lastly, we reported the statistics of NCCD, as well as the evaluation results using several sets of randomly selected Chinese drugs. The current version of NCCD contains 16 976 chemical drugs and 2663 Chinese patent medicines, resulting in 19 639 clinical drugs, 250 267 unique concepts, and 2 602 760 relations. By manual review of 1700 chemical drugs and 250 Chinese patent drugs randomly selected from NCCD (about 10%), we showed that the hybrid approach could achieve an accuracy of 98.60% for drug name extraction and normalization. Using a collection of 500 chemical drugs and 500 Chinese patent drugs from other resources, we showed that NCCD achieved coverages of 97.0% and 90.0% for chemical drugs and Chinese patent drugs, respectively. Evaluation results demonstrated the potential to improve interoperability across various electronic drug systems in China.
Refining Automatically Extracted Knowledge Bases Using Crowdsourcing.

PubMed

Li, Chunhua; Zhao, Pengpeng; Sheng, Victor S; Xian, Xuefeng; Wu, Jian; Cui, Zhiming

2017-01-01

Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost.
Overview of the Cancer Genetics and Pathway Curation tasks of BioNLP Shared Task 2013

PubMed Central

2015-01-01

Background Since their introduction in 2009, the BioNLP Shared Task events have been instrumental in advancing the development of methods and resources for the automatic extraction of information from the biomedical literature. In this paper, we present the Cancer Genetics (CG) and Pathway Curation (PC) tasks, two event extraction tasks introduced in the BioNLP Shared Task 2013. The CG task focuses on cancer, emphasizing the extraction of physiological and pathological processes at various levels of biological organization, and the PC task targets reactions relevant to the development of biomolecular pathway models, defining its extraction targets on the basis of established pathway representations and ontologies. Results Six groups participated in the CG task and two groups in the PC task, together applying a wide range of extraction approaches including both established state-of-the-art systems and newly introduced extraction methods. The best-performing systems achieved F-scores of 55% on the CG task and 53% on the PC task, demonstrating a level of performance comparable to the best results achieved in similar previously proposed tasks. Conclusions The results indicate that existing event extraction technology can generalize to meet the novel challenges represented by the CG and PC task settings, suggesting that extraction methods are capable of supporting the construction of knowledge bases on the molecular mechanisms of cancer and the curation of biomolecular pathway models. The CG and PC tasks continue as open challenges for all interested parties, with data, tools and resources available from the shared task homepage. PMID:26202570
Issues in knowledge representation to support maintainability: A case study in scientific data preparation

NASA Technical Reports Server (NTRS)

Chien, Steve; Kandt, R. Kirk; Roden, Joseph; Burleigh, Scott; King, Todd; Joy, Steve

1992-01-01

Scientific data preparation is the process of extracting usable scientific data from raw instrument data. This task involves noise detection (and subsequent noise classification and flagging or removal), extracting data from compressed forms, and construction of derivative or aggregate data (e.g. spectral densities or running averages). A software system called PIPE provides intelligent assistance to users developing scientific data preparation plans using a programming language called Master Plumber. PIPE provides this assistance capability by using a process description to create a dependency model of the scientific data preparation plan. This dependency model can then be used to verify syntactic and semantic constraints on processing steps to perform limited plan validation. PIPE also provides capabilities for using this model to assist in debugging faulty data preparation plans. In this case, the process model is used to focus the developer's attention upon those processing steps and data elements that were used in computing the faulty output values. Finally, the dependency model of a plan can be used to perform plan optimization and runtime estimation. These capabilities allow scientists to spend less time developing data preparation procedures and more time on scientific analysis tasks. Because the scientific data processing modules (called fittings) evolve to match scientists' needs, issues regarding maintainability are of prime importance in PIPE. This paper describes the PIPE system and describes how issues in maintainability affected the knowledge representation used in PIPE to capture knowledge about the behavior of fittings.
Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems.

PubMed

Huang, Lifu; May, Jonathan; Pan, Xiaoman; Ji, Heng; Ren, Xiang; Han, Jiawei; Zhao, Lin; Hendler, James A

2017-03-01

The ability of automatically recognizing and typing entities in natural language without prior knowledge (e.g., predefined entity types) is a major challenge in processing such data. Most existing entity typing systems are limited to certain domains, genres, and languages. In this article, we propose a novel unsupervised entity-typing framework by combining symbolic and distributional semantics. We start from learning three types of representations for each entity mention: general semantic representation, specific context representation, and knowledge representation based on knowledge bases. Then we develop a novel joint hierarchical clustering and linking algorithm to type all mentions using these representations. This framework does not rely on any annotated data, predefined typing schema, or handcrafted features; therefore, it can be quickly adapted to a new domain, genre, and/or language. Experiments on genres (news and discussion forum) show comparable performance with state-of-the-art supervised typing systems trained from a large amount of labeled data. Results on various languages (English, Chinese, Japanese, Hausa, and Yoruba) and domains (general and biomedical) demonstrate the portability of our framework.
Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems

PubMed Central

Huang, Lifu; May, Jonathan; Pan, Xiaoman; Ji, Heng; Ren, Xiang; Han, Jiawei; Zhao, Lin; Hendler, James A.

2017-01-01

Abstract The ability of automatically recognizing and typing entities in natural language without prior knowledge (e.g., predefined entity types) is a major challenge in processing such data. Most existing entity typing systems are limited to certain domains, genres, and languages. In this article, we propose a novel unsupervised entity-typing framework by combining symbolic and distributional semantics. We start from learning three types of representations for each entity mention: general semantic representation, specific context representation, and knowledge representation based on knowledge bases. Then we develop a novel joint hierarchical clustering and linking algorithm to type all mentions using these representations. This framework does not rely on any annotated data, predefined typing schema, or handcrafted features; therefore, it can be quickly adapted to a new domain, genre, and/or language. Experiments on genres (news and discussion forum) show comparable performance with state-of-the-art supervised typing systems trained from a large amount of labeled data. Results on various languages (English, Chinese, Japanese, Hausa, and Yoruba) and domains (general and biomedical) demonstrate the portability of our framework. PMID:28328252
Gaussian Processes for Data-Efficient Learning in Robotics and Control.

PubMed

Deisenroth, Marc Peter; Fox, Dieter; Rasmussen, Carl Edward

2015-02-01

Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this paper, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.
Building a knowledge base of severe adverse drug events based on AERS reporting data using semantic web technologies.

PubMed

Jiang, Guoqian; Wang, Liwei; Liu, Hongfang; Solbrig, Harold R; Chute, Christopher G

2013-01-01

A semantically coded knowledge base of adverse drug events (ADEs) with severity information is critical for clinical decision support systems and translational research applications. However it remains challenging to measure and identify the severity information of ADEs. The objective of the study is to develop and evaluate a semantic web based approach for building a knowledge base of severe ADEs based on the FDA Adverse Event Reporting System (AERS) reporting data. We utilized a normalized AERS reporting dataset and extracted putative drug-ADE pairs and their associated outcome codes in the domain of cardiac disorders. We validated the drug-ADE associations using ADE datasets from SIDe Effect Resource (SIDER) and the UMLS. We leveraged the Common Terminology Criteria for Adverse Event (CTCAE) grading system and classified the ADEs into the CTCAE in the Web Ontology Language (OWL). We identified and validated 2,444 unique Drug-ADE pairs in the domain of cardiac disorders, of which 760 pairs are in Grade 5, 775 pairs in Grade 4 and 2,196 pairs in Grade 3.
Construction of phosphorylation interaction networks by text mining of full-length articles using the eFIP system.

PubMed

Tudor, Catalina O; Ross, Karen E; Li, Gang; Vijay-Shanker, K; Wu, Cathy H; Arighi, Cecilia N

2015-01-01

Protein phosphorylation is a reversible post-translational modification where a protein kinase adds a phosphate group to a protein, potentially regulating its function, localization and/or activity. Phosphorylation can affect protein-protein interactions (PPIs), abolishing interaction with previous binding partners or enabling new interactions. Extracting phosphorylation information coupled with PPI information from the scientific literature will facilitate the creation of phosphorylation interaction networks of kinases, substrates and interacting partners, toward knowledge discovery of functional outcomes of protein phosphorylation. Increasingly, PPI databases are interested in capturing the phosphorylation state of interacting partners. We have previously developed the eFIP (Extracting Functional Impact of Phosphorylation) text mining system, which identifies phosphorylated proteins and phosphorylation-dependent PPIs. In this work, we present several enhancements for the eFIP system: (i) text mining for full-length articles from the PubMed Central open-access collection; (ii) the integration of the RLIMS-P 2.0 system for the extraction of phosphorylation events with kinase, substrate and site information; (iii) the extension of the PPI module with new trigger words/phrases describing interactions and (iv) the addition of the iSimp tool for sentence simplification to aid in the matching of syntactic patterns. We enhance the website functionality to: (i) support searches based on protein roles (kinases, substrates, interacting partners) or using keywords; (ii) link protein entities to their corresponding UniProt identifiers if mapped and (iii) support visual exploration of phosphorylation interaction networks using Cytoscape. The evaluation of eFIP on full-length articles achieved 92.4% precision, 76.5% recall and 83.7% F-measure on 100 article sections. To demonstrate eFIP for knowledge extraction and discovery, we constructed phosphorylation-dependent interaction networks involving 14-3-3 proteins identified from cancer-related versus diabetes-related articles. Comparison of the phosphorylation interaction network of kinases, phosphoproteins and interactants obtained from eFIP searches, along with enrichment analysis of the protein set, revealed several shared interactions, highlighting common pathways discussed in the context of both diseases. © The Author(s) 2015. Published by Oxford University Press.

Development of the IMB Model and an Evidence-Based Diabetes Self-management Mobile Application.

PubMed

Jeon, Eunjoo; Park, Hyeoun-Ae

2018-04-01

This study developed a diabetes self-management mobile application based on the information-motivation-behavioral skills (IMB) model, evidence extracted from clinical practice guidelines, and requirements identified through focus group interviews (FGIs) with diabetes patients. We developed a diabetes self-management (DSM) app in accordance with the following four stages of the system development life cycle. The functional and knowledge requirements of the users were extracted through FGIs with 19 diabetes patients. A system diagram, data models, a database, an algorithm, screens, and menus were designed. An Android app and server with an SSL protocol were developed. The DSM app algorithm and heuristics, as well as the usability of the DSM app were evaluated, and then the DSM app was modified based on heuristics and usability evaluation. A total of 11 requirement themes were identified through the FGIs. Sixteen functions and 49 knowledge rules were extracted. The system diagram consisted of a client part and server part, 78 data models, a database with 10 tables, an algorithm, and a menu structure with 6 main menus, and 40 user screens were developed. The DSM app was Android version 4.4 or higher for Bluetooth connectivity. The proficiency and efficiency scores of the algorithm were 90.96% and 92.39%, respectively. Fifteen issues were revealed through the heuristic evaluation, and the app was modified to address three of these issues. It was also modified to address five comments received by the researchers through the usability evaluation. The DSM app was developed based on behavioral change theory through IMB models. It was designed to be evidence-based, user-centered, and effective. It remains necessary to fully evaluate the effect of the DSM app on the DSM behavior changes of diabetes patients.
Development of the IMB Model and an Evidence-Based Diabetes Self-management Mobile Application

PubMed Central

Jeon, Eunjoo

2018-01-01

Objectives This study developed a diabetes self-management mobile application based on the information-motivation-behavioral skills (IMB) model, evidence extracted from clinical practice guidelines, and requirements identified through focus group interviews (FGIs) with diabetes patients. Methods We developed a diabetes self-management (DSM) app in accordance with the following four stages of the system development life cycle. The functional and knowledge requirements of the users were extracted through FGIs with 19 diabetes patients. A system diagram, data models, a database, an algorithm, screens, and menus were designed. An Android app and server with an SSL protocol were developed. The DSM app algorithm and heuristics, as well as the usability of the DSM app were evaluated, and then the DSM app was modified based on heuristics and usability evaluation. Results A total of 11 requirement themes were identified through the FGIs. Sixteen functions and 49 knowledge rules were extracted. The system diagram consisted of a client part and server part, 78 data models, a database with 10 tables, an algorithm, and a menu structure with 6 main menus, and 40 user screens were developed. The DSM app was Android version 4.4 or higher for Bluetooth connectivity. The proficiency and efficiency scores of the algorithm were 90.96% and 92.39%, respectively. Fifteen issues were revealed through the heuristic evaluation, and the app was modified to address three of these issues. It was also modified to address five comments received by the researchers through the usability evaluation. Conclusions The DSM app was developed based on behavioral change theory through IMB models. It was designed to be evidence-based, user-centered, and effective. It remains necessary to fully evaluate the effect of the DSM app on the DSM behavior changes of diabetes patients. PMID:29770246
Machine Learning for Knowledge Extraction from PHR Big Data.

PubMed

Poulymenopoulou, Michaela; Malamateniou, Flora; Vassilacopoulos, George

2014-01-01

Cloud computing, Internet of things (IOT) and NoSQL database technologies can support a new generation of cloud-based PHR services that contain heterogeneous (unstructured, semi-structured and structured) patient data (health, social and lifestyle) from various sources, including automatically transmitted data from Internet connected devices of patient living space (e.g. medical devices connected to patients at home care). The patient data stored in such PHR systems constitute big data whose analysis with the use of appropriate machine learning algorithms is expected to improve diagnosis and treatment accuracy, to cut healthcare costs and, hence, to improve the overall quality and efficiency of healthcare provided. This paper describes a health data analytics engine which uses machine learning algorithms for analyzing cloud based PHR big health data towards knowledge extraction to support better healthcare delivery as regards disease diagnosis and prognosis. This engine comprises of the data preparation, the model generation and the data analysis modules and runs on the cloud taking advantage from the map/reduce paradigm provided by Apache Hadoop.
Strategic Integration of Multiple Bioinformatics Resources for System Level Analysis of Biological Networks.

PubMed

D'Souza, Mark; Sulakhe, Dinanath; Wang, Sheng; Xie, Bing; Hashemifar, Somaye; Taylor, Andrew; Dubchak, Inna; Conrad Gilliam, T; Maltsev, Natalia

2017-01-01

Recent technological advances in genomics allow the production of biological data at unprecedented tera- and petabyte scales. Efficient mining of these vast and complex datasets for the needs of biomedical research critically depends on a seamless integration of the clinical, genomic, and experimental information with prior knowledge about genotype-phenotype relationships. Such experimental data accumulated in publicly available databases should be accessible to a variety of algorithms and analytical pipelines that drive computational analysis and data mining.We present an integrated computational platform Lynx (Sulakhe et al., Nucleic Acids Res 44:D882-D887, 2016) ( http://lynx.cri.uchicago.edu ), a web-based database and knowledge extraction engine. It provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization. It gives public access to the Lynx integrated knowledge base (LynxKB) and its analytical tools via user-friendly web services and interfaces. The Lynx service-oriented architecture supports annotation and analysis of high-throughput experimental data. Lynx tools assist the user in extracting meaningful knowledge from LynxKB and experimental data, and in the generation of weighted hypotheses regarding the genes and molecular mechanisms contributing to human phenotypes or conditions of interest. The goal of this integrated platform is to support the end-to-end analytical needs of various translational projects.
Indicators and measurement tools for health system integration: a knowledge synthesis protocol.

PubMed

Oelke, Nelly D; Suter, Esther; da Silva Lima, Maria Alice Dias; Van Vliet-Brown, Cheryl

2015-07-29

Health system integration is a key component of health system reform with the goal of improving outcomes for patients, providers, and the health system. Although health systems continue to strive for better integration, current delivery of health services continues to be fragmented. A key gap in the literature is the lack of information on what successful integration looks like and how to measure achievement towards an integrated system. This multi-site study protocol builds on a prior knowledge synthesis completed by two of the primary investigators which identified 10 key principles that collectively support health system integration. The aim is to answer two research questions: What are appropriate indicators for each of the 10 key integration principles developed in our previous knowledge synthesis and what measurement tools are used to measure these indicators? To enhance generalizability of the findings, a partnership between Canada and Brazil was created as health system integration is a priority in both countries and they share similar contexts. This knowledge synthesis will follow an iterative scoping review process with emerging information from knowledge-user engagement leading to the refinement of research questions and study selection. This paper describes the methods for each phase of the study. Research questions were developed with stakeholder input. Indicator identification and prioritization will utilize a modified Delphi method and patient/user focus groups. Based on priority indicators, a search of the literature will be completed and studies screened for inclusion. Quality appraisal of relevant studies will be completed prior to data extraction. Results will be used to develop recommendations and key messages to be presented through integrated and end-of-grant knowledge translation strategies with researchers and knowledge-users from the three jurisdictions. This project will directly benefit policy and decision-makers by providing an easy accessible set of indicators and tools to measure health system integration across different contexts and cultures. Being able to evaluate the success of integration strategies and initiatives will lead to better health system design and improved health outcomes for patients.
Semantic Analysis of Email Using Domain Ontologies and WordNet

NASA Technical Reports Server (NTRS)

Berrios, Daniel C.; Keller, Richard M.

2005-01-01

The problem of capturing and accessing knowledge in paper form has been supplanted by a problem of providing structure to vast amounts of electronic information. Systems that can construct semantic links for natural language documents like email messages automatically will be a crucial element of semantic email tools. We have designed an information extraction process that can leverage the knowledge already contained in an existing semantic web, recognizing references in email to existing nodes in a network of ontology instances by using linguistic knowledge and knowledge of the structure of the semantic web. We developed a heuristic score that uses several forms of evidence to detect references in email to existing nodes in the Semanticorganizer repository's network. While these scores cannot directly support automated probabilistic inference, they can be used to rank nodes by relevance and link those deemed most relevant to email messages.
A novel probabilistic framework for event-based speech recognition

NASA Astrophysics Data System (ADS)

Juneja, Amit; Espy-Wilson, Carol

2003-10-01

One of the reasons for unsatisfactory performance of the state-of-the-art automatic speech recognition (ASR) systems is the inferior acoustic modeling of low-level acoustic-phonetic information in the speech signal. An acoustic-phonetic approach to ASR, on the other hand, explicitly targets linguistic information in the speech signal, but such a system for continuous speech recognition (CSR) is not known to exist. A probabilistic and statistical framework for CSR based on the idea of the representation of speech sounds by bundles of binary valued articulatory phonetic features is proposed. Multiple probabilistic sequences of linguistically motivated landmarks are obtained using binary classifiers of manner phonetic features-syllabic, sonorant and continuant-and the knowledge-based acoustic parameters (APs) that are acoustic correlates of those features. The landmarks are then used for the extraction of knowledge-based APs for source and place phonetic features and their binary classification. Probabilistic landmark sequences are constrained using manner class language models for isolated or connected word recognition. The proposed method could overcome the disadvantages encountered by the early acoustic-phonetic knowledge-based systems that led the ASR community to switch to systems highly dependent on statistical pattern analysis methods and probabilistic language or grammar models.
Co-occurrence graphs for word sense disambiguation in the biomedical domain.

PubMed

Duque, Andres; Stevenson, Mark; Martinez-Romo, Juan; Araujo, Lourdes

2018-05-01

Word sense disambiguation is a key step for many natural language processing tasks (e.g. summarization, text classification, relation extraction) and presents a challenge to any system that aims to process documents from the biomedical domain. In this paper, we present a new graph-based unsupervised technique to address this problem. The knowledge base used in this work is a graph built with co-occurrence information from medical concepts found in scientific abstracts, and hence adapted to the specific domain. Unlike other unsupervised approaches based on static graphs such as UMLS, in this work the knowledge base takes the context of the ambiguous terms into account. Abstracts downloaded from PubMed are used for building the graph and disambiguation is performed using the personalized PageRank algorithm. Evaluation is carried out over two test datasets widely explored in the literature. Different parameters of the system are also evaluated to test robustness and scalability. Results show that the system is able to outperform state-of-the-art knowledge-based systems, obtaining more than 10% of accuracy improvement in some cases, while only requiring minimal external resources. Copyright © 2018 Elsevier B.V. All rights reserved.
Knowledge Based Text Generation

DTIC Science & Technology

1989-08-01

Number 4, October-December, 1985, pp. 219-242. de Joia , A. and Stenton, A., Terms in Linguistics: A Guide to Halliday, London: Batsford Academic and...extraction of text schemata and their corresponding rhetorical predicates; design of a system motivated by the desire for domain and language independence...semantics and semantics effects syntax. Functional Linguistic Framework Page 19 The design of GENNY was guided by the functional paradigm. Provided a
Artificial Intelligence Project

DTIC Science & Technology

1990-01-01

Artifcial Intelligence Project at The University of Texas at Austin, University of Texas at Austin, Artificial Intelligence Laboratory AITR84-01. Novak...Texas at Austin, Artificial Intelligence Laboratory A187-52, April 1987. Novak, G. "GLISP: A Lisp-Based Programming System with Data Abstraction...of Texas at Austin, Artificial Intelligence Laboratory AITR85-14.) Rim, Hae-Chang, and Simmons, R. F. "Extracting Data Base Knowledge from Medical
The Unified Medical Language System (UMLS): integrating biomedical terminology

PubMed Central

Bodenreider, Olivier

2004-01-01

The Unified Medical Language System (http://umlsks.nlm.nih.gov) is a repository of biomedical vocabularies developed by the US National Library of Medicine. The UMLS integrates over 2 million names for some 900 000 concepts from more than 60 families of biomedical vocabularies, as well as 12 million relations among these concepts. Vocabularies integrated in the UMLS Metathesaurus include the NCBI taxonomy, Gene Ontology, the Medical Subject Headings (MeSH), OMIM and the Digital Anatomist Symbolic Knowledge Base. UMLS concepts are not only inter-related, but may also be linked to external resources such as GenBank. In addition to data, the UMLS includes tools for customizing the Metathesaurus (MetamorphoSys), for generating lexical variants of concept names (lvg) and for extracting UMLS concepts from text (MetaMap). The UMLS knowledge sources are updated quarterly. All vocabularies are available at no fee for research purposes within an institution, but UMLS users are required to sign a license agreement. The UMLS knowledge sources are distributed on CD-ROM and by FTP. PMID:14681409
The Unified Medical Language System (UMLS): integrating biomedical terminology.

PubMed

Bodenreider, Olivier

2004-01-01

The Unified Medical Language System (http://umlsks.nlm.nih.gov) is a repository of biomedical vocabularies developed by the US National Library of Medicine. The UMLS integrates over 2 million names for some 900,000 concepts from more than 60 families of biomedical vocabularies, as well as 12 million relations among these concepts. Vocabularies integrated in the UMLS Metathesaurus include the NCBI taxonomy, Gene Ontology, the Medical Subject Headings (MeSH), OMIM and the Digital Anatomist Symbolic Knowledge Base. UMLS concepts are not only inter-related, but may also be linked to external resources such as GenBank. In addition to data, the UMLS includes tools for customizing the Metathesaurus (MetamorphoSys), for generating lexical variants of concept names (lvg) and for extracting UMLS concepts from text (MetaMap). The UMLS knowledge sources are updated quarterly. All vocabularies are available at no fee for research purposes within an institution, but UMLS users are required to sign a license agreement. The UMLS knowledge sources are distributed on CD-ROM and by FTP.
Reusing Design Knowledge Based on Design Cases and Knowledge Map

ERIC Educational Resources Information Center

Yang, Cheng; Liu, Zheng; Wang, Haobai; Shen, Jiaoqi

2013-01-01

Design knowledge was reused for innovative design work to support designers with product design knowledge and help designers who lack rich experiences to improve their design capacity and efficiency. First, based on the ontological model of product design knowledge constructed by taxonomy, implicit and explicit knowledge was extracted from some…
A method of extracting ontology module using concept relations for sharing knowledge in mobile cloud computing environment.

PubMed

Lee, Keonsoo; Rho, Seungmin; Lee, Seok-Won

2014-01-01

In mobile cloud computing environment, the cooperation of distributed computing objects is one of the most important requirements for providing successful cloud services. To satisfy this requirement, all the members, who are employed in the cooperation group, need to share the knowledge for mutual understanding. Even if ontology can be the right tool for this goal, there are several issues to make a right ontology. As the cost and complexity of managing knowledge increase according to the scale of the knowledge, reducing the size of ontology is one of the critical issues. In this paper, we propose a method of extracting ontology module to increase the utility of knowledge. For the given signature, this method extracts the ontology module, which is semantically self-contained to fulfill the needs of the service, by considering the syntactic structure and semantic relation of concepts. By employing this module, instead of the original ontology, the cooperation of computing objects can be performed with less computing load and complexity. In particular, when multiple external ontologies need to be combined for more complex services, this method can be used to optimize the size of shared knowledge.
Identification of phenolic compounds in red wine extract samples and zebrafish embryos by HPLC-ESI-LTQ-Orbitrap-MS.

PubMed

Vallverdú-Queralt, Anna; Boix, Nuria; Piqué, Ester; Gómez-Catalan, Jesús; Medina-Remon, Alexander; Sasot, Gemma; Mercader-Martí, Mercè; Llobet, Juan M; Lamuela-Raventos, Rosa M

2015-08-15

The zebrafish embryo is a highly interesting biological model with applications in different scientific fields, such as biomedicine, pharmacology and toxicology. In this study, we used liquid chromatography/electrospray ionisation-linear ion trap quadrupole-Orbitrap-mass spectrometry (HPLC/ESI-LTQ-Orbitrap-MS) to identify the polyphenol compounds in a red wine extract and zebrafish embryos. Phenolic compounds and anthocyanin metabolites were determined in zebrafish embryos previously exposed to the red wine extract. Compounds were identified by injection in a high-resolution system (LTQ-Orbitrap) using accurate mass measurements in MS, MS(2) and MS(3) modes. To our knowledge, this research constitutes the first comprehensive identification of phenolic compounds in zebrafish by HPLC coupled to high-resolution mass spectrometry. Copyright © 2015 Elsevier Ltd. All rights reserved.
Expediting analog design retargeting by design knowledge re-use and circuit synthesis: a practical example on a Delta-Sigma modulator

NASA Astrophysics Data System (ADS)

Webb, Matthew; Tang, Hua

2016-08-01

In the past decade or two, due to constant and rapid technology changes, analog design re-use or design retargeting to newer technologies has been brought to the table in order to expedite the design process and improve time-to-market. If properly conducted, analog design retargeting could significantly cut down design cycle compared to designs starting from the scratch. In this article, we present an empirical and general method for efficient analog design retargeting by design knowledge re-use and circuit synthesis (CS). The method first identifies circuit blocks that compose the source system and extracts the performance parameter specifications of each circuit block. Then, for each circuit block, it scales the values of design variables (DV) from the source design to derive an initial design in the target technology. Depending on the performance of this initial target design, a design space is defined for synthesis. Subsequently, each circuit block is automatically synthesised using state-of-art analog synthesis tools based on a combination of global and local optimisation techniques to achieve comparable performance specifications to those extracted from the source system. Finally, the overall system is composed of those synthesised circuit blocks in the target technology. We illustrate the method using a practical example of a complex Delta-Sigma modulator (DSM) circuit.
CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines.

PubMed

Soysal, Ergin; Wang, Jingqi; Jiang, Min; Wu, Yonghui; Pakhomov, Serguei; Liu, Hongfang; Xu, Hua

2017-11-24

Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Towards a Semantic Web of Things: A Hybrid Semantic Annotation, Extraction, and Reasoning Framework for Cyber-Physical System.

PubMed

Wu, Zhenyu; Xu, Yuan; Yang, Yunong; Zhang, Chunhong; Zhu, Xinning; Ji, Yang

2017-02-20

Web of Things (WoT) facilitates the discovery and interoperability of Internet of Things (IoT) devices in a cyber-physical system (CPS). Moreover, a uniform knowledge representation of physical resources is quite necessary for further composition, collaboration, and decision-making process in CPS. Though several efforts have integrated semantics with WoT, such as knowledge engineering methods based on semantic sensor networks (SSN), it still could not represent the complex relationships between devices when dynamic composition and collaboration occur, and it totally depends on manual construction of a knowledge base with low scalability. In this paper, to addresses these limitations, we propose the semantic Web of Things (SWoT) framework for CPS (SWoT4CPS). SWoT4CPS provides a hybrid solution with both ontological engineering methods by extending SSN and machine learning methods based on an entity linking (EL) model. To testify to the feasibility and performance, we demonstrate the framework by implementing a temperature anomaly diagnosis and automatic control use case in a building automation system. Evaluation results on the EL method show that linking domain knowledge to DBpedia has a relative high accuracy and the time complexity is at a tolerant level. Advantages and disadvantages of SWoT4CPS with future work are also discussed.
First Extraction of Transversity from a Global Analysis of Electron-Proton and Proton-Proton Data

NASA Astrophysics Data System (ADS)

Radici, Marco; Bacchetta, Alessandro

2018-05-01

We present the first extraction of the transversity distribution in the framework of collinear factorization based on the global analysis of pion-pair production in deep-inelastic scattering and in proton-proton collisions with a transversely polarized proton. The extraction relies on the knowledge of dihadron fragmentation functions, which are taken from the analysis of electron-positron annihilation data. For the first time, the transversity is extracted from a global analysis similar to what is usually done for the spin-averaged and helicity distributions. The knowledge of transversity is important for, among other things, detecting possible signals of new physics in high-precision low-energy experiments.
A quality score for coronary artery tree extraction results

NASA Astrophysics Data System (ADS)

Cao, Qing; Broersen, Alexander; Kitslaar, Pieter H.; Lelieveldt, Boudewijn P. F.; Dijkstra, Jouke

2018-02-01

Coronary artery trees (CATs) are often extracted to aid the fully automatic analysis of coronary artery disease on coronary computed tomography angiography (CCTA) images. Automatically extracted CATs often miss some arteries or include wrong extractions which require manual corrections before performing successive steps. For analyzing a large number of datasets, a manual quality check of the extraction results is time-consuming. This paper presents a method to automatically calculate quality scores for extracted CATs in terms of clinical significance of the extracted arteries and the completeness of the extracted CAT. Both right dominant (RD) and left dominant (LD) anatomical statistical models are generated and exploited in developing the quality score. To automatically determine which model should be used, a dominance type detection method is also designed. Experiments are performed on the automatically extracted and manually refined CATs from 42 datasets to evaluate the proposed quality score. In 39 (92.9%) cases, the proposed method is able to measure the quality of the manually refined CATs with higher scores than the automatically extracted CATs. In a 100-point scale system, the average scores for automatically and manually refined CATs are 82.0 (+/-15.8) and 88.9 (+/-5.4) respectively. The proposed quality score will assist the automatic processing of the CAT extractions for large cohorts which contain both RD and LD cases. To the best of our knowledge, this is the first time that a general quality score for an extracted CAT is presented.

Enhanced nutraceutical potential of gamma irradiated black soybean extracts.

PubMed

Krishnan, Veda; Gothwal, Santosh; Dahuja, Anil; Vinutha, T; Singh, Bhupinder; Jolly, Monica; Praveen, Shelly; Sachdev, Archana

2018-04-15

Radiation processing of soybean, varying in seed coat colour, was carried out at dose levels of 0.25, 0.5 and 1 kGy to evaluate their potential anti-proliferative and cytoprotective effects in an in vitro cell culture system. Irradiated and control black (Kalitur) and yellow (DS9712) soybean extracts were characterized in terms of total phenolics, flavonoids and anthocyanins, especially cyanidin-3-glucoside (C3G). Using an epithelial cell line, BEAS-2B the potential cytoprotective effects of soybean extracts were evaluated in terms of intracellular ROS levels and cell viability. The most relevant scavenging effect was found in Kalitur, with 78% decrease in ROS, which well correlated with a 33% increase in C3G after a 1 kGy dose. Results evidenced a correspondence between in vitro antioxidant activity and a potential health property of black soybean extracts, exemplifying the nutraceutical role of C3G. To our knowledge this study is the first report validating the cytoprotective effects of irradiated black soybean extracts. Copyright © 2017 Elsevier Ltd. All rights reserved.
Effects of Withania somnifera on Reproductive System: A Systematic Review of the Available Evidence

PubMed Central

Nazemyieh, Hossein; Fazljou, Seyed Mohammad Bagher; Nejatbakhsh, Fatemeh; Moini Jazani, Arezoo; Ahmadi AsrBadr, Yadollah

2018-01-01

Introduction Withania somnifera (WS) also known as ashwagandha is a well-known medicinal plant used in traditional medicine in many countries for infertility treatment. The present study was aimed at systemically reviewing therapeutic effects of WS on the reproductive system. Methods This systematic review study was designed in 2016. Required data were obtained from PubMed, Scopus, Google Scholar, Cochrane Library, Science Direct, Web of Knowledge, Web of Science, and manual search of articles, grey literature, reference checking, and expert contact. Results WS was found to improve reproductive system function by many ways. WS extract decreased infertility among male subjects, due to the enhancement in semen quality which is proposed due to the enhanced enzymatic activity in seminal plasma and decreasing oxidative stress. Also, WS extract improved luteinizing hormone and follicular stimulating hormone balance leading to folliculogenesis and increased gonadal weight, although some animal studies had concluded that WS had reversible spermicidal and infertilizing effects in male subjects. Conclusion WS was found to enhance spermatogenesis and sperm related indices in male and sexual behaviors in female. But, according to some available evidences for spermicidal features, further studies should focus on the extract preparation method and also dosage used in their study protocols. PMID:29670898
A high-throughput solid-phase extraction microchip combined with inductively coupled plasma-mass spectrometry for rapid determination of trace heavy metals in natural water.

PubMed

Shih, Tsung-Ting; Hsieh, Cheng-Chuan; Luo, Yu-Ting; Su, Yi-An; Chen, Ping-Hung; Chuang, Yu-Chen; Sun, Yuh-Chang

2016-04-15

Herein, a hyphenated system combining a high-throughput solid-phase extraction (htSPE) microchip with inductively coupled plasma-mass spectrometry (ICP-MS) for rapid determination of trace heavy metals was developed. Rather than performing multiple analyses in parallel for the enhancement of analytical throughput, we improved the processing speed for individual samples by increasing the operation flow rate during SPE procedures. To this end, an innovative device combining a micromixer and a multi-channeled extraction unit was designed. Furthermore, a programmable valve manifold was used to interface the developed microchip and ICP-MS instrumentation in order to fully automate the system, leading to a dramatic reduction in operation time and human error. Under the optimized operation conditions for the established system, detection limits of 1.64-42.54 ng L(-1) for the analyte ions were achieved. Validation procedures demonstrated that the developed method could be satisfactorily applied to the determination of trace heavy metals in natural water. Each analysis could be readily accomplished within just 186 s using the established system. This represents, to the best of our knowledge, an unprecedented speed for the analysis of trace heavy metal ions. Copyright © 2016 Elsevier B.V. All rights reserved.
Heterogeneous postsurgical data analytics for predictive modeling of mortality risks in intensive care units.

PubMed

Yun Chen; Hui Yang

2014-01-01

The rapid advancements of biomedical instrumentation and healthcare technology have resulted in data-rich environments in hospitals. However, the meaningful information extracted from rich datasets is limited. There is a dire need to go beyond current medical practices, and develop data-driven methods and tools that will enable and help (i) the handling of big data, (ii) the extraction of data-driven knowledge, (iii) the exploitation of acquired knowledge for optimizing clinical decisions. This present study focuses on the prediction of mortality rates in Intensive Care Units (ICU) using patient-specific healthcare recordings. It is worth mentioning that postsurgical monitoring in ICU leads to massive datasets with unique properties, e.g., variable heterogeneity, patient heterogeneity, and time asyncronization. To cope with the challenges in ICU datasets, we developed the postsurgical decision support system with a series of analytical tools, including data categorization, data pre-processing, feature extraction, feature selection, and predictive modeling. Experimental results show that the proposed data-driven methodology outperforms traditional approaches and yields better results based on the evaluation of real-world ICU data from 4000 subjects in the database. This research shows great potentials for the use of data-driven analytics to improve the quality of healthcare services.
Spatial Knowledge Infrastructures - Creating Value for Policy Makers and Benefits the Community

NASA Astrophysics Data System (ADS)

Arnold, L. M.

2016-12-01

The spatial data infrastructure is arguably one of the most significant advancements in the spatial sector. It's been a game changer for governments, providing for the coordination and sharing of spatial data across organisations and the provision of accessible information to the broader community of users. Today however, end-users such as policy-makers require far more from these spatial data infrastructures. They want more than just data; they want the knowledge that can be extracted from data and they don't want to have to download, manipulate and process data in order to get the knowledge they seek. It's time for the spatial sector to reduce its focus on data in spatial data infrastructures and take a more proactive step in emphasising and delivering the knowledge value. Nowadays, decision-makers want to be able to query at will the data to meet their immediate need for knowledge. This is a new value proposal for the decision-making consumer and will require a shift in thinking. This paper presents a model for a Spatial Knowledge Infrastructure and underpinning methods that will realise a new real-time approach to delivering knowledge. The methods embrace the new capabilities afforded through the sematic web, domain and process ontologies and natural query language processing. Semantic Web technologies today have the potential to transform the spatial industry into more than just a distribution channel for data. The Semantic Web RDF (Resource Description Framework) enables meaning to be drawn from data automatically. While pushing data out to end-users will remain a central role for data producers, the power of the semantic web is that end-users have the ability to marshal a broad range of spatial resources via a query to extract knowledge from available data. This can be done without actually having to configure systems specifically for the end-user. All data producers need do is make data accessible in RDF and the spatial analytics does the rest.
On the Automation of the MarkIII Data Analysis System.

NASA Astrophysics Data System (ADS)

Schwegmann, W.; Schuh, H.

1999-03-01

A faster and semiautomatic data analysis is an important contribution to the acceleration of the VLBI procedure. A concept for the automation of one of the most widely used VLBI software packages the MarkIII Data Analysis System was developed. Then, the program PWXCB, which extracts weather and cable calibration data from the station log-files, was automated supplementing the existing Fortran77 program-code. The new program XLOG and its results will be presented. Most of the tasks in the VLBI data analysis are very complex and their automation requires typical knowledge-based techniques. Thus, a knowledge-based system (KBS) for support and guidance of the analyst is being developed using the AI-workbench BABYLON, which is based on methods of artificial intelligence (AI). The advantages of a KBS for the MarkIII Data Analysis System and the required steps to build a KBS will be demonstrated. Examples about the current status of the project will be given, too.
PASTE: patient-centered SMS text tagging in a medication management system

PubMed Central

Johnson, Kevin B; Denny, Joshua C

2011-01-01

Objective To evaluate the performance of a system that extracts medication information and administration-related actions from patient short message service (SMS) messages. Design Mobile technologies provide a platform for electronic patient-centered medication management. MyMediHealth (MMH) is a medication management system that includes a medication scheduler, a medication administration record, and a reminder engine that sends text messages to cell phones. The object of this work was to extend MMH to allow two-way interaction using mobile phone-based SMS technology. Unprompted text-message communication with patients using natural language could engage patients in their healthcare, but presents unique natural language processing challenges. The authors developed a new functional component of MMH, the Patient-centered Automated SMS Tagging Engine (PASTE). The PASTE web service uses natural language processing methods, custom lexicons, and existing knowledge sources to extract and tag medication information from patient text messages. Measurements A pilot evaluation of PASTE was completed using 130 medication messages anonymously submitted by 16 volunteers via a website. System output was compared with manually tagged messages. Results Verified medication names, medication terms, and action terms reached high F-measures of 91.3%, 94.7%, and 90.4%, respectively. The overall medication name F-measure was 79.8%, and the medication action term F-measure was 90%. Conclusion Other studies have demonstrated systems that successfully extract medication information from clinical documents using semantic tagging, regular expression-based approaches, or a combination of both approaches. This evaluation demonstrates the feasibility of extracting medication information from patient-generated medication messages. PMID:21984605
Intelligent system for topic survey in MEDLINE by keyword recommendation and learning text characteristics.

PubMed

Tanaka, M; Nakazono, S; Matsuno, H; Tsujimoto, H; Kitamura, Y; Miyano, S

2000-01-01

We have implemented a system for assisting experts in selecting MEDLINE records for database construction purposes. This system has two specific features: The first is a learning mechanism which extracts characteristics in the abstracts of MEDLINE records of interest as patterns. These patterns reflect selection decisions by experts and are used for screening the records. The second is a keyword recommendation system which assists and supplements experts' knowledge in unexpected cases. Combined with a conventional keyword-based information retrieval system, this system may provide an efficient and comfortable environment for MEDLINE record selection by experts. Some computational experiments are provided to prove that this idea is useful.
Machine-aided indexing at NASA

NASA Technical Reports Server (NTRS)

Silvester, June P.; Genuardi, Michael T.; Klingbiel, Paul H.

1994-01-01

This report describes the NASA Lexical Dictionary (NLD), a machine-aided indexing system used online at the National Aeronautics and Space Administration's Center for AeroSpace Information (CASI). This system automatically suggests a set of candidate terms from NASA's controlled vocabulary for any designated natural language text input. The system is comprised of a text processor that is based on the computational, nonsyntactic analysis of input text and an extensive knowledge base that serves to recognize and translate text-extracted concepts. The functions of the various NLD system components are described in detail, and production and quality benefits resulting from the implementation of machine-aided indexing at CASI are discussed.
Total Protein Extraction for Metaproteomics Analysis of Methane Producing Biofilm: The Effects of Detergents

PubMed Central

Huang, Hung-Jen; Chen, Wei-Yu; Wu, Jer-Horng

2014-01-01

Protein recovery is crucial for shotgun metaproteomics to study the in situ functionality of microbial populations from complex biofilms but still poorly addressed by far. To fill this knowledge gap, we systematically evaluated the sample preparation with extraction buffers comprising four detergents for the metaproteomics analysis of a terephthalate-degrading methanogenic biofilm using an on-line two-dimensional liquid chromatography tandem mass spectrometry (2D-LC-MS/MS) system. Totally, 1018 non-repeated proteins were identified with the four treatments. On the whole, each treatment could recover the biofilm proteins with specific distributions of molecular weight, hydrophobicity, and isoelectric point. The extraction buffers containing zwitterionic and anionic detergents were found to harvest the proteins with better efficiency and quality, allowing identification up to 76.2% of total identified proteins with the LC-MS/MS analysis. According to the annotation with a relevant metagenomic database, we further observed different taxonomic profiles of bacterial and archaeal members and discriminable patterns of the functional expression among the extraction buffers used. Overall, the finding of the present study provides first insight to the effect of the detergents on the characteristics of extractable proteins from biofilm and the developed protocol combined with nano 2D-LC/MS/MS analysis can improve the metaproteomics studies on microbial functionality of biofilms in the wastewater treatment systems. PMID:24914765
Dynamic edge warping - An experimental system for recovering disparity maps in weakly constrained systems

NASA Technical Reports Server (NTRS)

Boyer, K. L.; Wuescher, D. M.; Sarkar, S.

1991-01-01

Dynamic edge warping (DEW), a technique for recovering reasonably accurate disparity maps from uncalibrated stereo image pairs, is presented. No precise knowledge of the epipolar camera geometry is assumed. The technique is embedded in a system including structural stereopsis on the front end and robust estimation in digital photogrammetry on the other for the purpose of self-calibrating stereo image pairs. Once the relative camera orientation is known, the epipolar geometry is computed and the system can use this information to refine its representation of the object space. Such a system will find application in the autonomous extraction of terrain maps from stereo aerial photographs, for which camera position and orientation are unknown a priori, and for online autonomous calibration maintenance for robotic vision applications, in which the cameras are subject to vibration and other physical disturbances after calibration. This work thus forms a component of an intelligent system that begins with a pair of images and, having only vague knowledge of the conditions under which they were acquired, produces an accurate, dense, relative depth map. The resulting disparity map can also be used directly in some high-level applications involving qualitative scene analysis, spatial reasoning, and perceptual organization of the object space. The system as a whole substitutes high-level information and constraints for precise geometric knowledge in driving and constraining the early correspondence process.
Refining Automatically Extracted Knowledge Bases Using Crowdsourcing

PubMed Central

Xian, Xuefeng; Cui, Zhiming

2017-01-01

Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost. PMID:28588611
EliXR-TIME: A Temporal Knowledge Representation for Clinical Research Eligibility Criteria.

PubMed

Boland, Mary Regina; Tu, Samson W; Carini, Simona; Sim, Ida; Weng, Chunhua

2012-01-01

Effective clinical text processing requires accurate extraction and representation of temporal expressions. Multiple temporal information extraction models were developed but a similar need for extracting temporal expressions in eligibility criteria (e.g., for eligibility determination) remains. We identified the temporal knowledge representation requirements of eligibility criteria by reviewing 100 temporal criteria. We developed EliXR-TIME, a frame-based representation designed to support semantic annotation for temporal expressions in eligibility criteria by reusing applicable classes from well-known clinical temporal knowledge representations. We used EliXR-TIME to analyze a training set of 50 new temporal eligibility criteria. We evaluated EliXR-TIME using an additional random sample of 20 eligibility criteria with temporal expressions that have no overlap with the training data, yielding 92.7% (76 / 82) inter-coder agreement on sentence chunking and 72% (72 / 100) agreement on semantic annotation. We conclude that this knowledge representation can facilitate semantic annotation of the temporal expressions in eligibility criteria.
Integrating Multiple On-line Knowledge Bases for Disease-Lab Test Relation Extraction.

PubMed

Zhang, Yaoyun; Soysal, Ergin; Moon, Sungrim; Wang, Jingqi; Tao, Cui; Xu, Hua

2015-01-01

A computable knowledge base containing relations between diseases and lab tests would be a great resource for many biomedical informatics applications. This paper describes our initial step towards establishing a comprehensive knowledge base of disease and lab tests relations utilizing three public on-line resources. LabTestsOnline, MedlinePlus and Wikipedia are integrated to create a freely available, computable disease-lab test knowledgebase. Disease and lab test concepts are identified using MetaMap and relations between diseases and lab tests are determined based on source-specific rules. Experimental results demonstrate a high precision for relation extraction, with Wikipedia achieving the highest precision of 87%. Combining the three sources reached a recall of 51.40%, when compared with a subset of disease-lab test relations extracted from a reference book. Moreover, we found additional disease-lab test relations from on-line resources, indicating they are complementary to existing reference books for building a comprehensive disease and lab test relation knowledge base.
Applications of artificial intelligence to digital photogrammetry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kretsch, J.L.

1988-01-01

The aim of this research was to explore the application of expert systems to digital photogrammetry, specifically to photogrammetric triangulation, feature extraction, and photogrammetric problem solving. In 1987, prototype expert systems were developed for doing system startup, interior orientation, and relative orientation in the mensuration stage. The system explored means of performing diagnostics during the process. In the area of feature extraction, the relationship of metric uncertainty to symbolic uncertainty was the topic of research. Error propagation through the Dempster-Shafer formalism for representing evidence was performed in order to find the variance in the calculated belief values due to errorsmore » in measurements made together the initial evidence needed to being labeling of observed image features with features in an object model. In photogrammetric problem solving, an expert system is under continuous development which seeks to solve photogrammetric problems using mathematical reasoning. The key to the approach used is the representation of knowledge directly in the form of equations, rather than in the form of if-then rules. Then each variable in the equations is treated as a goal to be solved.« less
Knowledge mining from clinical datasets using rough sets and backpropagation neural network.

PubMed

Nahato, Kindie Biredagn; Harichandran, Khanna Nehemiah; Arputharaj, Kannan

2015-01-01

The availability of clinical datasets and knowledge mining methodologies encourages the researchers to pursue research in extracting knowledge from clinical datasets. Different data mining techniques have been used for mining rules, and mathematical models have been developed to assist the clinician in decision making. The objective of this research is to build a classifier that will predict the presence or absence of a disease by learning from the minimal set of attributes that has been extracted from the clinical dataset. In this work rough set indiscernibility relation method with backpropagation neural network (RS-BPNN) is used. This work has two stages. The first stage is handling of missing values to obtain a smooth data set and selection of appropriate attributes from the clinical dataset by indiscernibility relation method. The second stage is classification using backpropagation neural network on the selected reducts of the dataset. The classifier has been tested with hepatitis, Wisconsin breast cancer, and Statlog heart disease datasets obtained from the University of California at Irvine (UCI) machine learning repository. The accuracy obtained from the proposed method is 97.3%, 98.6%, and 90.4% for hepatitis, breast cancer, and heart disease, respectively. The proposed system provides an effective classification model for clinical datasets.
Knowledge-based low-level image analysis for computer vision systems

NASA Technical Reports Server (NTRS)

Dhawan, Atam P.; Baxi, Himanshu; Ranganath, M. V.

1988-01-01

Two algorithms for entry-level image analysis and preliminary segmentation are proposed which are flexible enough to incorporate local properties of the image. The first algorithm involves pyramid-based multiresolution processing and a strategy to define and use interlevel and intralevel link strengths. The second algorithm, which is designed for selected window processing, extracts regions adaptively using local histograms. The preliminary segmentation and a set of features are employed as the input to an efficient rule-based low-level analysis system, resulting in suboptimal meaningful segmentation.
DiMeX: A Text Mining System for Mutation-Disease Association Extraction.

PubMed

Mahmood, A S M Ashique; Wu, Tsung-Jung; Mazumder, Raja; Vijay-Shanker, K

2016-01-01

The number of published articles describing associations between mutations and diseases is increasing at a fast pace. There is a pressing need to gather such mutation-disease associations into public knowledge bases, but manual curation slows down the growth of such databases. We have addressed this problem by developing a text-mining system (DiMeX) to extract mutation to disease associations from publication abstracts. DiMeX consists of a series of natural language processing modules that preprocess input text and apply syntactic and semantic patterns to extract mutation-disease associations. DiMeX achieves high precision and recall with F-scores of 0.88, 0.91 and 0.89 when evaluated on three different datasets for mutation-disease associations. DiMeX includes a separate component that extracts mutation mentions in text and associates them with genes. This component has been also evaluated on different datasets and shown to achieve state-of-the-art performance. The results indicate that our system outperforms the existing mutation-disease association tools, addressing the low precision problems suffered by most approaches. DiMeX was applied on a large set of abstracts from Medline to extract mutation-disease associations, as well as other relevant information including patient/cohort size and population data. The results are stored in a database that can be queried and downloaded at http://biotm.cis.udel.edu/dimex/. We conclude that this high-throughput text-mining approach has the potential to significantly assist researchers and curators to enrich mutation databases.
DiMeX: A Text Mining System for Mutation-Disease Association Extraction

PubMed Central

Mahmood, A. S. M. Ashique; Wu, Tsung-Jung; Mazumder, Raja; Vijay-Shanker, K.

2016-01-01

The number of published articles describing associations between mutations and diseases is increasing at a fast pace. There is a pressing need to gather such mutation-disease associations into public knowledge bases, but manual curation slows down the growth of such databases. We have addressed this problem by developing a text-mining system (DiMeX) to extract mutation to disease associations from publication abstracts. DiMeX consists of a series of natural language processing modules that preprocess input text and apply syntactic and semantic patterns to extract mutation-disease associations. DiMeX achieves high precision and recall with F-scores of 0.88, 0.91 and 0.89 when evaluated on three different datasets for mutation-disease associations. DiMeX includes a separate component that extracts mutation mentions in text and associates them with genes. This component has been also evaluated on different datasets and shown to achieve state-of-the-art performance. The results indicate that our system outperforms the existing mutation-disease association tools, addressing the low precision problems suffered by most approaches. DiMeX was applied on a large set of abstracts from Medline to extract mutation-disease associations, as well as other relevant information including patient/cohort size and population data. The results are stored in a database that can be queried and downloaded at http://biotm.cis.udel.edu/dimex/. We conclude that this high-throughput text-mining approach has the potential to significantly assist researchers and curators to enrich mutation databases. PMID:27073839
Knowledge Discovery from Databases: An Introductory Review.

ERIC Educational Resources Information Center

Vickery, Brian

1997-01-01

Introduces new procedures being used to extract knowledge from databases and discusses rationales for developing knowledge discovery methods. Methods are described for such techniques as classification, clustering, and the detection of deviations from pre-established norms. Examines potential uses of knowledge discovery in the information field.…

End-User Evaluations of Semantic Web Technologies

DOE Office of Scientific and Technical Information (OSTI.GOV)

McCool, Rob; Cowell, Andrew J.; Thurman, David A.

Stanford University's Knowledge Systems Laboratory (KSL) is working in partnership with Battelle Memorial Institute and IBM Watson Research Center to develop a suite of technologies for information extraction, knowledge representation & reasoning, and human-information interaction, in unison entitled 'Knowledge Associates for Novel Intelligence' (KANI). We have developed an integrated analytic environment composed of a collection of analyst associates, software components that aid the user at different stages of the information analysis process. An important part of our participatory design process has been to ensure our technologies and designs are tightly integrate with the needs and requirements of our end users,more » To this end, we perform a sequence of evaluations towards the end of the development process that ensure the technologies are both functional and usable. This paper reports on that process.« less
System Architecture for Temporal Information Extraction, Representation and Reasoning in Clinical Narrative Reports

PubMed Central

Zhou, Li; Friedman, Carol; Parsons, Simon; Hripcsak, George

2005-01-01

Exploring temporal information in narrative Electronic Medical Records (EMRs) is essential and challenging. We propose an architecture for an integrated approach to process temporal information in clinical narrative reports. The goal is to initiate and build a foundation that supports applications which assist healthcare practice and research by including the ability to determine the time of clinical events (e.g., past vs. present). Key components include: (1) a temporal constraint structure for temporal expressions and the development of an associated tagger; (2) a Natural Language Processing (NLP) system for encoding and extracting medical events and associating them with formalized temporal data; (3) a post-processor, with a knowledge-based subsystem to help discover implicit information, that resolves temporal expressions and deals with issues such as granularity and vagueness; and (4) a reasoning mechanism which models clinical reports as Simple Temporal Problems (STPs). PMID:16779164
Mutual information, neural networks and the renormalization group

NASA Astrophysics Data System (ADS)

Koch-Janusz, Maciej; Ringel, Zohar

2018-06-01

Physical systems differing in their microscopic details often display strikingly similar behaviour when probed at macroscopic scales. Those universal properties, largely determining their physical characteristics, are revealed by the powerful renormalization group (RG) procedure, which systematically retains `slow' degrees of freedom and integrates out the rest. However, the important degrees of freedom may be difficult to identify. Here we demonstrate a machine-learning algorithm capable of identifying the relevant degrees of freedom and executing RG steps iteratively without any prior knowledge about the system. We introduce an artificial neural network based on a model-independent, information-theoretic characterization of a real-space RG procedure, which performs this task. We apply the algorithm to classical statistical physics problems in one and two dimensions. We demonstrate RG flow and extract the Ising critical exponent. Our results demonstrate that machine-learning techniques can extract abstract physical concepts and consequently become an integral part of theory- and model-building.
Large-scale extraction of accurate drug-disease treatment pairs from biomedical literature for drug repurposing

PubMed Central

2013-01-01

Background A large-scale, highly accurate, machine-understandable drug-disease treatment relationship knowledge base is important for computational approaches to drug repurposing. The large body of published biomedical research articles and clinical case reports available on MEDLINE is a rich source of FDA-approved drug-disease indication as well as drug-repurposing knowledge that is crucial for applying FDA-approved drugs for new diseases. However, much of this information is buried in free text and not captured in any existing databases. The goal of this study is to extract a large number of accurate drug-disease treatment pairs from published literature. Results In this study, we developed a simple but highly accurate pattern-learning approach to extract treatment-specific drug-disease pairs from 20 million biomedical abstracts available on MEDLINE. We extracted a total of 34,305 unique drug-disease treatment pairs, the majority of which are not included in existing structured databases. Our algorithm achieved a precision of 0.904 and a recall of 0.131 in extracting all pairs, and a precision of 0.904 and a recall of 0.842 in extracting frequent pairs. In addition, we have shown that the extracted pairs strongly correlate with both drug target genes and therapeutic classes, therefore may have high potential in drug discovery. Conclusions We demonstrated that our simple pattern-learning relationship extraction algorithm is able to accurately extract many drug-disease pairs from the free text of biomedical literature that are not captured in structured databases. The large-scale, accurate, machine-understandable drug-disease treatment knowledge base that is resultant of our study, in combination with pairs from structured databases, will have high potential in computational drug repurposing tasks. PMID:23742147
Combining automatic table classification and relationship extraction in extracting anticancer drug-side effect pairs from full-text articles.

PubMed

Xu, Rong; Wang, QuanQiu

2015-02-01

Anticancer drug-associated side effect knowledge often exists in multiple heterogeneous and complementary data sources. A comprehensive anticancer drug-side effect (drug-SE) relationship knowledge base is important for computation-based drug target discovery, drug toxicity predication and drug repositioning. In this study, we present a two-step approach by combining table classification and relationship extraction to extract drug-SE pairs from a large number of high-profile oncological full-text articles. The data consists of 31,255 tables downloaded from the Journal of Oncology (JCO). We first trained a statistical classifier to classify tables into SE-related and -unrelated categories. We then extracted drug-SE pairs from SE-related tables. We compared drug side effect knowledge extracted from JCO tables to that derived from FDA drug labels. Finally, we systematically analyzed relationships between anti-cancer drug-associated side effects and drug-associated gene targets, metabolism genes, and disease indications. The statistical table classifier is effective in classifying tables into SE-related and -unrelated (precision: 0.711; recall: 0.941; F1: 0.810). We extracted a total of 26,918 drug-SE pairs from SE-related tables with a precision of 0.605, a recall of 0.460, and a F1 of 0.520. Drug-SE pairs extracted from JCO tables is largely complementary to those derived from FDA drug labels; as many as 84.7% of the pairs extracted from JCO tables have not been included a side effect database constructed from FDA drug labels. Side effects associated with anticancer drugs positively correlate with drug target genes, drug metabolism genes, and disease indications. Copyright © 2014 Elsevier Inc. All rights reserved.
Knowledge and Policy: Research and Knowledge Transfer

ERIC Educational Resources Information Center

Ozga, Jenny

2007-01-01

Knowledge transfer (KT) is the emergent "third sector" of higher education activity--alongside research and teaching. Its commercialization origins are evidenced in its concerns to extract maximum value from research, and in the policy push to make research-based knowledge trapped in disciplinary silos more responsive to the growing…
Effectiveness of Adaptive E-Learning Environments on Knowledge, Competence, and Behavior in Health Professionals and Students: Protocol for a Systematic Review and Meta-Analysis.

PubMed

Fontaine, Guillaume; Cossette, Sylvie; Maheu-Cadotte, Marc-André; Mailhot, Tanya; Deschênes, Marie-France; Mathieu-Dupuis, Gabrielle

2017-07-05

Adaptive e-learning environments (AEEs) can provide tailored instruction by adapting content, navigation, presentation, multimedia, and tools to each user's navigation behavior, individual objectives, knowledge, and preferences. AEEs can have various levels of complexity, ranging from systems using a simple adaptive functionality to systems using artificial intelligence. While AEEs are promising, their effectiveness for the education of health professionals and health professions students remains unclear. The purpose of this systematic review is to assess the effectiveness of AEEs in improving knowledge, competence, and behavior in health professionals and students. We will follow the Cochrane Collaboration and the Effective Practice and Organisation of Care (EPOC) Group guidelines on systematic review methodology. A systematic search of the literature will be conducted in 6 bibliographic databases (CINAHL, EMBASE, ERIC, PsycINFO, PubMed, and Web of Science) using the concepts "adaptive e-learning environments," "health professionals/students," and "effects on knowledge/skills/behavior." We will include randomized and nonrandomized controlled trials, in addition to controlled before-after, interrupted time series, and repeated measures studies published between 2005 and 2017. The title and the abstract of each study followed by a full-text assessment of potentially eligible studies will be independently screened by 2 review authors. Using the EPOC extraction form, 1 review author will conduct data extraction and a second author will validate the data extraction. The methodological quality of included studies will be independently assessed by 2 review authors using the EPOC risk of bias criteria. Included studies will be synthesized by a descriptive analysis. Where appropriate, data will be pooled using meta-analysis by applying the RevMan software version 5.1, considering the heterogeneity of studies. The review is in progress. We plan to submit the results in the beginning of 2018. Providing tailored instruction to health professionals and students is a priority in order to optimize learning and clinical outcomes. This systematic review will synthesize the best available evidence regarding the effectiveness of AEEs in improving knowledge, competence, and behavior in health professionals and students. It will provide guidance to policy makers, hospital managers, and researchers in terms of AEE development, implementation, and evaluation in health care. ©Guillaume Fontaine, Sylvie Cossette, Marc-André Maheu-Cadotte, Tanya Mailhot, Marie-France Deschênes, Gabrielle Mathieu-Dupuis. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 05.07.2017.
Effectiveness of Adaptive E-Learning Environments on Knowledge, Competence, and Behavior in Health Professionals and Students: Protocol for a Systematic Review and Meta-Analysis

PubMed Central

Cossette, Sylvie; Maheu-Cadotte, Marc-André; Mailhot, Tanya; Deschênes, Marie-France; Mathieu-Dupuis, Gabrielle

2017-01-01

Background Adaptive e-learning environments (AEEs) can provide tailored instruction by adapting content, navigation, presentation, multimedia, and tools to each user’s navigation behavior, individual objectives, knowledge, and preferences. AEEs can have various levels of complexity, ranging from systems using a simple adaptive functionality to systems using artificial intelligence. While AEEs are promising, their effectiveness for the education of health professionals and health professions students remains unclear. Objective The purpose of this systematic review is to assess the effectiveness of AEEs in improving knowledge, competence, and behavior in health professionals and students. Methods We will follow the Cochrane Collaboration and the Effective Practice and Organisation of Care (EPOC) Group guidelines on systematic review methodology. A systematic search of the literature will be conducted in 6 bibliographic databases (CINAHL, EMBASE, ERIC, PsycINFO, PubMed, and Web of Science) using the concepts “adaptive e-learning environments,” “health professionals/students,” and “effects on knowledge/skills/behavior.” We will include randomized and nonrandomized controlled trials, in addition to controlled before-after, interrupted time series, and repeated measures studies published between 2005 and 2017. The title and the abstract of each study followed by a full-text assessment of potentially eligible studies will be independently screened by 2 review authors. Using the EPOC extraction form, 1 review author will conduct data extraction and a second author will validate the data extraction. The methodological quality of included studies will be independently assessed by 2 review authors using the EPOC risk of bias criteria. Included studies will be synthesized by a descriptive analysis. Where appropriate, data will be pooled using meta-analysis by applying the RevMan software version 5.1, considering the heterogeneity of studies. Results The review is in progress. We plan to submit the results in the beginning of 2018. Conclusions Providing tailored instruction to health professionals and students is a priority in order to optimize learning and clinical outcomes. This systematic review will synthesize the best available evidence regarding the effectiveness of AEEs in improving knowledge, competence, and behavior in health professionals and students. It will provide guidance to policy makers, hospital managers, and researchers in terms of AEE development, implementation, and evaluation in health care. Trial Registration PROSPERO International Prospective Register of Systematic Reviews: CRD42017065585; https://www.crd.york.ac.uk/PROSPERO/display_record.asp?ID=CRD42017065585 (Archived by WebCite® at http://www.webcitation.org/6rXGdDwf4) PMID:28679491
Knowledge discovery and system biology in molecular medicine: an application on neurodegenerative diseases.

PubMed

Fattore, Matteo; Arrigo, Patrizio

2005-01-01

The possibility to study an organism in terms of system theory has been proposed in the past, but only the advancement of molecular biology techniques allow us to investigate the dynamical properties of a biological system in a more quantitative and rational way than before . These new techniques can gave only the basic level view of an organisms functionality. The comprehension of its dynamical behaviour depends on the possibility to perform a multiple level analysis. Functional genomics has stimulated the interest in the investigation the dynamical behaviour of an organism as a whole. These activities are commonly known as System Biology, and its interests ranges from molecules to organs. One of the more promising applications is the 'disease modeling'. The use of experimental models is a common procedure in pharmacological and clinical researches; today this approach is supported by 'in silico' predictive methods. This investigation can be improved by a combination of experimental and computational tools. The Machine Learning (ML) tools are able to process different heterogeneous data sources, taking into account this peculiarity, they could be fruitfully applied to support a multilevel data processing (molecular, cellular and morphological) that is the prerequisite for the formal model design; these techniques can allow us to extract the knowledge for mathematical model development. The aim of our work is the development and implementation of a system that combines ML and dynamical models simulations. The program is addressed to the virtual analysis of the pathways involved in neurodegenerative diseases. These pathologies are multifactorial diseases and the relevance of the different factors has not yet been well elucidated. This is a very complex task; in order to test the integrative approach our program has been limited to the analysis of the effects of a specific protein, the Cyclin dependent kinase 5 (CDK5) which relies on the induction of neuronal apoptosis. The system has a modular structure centred on a textual knowledge discovery approach. The text mining is the only way to enhance the capability to extract ,from multiple data sources, the information required for the dynamical simulator. The user may access the publically available modules through the following site: http://biocomp.ge.ismac.cnr.it.
Comparison of ambient solvent extraction methods for the analysis of fatty acids in non-starch lipids of flour and starch

PubMed Central

Bahrami, Niloufar; Yonekura, Lina; Linforth, Robert; Carvalho da Silva, Margarida; Hill, Sandra; Penson, Simon; Chope, Gemma; Fisk, Ian Denis

2014-01-01

BACKGROUND Lipids are minor components of flours, but are major determinants of baking properties and end-product quality. To the best of our knowledge, there is no single solvent system currently known that efficiently extracts all non-starch lipids from all flours without the risk of chemical, mechanical or thermal damage. This paper compares nine ambient solvent systems (monophasic and biphasic) with varying polarities: Bligh and Dyer (BD); modified Bligh and Dyer using HCl (BDHCL); modified BD using NaCl (BDNaCl); methanol–chloroform–hexane (3:2:1, v/v); Hara and Radin (hexane–isopropanol, 3:2, v/v); water-saturated n-butanol; chloroform; methanol and hexane for their ability to extract total non-starch lipids (separated by lipid classes) from wheat flour (Triticum aestivum L.). Seven ambient extraction protocols were further compared for their ability to extract total non-starch lipids from three alternative samples: barley flour (Hordeum vulgare L.), maize starch (Zea mays L.) and tapioca starch (Manihot esculenta Crantz). RESULTS For wheat flour the original BD method and those containing HCl or NaCl tended to extract the maximum lipid and a significant correlation between lipid extraction yield (especially the glycolipids and phospholipids) and the polarity of the solvent was observed. For the wider range of samples BD and BD HCl repeatedly offered the maximum extraction yield and using pooled standardized (by sample) data from all flours, total non-starch lipid extraction yield was positively correlated with solvent polarity (r = 0.5682, P < 0.05) and water ratio in the solvent mixture (r = 0.5299, P < 0.05). CONCLUSION In general, BD-based methods showed better extraction yields compared to methods without the addition of water and, most interestingly, there was much greater method dependence of lipid yields in the starches when compared to the flour samples, which is due to the differences in lipid profiles between the two sample types (flours and starches). PMID:24132804
Quantitative knowledge acquisition for expert systems

NASA Technical Reports Server (NTRS)

Belkin, Brenda L.; Stengel, Robert F.

1991-01-01

A common problem in the design of expert systems is the definition of rules from data obtained in system operation or simulation. While it is relatively easy to collect data and to log the comments of human operators engaged in experiments, generalizing such information to a set of rules has not previously been a direct task. A statistical method is presented for generating rule bases from numerical data, motivated by an example based on aircraft navigation with multiple sensors. The specific objective is to design an expert system that selects a satisfactory suite of measurements from a dissimilar, redundant set, given an arbitrary navigation geometry and possible sensor failures. The systematic development is described of a Navigation Sensor Management (NSM) Expert System from Kalman Filter convariance data. The method invokes two statistical techniques: Analysis of Variance (ANOVA) and the ID3 Algorithm. The ANOVA technique indicates whether variations of problem parameters give statistically different covariance results, and the ID3 algorithms identifies the relationships between the problem parameters using probabilistic knowledge extracted from a simulation example set. Both are detailed.
Extracting Useful Semantic Information from Large Scale Corpora of Text

ERIC Educational Resources Information Center

Mendoza, Ray Padilla, Jr.

2012-01-01

Extracting and representing semantic information from large scale corpora is at the crux of computer-assisted knowledge generation. Semantic information depends on collocation extraction methods, mathematical models used to represent distributional information, and weighting functions which transform the space. This dissertation provides a…
Filtering large-scale event collections using a combination of supervised and unsupervised learning for event trigger classification.

PubMed

Mehryary, Farrokh; Kaewphan, Suwisa; Hakala, Kai; Ginter, Filip

2016-01-01

Biomedical event extraction is one of the key tasks in biomedical text mining, supporting various applications such as database curation and hypothesis generation. Several systems, some of which have been applied at a large scale, have been introduced to solve this task. Past studies have shown that the identification of the phrases describing biological processes, also known as trigger detection, is a crucial part of event extraction, and notable overall performance gains can be obtained by solely focusing on this sub-task. In this paper we propose a novel approach for filtering falsely identified triggers from large-scale event databases, thus improving the quality of knowledge extraction. Our method relies on state-of-the-art word embeddings, event statistics gathered from the whole biomedical literature, and both supervised and unsupervised machine learning techniques. We focus on EVEX, an event database covering the whole PubMed and PubMed Central Open Access literature containing more than 40 million extracted events. The top most frequent EVEX trigger words are hierarchically clustered, and the resulting cluster tree is pruned to identify words that can never act as triggers regardless of their context. For rarely occurring trigger words we introduce a supervised approach trained on the combination of trigger word classification produced by the unsupervised clustering method and manual annotation. The method is evaluated on the official test set of BioNLP Shared Task on Event Extraction. The evaluation shows that the method can be used to improve the performance of the state-of-the-art event extraction systems. This successful effort also translates into removing 1,338,075 of potentially incorrect events from EVEX, thus greatly improving the quality of the data. The method is not solely bound to the EVEX resource and can be thus used to improve the quality of any event extraction system or database. The data and source code for this work are available at: http://bionlp-www.utu.fi/trigger-clustering/.
Symmetrical compression distance for arrhythmia discrimination in cloud-based big-data services.

PubMed

Lillo-Castellano, J M; Mora-Jiménez, I; Santiago-Mozos, R; Chavarría-Asso, F; Cano-González, A; García-Alberola, A; Rojo-Álvarez, J L

2015-07-01

The current development of cloud computing is completely changing the paradigm of data knowledge extraction in huge databases. An example of this technology in the cardiac arrhythmia field is the SCOOP platform, a national-level scientific cloud-based big data service for implantable cardioverter defibrillators. In this scenario, we here propose a new methodology for automatic classification of intracardiac electrograms (EGMs) in a cloud computing system, designed for minimal signal preprocessing. A new compression-based similarity measure (CSM) is created for low computational burden, so-called weighted fast compression distance, which provides better performance when compared with other CSMs in the literature. Using simple machine learning techniques, a set of 6848 EGMs extracted from SCOOP platform were classified into seven cardiac arrhythmia classes and one noise class, reaching near to 90% accuracy when previous patient arrhythmia information was available and 63% otherwise, hence overcoming in all cases the classification provided by the majority class. Results show that this methodology can be used as a high-quality service of cloud computing, providing support to physicians for improving the knowledge on patient diagnosis.
In Vivo Anti-Candida Activity of Phenolic Extracts and Compounds: Future Perspectives Focusing on Effective Clinical Interventions

PubMed Central

Martins, Natália; Barros, Lillian; Henriques, Mariana; Silva, Sónia; Ferreira, Isabel C. F. R.

2015-01-01

Candida species have increasingly deserved a special attention among the medical community. In spite of the presence of Candida species as a human commensal, alarming rates of local and systemic infections have been observed, varying from moderate to severe impact. Currently available antifungal drugs have progressively lost their effectiveness, pointing urgently the problem of the microorganisms with acquired-resistance. Natural matrices are secularly used for numerous purposes, being inclusive and highly effective as antimicrobials. Increasing evidence gives a particular emphasis to the contribution of phenolic extracts and related individual compounds. In vitro studies clearly confirm their prominent effects, but the confirmation through in vivo studies, including the involved mechanisms of action, is not so much deepened. Therefore, the present report aims to provide extensive knowledge about all these aspects, highlighting the most efficient phytochemical formulations, including therapeutic doses. Further studies need to be incited to deepen knowledge on this area, namely, focused on clinical trials to provide safer and more effective antimicrobials than the current ones. PMID:26380266
Knowledge Management Framework for Emerging Infectious Diseases Preparedness and Response: Design and Development of Public Health Document Ontology

PubMed Central

Zhang, Zhizun; Gonzalez, Mila C; Morse, Stephen S

2017-01-01

Background There are increasing concerns about our preparedness and timely coordinated response across the globe to cope with emerging infectious diseases (EIDs). This poses practical challenges that require exploiting novel knowledge management approaches effectively. Objective This work aims to develop an ontology-driven knowledge management framework that addresses the existing challenges in sharing and reusing public health knowledge. Methods We propose a systems engineering-inspired ontology-driven knowledge management approach. It decomposes public health knowledge into concepts and relations and organizes the elements of knowledge based on the teleological functions. Both knowledge and semantic rules are stored in an ontology and retrieved to answer queries regarding EID preparedness and response. Results A hybrid concept extraction was implemented in this work. The quality of the ontology was evaluated using the formal evaluation method Ontology Quality Evaluation Framework. Conclusions Our approach is a potentially effective methodology for managing public health knowledge. Accuracy and comprehensiveness of the ontology can be improved as more knowledge is stored. In the future, a survey will be conducted to collect queries from public health practitioners. The reasoning capacity of the ontology will be evaluated using the queries and hypothetical outbreaks. We suggest the importance of developing a knowledge sharing standard like the Gene Ontology for the public health domain. PMID:29021130
Visual guidance of mobile platforms

NASA Astrophysics Data System (ADS)

Blissett, Rodney J.

1993-12-01

Two systems are described and results presented demonstrating aspects of real-time visual guidance of autonomous mobile platforms. The first approach incorporates prior knowledge in the form of rigid geometrical models linking visual references within the environment. The second approach is based on a continuous synthesis of information extracted from image tokens to generate a coarse-grained world model, from which potential obstacles are inferred. The use of these techniques in workplace applications is discussed.
Towards a Semantic Web of Things: A Hybrid Semantic Annotation, Extraction, and Reasoning Framework for Cyber-Physical System

PubMed Central

Wu, Zhenyu; Xu, Yuan; Yang, Yunong; Zhang, Chunhong; Zhu, Xinning; Ji, Yang

2017-01-01

Web of Things (WoT) facilitates the discovery and interoperability of Internet of Things (IoT) devices in a cyber-physical system (CPS). Moreover, a uniform knowledge representation of physical resources is quite necessary for further composition, collaboration, and decision-making process in CPS. Though several efforts have integrated semantics with WoT, such as knowledge engineering methods based on semantic sensor networks (SSN), it still could not represent the complex relationships between devices when dynamic composition and collaboration occur, and it totally depends on manual construction of a knowledge base with low scalability. In this paper, to addresses these limitations, we propose the semantic Web of Things (SWoT) framework for CPS (SWoT4CPS). SWoT4CPS provides a hybrid solution with both ontological engineering methods by extending SSN and machine learning methods based on an entity linking (EL) model. To testify to the feasibility and performance, we demonstrate the framework by implementing a temperature anomaly diagnosis and automatic control use case in a building automation system. Evaluation results on the EL method show that linking domain knowledge to DBpedia has a relative high accuracy and the time complexity is at a tolerant level. Advantages and disadvantages of SWoT4CPS with future work are also discussed. PMID:28230725
Application of AI techniques to infer vegetation characteristics from directional reflectance(s)

NASA Technical Reports Server (NTRS)

Kimes, D. S.; Smith, J. A.; Harrison, P. A.; Harrison, P. R.

1994-01-01

Traditionally, the remote sensing community has relied totally on spectral knowledge to extract vegetation characteristics. However, there are other knowledge bases (KB's) that can be used to significantly improve the accuracy and robustness of inference techniques. Using AI (artificial intelligence) techniques a KB system (VEG) was developed that integrates input spectral measurements with diverse KB's. These KB's consist of data sets of directional reflectance measurements, knowledge from literature, and knowledge from experts which are combined into an intelligent and efficient system for making vegetation inferences. VEG accepts spectral data of an unknown target as input, determines the best techniques for inferring the desired vegetation characteristic(s), applies the techniques to the target data, and provides a rigorous estimate of the accuracy of the inference. VEG was developed to: infer spectral hemispherical reflectance from any combination of nadir and/or off-nadir view angles; infer percent ground cover from any combination of nadir and/or off-nadir view angles; infer unknown view angle(s) from known view angle(s) (known as view angle extension); and discriminate between user defined vegetation classes using spectral and directional reflectance relationships developed from an automated learning algorithm. The errors for these techniques were generally very good ranging between 2 to 15% (proportional root mean square). The system is designed to aid scientists in developing, testing, and applying new inference techniques using directional reflectance data.
Expert Seeker: A People-Finder Knowledge Management System

NASA Technical Reports Server (NTRS)

Becerra-Fernandez, Irma

2000-01-01

The first objective for this report was to perform a comprehensive research of industry models currently being used for similar purposes, in order to provide the Center with ideas of what is being done in area by private companies and government agencies. The second objective was to evaluate the use of taxonomies or ontologies to describe and catalog the areas of expertise at GSFC. The creation of a knowledge taxonomy is necessary for information extraction in order for The Expert Seeker to adequately search and find experts in a particular area of expertise. The requirements to develop a taxonomy are: provide minimal descriptive text; have the appropriate level of abstration; facilitate browsing; ease of use and speed of data entry are critical for success; customized to the organization and its culture; extent of knowledge areas; expandable, so new skills could be develop; could be complemented with free text fields to allow users the option to describe their knowledge in detail.

Informing child welfare policy and practice: using knowledge discovery and data mining technology via a dynamic Web site.

PubMed

Duncan, Dean F; Kum, Hye-Chung; Weigensberg, Elizabeth Caplick; Flair, Kimberly A; Stewart, C Joy

2008-11-01

Proper management and implementation of an effective child welfare agency requires the constant use of information about the experiences and outcomes of children involved in the system, emphasizing the need for comprehensive, timely, and accurate data. In the past 20 years, there have been many advances in technology that can maximize the potential of administrative data to promote better evaluation and management in the field of child welfare. Specifically, this article discusses the use of knowledge discovery and data mining (KDD), which makes it possible to create longitudinal data files from administrative data sources, extract valuable knowledge, and make the information available via a user-friendly public Web site. This article demonstrates a successful project in North Carolina where knowledge discovery and data mining technology was used to develop a comprehensive set of child welfare outcomes available through a public Web site to facilitate information sharing of child welfare data to improve policy and practice.
Extracting knowledge from the World Wide Web

PubMed Central

Henzinger, Monika; Lawrence, Steve

2004-01-01

The World Wide Web provides a unprecedented opportunity to automatically analyze a large sample of interests and activity in the world. We discuss methods for extracting knowledge from the web by randomly sampling and analyzing hosts and pages, and by analyzing the link structure of the web and how links accumulate over time. A variety of interesting and valuable information can be extracted, such as the distribution of web pages over domains, the distribution of interest in different areas, communities related to different topics, the nature of competition in different categories of sites, and the degree of communication between different communities or countries. PMID:14745041
Progress and Prospects for Stem Cell Engineering

PubMed Central

Ashton, Randolph S.; Keung, Albert J.; Peltier, Joseph; Schaffer, David V.

2018-01-01

Stem cells offer tremendous biomedical potential owing to their abilities to self-renew and differentiate into cell types of multiple adult tissues. Researchers and engineers have increasingly developed novel discovery technologies, theoretical approaches, and cell culture systems to investigate microenvironmental cues and cellular signaling events that control stem cell fate. Many of these technologies facilitate high-throughput investigation of microenvironmental signals and the intracellular signaling networks and machinery processing those signals into cell fate decisions. As our aggregate empirical knowledge of stem cell regulation grows, theoretical modeling with systems and computational biology methods has and will continue to be important for developing our ability to analyze and extract important conceptual features of stem cell regulation from complex data. Based on this body of knowledge, stem cell engineers will continue to develop technologies that predictably control stem cell fate with the ultimate goal of being able to accurately and economically scale up these systems for clinical-grade production of stem cell therapeutics. PMID:22432628
Smart Networked Elements in Support of ISHM

NASA Technical Reports Server (NTRS)

Oostdyk, Rebecca; Mata, Carlos; Perotti, Jose M.

2008-01-01

At the core of ISHM is the ability to extract information and knowledge from raw data. Conventional data acquisition systems sample and convert physical measurements to engineering units, which higher-level systems use to derive health and information about processes and systems. Although health management is essential at the top level, there are considerable advantages to implementing health-related functions at the sensor level. The distribution of processing to lower levels reduces bandwidth requirements, enhances data fusion, and improves the resolution for detection and isolation of failures in a system, subsystem, component, or process. The Smart Networked Element (SNE) has been developed to implement intelligent functions and algorithms at the sensor level in support of ISHM.
Two frameworks for integrating knowledge in induction

NASA Technical Reports Server (NTRS)

Rosenbloom, Paul S.; Hirsh, Haym; Cohen, William W.; Smith, Benjamin D.

1994-01-01

The use of knowledge in inductive learning is critical for improving the quality of the concept definitions generated, reducing the number of examples required in order to learn effective concept definitions, and reducing the computation needed to find good concept definitions. Relevant knowledge may come in many forms (such as examples, descriptions, advice, and constraints) and from many sources (such as books, teachers, databases, and scientific instruments). How to extract the relevant knowledge from this plethora of possibilities, and then to integrate it together so as to appropriately affect the induction process is perhaps the key issue at this point in inductive learning. Here the focus is on the integration part of this problem; that is, how induction algorithms can, and do, utilize a range of extracted knowledge. Preliminary work on a transformational framework for defining knowledge-intensive inductive algorithms out of relatively knowledge-free algorithms is described, as is a more tentative problems-space framework that attempts to cover all induction algorithms within a single general approach. These frameworks help to organize what is known about current knowledge-intensive induction algorithms, and to point towards new algorithms.
Buildings classification from airborne LiDAR point clouds through OBIA and ontology driven approach

NASA Astrophysics Data System (ADS)

Tomljenovic, Ivan; Belgiu, Mariana; Lampoltshammer, Thomas J.

2013-04-01

In the last years, airborne Light Detection and Ranging (LiDAR) data proved to be a valuable information resource for a vast number of applications ranging from land cover mapping to individual surface feature extraction from complex urban environments. To extract information from LiDAR data, users apply prior knowledge. Unfortunately, there is no consistent initiative for structuring this knowledge into data models that can be shared and reused across different applications and domains. The absence of such models poses great challenges to data interpretation, data fusion and integration as well as information transferability. The intention of this work is to describe the design, development and deployment of an ontology-based system to classify buildings from airborne LiDAR data. The novelty of this approach consists of the development of a domain ontology that specifies explicitly the knowledge used to extract features from airborne LiDAR data. The overall goal of this approach is to investigate the possibility for classification of features of interest from LiDAR data by means of domain ontology. The proposed workflow is applied to the building extraction process for the region of "Biberach an der Riss" in South Germany. Strip-adjusted and georeferenced airborne LiDAR data is processed based on geometrical and radiometric signatures stored within the point cloud. Region-growing segmentation algorithms are applied and segmented regions are exported to the GeoJSON format. Subsequently, the data is imported into the ontology-based reasoning process used to automatically classify exported features of interest. Based on the ontology it becomes possible to define domain concepts, associated properties and relations. As a consequence, the resulting specific body of knowledge restricts possible interpretation variants. Moreover, ontologies are machinable and thus it is possible to run reasoning on top of them. Available reasoners (FACT++, JESS, Pellet) are used to check the consistency of the developed ontologies, and logical reasoning is performed to infer implicit relations between defined concepts. The ontology for the definition of building is specified using the Ontology Web Language (OWL). It is the most widely used ontology language that is based on Description Logics (DL). DL allows the description of internal properties of modelled concepts (roof typology, shape, area, height etc.) and relationships between objects (IS_A, MEMBER_OF/INSTANCE_OF). It captures terminological knowledge (TBox) as well as assertional knowledge (ABox) - that represents facts about concept instances, i.e. the buildings in airborne LiDAR data. To assess the classification accuracy, ground truth data generated by visual interpretation and calculated classification results in terms of precision and recall are used. The advantages of this approach are: (i) flexibility, (ii) transferability, and (iii) extendibility - i.e. ontology can be extended with further concepts, data properties and object properties.
Efficient chemical-disease identification and relationship extraction using Wikipedia to improve recall

PubMed Central

Lowe, Daniel M.; O’Boyle, Noel M.; Sayle, Roger A.

2016-01-01

Awareness of the adverse effects of chemicals is important in biomedical research and healthcare. Text mining can allow timely and low-cost extraction of this knowledge from the biomedical literature. We extended our text mining solution, LeadMine, to identify diseases and chemical-induced disease relationships (CIDs). LeadMine is a dictionary/grammar-based entity recognizer and was used to recognize and normalize both chemicals and diseases to Medical Subject Headings (MeSH) IDs. The disease lexicon was obtained from three sources: MeSH, the Disease Ontology and Wikipedia. The Wikipedia dictionary was derived from pages with a disease/symptom box, or those where the page title appeared in the lexicon. Composite entities (e.g. heart and lung disease) were detected and mapped to their composite MeSH IDs. For CIDs, we developed a simple pattern-based system to find relationships within the same sentence. Our system was evaluated in the BioCreative V Chemical–Disease Relation task and achieved very good results for both disease concept ID recognition (F1-score: 86.12%) and CIDs (F1-score: 52.20%) on the test set. As our system was over an order of magnitude faster than other solutions evaluated on the task, we were able to apply the same system to the entirety of MEDLINE allowing us to extract a collection of over 250 000 distinct CIDs. PMID:27060160
Determination of betulinic acid, oleanolic acid and ursolic acid from Achyranthes aspera L. using RP-UFLC-DAD analysis and evaluation of various parameters for their optimum yield.

PubMed

Pai, Sandeep R; Upadhya, Vinayak; Hegde, Harsha V; Joshi, Rajesh K; Kholkute, Sanjiva D

2016-03-01

Achyranthes aspera L. is a well known herb commonly used in traditional system of Indian medicine to treat various disorders, such as cough, dysentery, gonorrhea, piles, kidney stone, pneumonia, renal dropsy, skin eruptions, snake bite, etc. Here, we used RP-UFLC-DAD method for determining triterpenoids betulinic acid (BA), oleanolic acid (OA) and ursolic acid (UA) from A. aspera. Optimum yield of these compounds were studied and evaluated using parameters viz., method of extraction, time of extraction, age of plant and plant parts (leaves, stem and roots). Linear relationships in RP-UFLC-DAD analysis were obtained in the range 0.05-100 µg/mL with 0.035, 0.042 and 0.033 µg/mL LOD for BA, OA and UA, respectively. Of the variables tested, extraction method and parts used significantly affected content yield. Continuous shaking extraction (CSE) at ambient temperature gave better extraction efficiency than exposure to ultra sonic extraction (USE) or microwave assisted extraction (MAE) methods. The highest content of BA, OA and UA were determined individually in leaf, stem and root extracts with CSE. Collective yield of these triterpenoids were higher in leaf part exposed to 15 min USE method. To best of our knowledge, the study newly reports UA from A. aspera and the same was confirmed using ATR-FT-IR studies. This study explains the distribution pattern of these major triterpenoids and optimum extraction parameters in detail.
Stereo Image Ranging For An Autonomous Robot Vision System

NASA Astrophysics Data System (ADS)

Holten, James R.; Rogers, Steven K.; Kabrisky, Matthew; Cross, Steven

1985-12-01

The principles of stereo vision for three-dimensional data acquisition are well-known and can be applied to the problem of an autonomous robot vehicle. Coincidental points in the two images are located and then the location of that point in a three-dimensional space can be calculated using the offset of the points and knowledge of the camera positions and geometry. This research investigates the application of artificial intelligence knowledge representation techniques as a means to apply heuristics to relieve the computational intensity of the low level image processing tasks. Specifically a new technique for image feature extraction is presented. This technique, the Queen Victoria Algorithm, uses formal language productions to process the image and characterize its features. These characterized features are then used for stereo image feature registration to obtain the required ranging information. The results can be used by an autonomous robot vision system for environmental modeling and path finding.
Designing easy DNA extraction: Teaching creativity through laboratory practice.

PubMed

Susantini, Endang; Lisdiana, Lisa; Isnawati; Tanzih Al Haq, Aushia; Trimulyono, Guntur

2017-05-01

Subject material concerning Deoxyribose Nucleic Acid (DNA) structure in the format of creativity-driven laboratory practice offers meaningful learning experience to the students. Therefore, a laboratory practice in which utilizes simple procedures and easy-safe-affordable household materials should be promoted to students to develop their creativity. This study aimed to examine whether designing and conducting DNA extraction with household materials could foster students' creative thinking. We also described how this laboratory practice affected students' knowledge and views. A total of 47 students participated in this study. These students were grouped and asked to utilize available household materials and modify procedures using hands-on worksheet. Result showed that this approach encouraged creative thinking as well as improved subject-related knowledge. Students also demonstrated positive views about content knowledge, social skills, and creative thinking skills. This study implies that extracting DNA with household materials is able to develop content knowledge, social skills, and creative thinking of the students. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):216-225, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.
In vitro bioavailability and cellular bioactivity studies of flavonoids and flavonoid-rich plant extracts: questions, considerations and future perspectives.

PubMed

Gonzales, Gerard Bryan

2017-08-01

In vitro techniques are essential in elucidating biochemical mechanisms and for screening a wide range of possible bioactive candidates. The number of papers published reporting in vitro bioavailability and bioactivity of flavonoids and flavonoid-rich plant extracts is numerous and still increasing. However, even with the present knowledge on the bioavailability and metabolism of flavonoids after oral ingestion, certain inaccuracies still persist in the literature, such as the use of plant extracts to study bioactivity towards vascular cells. There is therefore a need to revisit, even question, these approaches in terms of their biological relevance. In this review, the bioavailability of flavonoid glycosides, the use of cell models for intestinal absorption and the use of flavonoid aglycones and flavonoid-rich plant extracts in in vitro bioactivity studies will be discussed. Here, we focus on the limitations of current in vitro systems and revisit the validity of some in vitro approaches, and not on the detailed mechanism of flavonoid absorption and bioactivity. Based on the results in the review, there is an apparent need for stricter guidelines on publishing data on in vitro data relating to the bioavailability and bioactivity of flavonoids and flavonoid-rich plant extracts.
Collaborative human-machine analysis to disambiguate entities in unstructured text and structured datasets

NASA Astrophysics Data System (ADS)

Davenport, Jack H.

2016-05-01

Intelligence analysts demand rapid information fusion capabilities to develop and maintain accurate situational awareness and understanding of dynamic enemy threats in asymmetric military operations. The ability to extract relationships between people, groups, and locations from a variety of text datasets is critical to proactive decision making. The derived network of entities must be automatically created and presented to analysts to assist in decision making. DECISIVE ANALYTICS Corporation (DAC) provides capabilities to automatically extract entities, relationships between entities, semantic concepts about entities, and network models of entities from text and multi-source datasets. DAC's Natural Language Processing (NLP) Entity Analytics model entities as complex systems of attributes and interrelationships which are extracted from unstructured text via NLP algorithms. The extracted entities are automatically disambiguated via machine learning algorithms, and resolution recommendations are presented to the analyst for validation; the analyst's expertise is leveraged in this hybrid human/computer collaborative model. Military capability is enhanced by these NLP Entity Analytics because analysts can now create/update an entity profile with intelligence automatically extracted from unstructured text, thereby fusing entity knowledge from structured and unstructured data sources. Operational and sustainment costs are reduced since analysts do not have to manually tag and resolve entities.
Concepts of Operations for Asteroid Rendezvous Missions Focused on Resources Utilization

NASA Technical Reports Server (NTRS)

Mueller, Robert P.; Sibille, Laurent; Sanders, Gerald B.; Jones, Christopher A.

2014-01-01

Several asteroids are the targets of international robotic space missions currently manifested or in the planning stage. This global interest reflects a need to study these celestial bodies for the scientific information they provide about our solar system, and to better understand how to mitigate the collision threats some of them pose to Earth. Another important objective of these missions is providing assessments of the potential resources that asteroids could provide to future space architectures. In this paper, we examine a series of possible mission operations focused on advancing both our knowledge of the types of asteroids suited for different forms of resource extraction, and the capabilities required to extract those resources for mission enhancing and enabling uses such as radiation protection, propulsion, life support, shelter and manufacturing. An evolutionary development and demonstration approach is recommended within the framework of a larger campaign that prepares for the first landings of humans on Mars. As is the case for terrestrial mining, the development and demonstration approach progresses from resource prospecting (understanding the resource, and mapping the 'ore body'), mining/extraction feasibility and product assessment, pilot operations, to full in-situ resource utilization (ISRU). Opportunities to gather specific knowledge for ISRU via resource prospecting during science missions to asteroids are also examined to maximize the pace of development of needed ISRU capabilities and technologies for deep space missions.
Multilingual Content Extraction Extended with Background Knowledge for Military Intelligence

DTIC Science & Technology

2011-06-01

extended with background knowledge (WordNet [Fel98], YAGO [SKW08]) so that new conclusions (logical inferences) can be drawn. For this purpose theorem...such formalized content is extended with background knowledge (WordNet, YAGO ) so that new conclusions (logical inferences) can be drawn. Our aim is to...External Knowledge Formulas Transformation FOLE MRS to FOLE WordNet OpenCyc ... YAGO Logical Calculation Knowledge Background Knowledge Axioms Background
Use of an online extraction liquid chromatography quadrupole time-of-flight tandem mass spectrometry method for the characterization of polyphenols in Citrus paradisi cv. Changshanhuyu peel.

PubMed

Tong, Chaoying; Peng, Mijun; Tong, Runna; Ma, Ruyi; Guo, Keke; Shi, Shuyun

2018-01-19

Chemical profiling of natural products by high performance liquid chromatography (HPLC) was critical for understanding of their clinical bioactivities, and sample pretreatment steps have been considered as a bottleneck for analysis. Currently, concerted efforts have been made to develop sample pretreatment methods with high efficiency, low solvent and time consumptions. Here, a simple and efficient online extraction (OLE) strategy coupled with HPLC-diode array detector-quadrupole time-of-flight tandem mass spectrometry (HPLC-DAD-QTOF-MS/MS) was developed for rapid chemical profiling. For OLE strategy, guard column inserted with ground sample (2 mg) instead of sample loop was connected with manual injection valve, in which components were directly extracted and transferred to HPLC-DAD-QTOF-MS/MS system only by mobile phase without any extra time, solvent, instrument and operation. By comparison with offline heat-reflux extraction of Citrus paradisi cv. Changshanhuyu (Changshanhuyu) peel, OLE strategy presented higher extraction efficiency perhaps because of the high pressure and gradient elution mode. A total of twenty-two secondary metabolites were detected according to their retention times, UV spectra, exact mass, and fragmentation ions in MS/MS spectra, and nine of them were discovered in Changshanhuyu peel for the first time to our knowledge. It is concluded that the developed OLE-HPLC-DAD-QTOF-MS/MS system offers new perspectives for rapid chemical profiling of natural products. Copyright © 2017 Elsevier B.V. All rights reserved.
Mixed Carboxylic Acid Production by Megasphaera elsdenii from Glucose and Lignocellulosic Hydrolysate

DOE PAGES

Nelson, Robert S.; Peterson, Darren J.; Karp, Eric M.; ...

2017-03-01

Here, volatile fatty acids (VFAs) can be readily produced from many anaerobic microbes and subsequently utilized as precursors to renewable biofuels and biochemicals. Megasphaera elsdenii represents a promising host for production of VFAs, butyric acid (BA) and hexanoic acid (HA). However, due to the toxicity of these acids, product removal via an extractive fermentation system is required to achieve high titers and productivities. Here, we examine multiple aspects of extractive separations to produce BA and HA from glucose and lignocellulosic hydrolysate with M. elsdenii. A mixture of oleyl alcohol and 10% (v/v) trioctylamine was selected as an extraction solvent duemore » to its insignificant inhibitory effect on the bacteria. Batch extractive fermentations were conducted in the pH range of 5.0 to 6.5 to select the best cell growth rate and extraction efficiency combination. Subsequently, fed-batch pertractive fermentations were run over 230 h, demonstrating high BA and HA concentrations in the extracted fraction (57.2 g/L from ~190 g/L glucose) and productivity (0.26 g/L/h). To our knowledge, these are the highest combined acid titers and productivity values reported for M. elsdenii and bacterial mono-cultures from sugars. Lastly, the production of BA and HA (up to 17 g/L) from lignocellulosic sugars was demonstrated.« less
Mixed Carboxylic Acid Production by Megasphaera elsdenii from Glucose and Lignocellulosic Hydrolysate

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nelson, Robert S.; Peterson, Darren J.; Karp, Eric M.

Here, volatile fatty acids (VFAs) can be readily produced from many anaerobic microbes and subsequently utilized as precursors to renewable biofuels and biochemicals. Megasphaera elsdenii represents a promising host for production of VFAs, butyric acid (BA) and hexanoic acid (HA). However, due to the toxicity of these acids, product removal via an extractive fermentation system is required to achieve high titers and productivities. Here, we examine multiple aspects of extractive separations to produce BA and HA from glucose and lignocellulosic hydrolysate with M. elsdenii. A mixture of oleyl alcohol and 10% (v/v) trioctylamine was selected as an extraction solvent duemore » to its insignificant inhibitory effect on the bacteria. Batch extractive fermentations were conducted in the pH range of 5.0 to 6.5 to select the best cell growth rate and extraction efficiency combination. Subsequently, fed-batch pertractive fermentations were run over 230 h, demonstrating high BA and HA concentrations in the extracted fraction (57.2 g/L from ~190 g/L glucose) and productivity (0.26 g/L/h). To our knowledge, these are the highest combined acid titers and productivity values reported for M. elsdenii and bacterial mono-cultures from sugars. Lastly, the production of BA and HA (up to 17 g/L) from lignocellulosic sugars was demonstrated.« less
Investigation of Microbubble Cavitation-Induced Calcein Release from Cells In Vitro.

PubMed

Maciulevičius, Martynas; Tamošiūnas, Mindaugas; Jakštys, Baltramiejus; Jurkonis, Rytis; Venslauskas, Mindaugas Saulius; Šatkauskas, Saulius

2016-12-01

In the present study, microbubble (MB) cavitation signal analysis was performed together with calcein release evaluation in both pressure and exposure duration domains of the acoustic field. A passive cavitation detection system was used to simultaneously measure MB scattering and attenuation signals for subsequent extraction efficiency relative to MB cavitation activity. The results indicate that the decrease in the efficiency of extraction of calcein molecules from Chinese hamster ovary cells, as well as cell viability, is associated with MB cavitation activity and can be accurately predicted using inertial cavitation doses up to 0.18 V × s (R 2 > 0.9, p < 0.0001). No decrease in additional calcein release or cell viability was observed after complete MB sonodestruction was achieved. This indicates that the optimal exposure duration within which maximal sono-extraction efficiency is obtained coincides with the time necessary to achieve complete MB destruction. These results illustrate the importance of MB inertial cavitation in the sono-extraction process. To our knowledge, this study is the first to (i) investigate small molecule extraction from cells via sonoporation and (ii) relate the extraction process to the quantitative characteristics of MB cavitation acoustic spectra. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
A semantic-based method for extracting concept definitions from scientific publications: evaluation in the autism phenotype domain.

PubMed

Hassanpour, Saeed; O'Connor, Martin J; Das, Amar K

2013-08-12

A variety of informatics approaches have been developed that use information retrieval, NLP and text-mining techniques to identify biomedical concepts and relations within scientific publications or their sentences. These approaches have not typically addressed the challenge of extracting more complex knowledge such as biomedical definitions. In our efforts to facilitate knowledge acquisition of rule-based definitions of autism phenotypes, we have developed a novel semantic-based text-mining approach that can automatically identify such definitions within text. Using an existing knowledge base of 156 autism phenotype definitions and an annotated corpus of 26 source articles containing such definitions, we evaluated and compared the average rank of correctly identified rule definition or corresponding rule template using both our semantic-based approach and a standard term-based approach. We examined three separate scenarios: (1) the snippet of text contained a definition already in the knowledge base; (2) the snippet contained an alternative definition for a concept in the knowledge base; and (3) the snippet contained a definition not in the knowledge base. Our semantic-based approach had a higher average rank than the term-based approach for each of the three scenarios (scenario 1: 3.8 vs. 5.0; scenario 2: 2.8 vs. 4.9; and scenario 3: 4.5 vs. 6.2), with each comparison significant at the p-value of 0.05 using the Wilcoxon signed-rank test. Our work shows that leveraging existing domain knowledge in the information extraction of biomedical definitions significantly improves the correct identification of such knowledge within sentences. Our method can thus help researchers rapidly acquire knowledge about biomedical definitions that are specified and evolving within an ever-growing corpus of scientific publications.
KnowEnG: a knowledge engine for genomics.

PubMed

Sinha, Saurabh; Song, Jun; Weinshilboum, Richard; Jongeneel, Victor; Han, Jiawei

2015-11-01

We describe here the vision, motivations, and research plans of the National Institutes of Health Center for Excellence in Big Data Computing at the University of Illinois, Urbana-Champaign. The Center is organized around the construction of "Knowledge Engine for Genomics" (KnowEnG), an E-science framework for genomics where biomedical scientists will have access to powerful methods of data mining, network mining, and machine learning to extract knowledge out of genomics data. The scientist will come to KnowEnG with their own data sets in the form of spreadsheets and ask KnowEnG to analyze those data sets in the light of a massive knowledge base of community data sets called the "Knowledge Network" that will be at the heart of the system. The Center is undertaking discovery projects aimed at testing the utility of KnowEnG for transforming big data to knowledge. These projects span a broad range of biological enquiry, from pharmacogenomics (in collaboration with Mayo Clinic) to transcriptomics of human behavior. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

REKRIATE: A Knowledge Representation System for Object Recognition and Scene Interpretation

NASA Astrophysics Data System (ADS)

Meystel, Alexander M.; Bhasin, Sanjay; Chen, X.

1990-02-01

What humans actually observe and how they comprehend this information is complex due to Gestalt processes and interaction of context in predicting the course of thinking and enforcing one idea while repressing another. How we extract the knowledge from the scene, what we get from the scene indeed and what we bring from our mechanisms of perception are areas separated by a thin, ill-defined line. The purpose of this paper is to present a system for Representing Knowledge and Recognizing and Interpreting Attention Trailed Entities dubbed as REKRIATE. It will be used as a tool for discovering the underlying principles involved in knowledge representation required for conceptual learning. REKRIATE has some inherited knowledge and is given a vocabulary which is used to form rules for identification of the object. It has various modalities of sensing and has the ability to measure the distance between the objects in the image as well as the similarity between different images of presumably the same object. All sensations received from matrix of different sensors put into an adequate form. The methodology proposed is applicable to not only the pictorial or visual world representation, but to any sensing modality. It is based upon the two premises: a) inseparability of all domains of the world representation including linguistic, as well as those formed by various sensor modalities. and b) representativity of the object at several levels of resolution simultaneously.
A logic programming approach to medical errors in imaging.

PubMed

Rodrigues, Susana; Brandão, Paulo; Nelas, Luís; Neves, José; Alves, Victor

2011-09-01

In 2000, the Institute of Medicine reported disturbing numbers on the scope it covers and the impact of medical error in the process of health delivery. Nevertheless, a solution to this problem may lie on the adoption of adverse event reporting and learning systems that can help to identify hazards and risks. It is crucial to apply models to identify the adverse events root causes, enhance the sharing of knowledge and experience. The efficiency of the efforts to improve patient safety has been frustratingly slow. Some of this insufficiency of progress may be assigned to the lack of systems that take into account the characteristic of the information about the real world. In our daily lives, we formulate most of our decisions normally based on incomplete, uncertain and even forbidden or contradictory information. One's knowledge is less based on exact facts and more on hypothesis, perceptions or indications. From the data collected on our adverse event treatment and learning system on medical imaging, and through the use of Extended Logic Programming to knowledge representation and reasoning, and the exploitation of new methodologies for problem solving, namely those based on the perception of what is an agent and/or multi-agent systems, we intend to generate reports that identify the most relevant causes of error and define improvement strategies, concluding about the impact, place of occurrence, form or type of event recorded in the healthcare institutions. The Eindhoven Classification Model was extended and adapted to the medical imaging field and used to classify adverse events root causes. Extended Logic Programming was used for knowledge representation with defective information, allowing for the modelling of the universe of discourse in terms of data and knowledge default. A systematization of the evolution of the body of knowledge about Quality of Information embedded in the Root Cause Analysis was accomplished. An adverse event reporting and learning system was developed based on the presented approach to medical errors in imaging. This system was deployed in two Portuguese healthcare institutions, with an appealing outcome. The system enabled to verify that the majority of occurrences were concentrated in a few events that could be avoided. The developed system allowed automatic knowledge extraction, enabling report generation with strategies for the improvement of quality-of-care. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Pulmonary rehabilitation referral and participation are commonly influenced by environment, knowledge, and beliefs about consequences: a systematic review using the Theoretical Domains Framework.

PubMed

Cox, Narelle S; Oliveira, Cristino C; Lahham, Aroub; Holland, Anne E

2017-04-01

What are the barriers and enablers of referral, uptake, attendance and completion of pulmonary rehabilitation for people with chronic obstructive pulmonary disease (COPD)? Systematic review of qualitative or quantitative studies reporting data relating to referral, uptake, attendance and/or completion in pulmonary rehabilitation. People aged >18years with a diagnosis of COPD and/or their healthcare professionals. Data were extracted regarding the nature of barriers and enablers of pulmonary rehabilitation referral and participation. Extracted data items were mapped to the Theoretical Domains Framework (TDF). A total of 6969 references were screened, with 48 studies included and 369 relevant items mapped to the TDF. The most frequently represented domain was 'Environment' (33/48 included studies, 37% of mapped items), which included items such as waiting time, burden of illness, travel, transport and health system resources. Other frequently represented domains were 'Knowledge' (18/48 studies, including items such as clinician knowledge of referral processes, patient understanding of rehabilitation content) and 'Beliefs about consequences' (15/48 studies, including items such as beliefs regarding role and safety of exercise, expectations of rehabilitation outcomes). Barriers to referral, uptake, attendance or completion represented 71% (n=183) of items mapped to the TDF. All domains of the TDF were represented; however, items were least frequently coded to the domains of 'Optimism' and 'Memory'. The methodological quality of included studies was fair (mean quality score 9/12, SD 2). Many factors - particularly those related to environment, knowledge, attitudes and behaviours - interact to influence referral, uptake, attendance and completion of pulmonary rehabilitation. Overcoming the challenges associated with the personal and/or healthcare system environment will be imperative to improving access and uptake of pulmonary rehabilitation. PROSPERO CRD42015015976. [Cox NS, Oliveira CC, Lahham A, Holland AE (2017) Pulmonary rehabilitation referral and participation are commonly influenced by environment, knowledge, and beliefs about consequences: a systematic review using the Theoretical Domains Framework. Journal of Physiotherapy 63: 84-93]. Copyright © 2017 Australian Physiotherapy Association. Published by Elsevier B.V. All rights reserved.
Intelligent data reduction for autonomous power systems

NASA Technical Reports Server (NTRS)

Floyd, Stephen A.

1988-01-01

Since 1984 Marshall Space Flight Center was actively engaged in research and development concerning autonomous power systems. Much of the work in this domain has dealt with the development and application of knowledge-based or expert systems to perform tasks previously accomplished only through intensive human involvement. One such task is the health status monitoring of electrical power systems. Such monitoring is a manpower intensive task which is vital to mission success. The Hubble Space Telescope testbed and its associated Nickel Cadmium Battery Expert System (NICBES) were designated as the system on which the initial proof of concept for intelligent power system monitoing will be established. The key function performed by an engineer engaged in system monitoring is to analyze the raw telemetry data and identify from the whole only those elements which can be considered significant. This function requires engineering expertise on the functionality of the system, the mode of operation and the efficient and effective reading of the telemetry data. Application of this expertise to extract the significant components of the data is referred to as data reduction. Such a function possesses characteristics which make it a prime candidate for the application of knowledge-based systems' technologies. Such applications are investigated and recommendations are offered for the development of intelligent data reduction systems.
Calling on a million minds for community annotation in WikiProteins

PubMed Central

Mons, Barend; Ashburner, Michael; Chichester, Christine; van Mulligen, Erik; Weeber, Marc; den Dunnen, Johan; van Ommen, Gert-Jan; Musen, Mark; Cockerill, Matthew; Hermjakob, Henning; Mons, Albert; Packer, Abel; Pacheco, Roberto; Lewis, Suzanna; Berkeley, Alfred; Melton, William; Barris, Nickolas; Wales, Jimmy; Meijssen, Gerard; Moeller, Erik; Roes, Peter Jan; Borner, Katy; Bairoch, Amos

2008-01-01

WikiProteins enables community annotation in a Wiki-based system. Extracts of major data sources have been fused into an editable environment that links out to the original sources. Data from community edits create automatic copies of the original data. Semantic technology captures concepts co-occurring in one sentence and thus potential factual statements. In addition, indirect associations between concepts have been calculated. We call on a 'million minds' to annotate a 'million concepts' and to collect facts from the literature with the reward of collaborative knowledge discovery. The system is available for beta testing at . PMID:18507872
An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).

PubMed

Valdez, Joshua; Rueschman, Michael; Kim, Matthew; Redline, Susan; Sahoo, Satya S

2016-10-01

Extraction of structured information from biomedical literature is a complex and challenging problem due to the complexity of biomedical domain and lack of appropriate natural language processing (NLP) techniques. High quality domain ontologies model both data and metadata information at a fine level of granularity, which can be effectively used to accurately extract structured information from biomedical text. Extraction of provenance metadata, which describes the history or source of information, from published articles is an important task to support scientific reproducibility. Reproducibility of results reported by previous research studies is a foundational component of scientific advancement. This is highlighted by the recent initiative by the US National Institutes of Health called "Principles of Rigor and Reproducibility". In this paper, we describe an effective approach to extract provenance metadata from published biomedical research literature using an ontology-enabled NLP platform as part of the Provenance for Clinical and Healthcare Research (ProvCaRe). The ProvCaRe-NLP tool extends the clinical Text Analysis and Knowledge Extraction System (cTAKES) platform using both provenance and biomedical domain ontologies. We demonstrate the effectiveness of ProvCaRe-NLP tool using a corpus of 20 peer-reviewed publications. The results of our evaluation demonstrate that the ProvCaRe-NLP tool has significantly higher recall in extracting provenance metadata as compared to existing NLP pipelines such as MetaMap.
Protection against Experimental Cryptococcosis following Vaccination with Glucan Particles Containing Cryptococcus Alkaline Extracts

PubMed Central

Lee, Chrono K.; Huang, Haibin; Shen, Zu T.; Lodge, Jennifer K.; Leszyk, John; Ostroff, Gary R.

2015-01-01

ABSTRACT A vaccine capable of protecting at-risk persons against infections due to Cryptococcus neoformans and Cryptococcus gattii could reduce the substantial global burden of human cryptococcosis. Vaccine development has been hampered though, by lack of knowledge as to which antigens are immunoprotective and the need for an effective vaccine delivery system. We made alkaline extracts from mutant cryptococcal strains that lacked capsule or chitosan. The extracts were then packaged into glucan particles (GPs), which are purified Saccharomyces cerevisiae cell walls composed primarily of β-1,3-glucans. Subcutaneous vaccination with the GP-based vaccines provided significant protection against subsequent pulmonary infection with highly virulent strains of C. neoformans and C. gattii. The alkaline extract derived from the acapsular strain was analyzed by liquid chromatography tandem mass spectrometry (LC-MS/MS), and the most abundant proteins were identified. Separation of the alkaline extract by size exclusion chromatography revealed fractions that conferred protection when loaded in GP-based vaccines. Robust Th1- and Th17-biased CD4+ T cell recall responses were observed in the lungs of vaccinated and infected mice. Thus, our preclinical studies have indicated promising cryptococcal vaccine candidates in alkaline extracts delivered in GPs. Ongoing studies are directed at identifying the individual components of the extracts that confer protection and thus would be promising candidates for a human vaccine. PMID:26695631
Exploiting domain information for Word Sense Disambiguation of medical documents.

PubMed

Stevenson, Mark; Agirre, Eneko; Soroa, Aitor

2012-01-01

Current techniques for knowledge-based Word Sense Disambiguation (WSD) of ambiguous biomedical terms rely on relations in the Unified Medical Language System Metathesaurus but do not take into account the domain of the target documents. The authors' goal is to improve these methods by using information about the topic of the document in which the ambiguous term appears. The authors proposed and implemented several methods to extract lists of key terms associated with Medical Subject Heading terms. These key terms are used to represent the document topic in a knowledge-based WSD system. They are applied both alone and in combination with local context. A standard measure of accuracy was calculated over the set of target words in the widely used National Library of Medicine WSD dataset. The authors report a significant improvement when combining those key terms with local context, showing that domain information improves the results of a WSD system based on the Unified Medical Language System Metathesaurus alone. The best results were obtained using key terms obtained by relevance feedback and weighted by inverse document frequency.
Exploiting domain information for Word Sense Disambiguation of medical documents

PubMed Central

Agirre, Eneko; Soroa, Aitor

2011-01-01

Objective Current techniques for knowledge-based Word Sense Disambiguation (WSD) of ambiguous biomedical terms rely on relations in the Unified Medical Language System Metathesaurus but do not take into account the domain of the target documents. The authors' goal is to improve these methods by using information about the topic of the document in which the ambiguous term appears. Design The authors proposed and implemented several methods to extract lists of key terms associated with Medical Subject Heading terms. These key terms are used to represent the document topic in a knowledge-based WSD system. They are applied both alone and in combination with local context. Measurements A standard measure of accuracy was calculated over the set of target words in the widely used National Library of Medicine WSD dataset. Results and discussion The authors report a significant improvement when combining those key terms with local context, showing that domain information improves the results of a WSD system based on the Unified Medical Language System Metathesaurus alone. The best results were obtained using key terms obtained by relevance feedback and weighted by inverse document frequency. PMID:21900701
Automated extraction and semantic analysis of mutation impacts from the biomedical literature

PubMed Central

2012-01-01

Background Mutations as sources of evolution have long been the focus of attention in the biomedical literature. Accessing the mutational information and their impacts on protein properties facilitates research in various domains, such as enzymology and pharmacology. However, manually curating the rich and fast growing repository of biomedical literature is expensive and time-consuming. As a solution, text mining approaches have increasingly been deployed in the biomedical domain. While the detection of single-point mutations is well covered by existing systems, challenges still exist in grounding impacts to their respective mutations and recognizing the affected protein properties, in particular kinetic and stability properties together with physical quantities. Results We present an ontology model for mutation impacts, together with a comprehensive text mining system for extracting and analysing mutation impact information from full-text articles. Organisms, as sources of proteins, are extracted to help disambiguation of genes and proteins. Our system then detects mutation series to correctly ground detected impacts using novel heuristics. It also extracts the affected protein properties, in particular kinetic and stability properties, as well as the magnitude of the effects and validates these relations against the domain ontology. The output of our system can be provided in various formats, in particular by populating an OWL-DL ontology, which can then be queried to provide structured information. The performance of the system is evaluated on our manually annotated corpora. In the impact detection task, our system achieves a precision of 70.4%-71.1%, a recall of 71.3%-71.5%, and grounds the detected impacts with an accuracy of 76.5%-77%. The developed system, including resources, evaluation data and end-user and developer documentation is freely available under an open source license at http://www.semanticsoftware.info/open-mutation-miner. Conclusion We present Open Mutation Miner (OMM), the first comprehensive, fully open-source approach to automatically extract impacts and related relevant information from the biomedical literature. We assessed the performance of our work on manually annotated corpora and the results show the reliability of our approach. The representation of the extracted information into a structured format facilitates knowledge management and aids in database curation and correction. Furthermore, access to the analysis results is provided through multiple interfaces, including web services for automated data integration and desktop-based solutions for end user interactions. PMID:22759648
Analysis of a Knowledge-Management-Based Process of Transferring Project Management Skills

ERIC Educational Resources Information Center

Ioi, Toshihiro; Ono, Masakazu; Ishii, Kota; Kato, Kazuhiko

2012-01-01

Purpose: The purpose of this paper is to propose a method for the transfer of knowledge and skills in project management (PM) based on techniques in knowledge management (KM). Design/methodology/approach: The literature contains studies on methods to extract experiential knowledge in PM, but few studies exist that focus on methods to convert…
Molecular dynamics in principal component space.

PubMed

Michielssens, Servaas; van Erp, Titus S; Kutzner, Carsten; Ceulemans, Arnout; de Groot, Bert L

2012-07-26

A molecular dynamics algorithm in principal component space is presented. It is demonstrated that sampling can be improved without changing the ensemble by assigning masses to the principal components proportional to the inverse square root of the eigenvalues. The setup of the simulation requires no prior knowledge of the system; a short initial MD simulation to extract the eigenvectors and eigenvalues suffices. Independent measures indicated a 6-7 times faster sampling compared to a regular molecular dynamics simulation.
Building entity models through observation and learning

NASA Astrophysics Data System (ADS)

Garcia, Richard; Kania, Robert; Fields, MaryAnne; Barnes, Laura

2011-05-01

To support the missions and tasks of mixed robotic/human teams, future robotic systems will need to adapt to the dynamic behavior of both teammates and opponents. One of the basic elements of this adaptation is the ability to exploit both long and short-term temporal data. This adaptation allows robotic systems to predict/anticipate, as well as influence, future behavior for both opponents and teammates and will afford the system the ability to adjust its own behavior in order to optimize its ability to achieve the mission goals. This work is a preliminary step in the effort to develop online entity behavior models through a combination of learning techniques and observations. As knowledge is extracted from the system through sensor and temporal feedback, agents within the multi-agent system attempt to develop and exploit a basic movement model of an opponent. For the purpose of this work, extraction and exploitation is performed through the use of a discretized two-dimensional game. The game consists of a predetermined number of sentries attempting to keep an unknown intruder agent from penetrating their territory. The sentries utilize temporal data coupled with past opponent observations to hypothesize the probable locations of the opponent and thus optimize their guarding locations.
A Simple and Rapid Method for Preparing a Cell-Free Bacterial Lysate for Protein Synthesis

PubMed Central

Kaduri, Maya; Shainsky-Roitman, Janna; Goldfeder, Mor; Ivanir, Eran; Benhar, Itai; Shoham, Yuval; Schroeder, Avi

2016-01-01

Cell-free protein synthesis (CFPS) systems are important laboratory tools that are used for various synthetic biology applications. Here, we present a simple and inexpensive laboratory-scale method for preparing a CFPS system from E. coli. The procedure uses basic lab equipment, a minimal set of reagents, and requires less than one hour to process the bacterial cell mass into a functional S30-T7 extract. BL21(DE3) and MRE600 E. coli strains were used to prepare the S30-T7 extract. The CFPS system was used to produce a set of fluorescent and therapeutic proteins of different molecular weights (up to 66 kDa). This system was able to produce 40–150 μg-protein/ml, with variations depending on the plasmid type, expressed protein and E. coli strain. Interestingly, the BL21-based CFPS exhibited stability and increased activity at 40 and 45°C. To the best of our knowledge, this is the most rapid and affordable lab-scale protocol for preparing a cell-free protein synthesis system, with high thermal stability and efficacy in producing therapeutic proteins. PMID:27768741
A protocol for a systematic review of knowledge translation strategies in the allied health professions

PubMed Central

2011-01-01

Background Knowledge translation (KT) aims to close the gap between knowledge and practice in order to realize the benefits of research through (a) improved health outcomes, (b) more effective health services and products, and (c) strengthened healthcare systems. While there is some understanding of strategies to put research findings into practice within nursing and medicine, we have limited knowledge of KT strategies in allied health professions. Given the interprofessional nature of healthcare, a lack of guidance for supporting KT strategies in the allied health professions is concerning. Our objective in this study is to systematically review published research on KT strategies in five allied health disciplines. Methods A medical research librarian will develop and implement search strategies designed to identify evidence that is relevant to each question of the review. Two reviewers will perform study selection and quality assessment using standard forms. For study selection, data will be extracted by two reviewers. For quality assessment, data will be extracted by one reviewer and verified by a second. Disagreements will be resolved through discussion or third party adjudication. Within each profession, data will be grouped and analyzed by research design and KT strategies using the Effective Practice and Organisation of Care Review Group classification scheme. An overall synthesis across professions will be conducted. Significance A uniprofessional approach to KT does not represent the interprofessional context it targets. Our findings will provide the first systematic overview of KT strategies used in allied health professionals' clinical practice, as well as a foundation to inform future KT interventions in allied healthcare settings. PMID:21635763
Machine Reading for Extraction of Bacteria and Habitat Taxonomies

PubMed Central

Kordjamshidi, Parisa; Massa, Wouter; Provoost, Thomas; Moens, Marie-Francine

2015-01-01

There is a vast amount of scientific literature available from various resources such as the internet. Automating the extraction of knowledge from these resources is very helpful for biologists to easily access this information. This paper presents a system to extract the bacteria and their habitats, as well as the relations between them. We investigate to what extent current techniques are suited for this task and test a variety of models in this regard. We detect entities in a biological text and map the habitats into a given taxonomy. Our model uses a linear chain Conditional Random Field (CRF). For the prediction of relations between the entities, a model based on logistic regression is built. Designing a system upon these techniques, we explore several improvements for both the generation and selection of good candidates. One contribution to this lies in the extended exibility of our ontology mapper that uses an advanced boundary detection and assigns the taxonomy elements to the detected habitats. Furthermore, we discover value in the combination of several distinct candidate generation rules. Using these techniques, we show results that are significantly improving upon the state of art for the BioNLP Bacteria Biotopes task. PMID:27077141
Rule Extracting based on MCG with its Application in Helicopter Power Train Fault Diagnosis

NASA Astrophysics Data System (ADS)

Wang, M.; Hu, N. Q.; Qin, G. J.

2011-07-01

In order to extract decision rules for fault diagnosis from incomplete historical test records for knowledge-based damage assessment of helicopter power train structure. A method that can directly extract the optimal generalized decision rules from incomplete information based on GrC was proposed. Based on semantic analysis of unknown attribute value, the granule was extended to handle incomplete information. Maximum characteristic granule (MCG) was defined based on characteristic relation, and MCG was used to construct the resolution function matrix. The optimal general decision rule was introduced, with the basic equivalent forms of propositional logic, the rules were extracted and reduction from incomplete information table. Combined with a fault diagnosis example of power train, the application approach of the method was present, and the validity of this method in knowledge acquisition was proved.
Anomaly Detection for Next-Generation Space Launch Ground Operations

NASA Technical Reports Server (NTRS)

Spirkovska, Lilly; Iverson, David L.; Hall, David R.; Taylor, William M.; Patterson-Hine, Ann; Brown, Barbara; Ferrell, Bob A.; Waterman, Robert D.

2010-01-01

NASA is developing new capabilities that will enable future human exploration missions while reducing mission risk and cost. The Fault Detection, Isolation, and Recovery (FDIR) project aims to demonstrate the utility of integrated vehicle health management (IVHM) tools in the domain of ground support equipment (GSE) to be used for the next generation launch vehicles. In addition to demonstrating the utility of IVHM tools for GSE, FDIR aims to mature promising tools for use on future missions and document the level of effort - and hence cost - required to implement an application with each selected tool. One of the FDIR capabilities is anomaly detection, i.e., detecting off-nominal behavior. The tool we selected for this task uses a data-driven approach. Unlike rule-based and model-based systems that require manual extraction of system knowledge, data-driven systems take a radically different approach to reasoning. At the basic level, they start with data that represent nominal functioning of the system and automatically learn expected system behavior. The behavior is encoded in a knowledge base that represents "in-family" system operations. During real-time system monitoring or during post-flight analysis, incoming data is compared to that nominal system operating behavior knowledge base; a distance representing deviation from nominal is computed, providing a measure of how far "out of family" current behavior is. We describe the selected tool for FDIR anomaly detection - Inductive Monitoring System (IMS), how it fits into the FDIR architecture, the operations concept for the GSE anomaly monitoring, and some preliminary results of applying IMS to a Space Shuttle GSE anomaly.
Knowledge Discovery and Data Mining in Iran's Climatic Researches

NASA Astrophysics Data System (ADS)

Karimi, Mostafa

2013-04-01

Advances in measurement technology and data collection is the database gets larger. Large databases require powerful tools for analysis data. Iterative process of acquiring knowledge from information obtained from data processing is done in various forms in all scientific fields. However, when the data volume large, and many of the problems the Traditional methods cannot respond. in the recent years, use of databases in various scientific fields, especially atmospheric databases in climatology expanded. in addition, increases in the amount of data generated by the climate models is a challenge for analysis of it for extraction of hidden pattern and knowledge. The approach to this problem has been made in recent years uses the process of knowledge discovery and data mining techniques with the use of the concepts of machine learning, artificial intelligence and expert (professional) systems is overall performance. Data manning is analytically process for manning in massive volume data. The ultimate goal of data mining is access to information and finally knowledge. climatology is a part of science that uses variety and massive volume data. Goal of the climate data manning is Achieve to information from variety and massive atmospheric and non-atmospheric data. in fact, Knowledge Discovery performs these activities in a logical and predetermined and almost automatic process. The goal of this research is study of uses knowledge Discovery and data mining technique in Iranian climate research. For Achieve This goal, study content (descriptive) analysis and classify base method and issue. The result shown that in climatic research of Iran most clustering, k-means and wards applied and in terms of issues precipitation and atmospheric circulation patterns most introduced. Although several studies in geography and climate issues with statistical techniques such as clustering and pattern extraction is done, Due to the nature of statistics and data mining, but cannot say for internal climate studies in data mining and knowledge discovery techniques are used. However, it is necessary to use the KDD Approach and DM techniques in the climatic studies, specific interpreter of climate modeling result.
Medical knowledge discovery and management.

PubMed

Prior, Fred

2009-05-01

Although the volume of medical information is growing rapidly, the ability to rapidly convert this data into "actionable insights" and new medical knowledge is lagging far behind. The first step in the knowledge discovery process is data management and integration, which logically can be accomplished through the application of data warehouse technologies. A key insight that arises from efforts in biosurveillance and the global scope of military medicine is that information must be integrated over both time (longitudinal health records) and space (spatial localization of health-related events). Once data are compiled and integrated it is essential to encode the semantics and relationships among data elements through the use of ontologies and semantic web technologies to convert data into knowledge. Medical images form a special class of health-related information. Traditionally knowledge has been extracted from images by human observation and encoded via controlled terminologies. This approach is rapidly being replaced by quantitative analyses that more reliably support knowledge extraction. The goals of knowledge discovery are the improvement of both the timeliness and accuracy of medical decision making and the identification of new procedures and therapies.

DNA Extraction Techniques for Use in Education

ERIC Educational Resources Information Center

Hearn, R. P.; Arblaster, K. E.

2010-01-01

DNA extraction provides a hands-on introduction to DNA and enables students to gain real life experience and practical knowledge of DNA. Students gain a sense of ownership and are more enthusiastic when they use their own DNA. A cost effective, simple protocol for DNA extraction and visualization was devised. Buccal mucosal epithelia provide a…
Knowledge Management Framework for Emerging Infectious Diseases Preparedness and Response: Design and Development of Public Health Document Ontology.

PubMed

Zhang, Zhizun; Gonzalez, Mila C; Morse, Stephen S; Venkatasubramanian, Venkat

2017-10-11

There are increasing concerns about our preparedness and timely coordinated response across the globe to cope with emerging infectious diseases (EIDs). This poses practical challenges that require exploiting novel knowledge management approaches effectively. This work aims to develop an ontology-driven knowledge management framework that addresses the existing challenges in sharing and reusing public health knowledge. We propose a systems engineering-inspired ontology-driven knowledge management approach. It decomposes public health knowledge into concepts and relations and organizes the elements of knowledge based on the teleological functions. Both knowledge and semantic rules are stored in an ontology and retrieved to answer queries regarding EID preparedness and response. A hybrid concept extraction was implemented in this work. The quality of the ontology was evaluated using the formal evaluation method Ontology Quality Evaluation Framework. Our approach is a potentially effective methodology for managing public health knowledge. Accuracy and comprehensiveness of the ontology can be improved as more knowledge is stored. In the future, a survey will be conducted to collect queries from public health practitioners. The reasoning capacity of the ontology will be evaluated using the queries and hypothetical outbreaks. We suggest the importance of developing a knowledge sharing standard like the Gene Ontology for the public health domain. ©Zhizun Zhang, Mila C Gonzalez, Stephen S Morse, Venkat Venkatasubramanian. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 11.10.2017.
A common type system for clinical natural language processing

PubMed Central

2013-01-01

Background One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. Results We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. Conclusions We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types. PMID:23286462
A common type system for clinical natural language processing.

PubMed

Wu, Stephen T; Kaggal, Vinod C; Dligach, Dmitriy; Masanz, James J; Chen, Pei; Becker, Lee; Chapman, Wendy W; Savova, Guergana K; Liu, Hongfang; Chute, Christopher G

2013-01-03

One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types.
Induction of belief decision trees from data

NASA Astrophysics Data System (ADS)

AbuDahab, Khalil; Xu, Dong-ling; Keane, John

2012-09-01

In this paper, a method for acquiring belief rule-bases by inductive inference from data is described and evaluated. Existing methods extract traditional rules inductively from data, with consequents that are believed to be either 100% true or 100% false. Belief rules can capture uncertain or incomplete knowledge using uncertain belief degrees in consequents. Instead of using singled-value consequents, each belief rule deals with a set of collectively exhaustive and mutually exclusive consequents. The proposed method extracts belief rules from data which contain uncertain or incomplete knowledge.
Combining Deep and Handcrafted Image Features for Presentation Attack Detection in Face Recognition Systems Using Visible-Light Camera Sensors

PubMed Central

Nguyen, Dat Tien; Pham, Tuyen Danh; Baek, Na Rae; Park, Kang Ryoung

2018-01-01

Although face recognition systems have wide application, they are vulnerable to presentation attack samples (fake samples). Therefore, a presentation attack detection (PAD) method is required to enhance the security level of face recognition systems. Most of the previously proposed PAD methods for face recognition systems have focused on using handcrafted image features, which are designed by expert knowledge of designers, such as Gabor filter, local binary pattern (LBP), local ternary pattern (LTP), and histogram of oriented gradients (HOG). As a result, the extracted features reflect limited aspects of the problem, yielding a detection accuracy that is low and varies with the characteristics of presentation attack face images. The deep learning method has been developed in the computer vision research community, which is proven to be suitable for automatically training a feature extractor that can be used to enhance the ability of handcrafted features. To overcome the limitations of previously proposed PAD methods, we propose a new PAD method that uses a combination of deep and handcrafted features extracted from the images by visible-light camera sensor. Our proposed method uses the convolutional neural network (CNN) method to extract deep image features and the multi-level local binary pattern (MLBP) method to extract skin detail features from face images to discriminate the real and presentation attack face images. By combining the two types of image features, we form a new type of image features, called hybrid features, which has stronger discrimination ability than single image features. Finally, we use the support vector machine (SVM) method to classify the image features into real or presentation attack class. Our experimental results indicate that our proposed method outperforms previous PAD methods by yielding the smallest error rates on the same image databases. PMID:29495417
BioNLP Shared Task--The Bacteria Track.

PubMed

Bossy, Robert; Jourde, Julien; Manine, Alain-Pierre; Veber, Philippe; Alphonse, Erick; van de Guchte, Maarten; Bessières, Philippe; Nédellec, Claire

2012-06-26

We present the BioNLP 2011 Shared Task Bacteria Track, the first Information Extraction challenge entirely dedicated to bacteria. It includes three tasks that cover different levels of biological knowledge. The Bacteria Gene Renaming supporting task is aimed at extracting gene renaming and gene name synonymy in PubMed abstracts. The Bacteria Gene Interaction is a gene/protein interaction extraction task from individual sentences. The interactions have been categorized into ten different sub-types, thus giving a detailed account of genetic regulations at the molecular level. Finally, the Bacteria Biotopes task focuses on the localization and environment of bacteria mentioned in textbook articles. We describe the process of creation for the three corpora, including document acquisition and manual annotation, as well as the metrics used to evaluate the participants' submissions. Three teams submitted to the Bacteria Gene Renaming task; the best team achieved an F-score of 87%. For the Bacteria Gene Interaction task, the only participant's score had reached a global F-score of 77%, although the system efficiency varies significantly from one sub-type to another. Three teams submitted to the Bacteria Biotopes task with very different approaches; the best team achieved an F-score of 45%. However, the detailed study of the participating systems efficiency reveals the strengths and weaknesses of each participating system. The three tasks of the Bacteria Track offer participants a chance to address a wide range of issues in Information Extraction, including entity recognition, semantic typing and coreference resolution. We found common trends in the most efficient systems: the systematic use of syntactic dependencies and machine learning. Nevertheless, the originality of the Bacteria Biotopes task encouraged the use of interesting novel methods and techniques, such as term compositionality, scopes wider than the sentence.
Combining Deep and Handcrafted Image Features for Presentation Attack Detection in Face Recognition Systems Using Visible-Light Camera Sensors.

PubMed

Nguyen, Dat Tien; Pham, Tuyen Danh; Baek, Na Rae; Park, Kang Ryoung

2018-02-26

Although face recognition systems have wide application, they are vulnerable to presentation attack samples (fake samples). Therefore, a presentation attack detection (PAD) method is required to enhance the security level of face recognition systems. Most of the previously proposed PAD methods for face recognition systems have focused on using handcrafted image features, which are designed by expert knowledge of designers, such as Gabor filter, local binary pattern (LBP), local ternary pattern (LTP), and histogram of oriented gradients (HOG). As a result, the extracted features reflect limited aspects of the problem, yielding a detection accuracy that is low and varies with the characteristics of presentation attack face images. The deep learning method has been developed in the computer vision research community, which is proven to be suitable for automatically training a feature extractor that can be used to enhance the ability of handcrafted features. To overcome the limitations of previously proposed PAD methods, we propose a new PAD method that uses a combination of deep and handcrafted features extracted from the images by visible-light camera sensor. Our proposed method uses the convolutional neural network (CNN) method to extract deep image features and the multi-level local binary pattern (MLBP) method to extract skin detail features from face images to discriminate the real and presentation attack face images. By combining the two types of image features, we form a new type of image features, called hybrid features, which has stronger discrimination ability than single image features. Finally, we use the support vector machine (SVM) method to classify the image features into real or presentation attack class. Our experimental results indicate that our proposed method outperforms previous PAD methods by yielding the smallest error rates on the same image databases.
ESIP's Earth Science Knowledge Graph (ESKG) Testbed Project: An Automatic Approach to Building Interdisciplinary Earth Science Knowledge Graphs to Improve Data Discovery

NASA Astrophysics Data System (ADS)

McGibbney, L. J.; Jiang, Y.; Burgess, A. B.

2017-12-01

Big Earth observation data have been produced, archived and made available online, but discovering the right data in a manner that precisely and efficiently satisfies user needs presents a significant challenge to the Earth Science (ES) community. An emerging trend in information retrieval community is to utilize knowledge graphs to assist users in quickly finding desired information from across knowledge sources. This is particularly prevalent within the fields of social media and complex multimodal information processing to name but a few, however building a domain-specific knowledge graph is labour-intensive and hard to keep up-to-date. In this work, we update our progress on the Earth Science Knowledge Graph (ESKG) project; an ESIP-funded testbed project which provides an automatic approach to building a dynamic knowledge graph for ES to improve interdisciplinary data discovery by leveraging implicit, latent existing knowledge present within across several U.S Federal Agencies e.g. NASA, NOAA and USGS. ESKG strengthens ties between observations and user communities by: 1) developing a knowledge graph derived from various sources e.g. Web pages, Web Services, etc. via natural language processing and knowledge extraction techniques; 2) allowing users to traverse, explore, query, reason and navigate ES data via knowledge graph interaction. ESKG has the potential to revolutionize the way in which ES communities interact with ES data in the open world through the entity, spatial and temporal linkages and characteristics that make it up. This project enables the advancement of ESIP collaboration areas including both Discovery and Semantic Technologies by putting graph information right at our fingertips in an interactive, modern manner and reducing the efforts to constructing ontology. To demonstrate the ESKG concept, we will demonstrate use of our framework across NASA JPL's PO.DAAC, NOAA's Earth Observation Requirements Evaluation System (EORES) and various USGS systems.
Chemical name extraction based on automatic training data generation and rich feature set.

PubMed

Yan, Su; Spangler, W Scott; Chen, Ying

2013-01-01

The automation of extracting chemical names from text has significant value to biomedical and life science research. A major barrier in this task is the difficulty of getting a sizable and good quality data to train a reliable entity extraction model. Another difficulty is the selection of informative features of chemical names, since comprehensive domain knowledge on chemistry nomenclature is required. Leveraging random text generation techniques, we explore the idea of automatically creating training sets for the task of chemical name extraction. Assuming the availability of an incomplete list of chemical names, called a dictionary, we are able to generate well-controlled, random, yet realistic chemical-like training documents. We statistically analyze the construction of chemical names based on the incomplete dictionary, and propose a series of new features, without relying on any domain knowledge. Compared to state-of-the-art models learned from manually labeled data and domain knowledge, our solution shows better or comparable results in annotating real-world data with less human effort. Moreover, we report an interesting observation about the language for chemical names. That is, both the structural and semantic components of chemical names follow a Zipfian distribution, which resembles many natural languages.
Knowledge Discovery in Textual Documentation: Qualitative and Quantitative Analyses.

ERIC Educational Resources Information Center

Loh, Stanley; De Oliveira, Jose Palazzo M.; Gastal, Fabio Leite

2001-01-01

Presents an application of knowledge discovery in texts (KDT) concerning medical records of a psychiatric hospital. The approach helps physicians to extract knowledge about patients and diseases that may be used for epidemiological studies, for training professionals, and to support physicians to diagnose and evaluate diseases. (Author/AEF)
Ion Channel ElectroPhysiology Ontology (ICEPO) - a case study of text mining assisted ontology development.

PubMed

Elayavilli, Ravikumar Komandur; Liu, Hongfang

2016-01-01

Computational modeling of biological cascades is of great interest to quantitative biologists. Biomedical text has been a rich source for quantitative information. Gathering quantitative parameters and values from biomedical text is one significant challenge in the early steps of computational modeling as it involves huge manual effort. While automatically extracting such quantitative information from bio-medical text may offer some relief, lack of ontological representation for a subdomain serves as impedance in normalizing textual extractions to a standard representation. This may render textual extractions less meaningful to the domain experts. In this work, we propose a rule-based approach to automatically extract relations involving quantitative data from biomedical text describing ion channel electrophysiology. We further translated the quantitative assertions extracted through text mining to a formal representation that may help in constructing ontology for ion channel events using a rule based approach. We have developed Ion Channel ElectroPhysiology Ontology (ICEPO) by integrating the information represented in closely related ontologies such as, Cell Physiology Ontology (CPO), and Cardiac Electro Physiology Ontology (CPEO) and the knowledge provided by domain experts. The rule-based system achieved an overall F-measure of 68.93% in extracting the quantitative data assertions system on an independently annotated blind data set. We further made an initial attempt in formalizing the quantitative data assertions extracted from the biomedical text into a formal representation that offers potential to facilitate the integration of text mining into ontological workflow, a novel aspect of this study. This work is a case study where we created a platform that provides formal interaction between ontology development and text mining. We have achieved partial success in extracting quantitative assertions from the biomedical text and formalizing them in ontological framework. The ICEPO ontology is available for download at http://openbionlp.org/mutd/supplementarydata/ICEPO/ICEPO.owl.
A novel 9 × 9 map-based solvent selection strategy for targeted counter-current chromatography isolation of natural products.

PubMed

Liang, Junling; Meng, Jie; Wu, Dingfang; Guo, Mengzhe; Wu, Shihua

2015-06-26

Counter-current chromatography (CCC) is an efficient liquid-liquid chromatography technique for separation and purification of complex mixtures like natural products extracts and synthetic chemicals. However, CCC is still a challenging process requiring some special technical knowledge especially in the selection of appropriated solvent systems. In this work, we introduced a new 9 × 9 map-based solvent selection strategy for CCC isolation of targets, which permit more than 60 hexane-ethyl acetate-methanol-water (HEMWat) solvent systems as the start candidates for the selection of solvent systems. Among these solvent systems, there are clear linear correlations between partition coefficient (K) and the system numbers. Thus, an appropriate CCC solvent system (i.e., sweet spot for K = 1) may be hit by measurement of k values of the target only in two random solvent systems. Besides this, surprisingly, we found that through two sweet spots, we could get a line ("Sweet line") where there are infinite sweet solvent systems being suitable for CCC separation. In these sweet solvent systems, the target has the same partition coefficient (K) but different solubilities. Thus, the better sweet solvent system with higher sample solubility can be obtained for high capacity CCC preparation. Furthermore, we found that there is a zone ("Sweet zone") where all solvent systems have their own sweet partition coefficients values for the target in range of 0.4 < K< 2.5 or extended range of 0.25 < K < 16. All results were validated by using 14 pure GUESSmix mimic natural products as standards and further confirmed by isolation of several targets including honokiol and magnolol from the extracts of Magnolia officinalis Rehd. Et Wils and tanshinone IIA from Salvia miltiorrhiza Bunge. In practice, it is much easier to get a suitable solvent system only by making a simple screening two to four HEMWat two-phase solvent systems to obtain the sweet line or sweet zone without special knowledge or comprehensive standards as references. This is an important advancement for solvent system selection and also will be very useful for isolation of current natural products including Traditional Chinese Medicines. Copyright © 2015 Elsevier B.V. All rights reserved.
Induced lexico-syntactic patterns improve information extraction from online medical forums.

PubMed

Gupta, Sonal; MacLean, Diana L; Heer, Jeffrey; Manning, Christopher D

2014-01-01

To reliably extract two entity types, symptoms and conditions (SCs), and drugs and treatments (DTs), from patient-authored text (PAT) by learning lexico-syntactic patterns from data annotated with seed dictionaries. Despite the increasing quantity of PAT (eg, online discussion threads), tools for identifying medical entities in PAT are limited. When applied to PAT, existing tools either fail to identify specific entity types or perform poorly. Identification of SC and DT terms in PAT would enable exploration of efficacy and side effects for not only pharmaceutical drugs, but also for home remedies and components of daily care. We use SC and DT term dictionaries compiled from online sources to label several discussion forums from MedHelp (http://www.medhelp.org). We then iteratively induce lexico-syntactic patterns corresponding strongly to each entity type to extract new SC and DT terms. Our system is able to extract symptom descriptions and treatments absent from our original dictionaries, such as 'LADA', 'stabbing pain', and 'cinnamon pills'. Our system extracts DT terms with 58-70% F1 score and SC terms with 66-76% F1 score on two forums from MedHelp. We show improvements over MetaMap, OBA, a conditional random field-based classifier, and a previous pattern learning approach. Our entity extractor based on lexico-syntactic patterns is a successful and preferable technique for identifying specific entity types in PAT. To the best of our knowledge, this is the first paper to extract SC and DT entities from PAT. We exhibit learning of informal terms often used in PAT but missing from typical dictionaries. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Computer sciences

NASA Technical Reports Server (NTRS)

Smith, Paul H.

1988-01-01

The Computer Science Program provides advanced concepts, techniques, system architectures, algorithms, and software for both space and aeronautics information sciences and computer systems. The overall goal is to provide the technical foundation within NASA for the advancement of computing technology in aerospace applications. The research program is improving the state of knowledge of fundamental aerospace computing principles and advancing computing technology in space applications such as software engineering and information extraction from data collected by scientific instruments in space. The program includes the development of special algorithms and techniques to exploit the computing power provided by high performance parallel processors and special purpose architectures. Research is being conducted in the fundamentals of data base logic and improvement techniques for producing reliable computing systems.
Subaperture correlation based digital adaptive optics for full field optical coherence tomography.

PubMed

Kumar, Abhishek; Drexler, Wolfgang; Leitgeb, Rainer A

2013-05-06

This paper proposes a sub-aperture correlation based numerical phase correction method for interferometric full field imaging systems provided the complex object field information can be extracted. This method corrects for the wavefront aberration at the pupil/ Fourier transform plane without the need of any adaptive optics, spatial light modulators (SLM) and additional cameras. We show that this method does not require the knowledge of any system parameters. In the simulation study, we consider a full field swept source OCT (FF SSOCT) system to show the working principle of the algorithm. Experimental results are presented for a technical and biological sample to demonstrate the proof of the principle.
A knowledge-driven approach to cluster validity assessment.

PubMed

Bolshakova, Nadia; Azuaje, Francisco; Cunningham, Pádraig

2005-05-15

This paper presents an approach to assessing cluster validity based on similarity knowledge extracted from the Gene Ontology. The program is freely available for non-profit use on request from the authors.
A new linoleiyl arabinopyranoside from the bark of Bauhinia racemosa Lam and a new flavonoidal glycoside from the leaves of Cordia dichotoma Linn.

PubMed

Rahman, Md Azizur; Akhtar, Juber

2016-10-01

Phytochemical investigation is very valuable for the ethnomedicinally important plants Bauhinia racemosa Lam (BR) and Cordia dichotoma Linn (CD) used for the cure of variety of ailments. This study was thus designed for phytochemical investigation of BR bark and CD leaves. Phytoconstituents were isolated from the methanolic extracts of the plants by column chromatography using silica gel as stationary phase. The structures had been established on the basis of their physicochemical and spectral data, i.e. IR, (1)H NMR, (13)C NMR and MS. Elution of the columns with different solvents furnished six compounds (1-6) from the methanolic extract of BR bark and three compounds (7-9) from the methanolic extract of CD leaves which were structurally elucidated. The present phytochemical investigation reported several new compounds useful in increasing the existing knowledge of phytoconstituents from BR bark and CD leaves which is very valuable, as these drugs are used in the Indian traditional systems of medicine.
Hot-compressed water extraction of polysaccharides from soy hulls.

PubMed

Liu, Hua-Min; Wang, Fei-Yun; Liu, Yu-Lan

2016-07-01

The polysaccharides of soy hulls were extracted by hot-compressed water at temperatures of 110 from 180°C and various treatment times (10-150min) in a batch system. It was determined that a moderate temperature and short time are suitable for the preparation of polysaccharides. The structure of xylan and the inter- and intra-chain hydrogen bonding of cellulose fibrils in the soy hulls were not significantly broken down. The polysaccharides obtained were primarily composed of α-L-arabinofuranosyl units, 4-O-methyl-glucuronic acid units and α-D-galactose units attached with substituted units. A sugar analysis indicated that arabinose was the major component, constituting 35.6-46.9% of the polysaccharide products extracted at 130°C, 140°C, and 150°C. This investigation contributes to the knowledge of the polysaccharides of soy by-products, which can reduce the environmental impact of waste from the food industries. Copyright © 2016 Elsevier Ltd. All rights reserved.
A knowledgebase system to enhance scientific discovery: Telemakus

PubMed Central

Fuller, Sherrilynne S; Revere, Debra; Bugni, Paul F; Martin, George M

2004-01-01

Background With the rapid expansion of scientific research, the ability to effectively find or integrate new domain knowledge in the sciences is proving increasingly difficult. Efforts to improve and speed up scientific discovery are being explored on a number of fronts. However, much of this work is based on traditional search and retrieval approaches and the bibliographic citation presentation format remains unchanged. Methods Case study. Results The Telemakus KnowledgeBase System provides flexible new tools for creating knowledgebases to facilitate retrieval and review of scientific research reports. In formalizing the representation of the research methods and results of scientific reports, Telemakus offers a potential strategy to enhance the scientific discovery process. While other research has demonstrated that aggregating and analyzing research findings across domains augments knowledge discovery, the Telemakus system is unique in combining document surrogates with interactive concept maps of linked relationships across groups of research reports. Conclusion Based on how scientists conduct research and read the literature, the Telemakus KnowledgeBase System brings together three innovations in analyzing, displaying and summarizing research reports across a domain: (1) research report schema, a document surrogate of extracted research methods and findings presented in a consistent and structured schema format which mimics the research process itself and provides a high-level surrogate to facilitate searching and rapid review of retrieved documents; (2) research findings, used to index the documents, allowing searchers to request, for example, research studies which have studied the relationship between neoplasms and vitamin E; and (3) visual exploration interface of linked relationships for interactive querying of research findings across the knowledgebase and graphical displays of what is known as well as, through gaps in the map, what is yet to be tested. The rationale and system architecture are described and plans for the future are discussed. PMID:15507158

Assembly of objects with not fully predefined shapes

NASA Technical Reports Server (NTRS)

Arlotti, M. A.; Dimartino, V.

1989-01-01

An assembly problem in a non-deterministic environment, i.e., where parts to be assembled have unknown shape, size and location, is described. The only knowledge used by the robot to perform the assembly operation is given by a connectivity rule and geometrical constraints concerning parts. Once a set of geometrical features of parts has been extracted by a vision system, applying such a rule allows the dtermination of the composition sequence. A suitable sensory apparatus allows the control the whole operation.
Protection against Experimental Cryptococcosis following Vaccination with Glucan Particles Containing Cryptococcus Alkaline Extracts.

PubMed

Specht, Charles A; Lee, Chrono K; Huang, Haibin; Tipper, Donald J; Shen, Zu T; Lodge, Jennifer K; Leszyk, John; Ostroff, Gary R; Levitz, Stuart M

2015-12-22

A vaccine capable of protecting at-risk persons against infections due to Cryptococcus neoformans and Cryptococcus gattii could reduce the substantial global burden of human cryptococcosis. Vaccine development has been hampered though, by lack of knowledge as to which antigens are immunoprotective and the need for an effective vaccine delivery system. We made alkaline extracts from mutant cryptococcal strains that lacked capsule or chitosan. The extracts were then packaged into glucan particles (GPs), which are purified Saccharomyces cerevisiae cell walls composed primarily of β-1,3-glucans. Subcutaneous vaccination with the GP-based vaccines provided significant protection against subsequent pulmonary infection with highly virulent strains of C. neoformans and C. gattii. The alkaline extract derived from the acapsular strain was analyzed by liquid chromatography tandem mass spectrometry (LC-MS/MS), and the most abundant proteins were identified. Separation of the alkaline extract by size exclusion chromatography revealed fractions that conferred protection when loaded in GP-based vaccines. Robust Th1- and Th17-biased CD4(+) T cell recall responses were observed in the lungs of vaccinated and infected mice. Thus, our preclinical studies have indicated promising cryptococcal vaccine candidates in alkaline extracts delivered in GPs. Ongoing studies are directed at identifying the individual components of the extracts that confer protection and thus would be promising candidates for a human vaccine. The encapsulated yeast Cryptococcus neoformans and its closely related sister species, Cryptococcus gattii, are major causes of morbidity and mortality, particularly in immunocompromised persons. This study reports on the preclinical development of vaccines to protect at-risk populations from cryptococcosis. Antigens were extracted from Cryptococcus by treatment with an alkaline solution. The extracted antigens were then packaged into glucan particles, which are hollow yeast cell walls composed mainly of β-glucans. The glucan particle-based vaccines elicited robust T cell immune responses and protected mice from otherwise-lethal challenge with virulent strains of C. neoformans and C. gattii. The technology used for antigen extraction and subsequent loading into the glucan particle delivery system is relatively simple and can be applied to vaccine development against other pathogens. Copyright © 2015 Specht et al.
The Adverse Drug Reactions from Patient Reports in Social Media Project: Five Major Challenges to Overcome to Operationalize Analysis and Efficiently Support Pharmacovigilance Process

PubMed Central

Dahamna, Badisse; Guillemin-Lanne, Sylvie; Darmoni, Stefan J; Faviez, Carole; Huot, Charles; Katsahian, Sandrine; Leroux, Vincent; Pereira, Suzanne; Richard, Christophe; Schück, Stéphane; Souvignet, Julien; Lillo-Le Louët, Agnès; Texier, Nathalie

2017-01-01

Background Adverse drug reactions (ADRs) are an important cause of morbidity and mortality. Classical Pharmacovigilance process is limited by underreporting which justifies the current interest in new knowledge sources such as social media. The Adverse Drug Reactions from Patient Reports in Social Media (ADR-PRISM) project aims to extract ADRs reported by patients in these media. We identified 5 major challenges to overcome to operationalize the analysis of patient posts: (1) variable quality of information on social media, (2) guarantee of data privacy, (3) response to pharmacovigilance expert expectations, (4) identification of relevant information within Web pages, and (5) robust and evolutive architecture. Objective This article aims to describe the current state of advancement of the ADR-PRISM project by focusing on the solutions we have chosen to address these 5 major challenges. Methods In this article, we propose methods and describe the advancement of this project on several aspects: (1) a quality driven approach for selecting relevant social media for the extraction of knowledge on potential ADRs, (2) an assessment of ethical issues and French regulation for the analysis of data on social media, (3) an analysis of pharmacovigilance expert requirements when reviewing patient posts on the Internet, (4) an extraction method based on natural language processing, pattern based matching, and selection of relevant medical concepts in reference terminologies, and (5) specifications of a component-based architecture for the monitoring system. Results Considering the 5 major challenges, we (1) selected a set of 21 validated criteria for selecting social media to support the extraction of potential ADRs, (2) proposed solutions to guarantee data privacy of patients posting on Internet, (3) took into account pharmacovigilance expert requirements with use case diagrams and scenarios, (4) built domain-specific knowledge resources embeding a lexicon, morphological rules, context rules, semantic rules, syntactic rules, and post-analysis processing, and (5) proposed a component-based architecture that allows storage of big data and accessibility to third-party applications through Web services. Conclusions We demonstrated the feasibility of implementing a component-based architecture that allows collection of patient posts on the Internet, near real-time processing of those posts including annotation, and storage in big data structures. In the next steps, we will evaluate the posts identified by the system in social media to clarify the interest and relevance of such approach to improve conventional pharmacovigilance processes based on spontaneous reporting. PMID:28935617
Efficient chemical-disease identification and relationship extraction using Wikipedia to improve recall.

PubMed

Lowe, Daniel M; O'Boyle, Noel M; Sayle, Roger A

2016-01-01

Awareness of the adverse effects of chemicals is important in biomedical research and healthcare. Text mining can allow timely and low-cost extraction of this knowledge from the biomedical literature. We extended our text mining solution, LeadMine, to identify diseases and chemical-induced disease relationships (CIDs). LeadMine is a dictionary/grammar-based entity recognizer and was used to recognize and normalize both chemicals and diseases to Medical Subject Headings (MeSH) IDs. The disease lexicon was obtained from three sources: MeSH, the Disease Ontology and Wikipedia. The Wikipedia dictionary was derived from pages with a disease/symptom box, or those where the page title appeared in the lexicon. Composite entities (e.g. heart and lung disease) were detected and mapped to their composite MeSH IDs. For CIDs, we developed a simple pattern-based system to find relationships within the same sentence. Our system was evaluated in the BioCreative V Chemical-Disease Relation task and achieved very good results for both disease concept ID recognition (F1-score: 86.12%) and CIDs (F1-score: 52.20%) on the test set. As our system was over an order of magnitude faster than other solutions evaluated on the task, we were able to apply the same system to the entirety of MEDLINE allowing us to extract a collection of over 250 000 distinct CIDs. © The Author(s) 2016. Published by Oxford University Press.
Toward End-to-End Face Recognition Through Alignment Learning

NASA Astrophysics Data System (ADS)

Zhong, Yuanyi; Chen, Jiansheng; Huang, Bo

2017-08-01

Plenty of effective methods have been proposed for face recognition during the past decade. Although these methods differ essentially in many aspects, a common practice of them is to specifically align the facial area based on the prior knowledge of human face structure before feature extraction. In most systems, the face alignment module is implemented independently. This has actually caused difficulties in the designing and training of end-to-end face recognition models. In this paper we study the possibility of alignment learning in end-to-end face recognition, in which neither prior knowledge on facial landmarks nor artificially defined geometric transformations are required. Specifically, spatial transformer layers are inserted in front of the feature extraction layers in a Convolutional Neural Network (CNN) for face recognition. Only human identity clues are used for driving the neural network to automatically learn the most suitable geometric transformation and the most appropriate facial area for the recognition task. To ensure reproducibility, our model is trained purely on the publicly available CASIA-WebFace dataset, and is tested on the Labeled Face in the Wild (LFW) dataset. We have achieved a verification accuracy of 99.08\\% which is comparable to state-of-the-art single model based methods.
Structural classification of proteins using texture descriptors extracted from the cellular automata image.

PubMed

Kavianpour, Hamidreza; Vasighi, Mahdi

2017-02-01

Nowadays, having knowledge about cellular attributes of proteins has an important role in pharmacy, medical science and molecular biology. These attributes are closely correlated with the function and three-dimensional structure of proteins. Knowledge of protein structural class is used by various methods for better understanding the protein functionality and folding patterns. Computational methods and intelligence systems can have an important role in performing structural classification of proteins. Most of protein sequences are saved in databanks as characters and strings and a numerical representation is essential for applying machine learning methods. In this work, a binary representation of protein sequences is introduced based on reduced amino acids alphabets according to surrounding hydrophobicity index. Many important features which are hidden in these long binary sequences can be clearly displayed through their cellular automata images. The extracted features from these images are used to build a classification model by support vector machine. Comparing to previous studies on the several benchmark datasets, the promising classification rates obtained by tenfold cross-validation imply that the current approach can help in revealing some inherent features deeply hidden in protein sequences and improve the quality of predicting protein structural class.
Intelligent Diagnostic Assistant for Complicated Skin Diseases through C5's Algorithm.

PubMed

Jeddi, Fatemeh Rangraz; Arabfard, Masoud; Kermany, Zahra Arab

2017-09-01

Intelligent Diagnostic Assistant can be used for complicated diagnosis of skin diseases, which are among the most common causes of disability. The aim of this study was to design and implement a computerized intelligent diagnostic assistant for complicated skin diseases through C5's Algorithm. An applied-developmental study was done in 2015. Knowledge base was developed based on interviews with dermatologists through questionnaires and checklists. Knowledge representation was obtained from the train data in the database using Excel Microsoft Office. Clementine Software and C5's Algorithms were applied to draw the decision tree. Analysis of test accuracy was performed based on rules extracted using inference chains. The rules extracted from the decision tree were entered into the CLIPS programming environment and the intelligent diagnostic assistant was designed then. The rules were defined using forward chaining inference technique and were entered into Clips programming environment as RULE. The accuracy and error rates obtained in the training phase from the decision tree were 99.56% and 0.44%, respectively. The accuracy of the decision tree was 98% and the error was 2% in the test phase. Intelligent diagnostic assistant can be used as a reliable system with high accuracy, sensitivity, specificity, and agreement.
Structural health monitoring feature design by genetic programming

NASA Astrophysics Data System (ADS)

Harvey, Dustin Y.; Todd, Michael D.

2014-09-01

Structural health monitoring (SHM) systems provide real-time damage and performance information for civil, aerospace, and other high-capital or life-safety critical structures. Conventional data processing involves pre-processing and extraction of low-dimensional features from in situ time series measurements. The features are then input to a statistical pattern recognition algorithm to perform the relevant classification or regression task necessary to facilitate decisions by the SHM system. Traditional design of signal processing and feature extraction algorithms can be an expensive and time-consuming process requiring extensive system knowledge and domain expertise. Genetic programming, a heuristic program search method from evolutionary computation, was recently adapted by the authors to perform automated, data-driven design of signal processing and feature extraction algorithms for statistical pattern recognition applications. The proposed method, called Autofead, is particularly suitable to handle the challenges inherent in algorithm design for SHM problems where the manifestation of damage in structural response measurements is often unclear or unknown. Autofead mines a training database of response measurements to discover information-rich features specific to the problem at hand. This study provides experimental validation on three SHM applications including ultrasonic damage detection, bearing damage classification for rotating machinery, and vibration-based structural health monitoring. Performance comparisons with common feature choices for each problem area are provided demonstrating the versatility of Autofead to produce significant algorithm improvements on a wide range of problems.
Whither systems medicine?

PubMed Central

Apweiler, Rolf; Beissbarth, Tim; Berthold, Michael R; Blüthgen, Nils; Burmeister, Yvonne; Dammann, Olaf; Deutsch, Andreas; Feuerhake, Friedrich; Franke, Andre; Hasenauer, Jan; Hoffmann, Steve; Höfer, Thomas; Jansen, Peter LM; Kaderali, Lars; Klingmüller, Ursula; Koch, Ina; Kohlbacher, Oliver; Kuepfer, Lars; Lammert, Frank; Maier, Dieter; Pfeifer, Nico; Radde, Nicole; Rehm, Markus; Roeder, Ingo; Saez-Rodriguez, Julio; Sax, Ulrich; Schmeck, Bernd; Schuppert, Andreas; Seilheimer, Bernd; Theis, Fabian J; Vera, Julio; Wolkenhauer, Olaf

2018-01-01

New technologies to generate, store and retrieve medical and research data are inducing a rapid change in clinical and translational research and health care. Systems medicine is the interdisciplinary approach wherein physicians and clinical investigators team up with experts from biology, biostatistics, informatics, mathematics and computational modeling to develop methods to use new and stored data to the benefit of the patient. We here provide a critical assessment of the opportunities and challenges arising out of systems approaches in medicine and from this provide a definition of what systems medicine entails. Based on our analysis of current developments in medicine and healthcare and associated research needs, we emphasize the role of systems medicine as a multilevel and multidisciplinary methodological framework for informed data acquisition and interdisciplinary data analysis to extract previously inaccessible knowledge for the benefit of patients. PMID:29497170
PRO-Elicere: A Study for Create a New Process of Dependability Analysis of Space Computer Systems

NASA Astrophysics Data System (ADS)

da Silva, Glauco; Netto Lahoz, Carlos Henrique

2013-09-01

This paper presents the new approach to the computer system dependability analysis, called PRO-ELICERE, which introduces data mining concepts and intelligent mechanisms to decision support to analyze the potential hazards and failures of a critical computer system. Also, are presented some techniques and tools that support the traditional dependability analysis and briefly discusses the concept of knowledge discovery and intelligent databases for critical computer systems. After that, introduces the PRO-ELICERE process, an intelligent approach to automate the ELICERE, a process created to extract non-functional requirements for critical computer systems. The PRO-ELICERE can be used in the V&V activities in the projects of Institute of Aeronautics and Space, such as the Brazilian Satellite Launcher (VLS-1).
Using graph theory to analyze biological networks

PubMed Central

2011-01-01

Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can be done by examining the elementary constituents individually and then how these are connected. The myriad components of a system and their interactions are best characterized as networks and they are mainly represented as graphs where thousands of nodes are connected with thousands of vertices. In this article we demonstrate approaches, models and methods from the graph theory universe and we discuss ways in which they can be used to reveal hidden properties and features of a network. This network profiling combined with knowledge extraction will help us to better understand the biological significance of the system. PMID:21527005
A digital system for surface reconstruction

USGS Publications Warehouse

Zhou, Weiyang; Brock, Robert H.; Hopkins, Paul F.

1996-01-01

A digital photogrammetric system, STEREO, was developed to determine three dimensional coordinates of points of interest (POIs) defined with a grid on a textureless and smooth-surfaced specimen. Two CCD cameras were set up with unknown orientation and recorded digital images of a reference model and a specimen. Points on the model were selected as control or check points for calibrating or assessing the system. A new algorithm for edge-detection called local maximum convolution (LMC) helped extract the POIs from the stereo image pairs. The system then matched the extracted POIs and used a least squares “bundle” adjustment procedure to solve for the camera orientation parameters and the coordinates of the POIs. An experiment with STEREO found that the standard deviation of the residuals at the check points was approximately 24%, 49% and 56% of the pixel size in the X, Y and Z directions, respectively. The average of the absolute values of the residuals at the check points was approximately 19%, 36% and 49% of the pixel size in the X, Y and Z directions, respectively. With the graphical user interface, STEREO demonstrated a high degree of automation and its operation does not require special knowledge of photogrammetry, computers or image processing.
IMAGE 100: The interactive multispectral image processing system

NASA Technical Reports Server (NTRS)

Schaller, E. S.; Towles, R. W.

1975-01-01

The need for rapid, cost-effective extraction of useful information from vast quantities of multispectral imagery available from aircraft or spacecraft has resulted in the design, implementation and application of a state-of-the-art processing system known as IMAGE 100. Operating on the general principle that all objects or materials possess unique spectral characteristics or signatures, the system uses this signature uniqueness to identify similar features in an image by simultaneously analyzing signatures in multiple frequency bands. Pseudo-colors, or themes, are assigned to features having identical spectral characteristics. These themes are displayed on a color CRT, and may be recorded on tape, film, or other media. The system was designed to incorporate key features such as interactive operation, user-oriented displays and controls, and rapid-response machine processing. Owing to these features, the user can readily control and/or modify the analysis process based on his knowledge of the input imagery. Effective use can be made of conventional photographic interpretation skills and state-of-the-art machine analysis techniques in the extraction of useful information from multispectral imagery. This approach results in highly accurate multitheme classification of imagery in seconds or minutes rather than the hours often involved in processing using other means.
Using knowledge for indexing health web resources in a quality-controlled gateway.

PubMed

Joubert, Michel; Darmoni, Stefan J; Avillach, Paul; Dahamna, Badisse; Fieschi, Marius

2008-01-01

The aim of this study is to provide to indexers MeSH terms to be considered as major ones in a list of terms automatically extracted from a document. We propose a method combining symbolic knowledge - the UMLS Metathesaurus and Semantic Network - and statistical knowledge drawn from co-occurrences of terms in the CISMeF database (a French-language quality-controlled health gateway) using data mining measures. The method was tested on CISMeF corpus of 293 resources. There was a proportion of 0.37+/-0.26 major terms in the processed records. The method produced lists of terms with a proportion of terms initially pointed out as major of 0.54+/-0.31. The method we propose reduces the number of terms, which seem not useful for content description of resources, such as "check tags", but retains the most descriptive ones. Discarding these terms is accounted for by: 1) the removal by using semantic knowledge of associations of concepts bearing no real medical significance, 2) the removal by using statistical knowledge of nonstatistically significant associations of terms. This method can assist effectively indexers in their daily work and will be soon applied in the CISMeF system.
Shape Adaptive, Robust Iris Feature Extraction from Noisy Iris Images

PubMed Central

Ghodrati, Hamed; Dehghani, Mohammad Javad; Danyali, Habibolah

2013-01-01

In the current iris recognition systems, noise removing step is only used to detect noisy parts of the iris region and features extracted from there will be excluded in matching step. Whereas depending on the filter structure used in feature extraction, the noisy parts may influence relevant features. To the best of our knowledge, the effect of noise factors on feature extraction has not been considered in the previous works. This paper investigates the effect of shape adaptive wavelet transform and shape adaptive Gabor-wavelet for feature extraction on the iris recognition performance. In addition, an effective noise-removing approach is proposed in this paper. The contribution is to detect eyelashes and reflections by calculating appropriate thresholds by a procedure called statistical decision making. The eyelids are segmented by parabolic Hough transform in normalized iris image to decrease computational burden through omitting rotation term. The iris is localized by an accurate and fast algorithm based on coarse-to-fine strategy. The principle of mask code generation is to assign the noisy bits in an iris code in order to exclude them in matching step is presented in details. An experimental result shows that by using the shape adaptive Gabor-wavelet technique there is an improvement on the accuracy of recognition rate. PMID:24696801
Shape adaptive, robust iris feature extraction from noisy iris images.

PubMed

Ghodrati, Hamed; Dehghani, Mohammad Javad; Danyali, Habibolah

2013-10-01

In the current iris recognition systems, noise removing step is only used to detect noisy parts of the iris region and features extracted from there will be excluded in matching step. Whereas depending on the filter structure used in feature extraction, the noisy parts may influence relevant features. To the best of our knowledge, the effect of noise factors on feature extraction has not been considered in the previous works. This paper investigates the effect of shape adaptive wavelet transform and shape adaptive Gabor-wavelet for feature extraction on the iris recognition performance. In addition, an effective noise-removing approach is proposed in this paper. The contribution is to detect eyelashes and reflections by calculating appropriate thresholds by a procedure called statistical decision making. The eyelids are segmented by parabolic Hough transform in normalized iris image to decrease computational burden through omitting rotation term. The iris is localized by an accurate and fast algorithm based on coarse-to-fine strategy. The principle of mask code generation is to assign the noisy bits in an iris code in order to exclude them in matching step is presented in details. An experimental result shows that by using the shape adaptive Gabor-wavelet technique there is an improvement on the accuracy of recognition rate.
PASBio: predicate-argument structures for event extraction in molecular biology

PubMed Central

Wattarujeekrit, Tuangthong; Shah, Parantu K; Collier, Nigel

2004-01-01

Background The exploitation of information extraction (IE), a technology aiming to provide instances of structured representations from free-form text, has been rapidly growing within the molecular biology (MB) research community to keep track of the latest results reported in literature. IE systems have traditionally used shallow syntactic patterns for matching facts in sentences but such approaches appear inadequate to achieve high accuracy in MB event extraction due to complex sentence structure. A consensus in the IE community is emerging on the necessity for exploiting deeper knowledge structures such as through the relations between a verb and its arguments shown by predicate-argument structure (PAS). PAS is of interest as structures typically correspond to events of interest and their participating entities. For this to be realized within IE a key knowledge component is the definition of PAS frames. PAS frames for non-technical domains such as newswire are already being constructed in several projects such as PropBank, VerbNet, and FrameNet. Knowledge from PAS should enable more accurate applications in several areas where sentence understanding is required like machine translation and text summarization. In this article, we explore the need to adapt PAS for the MB domain and specify PAS frames to support IE, as well as outlining the major issues that require consideration in their construction. Results We introduce PASBio by extending a model based on PropBank to the MB domain. The hypothesis we explore is that PAS holds the key for understanding relationships describing the roles of genes and gene products in mediating their biological functions. We chose predicates describing gene expression, molecular interactions and signal transduction events with the aim of covering a number of research areas in MB. Analysis was performed on sentences containing a set of verbal predicates from MEDLINE and full text journals. Results confirm the necessity to analyze PAS specifically for MB domain. Conclusions At present PASBio contains the analyzed PAS of over 30 verbs, publicly available on the Internet for use in advanced applications. In the future we aim to expand the knowledge base to cover more verbs and the nominal form of each predicate. PMID:15494078
A Framework to Learn Physics from Atomically Resolved Images

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vlcek, L.; Maksov, A.; Pan, M.

Here, we present a generalized framework for physics extraction, i.e., knowledge, from atomically resolved images, and show its utility by applying it to a model system of segregation of chalcogen atoms in an FeSe 0.45Te 0.55 superconductor system. We emphasize that the framework can be used for any imaging data for which a generative physical model exists. Consider that a generative physical model can produce a very large number of configurations, not all of which are observable. By applying a microscope function to a sub-set of this generated data, we form a simulated dataset on which statistics can be computed.
Identification of phenanthrene derivatives in Aerides rosea (Orchidaceae) using the combined systems HPLC-ESI-HRMS/MS and HPLC-DAD-MS-SPE-UV-NMR.

PubMed

Cakova, Veronika; Urbain, Aurélie; Antheaume, Cyril; Rimlinger, Nicole; Wehrung, Patrick; Bonté, Frédéric; Lobstein, Annelise

2015-01-01

In our continued efforts to contribute to the general knowledge on the chemical diversity of orchids, we have decided to focus our investigations on the Aeridinae subtribe. Following our previous phytochemical study of Vanda coerulea, which has led to the identification of phenanthrene derivatives, a closely related species, Aerides rosea Lodd. ex Lindl. & Paxton, was chosen for investigation. To identify new secondary metabolites, and to avoid isolation of those already known, by means of the combined systems HPLC-DAD(diode-array detector) with high-resolution tandem mass spectrometry (HRMS/MS) and HPLC-DAD-MS-SPE(solid-phase extraction)-UV-NMR. A dereplication strategy was developed using a HPLC-DAD-HRMS/MS targeted method and applied to fractions from A. rosea stem extract. Characterisation of unknown minor compounds was then performed using the combined HPLC-DAD-MS-SPE-UV-NMR system. The dereplication method allowed the characterisation of four compounds (gigantol, imbricatin, methoxycoelonin and coelonin), previously isolated from Vanda coerulea stem extract. The analyses of two fractions permitted the identification of five additional minor constituents including one phenanthropyran, two phenanthrene and two dihydrophenanthrene derivatives. The full set of NMR data of each compound was obtained from microgram quantities. Nine secondary metabolites were characterised in A. rosea stems, utilising HPLC systems combined with high-resolution analytical systems. Two of them are newly described phenanthrene derivatives: aerosanthrene (5-methoxyphenanthrene-2,3,7-triol) and aerosin (3-methoxy-9,10-dihydro-2,5,7-phenanthrenetriol). Copyright © 2014 John Wiley & Sons, Ltd.
The explosion at institute: modeling and analyzing the situation awareness factor.

PubMed

Naderpour, Mohsen; Lu, Jie; Zhang, Guangquan

2014-12-01

In 2008 a runaway chemical reaction caused an explosion at a methomyl unit in West Virginia, USA, killing two employees, injuring eight people, evacuating more than 40,000 residents adjacent to the facility, disrupting traffic on a nearby highway and causing significant business loss and interruption. Although the accident was formally investigated, the role of the situation awareness (SA) factor, i.e., a correct understanding of the situation, and appropriate models to maintain SA, remain unexplained. This paper extracts details of abnormal situations within the methomyl unit and models them into a situational network using dynamic Bayesian networks. A fuzzy logic system is used to resemble the operator's thinking when confronted with these abnormal situations. The combined situational network and fuzzy logic system make it possible for the operator to assess such situations dynamically to achieve accurate SA. The findings show that the proposed structure provides a useful graphical model that facilitates the inclusion of prior background knowledge and the updating of this knowledge when new information is available from monitoring systems. Copyright © 2014 Elsevier Ltd. All rights reserved.

Wnt pathway curation using automated natural language processing: combining statistical methods with partial and full parse for knowledge extraction.

PubMed

Santos, Carlos; Eggle, Daniela; States, David J

2005-04-15

Wnt signaling is a very active area of research with highly relevant publications appearing at a rate of more than one per day. Building and maintaining databases describing signal transduction networks is a time-consuming and demanding task that requires careful literature analysis and extensive domain-specific knowledge. For instance, more than 50 factors involved in Wnt signal transduction have been identified as of late 2003. In this work we describe a natural language processing (NLP) system that is able to identify references to biological interaction networks in free text and automatically assembles a protein association and interaction map. A 'gold standard' set of names and assertions was derived by manual scanning of the Wnt genes website (http://www.stanford.edu/~rnusse/wntwindow.html) including 53 interactions involved in Wnt signaling. This system was used to analyze a corpus of peer-reviewed articles related to Wnt signaling including 3369 Pubmed and 1230 full text papers. Names for key Wnt-pathway associated proteins and biological entities are identified using a chi-squared analysis of noun phrases over-represented in the Wnt literature as compared to the general signal transduction literature. Interestingly, we identified several instances where generic terms were used on the website when more specific terms occur in the literature, and one typographic error on the Wnt canonical pathway. Using the named entity list and performing an exhaustive assertion extraction of the corpus, 34 of the 53 interactions in the 'gold standard' Wnt signaling set were successfully identified (64% recall). In addition, the automated extraction found several interactions involving key Wnt-related molecules which were missing or different from those in the canonical diagram, and these were confirmed by manual review of the text. These results suggest that a combination of NLP techniques for information extraction can form a useful first-pass tool for assisting human annotation and maintenance of signal pathway databases. The pipeline software components are freely available on request to the authors. dstates@umich.edu http://stateslab.bioinformatics.med.umich.edu/software.html.
Argumentation Based Joint Learning: A Novel Ensemble Learning Approach

PubMed Central

Xu, Junyi; Yao, Li; Li, Le

2015-01-01

Recently, ensemble learning methods have been widely used to improve classification performance in machine learning. In this paper, we present a novel ensemble learning method: argumentation based multi-agent joint learning (AMAJL), which integrates ideas from multi-agent argumentation, ensemble learning, and association rule mining. In AMAJL, argumentation technology is introduced as an ensemble strategy to integrate multiple base classifiers and generate a high performance ensemble classifier. We design an argumentation framework named Arena as a communication platform for knowledge integration. Through argumentation based joint learning, high quality individual knowledge can be extracted, and thus a refined global knowledge base can be generated and used independently for classification. We perform numerous experiments on multiple public datasets using AMAJL and other benchmark methods. The results demonstrate that our method can effectively extract high quality knowledge for ensemble classifier and improve the performance of classification. PMID:25966359
Bridging knowledge, policies and practices across the ageing and disability fields: a protocol for a scoping review to inform the development of a taxonomy.

PubMed

Nalder, Emily Joan; Putnam, Michelle; Salvador-Carulla, Luis; Spindel, Andria; Batliwalla, Zinnia; Lenton, Erica

2017-10-25

Bridging is a term used to describe activities, or tasks, used to promote collaboration and knowledge exchange across fields. This paper reports the protocol for a scoping review which aims to identify and characterise peer reviewed evidence describing bridging activities, between the ageing and disability fields. The purpose is to clarify the concepts underpinning bridging to inform the development of a taxonomy, and identify research strengths and gaps. A scoping review will be conducted. We will search Medline, Cumulative Index to Nursing and Allied Health Literature, Embase, PsycInfo, Sociological Abstracts and the Cochrane Library, to identify peer reviewed publications (reviews, experimental, observational, qualitative designs and expert commentaries) describing bridging activities. Grey literature, and articles not published in English will be excluded. Two investigators will independently complete article selection and data abstraction to minimise bias. A data extraction form will be iteratively developed and information from each publication will be extracted: (1) bibliographic, (2) methodological, (3) demographic, and (4) bridging information. Qualitative content analysis will be used to describe key concepts related to bridging. To our knowledge, this will be the first scoping review to describe bridging of ageing and disability knowledge, services and policies. The findings will inform the development of a taxonomy to define models of bridging that can be implemented and further evaluated to enable integrated care and improve systems and services for those ageing with disability. Ethics is not required because this is a scoping review of published literature. Findings will be disseminated through stakeholder meetings, conference presentations and peer reviewed publication. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Data Mining and Knowledge Discovery tools for exploiting big Earth-Observation data

NASA Astrophysics Data System (ADS)

Espinoza Molina, D.; Datcu, M.

2015-04-01

The continuous increase in the size of the archives and in the variety and complexity of Earth-Observation (EO) sensors require new methodologies and tools that allow the end-user to access a large image repository, to extract and to infer knowledge about the patterns hidden in the images, to retrieve dynamically a collection of relevant images, and to support the creation of emerging applications (e.g.: change detection, global monitoring, disaster and risk management, image time series, etc.). In this context, we are concerned with providing a platform for data mining and knowledge discovery content from EO archives. The platform's goal is to implement a communication channel between Payload Ground Segments and the end-user who receives the content of the data coded in an understandable format associated with semantics that is ready for immediate exploitation. It will provide the user with automated tools to explore and understand the content of highly complex images archives. The challenge lies in the extraction of meaningful information and understanding observations of large extended areas, over long periods of time, with a broad variety of EO imaging sensors in synergy with other related measurements and data. The platform is composed of several components such as 1.) ingestion of EO images and related data providing basic features for image analysis, 2.) query engine based on metadata, semantics and image content, 3.) data mining and knowledge discovery tools for supporting the interpretation and understanding of image content, 4.) semantic definition of the image content via machine learning methods. All these components are integrated and supported by a relational database management system, ensuring the integrity and consistency of Terabytes of Earth Observation data.
Toward Scalable Trustworthy Computing Using the Human-Physiology-Immunity Metaphor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hively, Lee M; Sheldon, Frederick T

The cybersecurity landscape consists of an ad hoc patchwork of solutions. Optimal cybersecurity is difficult for various reasons: complexity, immense data and processing requirements, resource-agnostic cloud computing, practical time-space-energy constraints, inherent flaws in 'Maginot Line' defenses, and the growing number and sophistication of cyberattacks. This article defines the high-priority problems and examines the potential solution space. In that space, achieving scalable trustworthy computing and communications is possible through real-time knowledge-based decisions about cyber trust. This vision is based on the human-physiology-immunity metaphor and the human brain's ability to extract knowledge from data and information. The article outlines future steps towardmore » scalable trustworthy systems requiring a long-term commitment to solve the well-known challenges.« less
Mining Personal Data Using Smartphones and Wearable Devices: A Survey

PubMed Central

Rehman, Muhammad Habib ur; Liew, Chee Sun; Wah, Teh Ying; Shuja, Junaid; Daghighi, Babak

2015-01-01

The staggering growth in smartphone and wearable device use has led to a massive scale generation of personal (user-specific) data. To explore, analyze, and extract useful information and knowledge from the deluge of personal data, one has to leverage these devices as the data-mining platforms in ubiquitous, pervasive, and big data environments. This study presents the personal ecosystem where all computational resources, communication facilities, storage and knowledge management systems are available in user proximity. An extensive review on recent literature has been conducted and a detailed taxonomy is presented. The performance evaluation metrics and their empirical evidences are sorted out in this paper. Finally, we have highlighted some future research directions and potentially emerging application areas for personal data mining using smartphones and wearable devices. PMID:25688592
HPLC-ESI-QTOF-MS as a powerful analytical tool for characterising phenolic compounds in olive-leaf extracts.

PubMed

Quirantes-Piné, Rosa; Lozano-Sánchez, Jesús; Herrero, Miguel; Ibáñez, Elena; Segura-Carretero, Antonio; Fernández-Gutiérrez, Alberto

2013-01-01

Olea europaea L. leaves may be considered a cheap, easily available natural source of phenolic compounds. In a previous study we evaluated the possibility of obtaining bioactive phenolic compounds from olive leaves by pressurised liquid extraction (PLE) for their use as natural anti-oxidants. The alimentary use of these kinds of extract makes comprehensive knowledge of their composition essential. To undertake a comprehensive characterisation of two olive-leaf extracts obtained by PLE using high-performance liquid chromatography coupled to electrospray ionisation and quadrupole time-of-flight mass spectrometry (HPLC-ESI-QTOF-MS). Olive leaves were extracted by PLE using ethanol and water as extraction solvents at 150°C and 200°C respectively. Separation was carried out in a HPLC system equipped with a C₁₈-column working in a gradient elution programme coupled to ESI-QTOF-MS operating in negative ion mode. This analytical platform was able to detect 48 compounds and tentatively identify 31 different phenolic compounds in these extracts, including secoiridoids, simple phenols, flavonoids, cinnamic-acid derivatives and benzoic acids. Lucidumoside C was also identified for the first time in olive leaves. The coupling of HPLC-ESI-QTOF-MS led to the in-depth characterisation of the olive-leaf extracts on the basis of mass accuracy, true isotopic pattern and tandem mass spectrometry (MS/MS) spectra. We may conclude therefore that this analytical tool is very valuable in the study of phenolic compounds in plant matrices. Copyright © 2012 John Wiley & Sons, Ltd.
Managing biological networks by using text mining and computer-aided curation

NASA Astrophysics Data System (ADS)

Yu, Seok Jong; Cho, Yongseong; Lee, Min-Ho; Lim, Jongtae; Yoo, Jaesoo

2015-11-01

In order to understand a biological mechanism in a cell, a researcher should collect a huge number of protein interactions with experimental data from experiments and the literature. Text mining systems that extract biological interactions from papers have been used to construct biological networks for a few decades. Even though the text mining of literature is necessary to construct a biological network, few systems with a text mining tool are available for biologists who want to construct their own biological networks. We have developed a biological network construction system called BioKnowledge Viewer that can generate a biological interaction network by using a text mining tool and biological taggers. It also Boolean simulation software to provide a biological modeling system to simulate the model that is made with the text mining tool. A user can download PubMed articles and construct a biological network by using the Multi-level Knowledge Emergence Model (KMEM), MetaMap, and A Biomedical Named Entity Recognizer (ABNER) as a text mining tool. To evaluate the system, we constructed an aging-related biological network that consist 9,415 nodes (genes) by using manual curation. With network analysis, we found that several genes, including JNK, AP-1, and BCL-2, were highly related in aging biological network. We provide a semi-automatic curation environment so that users can obtain a graph database for managing text mining results that are generated in the server system and can navigate the network with BioKnowledge Viewer, which is freely available at http://bioknowledgeviewer.kisti.re.kr.
Emerging medical informatics with case-based reasoning for aiding clinical decision in multi-agent system.

PubMed

Shen, Ying; Colloc, Joël; Jacquet-Andrieu, Armelle; Lei, Kai

2015-08-01

This research aims to depict the methodological steps and tools about the combined operation of case-based reasoning (CBR) and multi-agent system (MAS) to expose the ontological application in the field of clinical decision support. The multi-agent architecture works for the consideration of the whole cycle of clinical decision-making adaptable to many medical aspects such as the diagnosis, prognosis, treatment, therapeutic monitoring of gastric cancer. In the multi-agent architecture, the ontological agent type employs the domain knowledge to ease the extraction of similar clinical cases and provide treatment suggestions to patients and physicians. Ontological agent is used for the extension of domain hierarchy and the interpretation of input requests. Case-based reasoning memorizes and restores experience data for solving similar problems, with the help of matching approach and defined interfaces of ontologies. A typical case is developed to illustrate the implementation of the knowledge acquisition and restitution of medical experts. Copyright © 2015 Elsevier Inc. All rights reserved.
Atlasmaker: A Grid-based Implementation of the Hyperatlas

NASA Astrophysics Data System (ADS)

Williams, R.; Djorgovski, S. G.; Feldmann, M. T.; Jacob, J.

2004-07-01

The Atlasmaker project is using Grid technology, in combination with NVO interoperability, to create new knowledge resources in astronomy. The product is a multi-faceted, multi-dimensional, scientifically trusted image atlas of the sky, made by federating many different surveys at different wavelengths, times, resolutions, polarizations, etc. The Atlasmaker software does resampling and mosaicking of image collections, and is well-suited to operate with the Hyperatlas standard. Requests can be satisfied via on-demand computations or by accessing a data cache. Computed data is stored in a distributed virtual file system, such as the Storage Resource Broker (SRB). We expect these atlases to be a new and powerful paradigm for knowledge extraction in astronomy, as well as a magnificent way to build educational resources. The system is being incorporated into the data analysis pipeline of the Palomar-Quest synoptic survey, and is being used to generate all-sky atlases from the 2MASS, SDSS, and DPOSS surveys for joint object detection.
The Analysis of Image Segmentation Hierarchies with a Graph-based Knowledge Discovery System

NASA Technical Reports Server (NTRS)

Tilton, James C.; Cooke, diane J.; Ketkar, Nikhil; Aksoy, Selim

2008-01-01

Currently available pixel-based analysis techniques do not effectively extract the information content from the increasingly available high spatial resolution remotely sensed imagery data. A general consensus is that object-based image analysis (OBIA) is required to effectively analyze this type of data. OBIA is usually a two-stage process; image segmentation followed by an analysis of the segmented objects. We are exploring an approach to OBIA in which hierarchical image segmentations provided by the Recursive Hierarchical Segmentation (RHSEG) software developed at NASA GSFC are analyzed by the Subdue graph-based knowledge discovery system developed by a team at Washington State University. In this paper we discuss out initial approach to representing the RHSEG-produced hierarchical image segmentations in a graphical form understandable by Subdue, and provide results on real and simulated data. We also discuss planned improvements designed to more effectively and completely convey the hierarchical segmentation information to Subdue and to improve processing efficiency.
Toward the integration of expert knowledge and instrumental data to control food processes: application to Camembert-type cheese ripening.

PubMed

Sicard, M; Perrot, N; Leclercq-Perlat, M-N; Baudrit, C; Corrieu, G

2011-01-01

Modeling the cheese ripening process remains a challenge because of its complexity. We still lack the knowledge necessary to understand the interactions that take place at different levels of scale during the process. However, information may be gathered from expert knowledge. Combining this expertise with knowledge extracted from experimental databases may allow a better understanding of the entire ripening process. The aim of this study was to elicit expert knowledge and to check its validity to assess the evolution of organoleptic quality during a dynamic food process: Camembert cheese ripening. Experiments on a pilot scale were carried out at different temperatures and relative humidities to obtain contrasting ripening kinetics. During these experiments, macroscopic evolution was evaluated from an expert's point of view and instrumental measurements were carried out to simultaneously monitor microbiological, physicochemical, and biochemical kinetics. A correlation of 76% was established between the microbiological, physicochemical, and biochemical data and the sensory phases measured according to expert knowledge, highlighting the validity of the experts' measurements. In the future, it is hoped that this expert knowledge may be integrated into food process models to build better decision-aid systems that will make it possible to preserve organoleptic qualities by linking them to other phenomena at the microscopic level. Copyright © 2011 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Aggregating Concept Map Data to Investigate the Knowledge of Beginning CS Students

ERIC Educational Resources Information Center

Mühling, Andreas

2016-01-01

Concept maps have a long history in educational settings as a tool for teaching, learning, and assessing. As an assessment tool, they are predominantly used to extract the structural configuration of learners' knowledge. This article presents an investigation of the knowledge structures of a large group of beginning CS students. The investigation…
Knowledge Preservation for Design of Rocket Systems

NASA Technical Reports Server (NTRS)

Moreman, Douglas

2002-01-01

An engineer at NASA Lewis RC presented a challenge to us at Southern University. Our response to that challenge, stated circa 1993, has evolved into the Knowledge Preservation Project which is here reported. The stated problem was to capture some of the knowledge of retiring NASA engineers and make it useful to younger engineers via computers. We evolved that initial challenge to this - design a system of tools such that, with this system, people might efficiently capture and make available via commonplace computers, deep knowledge of retiring NASA engineers. In the process of proving some of the concepts of this system, we would (and did) capture knowledge from some specific engineers and, so, meet the original challenge along the way to meeting the new. Some of the specific knowledge acquired, particularly that on the RL- 10 engine, was directly relevant to design of rocket engines. We considered and rejected some of the techniques popular in the days we began - specifically "expert systems" and "oral histories". We judged that these old methods had too high a cost per sentence preserved. That cost could be measured in hours of labor of a "knowledge professional". We did spend, particularly in the grant preceding this one, some time creating a couple of "concept maps", one of the latest ideas of the day, but judged this also to be costly in time of a specially trained knowledge-professional. We reasoned that the cost in specialized labor could be lowered if less time were spent being selective about sentences from the engineers and in crafting replacements for those sentences. The trade-off would seem to be that our set of sentences would be less dense in information, but we found a computer-based way around this seeming defect. Our plan, details of which we have been carrying out, was to find methods of extracting information from experts which would be capable of gaining cooperation, and interest, of senior engineers and using their time in a way they would find worthy (and, so, they would give more of their time and recruit time of other engineers as well). We studied these four ways of creating text: 1) the old way, via interviews and discussions - one of our team working with one expert, 2) a group-discussion led by one of the experts themselves and on a topic which inspires interaction of the experts, 3) a spoken dissertation by one expert practiced in giving talks, 4) expropriating, and modifying for our system, some existing reports (such as "oral histories" from the Smithsonian Institution).
Automated DICOM metadata and volumetric anatomical information extraction for radiation dosimetry

NASA Astrophysics Data System (ADS)

Papamichail, D.; Ploussi, A.; Kordolaimi, S.; Karavasilis, E.; Papadimitroulas, P.; Syrgiamiotis, V.; Efstathopoulos, E.

2015-09-01

Patient-specific dosimetry calculations based on simulation techniques have as a prerequisite the modeling of the modality system and the creation of voxelized phantoms. This procedure requires the knowledge of scanning parameters and patients’ information included in a DICOM file as well as image segmentation. However, the extraction of this information is complicated and time-consuming. The objective of this study was to develop a simple graphical user interface (GUI) to (i) automatically extract metadata from every slice image of a DICOM file in a single query and (ii) interactively specify the regions of interest (ROI) without explicit access to the radiology information system. The user-friendly application developed in Matlab environment. The user can select a series of DICOM files and manage their text and graphical data. The metadata are automatically formatted and presented to the user as a Microsoft Excel file. The volumetric maps are formed by interactively specifying the ROIs and by assigning a specific value in every ROI. The result is stored in DICOM format, for data and trend analysis. The developed GUI is easy, fast and and constitutes a very useful tool for individualized dosimetry. One of the future goals is to incorporate a remote access to a PACS server functionality.
Comprehensive automation of the solid phase extraction gas chromatographic mass spectrometric analysis (SPE-GC/MS) of opioids, cocaine, and metabolites from serum and other matrices.

PubMed

Lerch, Oliver; Temme, Oliver; Daldrup, Thomas

2014-07-01

The analysis of opioids, cocaine, and metabolites from blood serum is a routine task in forensic laboratories. Commonly, the employed methods include many manual or partly automated steps like protein precipitation, dilution, solid phase extraction, evaporation, and derivatization preceding a gas chromatography (GC)/mass spectrometry (MS) or liquid chromatography (LC)/MS analysis. In this study, a comprehensively automated method was developed from a validated, partly automated routine method. This was possible by replicating method parameters on the automated system. Only marginal optimization of parameters was necessary. The automation relying on an x-y-z robot after manual protein precipitation includes the solid phase extraction, evaporation of the eluate, derivatization (silylation with N-methyl-N-trimethylsilyltrifluoroacetamide, MSTFA), and injection into a GC/MS. A quantitative analysis of almost 170 authentic serum samples and more than 50 authentic samples of other matrices like urine, different tissues, and heart blood on cocaine, benzoylecgonine, methadone, morphine, codeine, 6-monoacetylmorphine, dihydrocodeine, and 7-aminoflunitrazepam was conducted with both methods proving that the analytical results are equivalent even near the limits of quantification (low ng/ml range). To our best knowledge, this application is the first one reported in the literature employing this sample preparation system.
HPIminer: A text mining system for building and visualizing human protein interaction networks and pathways.

PubMed

Subramani, Suresh; Kalpana, Raja; Monickaraj, Pankaj Moses; Natarajan, Jeyakumar

2015-04-01

The knowledge on protein-protein interactions (PPI) and their related pathways are equally important to understand the biological functions of the living cell. Such information on human proteins is highly desirable to understand the mechanism of several diseases such as cancer, diabetes, and Alzheimer's disease. Because much of that information is buried in biomedical literature, an automated text mining system for visualizing human PPI and pathways is highly desirable. In this paper, we present HPIminer, a text mining system for visualizing human protein interactions and pathways from biomedical literature. HPIminer extracts human PPI information and PPI pairs from biomedical literature, and visualize their associated interactions, networks and pathways using two curated databases HPRD and KEGG. To our knowledge, HPIminer is the first system to build interaction networks from literature as well as curated databases. Further, the new interactions mined only from literature and not reported earlier in databases are highlighted as new. A comparative study with other similar tools shows that the resultant network is more informative and provides additional information on interacting proteins and their associated networks. Copyright © 2015 Elsevier Inc. All rights reserved.
Data mining for blood glucose prediction and knowledge discovery in diabetic patients: the METABO diabetes modeling and management system.

PubMed

Georga, Eleni; Protopappas, Vasilios; Guillen, Alejandra; Fico, Giuseppe; Ardigo, Diego; Arredondo, Maria Teresa; Exarchos, Themis P; Polyzos, Demosthenes; Fotiadis, Dimitrios I

2009-01-01

METABO is a diabetes monitoring and management system which aims at recording and interpreting patient's context, as well as, at providing decision support to both the patient and the doctor. The METABO system consists of (a) a Patient's Mobile Device (PMD), (b) different types of unobtrusive biosensors, (c) a Central Subsystem (CS) located remotely at the hospital and (d) the Control Panel (CP) from which physicians can follow-up their patients and gain also access to the CS. METABO provides a multi-parametric monitoring system which facilitates the efficient and systematic recording of dietary, physical activity, medication and medical information (continuous and discontinuous glucose measurements). Based on all recorded contextual information, data mining schemes that run in the PMD are responsible to model patients' metabolism, predict hypo/hyper-glycaemic events, and provide the patient with short and long-term alerts. In addition, all past and recently-recorded data are analyzed to extract patterns of behavior, discover new knowledge and provide explanations to the physician through the CP. Advanced tools in the CP allow the physician to prescribe personalized treatment plans and frequently quantify patient's adherence to treatment.
Automated extraction of Biomarker information from pathology reports.

PubMed

Lee, Jeongeun; Song, Hyun-Je; Yoon, Eunsil; Park, Seong-Bae; Park, Sung-Hye; Seo, Jeong-Wook; Park, Peom; Choi, Jinwook

2018-05-21

Pathology reports are written in free-text form, which precludes efficient data gathering. We aimed to overcome this limitation and design an automated system for extracting biomarker profiles from accumulated pathology reports. We designed a new data model for representing biomarker knowledge. The automated system parses immunohistochemistry reports based on a "slide paragraph" unit defined as a set of immunohistochemistry findings obtained for the same tissue slide. Pathology reports are parsed using context-free grammar for immunohistochemistry, and using a tree-like structure for surgical pathology. The performance of the approach was validated on manually annotated pathology reports of 100 randomly selected patients managed at Seoul National University Hospital. High F-scores were obtained for parsing biomarker name and corresponding test results (0.999 and 0.998, respectively) from the immunohistochemistry reports, compared to relatively poor performance for parsing surgical pathology findings. However, applying the proposed approach to our single-center dataset revealed information on 221 unique biomarkers, which represents a richer result than biomarker profiles obtained based on the published literature. Owing to the data representation model, the proposed approach can associate biomarker profiles extracted from an immunohistochemistry report with corresponding pathology findings listed in one or more surgical pathology reports. Term variations are resolved by normalization to corresponding preferred terms determined by expanded dictionary look-up and text similarity-based search. Our proposed approach for biomarker data extraction addresses key limitations regarding data representation and can handle reports prepared in the clinical setting, which often contain incomplete sentences, typographical errors, and inconsistent formatting.
Smart homes and ambient assisted living applications: from data to knowledge-empowering or overwhelming older adults? Contribution of the IMIA Smart Homes and Ambiant Assisted Living Working Group.

PubMed

Demiris, G; Thompson, H

2011-01-01

As health care systems face limited resources and workforce shortages to address the complex needs of older adult populations, innovative approaches utilizing information technology can support aging. Smart Home and Ambient Assisted Living (SHAAL) systems utilize advanced and ubiquitous technologies including sensors and other devices that are integrated in the residential infrastructure or wearable, to capture data describing activities of daily living and health related events. This paper highlights how data from SHAAL systems can lead to information and knowledge that ultimately improves clinical outcomes and quality of life for older adults as well as quality of health care services. We conducted a review of personal health record applications specifically for older adults and approaches to using information to improve elder care. We present a framework that showcases how data captured from SHAAL systems can be processed to provide meaningful information that becomes part of a personal health record. Synthesis and visualization of information resulting from SHAAL systems can lead to knowledge and support education, delivery of tailored interventions and if needed, transitions in care. Such actions can involve multiple stakeholders as part of shared decision making. SHAAL systems have the potential to support aging and improve quality of life and decision making for older adults and their families. The framework presented in this paper demonstrates how emphasis needs to be placed into extracting meaningful information from new innovative systems that will support decision making. The challenge for informatics designers and researchers is to facilitate an evolution of SHAAL systems expanding beyond demonstration projects to actual interventions that will improve health care for older adults.

Managing knowledge business intelligence: A cognitive analytic approach

NASA Astrophysics Data System (ADS)

Surbakti, Herison; Ta'a, Azman

2017-10-01

The purpose of this paper is to identify and analyze integration of Knowledge Management (KM) and Business Intelligence (BI) in order to achieve competitive edge in context of intellectual capital. Methodology includes review of literatures and analyzes the interviews data from managers in corporate sector and models established by different authors. BI technologies have strong association with process of KM for attaining competitive advantage. KM have strong influence from human and social factors and turn them to the most valuable assets with efficient system run under BI tactics and technologies. However, the term of predictive analytics is based on the field of BI. Extracting tacit knowledge is a big challenge to be used as a new source for BI to use in analyzing. The advanced approach of the analytic methods that address the diversity of data corpus - structured and unstructured - required a cognitive approach to provide estimative results and to yield actionable descriptive, predictive and prescriptive results. This is a big challenge nowadays, and this paper aims to elaborate detail in this initial work.
Knowledge-based approach to video content classification

NASA Astrophysics Data System (ADS)

Chen, Yu; Wong, Edward K.

2001-01-01

A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.
Knowledge-based approach to video content classification

NASA Astrophysics Data System (ADS)

Chen, Yu; Wong, Edward K.

2000-12-01

A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.
Hyperspectral Feature Detection Onboard the Earth Observing One Spacecraft using Superpixel Segmentation and Endmember Extraction

NASA Technical Reports Server (NTRS)

Thompson, David R.; Bornstein, Benjamin; Bue, Brian D.; Tran, Daniel Q.; Chien, Steve A.; Castano, Rebecca

2012-01-01

We present a demonstration of onboard hyperspectral image processing with the potential to reduce mission downlink requirements. The system detects spectral endmembers and then uses them to map units of surface material. This summarizes the content of the scene, reveals spectral anomalies warranting fast response, and reduces data volume by two orders of magnitude. We have integrated this system into the Autonomous Science craft Experiment for operational use onboard the Earth Observing One (EO-1) Spacecraft. The system does not require prior knowledge about spectra of interest. We report on a series of trial overflights in which identical spacecraft commands are effective for autonomous spectral discovery and mapping for varied target features, scenes and imaging conditions.
Extraction and Classification of Human Gait Features

NASA Astrophysics Data System (ADS)

Ng, Hu; Tan, Wooi-Haw; Tong, Hau-Lee; Abdullah, Junaidi; Komiya, Ryoichi

In this paper, a new approach is proposed for extracting human gait features from a walking human based on the silhouette images. The approach consists of six stages: clearing the background noise of image by morphological opening; measuring of the width and height of the human silhouette; dividing the enhanced human silhouette into six body segments based on anatomical knowledge; applying morphological skeleton to obtain the body skeleton; applying Hough transform to obtain the joint angles from the body segment skeletons; and measuring the distance between the bottom of right leg and left leg from the body segment skeletons. The angles of joints, step-size together with the height and width of the human silhouette are collected and used for gait analysis. The experimental results have demonstrated that the proposed system is feasible and achieved satisfactory results.
Towards Robust Self-Calibration for Handheld 3d Line Laser Scanning

NASA Astrophysics Data System (ADS)

Bleier, M.; Nüchter, A.

2017-11-01

This paper studies self-calibration of a structured light system, which reconstructs 3D information using video from a static consumer camera and a handheld cross line laser projector. Intersections between the individual laser curves and geometric constraints on the relative position of the laser planes are exploited to achieve dense 3D reconstruction. This is possible without any prior knowledge of the movement of the projector. However, inaccurrately extracted laser lines introduce noise in the detected intersection positions and therefore distort the reconstruction result. Furthermore, when scanning objects with specular reflections, such as glossy painted or metalic surfaces, the reflections are often extracted from the camera image as erroneous laser curves. In this paper we investiagte how robust estimates of the parameters of the laser planes can be obtained despite of noisy detections.
Experiences of healthcare professionals of having their significant other admitted to an acute care facility: a qualitative systematic review.

PubMed

Sabyani, Hussamaldeen; Wiechula, Richard; Magarey, Judy; Donnelly, Frank

2017-05-01

Most healthcare professionals at some time will experience having a significant other admitted to an acute care hospital. The knowledge and understanding that these individuals possess because of their professional practice can potentially alter this experience. Expectations of staff and other family members (FMs) can potentially increase the burden on these health professionals. All FMs of patients should have their needs and expectations considered; however, this review specifically addresses what may be unique for healthcare professionals. To synthesize the qualitative evidence on the experiences of healthcare professionals when their significant others are admitted to an acute care hospital. The current review considered studies reporting the experiences of healthcare professionals, specifically registered nurses (RNs) and physicians. The experiences of RNs and physicians when a significant other is admitted to an acute care facility. Qualitative studies that have examined the phenomenon of interest including, but not limited to, designs such as phenomenology and grounded theory. The search strategy aimed to find both published and unpublished studies with no date restrictions. Only studies published in English were considered for inclusion in this review. Qualitative papers selected for retrieval were assessed using the standardized critical appraisal instrument from the Joanna Briggs Institute Qualitative Assessment and Review Instrument (JBI-QARI). Data were extracted from the seven included papers using the standardized data extraction tool from JBI-QARI. The data were synthesized using the JBI approach to meta-synthesis by meta-aggregation using the JBI-QARI software and methods. Seven studies of moderate quality were included in the review. Forty findings were extracted and aggregated to create 10 categories, from which five synthesized findings were derived: CONCLUSION: In contrast to "lay" FMs, health professionals possess additional knowledge and understanding that alter their perceptions and expectations, and the expectations others have of them. This knowledge and understanding can be an advantage in navigating a complex health system but may also result in an additional burden such as role conflict.
Comparative analysis of prodigiosin isolated from endophyte Serratia marcescens.

PubMed

Khanam, B; Chandra, R

2018-03-01

Extraction of pigments from endophytes is an uphill task. Up till now, there are no efficient methods available to extract the maximum amount of prodigiosin from Serratia marcescens. This is one of the important endophytes of Beta vulgaris L. The present work was carried out for the comparative study of six different extraction methods such as homogenization, ultrasonication, freezing and thawing, heat treatment, organic solvents and inorganic acids to evaluate the efficiency of prodigiosin yield. Our results demonstrated that highest extraction was observed in ultrasonication (98·1 ± 1·7%) while the lowest extraction by freezing and thawing (31·8 ± 3·8%) methods. However, thin layer chromatography, high-performance liquid chromatography and Fourier transform infrared data suggest that bioactive pigment in the extract was prodigiosin. To the best of our knowledge, this is the first comprehensive study of extraction methods and identification and purification of prodigiosin from cell biomass of Ser. marcescens isolated from Beta vulgaris L. The prodigiosin family is a potent drug with anticancer, antimalarial, antibacterial, antifungal, antiproliferative and immunosuppressive activities. Moreover, it has immense potential in pharmaceutical, food and textile industries. For the industrial perspective, it is essential to achieve purified, high yield and cost-effective extraction of prodigiosin. To the best of our knowledge, this is the first comprehensive study on prodigiosin extraction and also the first report on endophyte Serratia marcescens isolated from Beta vulgaris L. The significance of our results is to extract high amount and good quality prodigiosin for commercial application. © 2017 The Society for Applied Microbiology.
Interactive Cohort Identification of Sleep Disorder Patients Using Natural Language Processing and i2b2.

PubMed

Chen, W; Kowatch, R; Lin, S; Splaingard, M; Huang, Y

2015-01-01

Nationwide Children's Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use.
Interactive Cohort Identification of Sleep Disorder Patients Using Natural Language Processing and i2b2

PubMed Central

Chen, W.; Kowatch, R.; Lin, S.; Splaingard, M.

2015-01-01

Summary Nationwide Children’s Hospital established an i2b2 (Informatics for Integrating Biology & the Bedside) application for sleep disorder cohort identification. Discrete data were gleaned from semistructured sleep study reports. The system showed to work more efficiently than the traditional manual chart review method, and it also enabled searching capabilities that were previously not possible. Objective We report on the development and implementation of the sleep disorder i2b2 cohort identification system using natural language processing of semi-structured documents. Methods We developed a natural language processing approach to automatically parse concepts and their values from semi-structured sleep study documents. Two parsers were developed: a regular expression parser for extracting numeric concepts and a NLP based tree parser for extracting textual concepts. Concepts were further organized into i2b2 ontologies based on document structures and in-domain knowledge. Results 26,550 concepts were extracted with 99% being textual concepts. 1.01 million facts were extracted from sleep study documents such as demographic information, sleep study lab results, medications, procedures, diagnoses, among others. The average accuracy of terminology parsing was over 83% when comparing against those by experts. The system is capable of capturing both standard and non-standard terminologies. The time for cohort identification has been reduced significantly from a few weeks to a few seconds. Conclusion Natural language processing was shown to be powerful for quickly converting large amount of semi-structured or unstructured clinical data into discrete concepts, which in combination of intuitive domain specific ontologies, allows fast and effective interactive cohort identification through the i2b2 platform for research and clinical use. PMID:26171080
Knowledge guided information fusion for segmentation of multiple sclerosis lesions in MRI images

NASA Astrophysics Data System (ADS)

Zhu, Chaozhe; Jiang, Tianzi

2003-05-01

In this work, T1-, T2- and PD-weighted MR images of multiple sclerosis (MS) patients, providing information on the properties of tissues from different aspects, are treated as three independent information sources for the detection and segmentation of MS lesions. Based on information fusion theory, a knowledge guided information fusion framework is proposed to accomplish 3-D segmentation of MS lesions. This framework consists of three parts: (1) information extraction, (2) information fusion, and (3) decision. Information provided by different spectral images is extracted and modeled separately in each spectrum using fuzzy sets, aiming at managing the uncertainty and ambiguity in the images due to noise and partial volume effect. In the second part, the possible fuzzy map of MS lesions in each spectral image is constructed from the extracted information under the guidance of experts' knowledge, and then the final fuzzy map of MS lesions is constructed through the fusion of the fuzzy maps obtained from different spectrum. Finally, 3-D segmentation of MS lesions is derived from the final fuzzy map. Experimental results show that this method is fast and accurate.
Integrated Bio-Entity Network: A System for Biological Knowledge Discovery

PubMed Central

Bell, Lindsey; Chowdhary, Rajesh; Liu, Jun S.; Niu, Xufeng; Zhang, Jinfeng

2011-01-01

A significant part of our biological knowledge is centered on relationships between biological entities (bio-entities) such as proteins, genes, small molecules, pathways, gene ontology (GO) terms and diseases. Accumulated at an increasing speed, the information on bio-entity relationships is archived in different forms at scattered places. Most of such information is buried in scientific literature as unstructured text. Organizing heterogeneous information in a structured form not only facilitates study of biological systems using integrative approaches, but also allows discovery of new knowledge in an automatic and systematic way. In this study, we performed a large scale integration of bio-entity relationship information from both databases containing manually annotated, structured information and automatic information extraction of unstructured text in scientific literature. The relationship information we integrated in this study includes protein–protein interactions, protein/gene regulations, protein–small molecule interactions, protein–GO relationships, protein–pathway relationships, and pathway–disease relationships. The relationship information is organized in a graph data structure, named integrated bio-entity network (IBN), where the vertices are the bio-entities and edges represent their relationships. Under this framework, graph theoretic algorithms can be designed to perform various knowledge discovery tasks. We designed breadth-first search with pruning (BFSP) and most probable path (MPP) algorithms to automatically generate hypotheses—the indirect relationships with high probabilities in the network. We show that IBN can be used to generate plausible hypotheses, which not only help to better understand the complex interactions in biological systems, but also provide guidance for experimental designs. PMID:21738677
Information, intelligence, and interface: the pillars of a successful medical information system.

PubMed

Hadzikadic, M; Harrington, A L; Bohren, B F

1995-01-01

This paper addresses three key issues facing developers of clinical and/or research medical information systems. 1. INFORMATION. The basic function of every database is to store information about the phenomenon under investigation. There are many ways to organize information in a computer; however only a few will prove optimal for any real life situation. Computer Science theory has developed several approaches to database structure, with relational theory leading in popularity among end users [8]. Strict conformance to the rules of relational database design rewards the user with consistent data and flexible access to that data. A properly defined database structure minimizes redundancy i.e.,multiple storage of the same information. Redundancy introduces problems when updating a database, since the repeated value has to be updated in all locations--missing even a single value corrupts the whole database, and incorrect reports are produced [8]. To avoid such problems, relational theory offers a formal mechanism for determining the number and content of data files. These files not only preserve the conceptual schema of the application domain, but allow a virtually unlimited number of reports to be efficiently generated. 2. INTELLIGENCE. Flexible access enables the user to harvest additional value from collected data. This value is usually gained via reports defined at the time of database design. Although these reports are indispensable, with proper tools more information can be extracted from the database. For example, machine learning, a sub-discipline of artificial intelligence, has been successfully used to extract knowledge from databases of varying size by uncovering a correlation among fields and records[1-6, 9]. This knowledge, represented in the form of decision trees, production rules, and probabilistic networks, clearly adds a flavor of intelligence to the data collection and manipulation system. 3. INTERFACE. Despite the obvious importance of collecting data and extracting knowledge, current systems often impede these processes. Problems stem from the lack of user friendliness and functionality. To overcome these problems, several features of a successful human-computer interface have been identified [7], including the following "golden" rules of dialog design [7]: consistency, use of shortcuts for frequent users, informative feedback, organized sequence of actions, simple error handling, easy reversal of actions, user-oriented focus of control, and reduced short-term memory load. To this list of rules, we added visual representation of both data and query results, since our experience has demonstrated that users react much more positively to visual rather than textual information. In our design of the Orthopaedic Trauma Registry--under development at the Carolinas Medical Center--we have made every effort to follow the above rules. The results were rewarding--the end users actually not only want to use the product, but also to participate in its development.
A collaborative filtering-based approach to biomedical knowledge discovery.

PubMed

Lever, Jake; Gakkhar, Sitanshu; Gottlieb, Michael; Rashnavadi, Tahereh; Lin, Santina; Siu, Celia; Smith, Maia; Jones, Martin R; Krzywinski, Martin; Jones, Steven J M; Wren, Jonathan

2018-02-15

The increase in publication rates makes it challenging for an individual researcher to stay abreast of all relevant research in order to find novel research hypotheses. Literature-based discovery methods make use of knowledge graphs built using text mining and can infer future associations between biomedical concepts that will likely occur in new publications. These predictions are a valuable resource for researchers to explore a research topic. Current methods for prediction are based on the local structure of the knowledge graph. A method that uses global knowledge from across the knowledge graph needs to be developed in order to make knowledge discovery a frequently used tool by researchers. We propose an approach based on the singular value decomposition (SVD) that is able to combine data from across the knowledge graph through a reduced representation. Using cooccurrence data extracted from published literature, we show that SVD performs better than the leading methods for scoring discoveries. We also show the diminishing predictive power of knowledge discovery as we compare our predictions with real associations that appear further into the future. Finally, we examine the strengths and weaknesses of the SVD approach against another well-performing system using several predicted associations. All code and results files for this analysis can be accessed at https://github.com/jakelever/knowledgediscovery. sjones@bcgsc.ca. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Extracting ballistic forensic intelligence: microstamped firearms deliver data for illegal firearm traffic mapping: technology, implementation, and applications

NASA Astrophysics Data System (ADS)

Ohar, Orest P.; Lizotte, Todd E.

2009-08-01

Over the years law enforcement has become increasingly complex, driving a need for a better level of organization of knowledge within policing. The use of COMPSTAT or other Geospatial Information Systems (GIS) for crime mapping and analysis has provided opportunities for careful analysis of crime trends. By identifying hotspots within communities, data collected and entered into these systems can be analyzed to determine how, when and where law enforcement assets can be deployed efficiently. This paper will introduce in detail, a powerful new law enforcement and forensic investigative technology called Intentional Firearm Microstamping (IFM). Once embedded and deployed into firearms, IFM will provide data for identifying and tracking the sources of illegally trafficked firearms within the borders of the United States and across the border with Mexico. Intentional Firearm Microstamping is a micro code technology that leverages a laser based micromachining process to form optimally located, microscopic "intentional structures and marks" on components within a firearm. Thus when the firearm is fired, these IFM structures transfer an identifying tracking code onto the expended cartridge that is ejected from the firearm. Intentional Firearm Microstamped structures are laser micromachined alpha numeric and encoded geometric tracking numbers, linked to the serial number of the firearm. IFM codes can be extracted quickly and used without the need to recover the firearm. Furthermore, through the process of extraction, IFM codes can be quantitatively verified to a higher level of certainty as compared to traditional forensic matching techniques. IFM provides critical intelligence capable of identifying straw purchasers, trafficking routes and networks across state borders and can be used on firearms illegally exported across international borders. This paper will outline IFM applications for supporting intelligence led policing initiatives, IFM implementation strategies, describe the how IFM overcomes the firearms stochastic properties and explain the code extraction technologies that can be used by forensic investigators and discuss the applications where the extracted data will benefit geospatial information systems for forensic intelligence benefit.
Normalization and standardization of electronic health records for high-throughput phenotyping: the SHARPn consortium

PubMed Central

Pathak, Jyotishman; Bailey, Kent R; Beebe, Calvin E; Bethard, Steven; Carrell, David S; Chen, Pei J; Dligach, Dmitriy; Endle, Cory M; Hart, Lacey A; Haug, Peter J; Huff, Stanley M; Kaggal, Vinod C; Li, Dingcheng; Liu, Hongfang; Marchant, Kyle; Masanz, James; Miller, Timothy; Oniki, Thomas A; Palmer, Martha; Peterson, Kevin J; Rea, Susan; Savova, Guergana K; Stancl, Craig R; Sohn, Sunghwan; Solbrig, Harold R; Suesse, Dale B; Tao, Cui; Taylor, David P; Westberg, Les; Wu, Stephen; Zhuo, Ning; Chute, Christopher G

2013-01-01

Research objective To develop scalable informatics infrastructure for normalization of both structured and unstructured electronic health record (EHR) data into a unified, concept-based model for high-throughput phenotype extraction. Materials and methods Software tools and applications were developed to extract information from EHRs. Representative and convenience samples of both structured and unstructured data from two EHR systems—Mayo Clinic and Intermountain Healthcare—were used for development and validation. Extracted information was standardized and normalized to meaningful use (MU) conformant terminology and value set standards using Clinical Element Models (CEMs). These resources were used to demonstrate semi-automatic execution of MU clinical-quality measures modeled using the Quality Data Model (QDM) and an open-source rules engine. Results Using CEMs and open-source natural language processing and terminology services engines—namely, Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) and Common Terminology Services (CTS2)—we developed a data-normalization platform that ensures data security, end-to-end connectivity, and reliable data flow within and across institutions. We demonstrated the applicability of this platform by executing a QDM-based MU quality measure that determines the percentage of patients between 18 and 75 years with diabetes whose most recent low-density lipoprotein cholesterol test result during the measurement year was <100 mg/dL on a randomly selected cohort of 273 Mayo Clinic patients. The platform identified 21 and 18 patients for the denominator and numerator of the quality measure, respectively. Validation results indicate that all identified patients meet the QDM-based criteria. Conclusions End-to-end automated systems for extracting clinical information from diverse EHR systems require extensive use of standardized vocabularies and terminologies, as well as robust information models for storing, discovering, and processing that information. This study demonstrates the application of modular and open-source resources for enabling secondary use of EHR data through normalization into standards-based, comparable, and consistent format for high-throughput phenotyping to identify patient cohorts. PMID:24190931
Morphometric information to reduce the semantic gap in the characterization of microscopic images of thyroid nodules.

PubMed

Macedo, Alessandra A; Pessotti, Hugo C; Almansa, Luciana F; Felipe, Joaquim C; Kimura, Edna T

2016-07-01

The analyses of several systems for medical-imaging processing typically support the extraction of image attributes, but do not comprise some information that characterizes images. For example, morphometry can be applied to find new information about the visual content of an image. The extension of information may result in knowledge. Subsequently, results of mappings can be applied to recognize exam patterns, thus improving the accuracy of image retrieval and allowing a better interpretation of exam results. Although successfully applied in breast lesion images, the morphometric approach is still poorly explored in thyroid lesions due to the high subjectivity thyroid examinations. This paper presents a theoretical-practical study, considering Computer Aided Diagnosis (CAD) and Morphometry, to reduce the semantic discontinuity between medical image features and human interpretation of image content. The proposed method aggregates the content of microscopic images characterized by morphometric information and other image attributes extracted by traditional object extraction algorithms. This method carries out segmentation, feature extraction, image labeling and classification. Morphometric analysis was included as an object extraction method in order to verify the improvement of its accuracy for automatic classification of microscopic images. To validate this proposal and verify the utility of morphometric information to characterize thyroid images, a CAD system was created to classify real thyroid image-exams into Papillary Cancer, Goiter and Non-Cancer. Results showed that morphometric information can improve the accuracy and precision of image retrieval and the interpretation of results in computer-aided diagnosis. For example, in the scenario where all the extractors are combined with the morphometric information, the CAD system had its best performance (70% of precision in Papillary cases). Results signalized a positive use of morphometric information from images to reduce semantic discontinuity between human interpretation and image characterization. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
An in vitro study on the risk of non-allergic type-I like hypersensitivity to Momordica charantia.

PubMed

Sagkan, Rahsan Ilikci

2013-10-26

Momordica charantia (MC) is a tropical plant that is extensively used in folk medicine. However, the knowledge about side effects of this plant is relatively little according to knowledge about its therapeutic effects. The aim of this study is to reveal the effects of non-allergic type-I like hypersensitivity to MC by an experiment which was designed in vitro. In the present study, the expression of CD63 and CD203c on peripheral blood basophils against different dilutions of MC extracts was measured using flow cytometry and compared with one another. In addition to this, intra-assay CV's of testing extracts were calculated for precision on reproducibility of test results. It was observed that the fruit extract of MC at 1/100 and 1/1000 dilutions significantly increased active basophils compared to same extract at 1/10000 dilution. In conclusion, Momordica charantia may elicit a non-allergic type-I like hypersensitivity reaction in especially susceptible individuals.
About Distributed Simulation-based Optimization of Forming Processes using a Grid Architecture

NASA Astrophysics Data System (ADS)

Grauer, Manfred; Barth, Thomas

2004-06-01

Permanently increasing complexity of products and their manufacturing processes combined with a shorter "time-to-market" leads to more and more use of simulation and optimization software systems for product design. Finding a "good" design of a product implies the solution of computationally expensive optimization problems based on the results of simulation. Due to the computational load caused by the solution of these problems, the requirements on the Information&Telecommunication (IT) infrastructure of an enterprise or research facility are shifting from stand-alone resources towards the integration of software and hardware resources in a distributed environment for high-performance computing. Resources can either comprise software systems, hardware systems, or communication networks. An appropriate IT-infrastructure must provide the means to integrate all these resources and enable their use even across a network to cope with requirements from geographically distributed scenarios, e.g. in computational engineering and/or collaborative engineering. Integrating expert's knowledge into the optimization process is inevitable in order to reduce the complexity caused by the number of design variables and the high dimensionality of the design space. Hence, utilization of knowledge-based systems must be supported by providing data management facilities as a basis for knowledge extraction from product data. In this paper, the focus is put on a distributed problem solving environment (PSE) capable of providing access to a variety of necessary resources and services. A distributed approach integrating simulation and optimization on a network of workstations and cluster systems is presented. For geometry generation the CAD-system CATIA is used which is coupled with the FEM-simulation system INDEED for simulation of sheet-metal forming processes and the problem solving environment OpTiX for distributed optimization.
Ontology-guided data preparation for discovering genotype-phenotype relationships.

PubMed

Coulet, Adrien; Smaïl-Tabbone, Malika; Benlian, Pascale; Napoli, Amedeo; Devignes, Marie-Dominique

2008-04-25

Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bio-ontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning. This paper presents a method that exploits bio-ontologies for guiding data selection within the preparation step of the KDD process. We propose three scenarios in which domain knowledge and ontology elements such as subsumption, properties, class descriptions, are taken into account for data selection, before the data mining step. Each of these scenarios is illustrated within a case-study relative to the search of genotype-phenotype relationships in a familial hypercholesterolemia dataset. The guiding of data selection based on domain knowledge is analysed and shows a direct influence on the volume and significance of the data mining results. The method proposed in this paper is an efficient alternative to numerical methods for data selection based on domain knowledge. In turn, the results of this study may be reused in ontology modelling and data integration.

Proceedings of the Lunar Materials Technology Symposium

NASA Technical Reports Server (NTRS)

1992-01-01

The meeting was organized around a possible lunar outpost scenario, featuring industrial technologies, systems, and components applicable to the extraction, processing, and fabrication of local materials. Acknowledged space resources experts as well as investigators from outside the field whose knowledge could be applied to space development activities were brought together. Presentations came from a variety of specialists in fields such as minerals processing, environmental control, and communications. The sessions of the symposium were divided into the following areas: resource characterization, energy management, materials processing, environment control, and automation and communications.
FIR: An Effective Scheme for Extracting Useful Metadata from Social Media.

PubMed

Chen, Long-Sheng; Lin, Zue-Cheng; Chang, Jing-Rong

2015-11-01

Recently, the use of social media for health information exchange is expanding among patients, physicians, and other health care professionals. In medical areas, social media allows non-experts to access, interpret, and generate medical information for their own care and the care of others. Researchers paid much attention on social media in medical educations, patient-pharmacist communications, adverse drug reactions detection, impacts of social media on medicine and healthcare, and so on. However, relatively few papers discuss how to extract useful knowledge from a huge amount of textual comments in social media effectively. Therefore, this study aims to propose a Fuzzy adaptive resonance theory network based Information Retrieval (FIR) scheme by combining Fuzzy adaptive resonance theory (ART) network, Latent Semantic Indexing (LSI), and association rules (AR) discovery to extract knowledge from social media. In our FIR scheme, Fuzzy ART network firstly has been employed to segment comments. Next, for each customer segment, we use LSI technique to retrieve important keywords. Then, in order to make the extracted keywords understandable, association rules mining is presented to organize these extracted keywords to build metadata. These extracted useful voices of customers will be transformed into design needs by using Quality Function Deployment (QFD) for further decision making. Unlike conventional information retrieval techniques which acquire too many keywords to get key points, our FIR scheme can extract understandable metadata from social media.
Data Mining.

ERIC Educational Resources Information Center

Benoit, Gerald

2002-01-01

Discusses data mining (DM) and knowledge discovery in databases (KDD), taking the view that KDD is the larger view of the entire process, with DM emphasizing the cleaning, warehousing, mining, and visualization of knowledge discovery in databases. Highlights include algorithms; users; the Internet; text mining; and information extraction.…
Automatic 3D power line reconstruction of multi-angular imaging power line inspection system

NASA Astrophysics Data System (ADS)

Zhang, Wuming; Yan, Guangjian; Wang, Ning; Li, Qiaozhi; Zhao, Wei

2007-06-01

We develop a multi-angular imaging power line inspection system. Its main objective is to monitor the relative distance between high voltage power line and around objects, and alert if the warning threshold is exceeded. Our multi-angular imaging power line inspection system generates DSM of the power line passage, which comprises ground surface and ground objects, for example trees and houses, etc. For the purpose of revealing the dangerous regions, where ground objects are too close to the power line, 3D power line information should be extracted at the same time. In order to improve the automation level of extraction, reduce labour costs and human errors, an automatic 3D power line reconstruction method is proposed and implemented. It can be achieved by using epipolar constraint and prior knowledge of pole tower's height. After that, the proper 3D power line information can be obtained by space intersection using found homologous projections. The flight experiment result shows that the proposed method can successfully reconstruct 3D power line, and the measurement accuracy of the relative distance satisfies the user requirement of 0.5m.
Biotechnology for Solar System Exploration

NASA Astrophysics Data System (ADS)

Steele, A.; Maule, J.; Toporski, J.; Parro-Garcia, V.; Briones, C.; Schweitzer, M.; McKay, D.

With the advent of a new era of astrobiology missions in the exploration of the solar system and the search for evidence of life elsewhere, we present a new approach to this goal, the integration of biotechnology. We have reviewed the current list of biotechnology techniques, which are applicable to miniaturization, automatization and integration into a combined flight platform. Amongst the techniques reviewed are- The uses of antibodies- Fluorescent detection strategies- Protein and DNA chip technology- Surface plasmon resonance and its relation to other techniques- Micro electronic machining (MEMS where applicable to biologicalsystems)- nanotechnology (e.g. molecular motors)- Lab-on-a-chip technology (including PCR)- Mass spectrometry (i.e. MALDI-TOF)- Fluid handling and extraction technologies- Chemical Force Microscopy (CFM)- Raman Spectroscopy We have begun to integrate this knowledge into a single flight instrument approach for the sole purpose of combining several mutually confirming tests for life, organic and/or microbial contamination, as well as prebiotic and abiotic organic chemicals. We will present several innovative designs for new instrumentation including pro- engineering design drawings of a protein chip reader for space flight and fluid handling strategies. We will also review the use of suitable extraction methodologies for use on different solar system bodies.
Information Extraction Using Controlled English to Support Knowledge-Sharing and Decision-Making

DTIC Science & Technology

2012-06-01

or language variants. CE-based information extraction will greatly facilitate the processes in the cognitive and social domains that enable forces...terminology or language variants. CE-based information extraction will greatly facilitate the processes in the cognitive and social domains that...processor is run to turn the atomic CE into a more “ stylistically felicitous” CE, using techniques such as: aggregating all information about an entity
An Interval Type-2 Neural Fuzzy System for Online System Identification and Feature Elimination.

PubMed

Lin, Chin-Teng; Pal, Nikhil R; Wu, Shang-Lin; Liu, Yu-Ting; Lin, Yang-Yin

2015-07-01

We propose an integrated mechanism for discarding derogatory features and extraction of fuzzy rules based on an interval type-2 neural fuzzy system (NFS)-in fact, it is a more general scheme that can discard bad features, irrelevant antecedent clauses, and even irrelevant rules. High-dimensional input variable and a large number of rules not only enhance the computational complexity of NFSs but also reduce their interpretability. Therefore, a mechanism for simultaneous extraction of fuzzy rules and reducing the impact of (or eliminating) the inferior features is necessary. The proposed approach, namely an interval type-2 Neural Fuzzy System for online System Identification and Feature Elimination (IT2NFS-SIFE), uses type-2 fuzzy sets to model uncertainties associated with information and data in designing the knowledge base. The consequent part of the IT2NFS-SIFE is of Takagi-Sugeno-Kang type with interval weights. The IT2NFS-SIFE possesses a self-evolving property that can automatically generate fuzzy rules. The poor features can be discarded through the concept of a membership modulator. The antecedent and modulator weights are learned using a gradient descent algorithm. The consequent part weights are tuned via the rule-ordered Kalman filter algorithm to enhance learning effectiveness. Simulation results show that IT2NFS-SIFE not only simplifies the system architecture by eliminating derogatory/irrelevant antecedent clauses, rules, and features but also maintains excellent performance.
Opinion mining feature-level using Naive Bayes and feature extraction based analysis dependencies

NASA Astrophysics Data System (ADS)

Sanda, Regi; Baizal, Z. K. Abdurahman; Nhita, Fhira

2015-12-01

Development of internet and technology, has major impact and providing new business called e-commerce. Many e-commerce sites that provide convenience in transaction, and consumers can also provide reviews or opinions on products that purchased. These opinions can be used by consumers and producers. Consumers to know the advantages and disadvantages of particular feature of the product. Procuders can analyse own strengths and weaknesses as well as it's competitors products. Many opinions need a method that the reader can know the point of whole opinion. The idea emerged from review summarization that summarizes the overall opinion based on sentiment and features contain. In this study, the domain that become the main focus is about the digital camera. This research consisted of four steps 1) giving the knowledge to the system to recognize the semantic orientation of an opinion 2) indentify the features of product 3) indentify whether the opinion gives a positive or negative 4) summarizing the result. In this research discussed the methods such as Naï;ve Bayes for sentiment classification, and feature extraction algorithm based on Dependencies Analysis, which is one of the tools in Natural Language Processing (NLP) and knowledge based dictionary which is useful for handling implicit features. The end result of research is a summary that contains a bunch of reviews from consumers on the features and sentiment. With proposed method, accuration for sentiment classification giving 81.2 % for positive test data, 80.2 % for negative test data, and accuration for feature extraction reach 90.3 %.
Application of the EVEX resource to event extraction and network construction: Shared Task entry and result analysis

PubMed Central

2015-01-01

Background Modern methods for mining biomolecular interactions from literature typically make predictions based solely on the immediate textual context, in effect a single sentence. No prior work has been published on extending this context to the information automatically gathered from the whole biomedical literature. Thus, our motivation for this study is to explore whether mutually supporting evidence, aggregated across several documents can be utilized to improve the performance of the state-of-the-art event extraction systems. In this paper, we describe our participation in the latest BioNLP Shared Task using the large-scale text mining resource EVEX. We participated in the Genia Event Extraction (GE) and Gene Regulation Network (GRN) tasks with two separate systems. In the GE task, we implemented a re-ranking approach to improve the precision of an existing event extraction system, incorporating features from the EVEX resource. In the GRN task, our system relied solely on the EVEX resource and utilized a rule-based conversion algorithm between the EVEX and GRN formats. Results In the GE task, our re-ranking approach led to a modest performance increase and resulted in the first rank of the official Shared Task results with 50.97% F-score. Additionally, in this paper we explore and evaluate the usage of distributed vector representations for this challenge. In the GRN task, we ranked fifth in the official results with a strict/relaxed SER score of 0.92/0.81 respectively. To try and improve upon these results, we have implemented a novel machine learning based conversion system and benchmarked its performance against the original rule-based system. Conclusions For the GRN task, we were able to produce a gene regulatory network from the EVEX data, warranting the use of such generic large-scale text mining data in network biology settings. A detailed performance and error analysis provides more insight into the relatively low recall rates. In the GE task we demonstrate that both the re-ranking approach and the word vectors can provide slight performance improvement. A manual evaluation of the re-ranking results pinpoints some of the challenges faced in applying large-scale text mining knowledge to event extraction. PMID:26551766
Towards a decision support system for hand dermatology.

PubMed

Mazzola, Luca; Cavazzina, Alice; Pinciroli, Francesco; Bonacina, Stefano; Pigatto, Paolo; Ayala, Fabio; De Pità, Ornella; Marceglia, Sara

2014-01-01

The complexity of the medical diagnosis is faced by practitioners relying mainly on their experiences. This can be acquired during daily practices and on-the-job training. Given the complexity and extensiveness of the subject, supporting tools that include knowledge extracted by highly specialized practitioners can be valuable. In the present work, a Decision Support System (DSS) for hand dermatology was developed based on data coming from a Visit Report Form (VRF). Using a Bayesian approach and factors significance difference over the population average for the case, we demonstrated the potentiality of creating an enhanced VRF that include a diagnoses distribution probability based on the DSS rules applied for the specific patient situation.
Neutron and positron techniques for fluid transfer system analysis and remote temperature and stress measurement

NASA Astrophysics Data System (ADS)

Stewart, P. A. E.

1987-05-01

Present and projected applications of penetrating radiation techniques to gas turbine research and development are considered. Approaches discussed include the visualization and measurement of metal component movement using high energy X-rays, the measurement of metal temperatures using epithermal neutrons, the measurement of metal stresses using thermal neutron diffraction, and the visualization and measurement of oil and fuel systems using either cold neutron radiography or emitting isotope tomography. By selecting the radiation appropriate to the problem, the desired data can be probed for and obtained through imaging or signal acquisition, and the necessary information can then be extracted with digital image processing or knowledge based image manipulation and pattern recognition.
A novel hand-type detection technique with fingerprint sensor

NASA Astrophysics Data System (ADS)

Abe, Narishige; Shinzaki, Takashi

2013-05-01

In large-scale biometric authentication systems such as the US-Visit (USA), a 10-fingerprints scanner which simultaneously captures four fingerprints is used. In traditional systems, specific hand-types (left or right) are indicated, but it is difficult to detect hand-type due to the hand rotation and the opening and closing of fingers. In this paper, we evaluated features that were extracted from hand images (which were captured by a general optical scanner) that are considered to be effective for detecting hand-type. Furthermore, we extended the knowledge to real fingerprint images, and evaluated the accuracy with which it detects hand-type. We obtained an accuracy of about 80% with only three fingers (index, middle, ring finger).
Hierarchical representation and machine learning from faulty jet engine behavioral examples to detect real time abnormal conditions

NASA Technical Reports Server (NTRS)

Gupta, U. K.; Ali, M.

1988-01-01

The theoretical basis and operation of LEBEX, a machine-learning system for jet-engine performance monitoring, are described. The behavior of the engine is modeled in terms of four parameters (the rotational speeds of the high- and low-speed sections and the exhaust and combustion temperatures), and parameter variations indicating malfunction are transformed into structural representations involving instances and events. LEBEX extracts descriptors from a set of training data on normal and faulty engines, represents them hierarchically in a knowledge base, and uses them to diagnose and predict faults on a real-time basis. Diagrams of the system architecture and printouts of typical results are shown.
An Optimized DNA Analysis Workflow for the Sampling, Extraction, and Concentration of DNA obtained from Archived Latent Fingerprints.

PubMed

Solomon, April D; Hytinen, Madison E; McClain, Aryn M; Miller, Marilyn T; Dawson Cruz, Tracey

2018-01-01

DNA profiles have been obtained from fingerprints, but there is limited knowledge regarding DNA analysis from archived latent fingerprints-touch DNA "sandwiched" between adhesive and paper. Thus, this study sought to comparatively analyze a variety of collection and analytical methods in an effort to seek an optimized workflow for this specific sample type. Untreated and treated archived latent fingerprints were utilized to compare different biological sampling techniques, swab diluents, DNA extraction systems, DNA concentration practices, and post-amplification purification methods. Archived latent fingerprints disassembled and sampled via direct cutting, followed by DNA extracted using the QIAamp® DNA Investigator Kit, and concentration with Centri-Sep™ columns increased the odds of obtaining an STR profile. Using the recommended DNA workflow, 9 of the 10 samples provided STR profiles, which included 7-100% of the expected STR alleles and two full profiles. Thus, with carefully selected procedures, archived latent fingerprints can be a viable DNA source for criminal investigations including cold/postconviction cases. © 2017 American Academy of Forensic Sciences.
Multi-Excitonic Quantum Dot Molecules

NASA Astrophysics Data System (ADS)

Scheibner, M.; Stinaff, E. A.; Doty, M. F.; Ware, M. E.; Bracker, A. S.; Gammon, D.; Ponomarev, I. V.; Reinecke, T. L.; Korenev, V. L.

2006-03-01

With the ability to create coupled pairs of quantum dots, the next step towards the realization of semiconductor based quantum information processing devices can be taken. However, so far little knowledge has been gained on these artificial molecules. Our photoluminescence experiments on single InAs/GaAs quantum dot molecules provide the systematics of coupled quantum dots by delineating the spectroscopic features of several key charge configurations in such quantum systems, including X, X^+,X^2+, XX, XX^+ (with X being the neutral exciton). We extract general rules which determine the formation of molecular states of coupled quantum dots. These include the fact that quantum dot molecules provide the possibility to realize various spin configurations and to switch the electron hole exchange interaction on and off by shifting charges inside the molecule. This knowledge will be valuable in developing implementations for quantum information processing.
Computer based extraction of phenoptypic features of human congenital anomalies from the digital literature with natural language processing techniques.

PubMed

Karakülah, Gökhan; Dicle, Oğuz; Koşaner, Ozgün; Suner, Aslı; Birant, Çağdaş Can; Berber, Tolga; Canbek, Sezin

2014-01-01

The lack of laboratory tests for the diagnosis of most of the congenital anomalies renders the physical examination of the case crucial for the diagnosis of the anomaly; and the cases in the diagnostic phase are mostly being evaluated in the light of the literature knowledge. In this respect, for accurate diagnosis, ,it is of great importance to provide the decision maker with decision support by presenting the literature knowledge about a particular case. Here, we demonstrated a methodology for automated scanning and determining of the phenotypic features from the case reports related to congenital anomalies in the literature with text and natural language processing methods, and we created a framework of an information source for a potential diagnostic decision support system for congenital anomalies.
Enhancing clinical concept extraction with distributional semantics

PubMed Central

Cohen, Trevor; Wu, Stephen; Gonzalez, Graciela

2011-01-01

Extracting concepts (such as drugs, symptoms, and diagnoses) from clinical narratives constitutes a basic enabling technology to unlock the knowledge within and support more advanced reasoning applications such as diagnosis explanation, disease progression modeling, and intelligent analysis of the effectiveness of treatment. The recent release of annotated training sets of de-identified clinical narratives has contributed to the development and refinement of concept extraction methods. However, as the annotation process is labor-intensive, training data are necessarily limited in the concepts and concept patterns covered, which impacts the performance of supervised machine learning applications trained with these data. This paper proposes an approach to minimize this limitation by combining supervised machine learning with empirical learning of semantic relatedness from the distribution of the relevant words in additional unannotated text. The approach uses a sequential discriminative classifier (Conditional Random Fields) to extract the mentions of medical problems, treatments and tests from clinical narratives. It takes advantage of all Medline abstracts indexed as being of the publication type “clinical trials” to estimate the relatedness between words in the i2b2/VA training and testing corpora. In addition to the traditional features such as dictionary matching, pattern matching and part-of-speech tags, we also used as a feature words that appear in similar contexts to the word in question (that is, words that have a similar vector representation measured with the commonly used cosine metric, where vector representations are derived using methods of distributional semantics). To the best of our knowledge, this is the first effort exploring the use of distributional semantics, the semantics derived empirically from unannotated text often using vector space models, for a sequence classification task such as concept extraction. Therefore, we first experimented with different sliding window models and found the model with parameters that led to best performance in a preliminary sequence labeling task. The evaluation of this approach, performed against the i2b2/VA concept extraction corpus, showed that incorporating features based on the distribution of words across a large unannotated corpus significantly aids concept extraction. Compared to a supervised-only approach as a baseline, the micro-averaged f-measure for exact match increased from 80.3% to 82.3% and the micro-averaged f-measure based on inexact match increased from 89.7% to 91.3%. These improvements are highly significant according to the bootstrap resampling method and also considering the performance of other systems. Thus, distributional semantic features significantly improve the performance of concept extraction from clinical narratives by taking advantage of word distribution information obtained from unannotated data. PMID:22085698
Effect of acoustic frequency and power density on the aqueous ultrasonic-assisted extraction of grape pomace (Vitis vinifera L.) - a response surface approach.

PubMed

González-Centeno, María Reyes; Knoerzer, Kai; Sabarez, Henry; Simal, Susana; Rosselló, Carmen; Femenia, Antoni

2014-11-01

Aqueous ultrasound-assisted extraction (UAE) of grape pomace was investigated by Response Surface Methodology (RSM) to evaluate the effect of acoustic frequency (40, 80, 120kHz), ultrasonic power density (50, 100, 150W/L) and extraction time (5, 15, 25min) on total phenolics, total flavonols and antioxidant capacity. All the process variables showed a significant effect on the aqueous UAE of grape pomace (p<0.05). The Box-Behnken Design (BBD) generated satisfactory mathematical models which accurately explain the behavior of the system; allowing to predict both the extraction yield of phenolic and flavonol compounds, and also the antioxidant capacity of the grape pomace extracts. The optimal UAE conditions for all response factors were a frequency of 40kHz, a power density of 150W/L and 25min of extraction time. Under these conditions, the aqueous UAE would achieve a maximum of 32.31mg GA/100g fw for total phenolics and 2.04mg quercetin/100g fw for total flavonols. Regarding the antioxidant capacity, the maximum predicted values were 53.47 and 43.66mg Trolox/100g fw for CUPRAC and FRAP assays, respectively. When comparing with organic UAE, in the present research, from 12% to 38% of total phenolic bibliographic values were obtained, but using only water as the extraction solvent, and applying lower temperatures and shorter extraction times. To the best of the authors' knowledge, no studies specifically addressing the optimization of both acoustic frequency and power density during aqueous-UAE of plant materials have been previously published. Copyright © 2014 Elsevier B.V. All rights reserved.
Evaluating the state of the art in coreference resolution for electronic medical records

PubMed Central

Bodnari, Andreea; Shen, Shuying; Forbush, Tyler; Pestian, John; South, Brett R

2012-01-01

Background The fifth i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records conducted a systematic review on resolution of noun phrase coreference in medical records. Informatics for Integrating Biology and the Bedside (i2b2) and the Veterans Affair (VA) Consortium for Healthcare Informatics Research (CHIR) partnered to organize the coreference challenge. They provided the research community with two corpora of medical records for the development and evaluation of the coreference resolution systems. These corpora contained various record types (ie, discharge summaries, pathology reports) from multiple institutions. Methods The coreference challenge provided the community with two annotated ground truth corpora and evaluated systems on coreference resolution in two ways: first, it evaluated systems for their ability to identify mentions of concepts and to link together those mentions. Second, it evaluated the ability of the systems to link together ground truth mentions that refer to the same entity. Twenty teams representing 29 organizations and nine countries participated in the coreference challenge. Results The teams' system submissions showed that machine-learning and rule-based approaches worked best when augmented with external knowledge sources and coreference clues extracted from document structure. The systems performed better in coreference resolution when provided with ground truth mentions. Overall, the systems struggled in solving coreference resolution for cases that required domain knowledge. PMID:22366294
The Influence of Ecological and Conventional Plant Production Systems on Soil Microbial Quality under Hops (Humulus lupulus)

PubMed Central

Oszust, Karolina; Frąc, Magdalena; Gryta, Agata; Bilińska, Nina

2014-01-01

The knowledge about microorganisms—activity and diversity under hop production is still limited. We assumed that, different systems of hop production (within the same soil and climatic conditions) significantly influence on the composition of soil microbial populations and its functional activity (metabolic potential). Therefore, we compared a set of soil microbial properties in the field experiment of two hop production systems (a) ecological based on the use of probiotic preparations and organic fertilization (b) conventional—with the use of chemical pesticides and mineral fertilizers. Soil analyses included following microbial properties: The total number microorganisms, a bunch of soil enzyme activities, the catabolic potential was also assessed following Biolog EcoPlates®. Moreover, the abundance of ammonia-oxidizing archaea (AOA) was characterized by terminal restriction fragment length polymorphism analysis (T-RFLP) of PCR ammonia monooxygenase α-subunit (amoA) gene products. Conventional and ecological systems of hop production were able to affect soil microbial state in different seasonal manner. Favorable effect on soil microbial activity met under ecological, was more probably due to livestock-based manure and fermented plant extracts application. No negative influence on conventional hopyard soil was revealed. Both type of production fulfilled fertilizing demands. Under ecological production it was due to livestock-based manure fertilizers and fermented plant extracts application. PMID:24897025

Design of a decision-support architecture for management of remotely monitored patients.

PubMed

Basilakis, Jim; Lovell, Nigel H; Redmond, Stephen J; Celler, Branko G

2010-09-01

Telehealth is the provision of health services at a distance. Typically, this occurs in unsupervised or remote environments, such as a patient's home. We describe one such telehealth system and the integration of extracted clinical measurement parameters with a decision-support system (DSS). An enterprise application-server framework, combined with a rules engine and statistical analysis tools, is used to analyze the acquired telehealth data, searching for trends and shifts in parameter values, as well as identifying individual measurements that exceed predetermined or adaptive thresholds. An overarching business process engine is used to manage the core DSS knowledge base and coordinate workflow outputs of the DSS. The primary role for such a DSS is to provide an effective means to reduce the data overload and to provide a means of health risk stratification to allow appropriate targeting of clinical resources to best manage the health of the patient. In this way, the system may ultimately influence changes in workflow by targeting scarce clinical resources to patients of most need. A single case study extracted from an initial pilot trial of the system, in patients with chronic obstructive pulmonary disease and chronic heart failure, will be reviewed to illustrate the potential benefit of integrating telehealth and decision support in the management of both acute and chronic disease.
Plenoptic layer-based modeling for image based rendering.

PubMed

Pearson, James; Brookes, Mike; Dragotti, Pier Luigi

2013-09-01

Image based rendering is an attractive alternative to model based rendering for generating novel views because of its lower complexity and potential for photo-realistic results. To reduce the number of images necessary for alias-free rendering, some geometric information for the 3D scene is normally necessary. In this paper, we present a fast automatic layer-based method for synthesizing an arbitrary new view of a scene from a set of existing views. Our algorithm takes advantage of the knowledge of the typical structure of multiview data to perform occlusion-aware layer extraction. In addition, the number of depth layers used to approximate the geometry of the scene is chosen based on plenoptic sampling theory with the layers placed non-uniformly to account for the scene distribution. The rendering is achieved using a probabilistic interpolation approach and by extracting the depth layer information on a small number of key images. Numerical results demonstrate that the algorithm is fast and yet is only 0.25 dB away from the ideal performance achieved with the ground-truth knowledge of the 3D geometry of the scene of interest. This indicates that there are measurable benefits from following the predictions of plenoptic theory and that they remain true when translated into a practical system for real world data.
Knowledge Representation of the Melody and Rhythm in Koto Songs

NASA Astrophysics Data System (ADS)

Deguchi, Sachiko; Shirai, Katsuhiko

This paper describes the knowledge representation of the melody and rhythm in koto songs based on the structure of the domain: the scale, melisma (the melody in a syllable), and bar. We have encoded koto scores and extracted 2,3,4-note melodic patterns sequentially from the voice part of koto scores. The 2,3,4-note patterns used in the melisma are limited and the percentages of top patterns are high. The 3,4-note melodic patterns are examined at each scale degree. These patterns are more restricted than the patterns that are possible under the constraint of the scale. These typical patterns on the scale represent the knowledge of koto players. We have analyzed rhythms in two different ways. We have extracted rhythms depending on each melodic pattern, while we have extracted rhythms depending on each bar. The former are complicated and the latter are typical. This result indicates that koto players recognize melodic patterns and rhythmic patterns independently. Our analyses show the melodic patterns and rhythmic patterns that are acquired by koto players. These patterns will be applied to the description of variations of the melisma to build a score database. These patterns will also be applied to a composition and education. The melodic patterns can be extracted from other genres of Japanese traditional music, foreign old folk songs or chants by using this method.
Dynamic knowledge representation using agent-based modeling: ontology instantiation and verification of conceptual models.

PubMed

An, Gary

2009-01-01

The sheer volume of biomedical research threatens to overwhelm the capacity of individuals to effectively process this information. Adding to this challenge is the multiscale nature of both biological systems and the research community as a whole. Given this volume and rate of generation of biomedical information, the research community must develop methods for robust representation of knowledge in order for individuals, and the community as a whole, to "know what they know." Despite increasing emphasis on "data-driven" research, the fact remains that researchers guide their research using intuitively constructed conceptual models derived from knowledge extracted from publications, knowledge that is generally qualitatively expressed using natural language. Agent-based modeling (ABM) is a computational modeling method that is suited to translating the knowledge expressed in biomedical texts into dynamic representations of the conceptual models generated by researchers. The hierarchical object-class orientation of ABM maps well to biomedical ontological structures, facilitating the translation of ontologies into instantiated models. Furthermore, ABM is suited to producing the nonintuitive behaviors that often "break" conceptual models. Verification in this context is focused at determining the plausibility of a particular conceptual model, and qualitative knowledge representation is often sufficient for this goal. Thus, utilized in this fashion, ABM can provide a powerful adjunct to other computational methods within the research process, as well as providing a metamodeling framework to enhance the evolution of biomedical ontologies.
Concept maps: A tool for knowledge management and synthesis in web-based conversational learning.

PubMed

Joshi, Ankur; Singh, Satendra; Jaswal, Shivani; Badyal, Dinesh Kumar; Singh, Tejinder

2016-01-01

Web-based conversational learning provides an opportunity for shared knowledge base creation through collaboration and collective wisdom extraction. Usually, the amount of generated information in such forums is very huge, multidimensional (in alignment with the desirable preconditions for constructivist knowledge creation), and sometimes, the nature of expected new information may not be anticipated in advance. Thus, concept maps (crafted from constructed data) as "process summary" tools may be a solution to improve critical thinking and learning by making connections between the facts or knowledge shared by the participants during online discussion This exploratory paper begins with the description of this innovation tried on a web-based interacting platform (email list management software), FAIMER-Listserv, and generated qualitative evidence through peer-feedback. This process description is further supported by a theoretical construct which shows how social constructivism (inclusive of autonomy and complexity) affects the conversational learning. The paper rationalizes the use of concept map as mid-summary tool for extracting information and further sense making out of this apparent intricacy.
Knowledge Discovery in Variant Databases Using Inductive Logic Programming

PubMed Central

Nguyen, Hoan; Luu, Tien-Dao; Poch, Olivier; Thompson, Julie D.

2013-01-01

Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/. PMID:23589683
Knowledge discovery in variant databases using inductive logic programming.

PubMed

Nguyen, Hoan; Luu, Tien-Dao; Poch, Olivier; Thompson, Julie D

2013-01-01

Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/.
Qualitative and Quantitative Phytochemical Analysis of Different Extracts from Thymus algeriensis Aerial Parts.

PubMed

Boutaoui, Nassima; Zaiter, Lahcene; Benayache, Fadila; Benayache, Samir; Carradori, Simone; Cesa, Stefania; Giusti, Anna Maria; Campestre, Cristina; Menghini, Luigi; Innosa, Denise; Locatelli, Marcello

2018-02-20

This study was performed to evaluate the metabolite recovery from different extraction methods applied to Thymus algeriensis aerial parts. A high-performance liquid chromatographic method using photodiode array detector with gradient elution has been developed and validated for the simultaneous estimation of different phenolic compounds in the extracts and in their corresponding purified fractions. The experimental results show that microwave-assisted aqueous extraction for 15 min at 100 °C gave the most phenolics-enriched extract, reducing extraction time without degradation effects on bioactives. Sixteen compounds were identified in this extract, 11 phenolic compounds and five flavonoids, all known for their biological activities. Color analysis and determination of chlorophylls and carotenoids implemented the knowledge of the chemical profile of this plant.
Monitoring System for Storm Readiness and Recovery of Test Facilities: Integrated System Health Management (ISHM) Approach

NASA Technical Reports Server (NTRS)

Figueroa, Fernando; Morris, Jon; Turowski, Mark; Franzl, Richard; Walker, Mark; Kapadia, Ravi; Venkatesh, Meera; Schmalzel, John

2010-01-01

Severe weather events are likely occurrences on the Mississippi Gulf Coast. It is important to rapidly diagnose and mitigate the effects of storms on Stennis Space Center's rocket engine test complex to avoid delays to critical test article programs, reduce costs, and maintain safety. An Integrated Systems Health Management (ISHM) approach and technologies are employed to integrate environmental (weather) monitoring, structural modeling, and the suite of available facility instrumentation to provide information for readiness before storms, rapid initial damage assessment to guide mitigation planning, and then support on-going assurance as repairs are effected and finally support recertification. The system is denominated Katrina Storm Monitoring System (KStorMS). Integrated Systems Health Management (ISHM) describes a comprehensive set of capabilities that provide insight into the behavior the health of a system. Knowing the status of a system allows decision makers to effectively plan and execute their mission. For example, early insight into component degradation and impending failures provides more time to develop work around strategies and more effectively plan for maintenance. Failures of system elements generally occur over time. Information extracted from sensor data, combined with system-wide knowledge bases and methods for information extraction and fusion, inference, and decision making, can be used to detect incipient failures. If failures do occur, it is critical to detect and isolate them, and suggest an appropriate course of action. ISHM enables determining the condition (health) of every element in a complex system-of-systems or SoS (detect anomalies, diagnose causes, predict future anomalies), and provide data, information, and knowledge (DIaK) to control systems for safe and effective operation. ISHM capability is achieved by using a wide range of technologies that enable anomaly detection, diagnostics, prognostics, and advise for control: (1) anomaly detection algorithms and strategies, (2) fusion of DIaK for anomaly detection (model-based, numerical, statistical, empirical, expert-based, qualitative, etc.), (3) diagnostics/prognostics strategies and methods, (4) user interface, (5) advanced control strategies, (6) integration architectures/frameworks, (7) embedding of intelligence. Many of these technologies are mature, and they are being used in the KStorMS. The paper will describe the design, implementation, and operation of the KStorMS; and discuss further evolution to support other needs such as condition-based maintenance (CBM).
Optimizing the extraction, storage, and analysis of airborne endotoxins

USDA-ARS?s Scientific Manuscript database

While the Limulus amebocyte lysate (LAL) assay is part of most procedures to assess airborne endotoxin exposure, there is no universally agreed upon standard procedure. The purpose of this study was to fill in additional knowledge gaps with respect to the extraction, storage, and analysis of endotox...
Respiratory-aspirated 35-mm hairpin successfully retrieved with a Teflon® snare system under fluoroscopic guidance via a split endotracheal tube: a useful technique in cases of failed extraction by bronchoscopy and avoiding the need for a thoracotomy.

PubMed

Gill, S S; Pease, R A; Ashwin, C J; Gill, S S; Tait, N P

2012-09-01

Respiratory foreign body aspiration (FBA) is a common global health problem requiring prompt recognition and early treatment to prevent potentially fatal complications. The majority of FBAs are due to organic objects and treatment is usually via either endoscopic or surgical extraction. FBA of a straight hairpin has been described as a unique entity in the literature, occurring most commonly in females, particularly during adolescence. In the process of inserting hairpins, the pins will typically be between the teeth with the head tilted backwards, while tying their hair with both hands. This position increases the risk of aspiration, particularly if there is any sudden coughing or laughing. To our knowledge, this is the first case report of a 35-mm straight metallic hairpin foreign body that has been successfully retrieved by a radiological snare system under fluoroscopic guidance. This was achieved with the use of a split endotracheal tube, and therefore avoided the need for a thoracotomy in an adolescent female patient.
A Weighted Deep Representation Learning Model for Imbalanced Fault Diagnosis in Cyber-Physical Systems.

PubMed

Wu, Zhenyu; Guo, Yang; Lin, Wenfang; Yu, Shuyang; Ji, Yang

2018-04-05

Predictive maintenance plays an important role in modern Cyber-Physical Systems (CPSs) and data-driven methods have been a worthwhile direction for Prognostics Health Management (PHM). However, two main challenges have significant influences on the traditional fault diagnostic models: one is that extracting hand-crafted features from multi-dimensional sensors with internal dependencies depends too much on expertise knowledge; the other is that imbalance pervasively exists among faulty and normal samples. As deep learning models have proved to be good methods for automatic feature extraction, the objective of this paper is to study an optimized deep learning model for imbalanced fault diagnosis for CPSs. Thus, this paper proposes a weighted Long Recurrent Convolutional LSTM model with sampling policy (wLRCL-D) to deal with these challenges. The model consists of 2-layer CNNs, 2-layer inner LSTMs and 2-Layer outer LSTMs, with under-sampling policy and weighted cost-sensitive loss function. Experiments are conducted on PHM 2015 challenge datasets, and the results show that wLRCL-D outperforms other baseline methods.
Application of off-line two-dimensional high-performance countercurrent chromatography on the chloroform-soluble extract of Cuscuta auralis seeds.

PubMed

Rho, Taewoong; Yoon, Kee Dong

2018-05-01

In this study, the chloroform-soluble extract of Cuscuta auralis was separated successfully using off-line two-dimensional high-performance countercurrent chromatography, yielding a γ-pyrone, two alkaloids, a flavonoid, and four lignans. The first-dimensional countercurrent separation using a methylene chloride/methanol/water (11:6:5, v/v/v) system yielded three subfractions (fractions I-III). The second-dimensional countercurrent separations, conducted on fractions I-III using n-hexane/ethyl acetate/methanol/water/acetic acid (5:5:5:5:0, 3:7:3:7:0, and 1:9:1:9:0.01, v/v/v/v/v) systems, gave maltol (1), (-)-(13S)-cuscutamine (2), (+)-(13R)-cuscutamine (3), (+)-pinoresinol (4), (+)-epipinoresinol (5), kaempferol (6), piperitol (7), and (9R)-hydroxy-d-sesamin (8). To the best of our knowledge, maltol was identified for the first time in Cuscuta species. Furthermore, this report details the first full assignment of spectroscopic data of two cuscutamine epimers, (-)-(13S)-cuscutamine and (+)-(13R)-cuscutamine. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Weighted Deep Representation Learning Model for Imbalanced Fault Diagnosis in Cyber-Physical Systems

PubMed Central

Guo, Yang; Lin, Wenfang; Yu, Shuyang; Ji, Yang

2018-01-01

Predictive maintenance plays an important role in modern Cyber-Physical Systems (CPSs) and data-driven methods have been a worthwhile direction for Prognostics Health Management (PHM). However, two main challenges have significant influences on the traditional fault diagnostic models: one is that extracting hand-crafted features from multi-dimensional sensors with internal dependencies depends too much on expertise knowledge; the other is that imbalance pervasively exists among faulty and normal samples. As deep learning models have proved to be good methods for automatic feature extraction, the objective of this paper is to study an optimized deep learning model for imbalanced fault diagnosis for CPSs. Thus, this paper proposes a weighted Long Recurrent Convolutional LSTM model with sampling policy (wLRCL-D) to deal with these challenges. The model consists of 2-layer CNNs, 2-layer inner LSTMs and 2-Layer outer LSTMs, with under-sampling policy and weighted cost-sensitive loss function. Experiments are conducted on PHM 2015 challenge datasets, and the results show that wLRCL-D outperforms other baseline methods. PMID:29621131
Project X: competitive intelligence data mining and analysis

NASA Astrophysics Data System (ADS)

Gilmore, John F.; Pagels, Michael A.; Palk, Justin

2001-03-01

Competitive Intelligence (CI) is a systematic and ethical program for gathering and analyzing information about your competitors' activities and general business trends to further your own company's goals. CI allows companies to gather extensive information on their competitors and to analyze what the competition is doing in order to maintain or gain a competitive edge. In commercial business this potentially translates into millions of dollars in annual savings or losses. The Internet provides an overwhelming portal of information for CI analysis. The problem is how a company can automate the translation of voluminous information into valuable and actionable knowledge. This paper describes Project X, an agent-based data mining system specifically developed for extracting and analyzing competitive information from the Internet. Project X gathers CI information from a variety of sources including online newspapers, corporate websites, industry sector reporting sites, speech archiving sites, video news casts, stock news sites, weather sites, and rumor sites. It uses individual industry specific (e.g., pharmaceutical, financial, aerospace, etc.) commercial sector ontologies to form the knowledge filtering and discovery structures/content required to filter and identify valuable competitive knowledge. Project X is described in detail and an example competitive intelligence case is shown demonstrating the system's performance and utility for business intelligence.
The Adverse Drug Reactions from Patient Reports in Social Media Project: Five Major Challenges to Overcome to Operationalize Analysis and Efficiently Support Pharmacovigilance Process.

PubMed

Bousquet, Cedric; Dahamna, Badisse; Guillemin-Lanne, Sylvie; Darmoni, Stefan J; Faviez, Carole; Huot, Charles; Katsahian, Sandrine; Leroux, Vincent; Pereira, Suzanne; Richard, Christophe; Schück, Stéphane; Souvignet, Julien; Lillo-Le Louët, Agnès; Texier, Nathalie

2017-09-21

Adverse drug reactions (ADRs) are an important cause of morbidity and mortality. Classical Pharmacovigilance process is limited by underreporting which justifies the current interest in new knowledge sources such as social media. The Adverse Drug Reactions from Patient Reports in Social Media (ADR-PRISM) project aims to extract ADRs reported by patients in these media. We identified 5 major challenges to overcome to operationalize the analysis of patient posts: (1) variable quality of information on social media, (2) guarantee of data privacy, (3) response to pharmacovigilance expert expectations, (4) identification of relevant information within Web pages, and (5) robust and evolutive architecture. This article aims to describe the current state of advancement of the ADR-PRISM project by focusing on the solutions we have chosen to address these 5 major challenges. In this article, we propose methods and describe the advancement of this project on several aspects: (1) a quality driven approach for selecting relevant social media for the extraction of knowledge on potential ADRs, (2) an assessment of ethical issues and French regulation for the analysis of data on social media, (3) an analysis of pharmacovigilance expert requirements when reviewing patient posts on the Internet, (4) an extraction method based on natural language processing, pattern based matching, and selection of relevant medical concepts in reference terminologies, and (5) specifications of a component-based architecture for the monitoring system. Considering the 5 major challenges, we (1) selected a set of 21 validated criteria for selecting social media to support the extraction of potential ADRs, (2) proposed solutions to guarantee data privacy of patients posting on Internet, (3) took into account pharmacovigilance expert requirements with use case diagrams and scenarios, (4) built domain-specific knowledge resources embeding a lexicon, morphological rules, context rules, semantic rules, syntactic rules, and post-analysis processing, and (5) proposed a component-based architecture that allows storage of big data and accessibility to third-party applications through Web services. We demonstrated the feasibility of implementing a component-based architecture that allows collection of patient posts on the Internet, near real-time processing of those posts including annotation, and storage in big data structures. In the next steps, we will evaluate the posts identified by the system in social media to clarify the interest and relevance of such approach to improve conventional pharmacovigilance processes based on spontaneous reporting. ©Cedric Bousquet, Badisse Dahamna, Sylvie Guillemin-Lanne, Stefan J Darmoni, Carole Faviez, Charles Huot, Sandrine Katsahian, Vincent Leroux, Suzanne Pereira, Christophe Richard, Stéphane Schück, Julien Souvignet, Agnès Lillo-Le Louët, Nathalie Texier. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 21.09.2017.
The Science and Art of Eyebrow Transplantation by Follicular Unit Extraction

PubMed Central

Gupta, Jyoti; Kumar, Amrendra; Chouhan, Kavish; Ariganesh, C; Nandal, Vinay

2017-01-01

Eyebrows constitute a very important and prominent feature of the face. With growing information, eyebrow transplant has become a popular procedure. However, though it is a small area it requires a lot of precision and knowledge regarding anatomy, designing of brows, extraction and implantation technique. This article gives a comprehensive view regarding eyebrow transplant with special emphasis on follicular unit extraction technique, which has become the most popular technique. PMID:28852290
Integrating clinicians, knowledge and data: expert-based cooperative analysis in healthcare decision support

PubMed Central

2010-01-01

Background Decision support in health systems is a highly difficult task, due to the inherent complexity of the process and structures involved. Method This paper introduces a new hybrid methodology Expert-based Cooperative Analysis (EbCA), which incorporates explicit prior expert knowledge in data analysis methods, and elicits implicit or tacit expert knowledge (IK) to improve decision support in healthcare systems. EbCA has been applied to two different case studies, showing its usability and versatility: 1) Bench-marking of small mental health areas based on technical efficiency estimated by EbCA-Data Envelopment Analysis (EbCA-DEA), and 2) Case-mix of schizophrenia based on functional dependency using Clustering Based on Rules (ClBR). In both cases comparisons towards classical procedures using qualitative explicit prior knowledge were made. Bayesian predictive validity measures were used for comparison with expert panels results. Overall agreement was tested by Intraclass Correlation Coefficient in case "1" and kappa in both cases. Results EbCA is a new methodology composed by 6 steps:. 1) Data collection and data preparation; 2) acquisition of "Prior Expert Knowledge" (PEK) and design of the "Prior Knowledge Base" (PKB); 3) PKB-guided analysis; 4) support-interpretation tools to evaluate results and detect inconsistencies (here Implicit Knowledg -IK- might be elicited); 5) incorporation of elicited IK in PKB and repeat till a satisfactory solution; 6) post-processing results for decision support. EbCA has been useful for incorporating PEK in two different analysis methods (DEA and Clustering), applied respectively to assess technical efficiency of small mental health areas and for case-mix of schizophrenia based on functional dependency. Differences in results obtained with classical approaches were mainly related to the IK which could be elicited by using EbCA and had major implications for the decision making in both cases. Discussion This paper presents EbCA and shows the convenience of completing classical data analysis with PEK as a mean to extract relevant knowledge in complex health domains. One of the major benefits of EbCA is iterative elicitation of IK.. Both explicit and tacit or implicit expert knowledge are critical to guide the scientific analysis of very complex decisional problems as those found in health system research. PMID:20920289
Integrating clinicians, knowledge and data: expert-based cooperative analysis in healthcare decision support.

PubMed

Gibert, Karina; García-Alonso, Carlos; Salvador-Carulla, Luis

2010-09-30

Decision support in health systems is a highly difficult task, due to the inherent complexity of the process and structures involved. This paper introduces a new hybrid methodology Expert-based Cooperative Analysis (EbCA), which incorporates explicit prior expert knowledge in data analysis methods, and elicits implicit or tacit expert knowledge (IK) to improve decision support in healthcare systems. EbCA has been applied to two different case studies, showing its usability and versatility: 1) Bench-marking of small mental health areas based on technical efficiency estimated by EbCA-Data Envelopment Analysis (EbCA-DEA), and 2) Case-mix of schizophrenia based on functional dependency using Clustering Based on Rules (ClBR). In both cases comparisons towards classical procedures using qualitative explicit prior knowledge were made. Bayesian predictive validity measures were used for comparison with expert panels results. Overall agreement was tested by Intraclass Correlation Coefficient in case "1" and kappa in both cases. EbCA is a new methodology composed by 6 steps:. 1) Data collection and data preparation; 2) acquisition of "Prior Expert Knowledge" (PEK) and design of the "Prior Knowledge Base" (PKB); 3) PKB-guided analysis; 4) support-interpretation tools to evaluate results and detect inconsistencies (here Implicit Knowledg -IK- might be elicited); 5) incorporation of elicited IK in PKB and repeat till a satisfactory solution; 6) post-processing results for decision support. EbCA has been useful for incorporating PEK in two different analysis methods (DEA and Clustering), applied respectively to assess technical efficiency of small mental health areas and for case-mix of schizophrenia based on functional dependency. Differences in results obtained with classical approaches were mainly related to the IK which could be elicited by using EbCA and had major implications for the decision making in both cases. This paper presents EbCA and shows the convenience of completing classical data analysis with PEK as a mean to extract relevant knowledge in complex health domains. One of the major benefits of EbCA is iterative elicitation of IK.. Both explicit and tacit or implicit expert knowledge are critical to guide the scientific analysis of very complex decisional problems as those found in health system research.
Separation of presampling and postsampling modulation transfer functions in infrared sensor systems

NASA Astrophysics Data System (ADS)

Espinola, Richard L.; Olson, Jeffrey T.; O'Shea, Patrick D.; Hodgkin, Van A.; Jacobs, Eddie L.

2006-05-01

New methods of measuring the modulation transfer function (MTF) of electro-optical sensor systems are investigated. These methods are designed to allow the separation and extraction of presampling and postsampling components from the total system MTF. The presampling MTF includes all the effects prior to the sampling stage of the imaging process, such as optical blur and detector shape. The postsampling MTF includes all the effects after sampling, such as interpolation filters and display characteristics. Simulation and laboratory measurements are used to assess the utility of these techniques. Knowledge of these components and inclusion into sensor models, such as the U.S. Army RDECOM CERDEC Night Vision and Electronic Sensors Directorate's NVThermIP, will allow more accurate modeling and complete characterization of sensor performance.

An ontology-based method for secondary use of electronic dental record data.

PubMed

Schleyer, Titus Kl; Ruttenberg, Alan; Duncan, William; Haendel, Melissa; Torniai, Carlo; Acharya, Amit; Song, Mei; Thyvalikakath, Thankam P; Liu, Kaihong; Hernandez, Pedro

2013-01-01

A key question for healthcare is how to operationalize the vision of the Learning Healthcare System, in which electronic health record data become a continuous information source for quality assurance and research. This project presents an initial, ontology-based, method for secondary use of electronic dental record (EDR) data. We defined a set of dental clinical research questions; constructed the Oral Health and Disease Ontology (OHD); analyzed data from a commercial EDR database; and created a knowledge base, with the OHD used to represent clinical data about 4,500 patients from a single dental practice. Currently, the OHD includes 213 classes and reuses 1,658 classes from other ontologies. We have developed an initial set of SPARQL queries to allow extraction of data about patients, teeth, surfaces, restorations and findings. Further work will establish a complete, open and reproducible workflow for extracting and aggregating data from a variety of EDRs for research and quality assurance.
Low- ν Flux and Total Charged-current Cross Sections in MINERvA

NASA Astrophysics Data System (ADS)

Ren, Lu

2014-03-01

The MINER νA experiment measures neutrino and antineutrino interaction cross sections on carbon and other nuclei. Cross section measurements require accurate knowledge of the incident neutrino flux. The ``low- ν'' flux technique uses a standard-candle cross section for events with low energy transfer to to the hadronic system to determine the incident flux. MINER νA will use low- ν fluxes for neutrinos and antineutrinos to tune production models used in beam simulations and to extract total cross sections as a function of energy. We present the low- ν flux technique adapted for the MINER νA data samples and preliminary results for the extracted low- ν fluxes in MINER νA. MINER νA will extend the range of antineutino charged-current cross section measurements to lower energies which are of interest to future accelerator oscillation experiments.
Motor Fault Diagnosis Based on Short-time Fourier Transform and Convolutional Neural Network

NASA Astrophysics Data System (ADS)

Wang, Li-Hua; Zhao, Xiao-Ping; Wu, Jia-Xin; Xie, Yang-Yang; Zhang, Yong-Hong

2017-11-01

With the rapid development of mechanical equipment, the mechanical health monitoring field has entered the era of big data. However, the method of manual feature extraction has the disadvantages of low efficiency and poor accuracy, when handling big data. In this study, the research object was the asynchronous motor in the drivetrain diagnostics simulator system. The vibration signals of different fault motors were collected. The raw signal was pretreated using short time Fourier transform (STFT) to obtain the corresponding time-frequency map. Then, the feature of the time-frequency map was adaptively extracted by using a convolutional neural network (CNN). The effects of the pretreatment method, and the hyper parameters of network diagnostic accuracy, were investigated experimentally. The experimental results showed that the influence of the preprocessing method is small, and that the batch-size is the main factor affecting accuracy and training efficiency. By investigating feature visualization, it was shown that, in the case of big data, the extracted CNN features can represent complex mapping relationships between signal and health status, and can also overcome the prior knowledge and engineering experience requirement for feature extraction, which is used by traditional diagnosis methods. This paper proposes a new method, based on STFT and CNN, which can complete motor fault diagnosis tasks more intelligently and accurately.
Cell-cycle regulation of formin-mediated actin cable assembly

PubMed Central

Miao, Yansong; Wong, Catherine C. L.; Mennella, Vito; Michelot, Alphée; Agard, David A.; Holt, Liam J.; Yates, John R.; Drubin, David G.

2013-01-01

Assembly of appropriately oriented actin cables nucleated by formin proteins is necessary for many biological processes in diverse eukaryotes. However, compared with knowledge of how nucleation of dendritic actin filament arrays by the actin-related protein-2/3 complex is regulated, the in vivo regulatory mechanisms for actin cable formation are less clear. To gain insights into mechanisms for regulating actin cable assembly, we reconstituted the assembly process in vitro by introducing microspheres functionalized with the C terminus of the budding yeast formin Bni1 into extracts prepared from yeast cells at different cell-cycle stages. EM studies showed that unbranched actin filament bundles were reconstituted successfully in the yeast extracts. Only extracts enriched in the mitotic cyclin Clb2 were competent for actin cable assembly, and cyclin-dependent kinase 1 activity was indispensible. Cyclin-dependent kinase 1 activity also was found to regulate cable assembly in vivo. Here we present evidence that formin cell-cycle regulation is conserved in vertebrates. The use of the cable-reconstitution system to test roles for the key actin-binding proteins tropomyosin, capping protein, and cofilin provided important insights into assembly regulation. Furthermore, using mass spectrometry, we identified components of the actin cables formed in yeast extracts, providing the basis for comprehensive understanding of cable assembly and regulation. PMID:24133141
Fundus Image Features Extraction for Exudate Mining in Coordination with Content Based Image Retrieval: A Study

NASA Astrophysics Data System (ADS)

Gururaj, C.; Jayadevappa, D.; Tunga, Satish

2018-02-01

Medical field has seen a phenomenal improvement over the previous years. The invention of computers with appropriate increase in the processing and internet speed has changed the face of the medical technology. However there is still scope for improvement of the technologies in use today. One of the many such technologies of medical aid is the detection of afflictions of the eye. Although a repertoire of research has been accomplished in this field, most of them fail to address how to take the detection forward to a stage where it will be beneficial to the society at large. An automated system that can predict the current medical condition of a patient after taking the fundus image of his eye is yet to see the light of the day. Such a system is explored in this paper by summarizing a number of techniques for fundus image features extraction, predominantly hard exudate mining, coupled with Content Based Image Retrieval to develop an automation tool. The knowledge of the same would bring about worthy changes in the domain of exudates extraction of the eye. This is essential in cases where the patients may not have access to the best of technologies. This paper attempts at a comprehensive summary of the techniques for Content Based Image Retrieval (CBIR) or fundus features image extraction, and few choice methods of both, and an exploration which aims to find ways to combine these two attractive features, and combine them so that it is beneficial to all.
Component spectra extraction from terahertz measurements of unknown mixtures.

PubMed

Li, Xian; Hou, D B; Huang, P J; Cai, J H; Zhang, G X

2015-10-20

The aim of this work is to extract component spectra from unknown mixtures in the terahertz region. To that end, a method, hard modeling factor analysis (HMFA), was applied to resolve terahertz spectral matrices collected from the unknown mixtures. This method does not require any expertise of the user and allows the consideration of nonlinear effects such as peak variations or peak shifts. It describes the spectra using a peak-based nonlinear mathematic model and builds the component spectra automatically by recombination of the resolved peaks through correlation analysis. Meanwhile, modifications on the method were made to take the features of terahertz spectra into account and to deal with the artificial baseline problem that troubles the extraction process of some terahertz spectra. In order to validate the proposed method, simulated wideband terahertz spectra of binary and ternary systems and experimental terahertz absorption spectra of amino acids mixtures were tested. In each test, not only the number of pure components could be correctly predicted but also the identified pure spectra had a good similarity with the true spectra. Moreover, the proposed method associated the molecular motions with the component extraction, making the identification process more physically meaningful and interpretable compared to other methods. The results indicate that the HMFA method with the modifications can be a practical tool for identifying component terahertz spectra in completely unknown mixtures. This work reports the solution to this kind of problem in the terahertz region for the first time, to the best of the authors' knowledge, and represents a significant advance toward exploring physical or chemical mechanisms of unknown complex systems by terahertz spectroscopy.
Reversal of pentylenetetrazole-altered swimming and neural activity-regulated gene expression in zebrafish larvae by valproic acid and valerian extract.

PubMed

Torres-Hernández, Bianca A; Colón, Luis R; Rosa-Falero, Coral; Torrado, Aranza; Miscalichi, Nahira; Ortíz, José G; González-Sepúlveda, Lorena; Pérez-Ríos, Naydi; Suárez-Pérez, Erick; Bradsher, John N; Behra, Martine

2016-07-01

Ethnopharmacology has documented hundreds of psychoactive plants awaiting exploitation for drug discovery. A robust and inexpensive in vivo system allowing systematic screening would be critical to exploiting this knowledge. The objective of this study was to establish a cheap and accurate screening method which can be used for testing psychoactive efficacy of complex mixtures of unknown composition, like plant crude extracts. We used automated recording of zebrafish larval swimming behavior during light vs. dark periods which we reproducibly altered with an anxiogenic compound, pentylenetetrazole (PTZ). First, we reversed this PTZ-altered swimming by co-treatment with a well-defined synthetic anxiolytic drug, valproic acid (VPA). Next, we aimed at reversing it by adding crude root extracts of Valeriana officinalis (Val) from which VPA was originally derived. Finally, we assessed how expression of neural activity-regulated genes (c-fos, npas4a, and bdnf) known to be upregulated by PTZ treatment was affected in the presence of Val. Both VPA and Val significantly reversed the PTZ-altered swimming behaviors. Noticeably, Val at higher doses was affecting swimming independently of the presence of PTZ. A strong regulation of all three neural-activity genes was observed in Val-treated larvae which fully supported the behavioral results. We demonstrated in a combined behavioral-molecular approach the strong psychoactivity of a natural extract of unknown composition made from V. officinalis. Our results highlight the efficacy and sensitivity of such an approach, therefore offering a novel in vivo screening system amenable to high-throughput testing of promising ethnobotanical candidates.
Fundus Image Features Extraction for Exudate Mining in Coordination with Content Based Image Retrieval: A Study

NASA Astrophysics Data System (ADS)

Gururaj, C.; Jayadevappa, D.; Tunga, Satish

2018-06-01

Medical field has seen a phenomenal improvement over the previous years. The invention of computers with appropriate increase in the processing and internet speed has changed the face of the medical technology. However there is still scope for improvement of the technologies in use today. One of the many such technologies of medical aid is the detection of afflictions of the eye. Although a repertoire of research has been accomplished in this field, most of them fail to address how to take the detection forward to a stage where it will be beneficial to the society at large. An automated system that can predict the current medical condition of a patient after taking the fundus image of his eye is yet to see the light of the day. Such a system is explored in this paper by summarizing a number of techniques for fundus image features extraction, predominantly hard exudate mining, coupled with Content Based Image Retrieval to develop an automation tool. The knowledge of the same would bring about worthy changes in the domain of exudates extraction of the eye. This is essential in cases where the patients may not have access to the best of technologies. This paper attempts at a comprehensive summary of the techniques for Content Based Image Retrieval (CBIR) or fundus features image extraction, and few choice methods of both, and an exploration which aims to find ways to combine these two attractive features, and combine them so that it is beneficial to all.
The Service Environment for Enhanced Knowledge and Research (SEEKR) Framework

NASA Astrophysics Data System (ADS)

King, T. A.; Walker, R. J.; Weigel, R. S.; Narock, T. W.; McGuire, R. E.; Candey, R. M.

2011-12-01

The Service Environment for Enhanced Knowledge and Research (SEEKR) Framework is a configurable service oriented framework to enable the discovery, access and analysis of data shared in a community. The SEEKR framework integrates many existing independent services through the use of web technologies and standard metadata. Services are hosted on systems by using an application server and are callable by using REpresentational State Transfer (REST) protocols. Messages and metadata are transferred with eXtensible Markup Language (XML) encoding which conform to a published XML schema. Space Physics Archive Search and Extract (SPASE) metadata is central to utilizing the services. Resources (data, documents, software, etc.) are described with SPASE and the associated Resource Identifier is used to access and exchange resources. The configurable options for the service can be set by using a web interface. Services are packaged as web application resource (WAR) files for direct deployment on application services such as Tomcat or Jetty. We discuss the composition of the SEEKR framework, how new services can be integrated and the steps necessary to deploying the framework. The SEEKR Framework emerged from NASA's Virtual Magnetospheric Observatory (VMO) and other systems and we present an overview of these systems from a SEEKR Framework perspective.
Women's Awareness and Knowledge of Abortion Laws: A Systematic Review.

PubMed

Assifi, Anisa R; Berger, Blair; Tunçalp, Özge; Khosla, Rajat; Ganatra, Bela

2016-01-01

Incorrect knowledge of laws may affect how women enter the health system or seek services, and it likely contributes to the disconnect between official laws and practical applications of the laws that influence women's access to safe, legal abortion services. To provide a synthesis of evidence of women's awareness and knowledge of the legal status of abortion in their country, and the accuracy of women's knowledge on specific legal grounds and restrictions outlined in a country's abortion law. A systematic search was carried for articles published between 1980-2015. Quantitative, mixed-method data collection, and objectives related to women's awareness or knowledge of the abortion law was included. Full texts were assessed, and data extraction done by a single reviewer. Final inclusion for analysis was assessed by two reviewers. The results were synthesised into tables, using narrative synthesis. Of the original 3,126 articles, and 16 hand searched citations, 24 studies were included for analysis. Women's correct general awareness and knowledge of the legal status was less than 50% in nine studies. In six studies, knowledge of legalization/liberalisation ranged between 32.3%-68.2%. Correct knowledge of abortion on the grounds of rape ranged from 12.8%-98%, while in the case of incest, ranged from 9.8%-64.5%. Abortion on the grounds of fetal impairment and gestational limits, varied widely from 7%-94% and 0%-89.5% respectively. This systematic review synthesizes literature on women's awareness and knowledge of the abortion law in their own context. The findings show that correct general awareness and knowledge of the abortion law and legal grounds and restrictions amongst women was limited, even in countries where the laws were liberal. Thus, interventions to disseminate accurate information on the legal context are necessary.
Anti-Cancer Effects of Imperata cylindrica Leaf Extract on Human Oral Squamous Carcinoma Cell Line SCC-9 in Vitro.

PubMed

Keshava, Rohini; Muniyappa, Nagesh; Gope, Rajalakshmi; Ramaswamaiah, Ananthanarayana Saligrama

2016-01-01

Imperata cylindrica, a tall tufted grass which has multiple pharmacological applications is one of the key ingredients in various traditional medicinal formula used in India. Previous reports have shown that I. cylindrica plant extract inhibited cell proliferation and induced apoptosis in various cancer cell lines. To our knowledge, no studies have been published on the effect of I. cylindrica leaf extract on human oral cancers. The present study was undertaken in order to evaluate the anticancer properties of the leaf extract of I. cylindrica using an oral squamous cell carcinoma cell line SCC-9 as an in vitro model system. A methanol extract from dried leaves of I. cylindrica (ICL) was prepared by standard procedures. Effects of the ICL extract on the morphology of SCC-9 cells was visualized by microscopy. Cytotoxicity was determined by MTT assay. Effects of the ICL extract on colony forming ability of SCC-9 cells was evaluated using clonogenic assay. Cell cycle analysis was performed by flow cytometry and induction of apoptosis was determined by DNA fragmentation assay. The ICL extract treatment caused cytotoxicity and induced cell death in vitro in SCC-9 cells in a dose-dependent manner. This treatment also significantly reduced the clonogenic potential and inhibited cell proliferation by arresting the cell cycle in the G2/M phase. Furthermore, DNA fragmentation assays showed that the observed cell death was caused by apoptosis. This is the first report showing the anticancer activity of the methanol extracts from the leaves of I. cylindrica in human oral cancer cell line. Our data indicates that ICL extract could be considered as one of the lead compounds for the formulation of anticancer therapeutic agents to treat/manage human oral cancers. The natural abundance of I. cylindrica and its wide geographic distribution could render it one of the primary resource materials for preparation of anticancer therapeutic agents.
Do infants retain the statistics of a statistical learning experience? Insights from a developmental cognitive neuroscience perspective

PubMed Central

2017-01-01

Statistical structure abounds in language. Human infants show a striking capacity for using statistical learning (SL) to extract regularities in their linguistic environments, a process thought to bootstrap their knowledge of language. Critically, studies of SL test infants in the minutes immediately following familiarization, but long-term retention unfolds over hours and days, with almost no work investigating retention of SL. This creates a critical gap in the literature given that we know little about how single or multiple SL experiences translate into permanent knowledge. Furthermore, different memory systems with vastly different encoding and retention profiles emerge at different points in development, with the underlying memory system dictating the fidelity of the memory trace hours later. I describe the scant literature on retention of SL, the learning and retention properties of memory systems as they apply to SL, and the development of these memory systems. I propose that different memory systems support retention of SL in infant and adult learners, suggesting an explanation for the slow pace of natural language acquisition in infancy. I discuss the implications of developing memory systems for SL and suggest that we exercise caution in extrapolating from adult to infant properties of SL. This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences’. PMID:27872372
Do infants retain the statistics of a statistical learning experience? Insights from a developmental cognitive neuroscience perspective.

PubMed

Gómez, Rebecca L

2017-01-05

Statistical structure abounds in language. Human infants show a striking capacity for using statistical learning (SL) to extract regularities in their linguistic environments, a process thought to bootstrap their knowledge of language. Critically, studies of SL test infants in the minutes immediately following familiarization, but long-term retention unfolds over hours and days, with almost no work investigating retention of SL. This creates a critical gap in the literature given that we know little about how single or multiple SL experiences translate into permanent knowledge. Furthermore, different memory systems with vastly different encoding and retention profiles emerge at different points in development, with the underlying memory system dictating the fidelity of the memory trace hours later. I describe the scant literature on retention of SL, the learning and retention properties of memory systems as they apply to SL, and the development of these memory systems. I propose that different memory systems support retention of SL in infant and adult learners, suggesting an explanation for the slow pace of natural language acquisition in infancy. I discuss the implications of developing memory systems for SL and suggest that we exercise caution in extrapolating from adult to infant properties of SL.This article is part of the themed issue 'New frontiers for statistical learning in the cognitive sciences'. © 2016 The Author(s).
Motion processing with two eyes in three dimensions.

PubMed

Rokers, Bas; Czuba, Thaddeus B; Cormack, Lawrence K; Huk, Alexander C

2011-02-11

The movement of an object toward or away from the head is perhaps the most critical piece of information an organism can extract from its environment. Such 3D motion produces horizontally opposite motions on the two retinae. Little is known about how or where the visual system combines these two retinal motion signals, relative to the wealth of knowledge about the neural hierarchies involved in 2D motion processing and binocular vision. Canonical conceptions of primate visual processing assert that neurons early in the visual system combine monocular inputs into a single cyclopean stream (lacking eye-of-origin information) and extract 1D ("component") motions; later stages then extract 2D pattern motion from the cyclopean output of the earlier stage. Here, however, we show that 3D motion perception is in fact affected by the comparison of opposite 2D pattern motions between the two eyes. Three-dimensional motion sensitivity depends systematically on pattern motion direction when dichoptically viewing gratings and plaids-and a novel "dichoptic pseudoplaid" stimulus provides strong support for use of interocular pattern motion differences by precluding potential contributions from conventional disparity-based mechanisms. These results imply the existence of eye-of-origin information in later stages of motion processing and therefore motivate the incorporation of such eye-specific pattern-motion signals in models of motion processing and binocular integration.
Automatic cell identification and visualization using digital holographic microscopy with head mounted augmented reality devices.

PubMed

O'Connor, Timothy; Rawat, Siddharth; Markman, Adam; Javidi, Bahram

2018-03-01

We propose a compact imaging system that integrates an augmented reality head mounted device with digital holographic microscopy for automated cell identification and visualization. A shearing interferometer is used to produce holograms of biological cells, which are recorded using customized smart glasses containing an external camera. After image acquisition, segmentation is performed to isolate regions of interest containing biological cells in the field-of-view, followed by digital reconstruction of the cells, which is used to generate a three-dimensional (3D) pseudocolor optical path length profile. Morphological features are extracted from the cell's optical path length map, including mean optical path length, coefficient of variation, optical volume, projected area, projected area to optical volume ratio, cell skewness, and cell kurtosis. Classification is performed using the random forest classifier, support vector machines, and K-nearest neighbor, and the results are compared. Finally, the augmented reality device displays the cell's pseudocolor 3D rendering of its optical path length profile, extracted features, and the identified cell's type or class. The proposed system could allow a healthcare worker to quickly visualize cells using augmented reality smart glasses and extract the relevant information for rapid diagnosis. To the best of our knowledge, this is the first report on the integration of digital holographic microscopy with augmented reality devices for automated cell identification and visualization.
Computational methods to extract meaning from text and advance theories of human cognition.

PubMed

McNamara, Danielle S

2011-01-01

Over the past two decades, researchers have made great advances in the area of computational methods for extracting meaning from text. This research has to a large extent been spurred by the development of latent semantic analysis (LSA), a method for extracting and representing the meaning of words using statistical computations applied to large corpora of text. Since the advent of LSA, researchers have developed and tested alternative statistical methods designed to detect and analyze meaning in text corpora. This research exemplifies how statistical models of semantics play an important role in our understanding of cognition and contribute to the field of cognitive science. Importantly, these models afford large-scale representations of human knowledge and allow researchers to explore various questions regarding knowledge, discourse processing, text comprehension, and language. This topic includes the latest progress by the leading researchers in the endeavor to go beyond LSA. Copyright © 2010 Cognitive Science Society, Inc.
Software-assisted live visualization system for subjacent blood vessels in endonasal endoscopic approaches

NASA Astrophysics Data System (ADS)

Lempe, B.; Taudt, Ch.; Maschke, R.; Gruening, J.; Ernstberger, M.; Basan, F.; Baselt, T.; Grunert, R.; Hartmann, P.

2013-02-01

Minimal invasive surgery methods have received growing attention in recent years. In vital important areas, it is crucial for the surgeon to have a precise knowledge of the tissue structure. Especially the visualization of arteries is desirable, as the destruction of the same can be lethal to the patient. In order to meet this requirement, the study presents a novel assistance system for endoscopic surgery. While state-of-the art systems rely on pre-operational data like computer-tomographic maps and require the use of radiation, the goal of the presented approach is to provide the clarification of subjacent blood vessels on live images of the endoscope camera system. Based on the transmission and reflection spectra of various human tissues, a prototype system with a NIR illumination unit working at 808 nm was established. Several image filtering, processing and enhancement techniques have been investigated and evaluated on the raw pictures in order to obtain high quality results. The most important were increasing contrast and thresholding by difference of Gaussian method. Based on that, it is possible to rectify a fragmented artery pattern and extract geometrical information about the structure in terms of position and orientation. By superposing the original image and the extracted segment, the surgeon is assisted with valuable live pictures of the region of interest. The whole system has been tested on a laboratory scale. An outlook on the integration of such a system in a clinical environment and obvious benefits are discussed.
["Big data" - large data, a lot of knowledge?].

PubMed

Hothorn, Torsten

2015-01-28

Since a couple of years, the term Big Data describes technologies to extract knowledge from data. Applications of Big Data and their consequences are also increasingly discussed in the mass media. Because medicine is an empirical science, we discuss the meaning of Big Data and its potential for future medical research.
Educational Data Mining Acceptance among Undergraduate Students

ERIC Educational Resources Information Center

Wook, Muslihah; Yusof, Zawiyah M.; Nazri, Mohd Zakree Ahmad

2017-01-01

The acceptance of Educational Data Mining (EDM) technology is on the rise due to, its ability to extract new knowledge from large amounts of students' data. This knowledge is important for educational stakeholders, such as policy makers, educators, and students themselves to enhance efficiency and achievements. However, previous studies on EDM…
Meta-synthesis exploring barriers to health seeking behaviour among Malaysian breast cancer patients.

PubMed

Yu, Foo Qing; Murugiah, Muthu Kumar; Khan, Amer Hayat; Mehmood, Tahir

2015-01-01

Barriers to health seeking constitute a challenging issue in the treatment of breast cancer. The current meta- synthesis aimed to explore common barriers to health seeking among Malaysian breast cancer patients. From the systematic search, nine studies were found meeting the inclusion criteria. Data extraction revealed that health behavior towards breast cancer among Malaysia women was influenced by knowledge, psychological, sociocultural and medical system factors. In terms of knowledge, most of the Malaysian patients were observed to have cursory information and the reliance on the information provided by media was limiting. Among psychological factors, stress and sense of denial were some of the common factors leading to delay in treatment seeking. Family member's advice, cultural beliefs towards traditional care were some of the common sociocultural factors hindering immediate access to advanced medical diagnosis and care. Lastly, the delay in referral was one of the most common health system-related problems highlighted in most of the studies. In conclusion, there is an immediate need to improve the knowledge and understanding of Malaysian women towards breast cancer. Mass media should liaise with the cancer specialists to disseminate accurate and up-to-date information for the readers and audience, helping in modification of cultural beliefs that hinder timing health seeking. However, such intervention will not improve or rectify the health system related barriers to treatment seeking. Therefore, there is an immediate need for resource adjustment and training programs among health professional to improve their competency and professionalism required to develop an efficient health system.

DBpedia and the Live Extraction of Structured Data from Wikipedia

ERIC Educational Resources Information Center

Morsey, Mohamed; Lehmann, Jens; Auer, Soren; Stadler, Claus; Hellmann, Sebastian

2012-01-01

Purpose: DBpedia extracts structured information from Wikipedia, interlinks it with other knowledge bases and freely publishes the results on the web using Linked Data and SPARQL. However, the DBpedia release process is heavyweight and releases are sometimes based on several months old data. DBpedia-Live solves this problem by providing a live…
The effect of extraction, storage, and analysis techniques on the measurement of airborne endotoxin from a large dairy

USDA-ARS?s Scientific Manuscript database

The objective of this study was to fill in additional knowledge gaps with respect to the extraction, storage, and analysis of airborne endotoxin, with a specific focus on samples from a dairy production facility. We utilized polycarbonate filters to collect total airborne endotoxins, sonication as ...
Point Cloud Classification of Tesserae from Terrestrial Laser Data Combined with Dense Image Matching for Archaeological Information Extraction

NASA Astrophysics Data System (ADS)

Poux, F.; Neuville, R.; Billen, R.

2017-08-01

Reasoning from information extraction given by point cloud data mining allows contextual adaptation and fast decision making. However, to achieve this perceptive level, a point cloud must be semantically rich, retaining relevant information for the end user. This paper presents an automatic knowledge-based method for pre-processing multi-sensory data and classifying a hybrid point cloud from both terrestrial laser scanning and dense image matching. Using 18 features including sensor's biased data, each tessera in the high-density point cloud from the 3D captured complex mosaics of Germigny-des-prés (France) is segmented via a colour multi-scale abstraction-based featuring extracting connectivity. A 2D surface and outline polygon of each tessera is generated by a RANSAC plane extraction and convex hull fitting. Knowledge is then used to classify every tesserae based on their size, surface, shape, material properties and their neighbour's class. The detection and semantic enrichment method shows promising results of 94% correct semantization, a first step toward the creation of an archaeological smart point cloud.
Towards an Age-Phenome Knowledge-base

PubMed Central

2011-01-01

Background Currently, data about age-phenotype associations are not systematically organized and cannot be studied methodically. Searching for scientific articles describing phenotypic changes reported as occurring at a given age is not possible for most ages. Results Here we present the Age-Phenome Knowledge-base (APK), in which knowledge about age-related phenotypic patterns and events can be modeled and stored for retrieval. The APK contains evidence connecting specific ages or age groups with phenotypes, such as disease and clinical traits. Using a simple text mining tool developed for this purpose, we extracted instances of age-phenotype associations from journal abstracts related to non-insulin-dependent Diabetes Mellitus. In addition, links between age and phenotype were extracted from clinical data obtained from the NHANES III survey. The knowledge stored in the APK is made available for the relevant research community in the form of 'Age-Cards', each card holds the collection of all the information stored in the APK about a particular age. These Age-Cards are presented in a wiki, allowing community review, amendment and contribution of additional information. In addition to the wiki interaction, complex searches can also be conducted which require the user to have some knowledge of database query construction. Conclusions The combination of a knowledge model based repository with community participation in the evolution and refinement of the knowledge-base makes the APK a useful and valuable environment for collecting and curating existing knowledge of the connections between age and phenotypes. PMID:21651792
Biocuration at the Saccharomyces genome database.

PubMed

Skrzypek, Marek S; Nash, Robert S

2015-08-01

Saccharomyces Genome Database is an online resource dedicated to managing information about the biology and genetics of the model organism, yeast (Saccharomyces cerevisiae). This information is derived primarily from scientific publications through a process of human curation that involves manual extraction of data and their organization into a comprehensive system of knowledge. This system provides a foundation for further analysis of experimental data coming from research on yeast as well as other organisms. In this review we will demonstrate how biocuration and biocurators add a key component, the biological context, to our understanding of how genes, proteins, genomes and cells function and interact. We will explain the role biocurators play in sifting through the wealth of biological data to incorporate and connect key information. We will also discuss the many ways we assist researchers with their various research needs. We hope to convince the reader that manual curation is vital in converting the flood of data into organized and interconnected knowledge, and that biocurators play an essential role in the integration of scientific information into a coherent model of the cell. © 2015 Wiley Periodicals, Inc.
Biocuration at the Saccharomyces Genome Database

PubMed Central

Skrzypek, Marek S.; Nash, Robert S.

2015-01-01

Saccharomyces Genome Database is an online resource dedicated to managing information about the biology and genetics of the model organism, yeast (Saccharomyces cerevisiae). This information is derived primarily from scientific publications through a process of human curation that involves manual extraction of data and their organization into a comprehensive system of knowledge. This system provides a foundation for further analysis of experimental data coming from research on yeast as well as other organisms. In this review we will demonstrate how biocuration and biocurators add a key component, the biological context, to our understanding of how genes, proteins, genomes and cells function and interact. We will explain the role biocurators play in sifting through the wealth of biological data to incorporate and connect key information. We will also discuss the many ways we assist researchers with their various research needs. We hope to convince the reader that manual curation is vital in converting the flood of data into organized and interconnected knowledge, and that biocurators play an essential role in the integration of scientific information into a coherent model of the cell. PMID:25997651
Towards an Obesity-Cancer Knowledge Base: Biomedical Entity Identification and Relation Detection

PubMed Central

Lossio-Ventura, Juan Antonio; Hogan, William; Modave, François; Hicks, Amanda; Hanna, Josh; Guo, Yi; He, Zhe; Bian, Jiang

2017-01-01

Obesity is associated with increased risks of various types of cancer, as well as a wide range of other chronic diseases. On the other hand, access to health information activates patient participation, and improve their health outcomes. However, existing online information on obesity and its relationship to cancer is heterogeneous ranging from pre-clinical models and case studies to mere hypothesis-based scientific arguments. A formal knowledge representation (i.e., a semantic knowledge base) would help better organizing and delivering quality health information related to obesity and cancer that consumers need. Nevertheless, current ontologies describing obesity, cancer and related entities are not designed to guide automatic knowledge base construction from heterogeneous information sources. Thus, in this paper, we present methods for named-entity recognition (NER) to extract biomedical entities from scholarly articles and for detecting if two biomedical entities are related, with the long term goal of building a obesity-cancer knowledge base. We leverage both linguistic and statistical approaches in the NER task, which supersedes the state-of-the-art results. Further, based on statistical features extracted from the sentences, our method for relation detection obtains an accuracy of 99.3% and a f-measure of 0.993. PMID:28503356
Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research.

PubMed

Bravo, Àlex; Piñero, Janet; Queralt-Rosinach, Núria; Rautschka, Michael; Furlong, Laura I

2015-02-21

Current biomedical research needs to leverage and exploit the large amount of information reported in scientific publications. Automated text mining approaches, in particular those aimed at finding relationships between entities, are key for identification of actionable knowledge from free text repositories. We present the BeFree system aimed at identifying relationships between biomedical entities with a special focus on genes and their associated diseases. By exploiting morpho-syntactic information of the text, BeFree is able to identify gene-disease, drug-disease and drug-target associations with state-of-the-art performance. The application of BeFree to real-case scenarios shows its effectiveness in extracting information relevant for translational research. We show the value of the gene-disease associations extracted by BeFree through a number of analyses and integration with other data sources. BeFree succeeds in identifying genes associated to a major cause of morbidity worldwide, depression, which are not present in other public resources. Moreover, large-scale extraction and analysis of gene-disease associations, and integration with current biomedical knowledge, provided interesting insights on the kind of information that can be found in the literature, and raised challenges regarding data prioritization and curation. We found that only a small proportion of the gene-disease associations discovered by using BeFree is collected in expert-curated databases. Thus, there is a pressing need to find alternative strategies to manual curation, in order to review, prioritize and curate text-mining data and incorporate it into domain-specific databases. We present our strategy for data prioritization and discuss its implications for supporting biomedical research and applications. BeFree is a novel text mining system that performs competitively for the identification of gene-disease, drug-disease and drug-target associations. Our analyses show that mining only a small fraction of MEDLINE results in a large dataset of gene-disease associations, and only a small proportion of this dataset is actually recorded in curated resources (2%), raising several issues on data prioritization and curation. We propose that joint analysis of text mined data with data curated by experts appears as a suitable approach to both assess data quality and highlight novel and interesting information.
Identification of an urban fractured-rock aquifer dynamics using an evolutionary self-organizing modelling

NASA Astrophysics Data System (ADS)

Hong, Yoon-Seok; Rosen, Michael R.

2002-03-01

An urban fractured-rock aquifer system, where disposal of storm water is via 'soak holes' drilled directly into the top of fractured-rock basalt, has a highly dynamic nature where theories or knowledge to generate the model are still incomplete and insufficient. Therefore, formulating an accurate mechanistic model, usually based on first principles (physical and chemical laws, mass balance, and diffusion and transport, etc.), requires time- and money-consuming tasks. Instead of a human developing the mechanistic-based model, this paper presents an approach to automatic model evolution in genetic programming (GP) to model dynamic behaviour of groundwater level fluctuations affected by storm water infiltration. This GP evolves mathematical models automatically that have an understandable structure using function tree representation by methods of natural selection ('survival of the fittest') through genetic operators (reproduction, crossover, and mutation). The simulation results have shown that GP is not only capable of predicting the groundwater level fluctuation due to storm water infiltration but also provides insight into the dynamic behaviour of a partially known urban fractured-rock aquifer system by allowing knowledge extraction of the evolved models. Our results show that GP can work as a cost-effective modelling tool, enabling us to create prototype models quickly and inexpensively and assists us in developing accurate models in less time, even if we have limited experience and incomplete knowledge for an urban fractured-rock aquifer system affected by storm water infiltration.
Model-based vision system for automatic recognition of structures in dental radiographs

NASA Astrophysics Data System (ADS)

Acharya, Raj S.; Samarabandu, Jagath K.; Hausmann, E.; Allen, K. A.

1991-07-01

X-ray diagnosis of destructive periodontal disease requires assessing serial radiographs by an expert to determine the change in the distance between cemento-enamel junction (CEJ) and the bone crest. To achieve this without the subjectivity of a human expert, a knowledge based system is proposed to automatically locate the two landmarks which are the CEJ and the level of alveolar crest at its junction with the periodontal ligament space. This work is a part of an ongoing project to automatically measure the distance between CEJ and the bone crest along a line parallel to the axis of the tooth. The approach presented in this paper is based on identifying a prominent feature such as the tooth boundary using local edge detection and edge thresholding to establish a reference and then using model knowledge to process sub-regions in locating the landmarks. Segmentation techniques invoked around these regions consists of a neural-network like hierarchical refinement scheme together with local gradient extraction, multilevel thresholding and ridge tracking. Recognition accuracy is further improved by first locating the easily identifiable parts of the bone surface and the interface between the enamel and the dentine and then extending these boundaries towards the periodontal ligament space and the tooth boundary respectively. The system is realized as a collection of tools (or knowledge sources) for pre-processing, segmentation, primary and secondary feature detection and a control structure based on the blackboard model to coordinate the activities of these tools.
Task-evoked brain functional magnetic susceptibility mapping by independent component analysis (χICA).

PubMed

Chen, Zikuan; Calhoun, Vince D

2016-03-01

Conventionally, independent component analysis (ICA) is performed on an fMRI magnitude dataset to analyze brain functional mapping (AICA). By solving the inverse problem of fMRI, we can reconstruct the brain magnetic susceptibility (χ) functional states. Upon the reconstructed χ dataspace, we propose an ICA-based brain functional χ mapping method (χICA) to extract task-evoked brain functional map. A complex division algorithm is applied to a timeseries of fMRI phase images to extract temporal phase changes (relative to an OFF-state snapshot). A computed inverse MRI (CIMRI) model is used to reconstruct a 4D brain χ response dataset. χICA is implemented by applying a spatial InfoMax ICA algorithm to the reconstructed 4D χ dataspace. With finger-tapping experiments on a 7T system, the χICA-extracted χ-depicted functional map is similar to the SPM-inferred functional χ map by a spatial correlation of 0.67 ± 0.05. In comparison, the AICA-extracted magnitude-depicted map is correlated with the SPM magnitude map by 0.81 ± 0.05. The understanding of the inferiority of χICA to AICA for task-evoked functional map is an ongoing research topic. For task-evoked brain functional mapping, we compare the data-driven ICA method with the task-correlated SPM method. In particular, we compare χICA with AICA for extracting task-correlated timecourses and functional maps. χICA can extract a χ-depicted task-evoked brain functional map from a reconstructed χ dataspace without the knowledge about brain hemodynamic responses. The χICA-extracted brain functional χ map reveals a bidirectional BOLD response pattern that is unavailable (or different) from AICA. Copyright © 2016 Elsevier B.V. All rights reserved.
Line drawing extraction from gray level images by feature integration

NASA Astrophysics Data System (ADS)

Yoo, Hoi J.; Crevier, Daniel; Lepage, Richard; Myler, Harley R.

1994-10-01

We describe procedures that extract line drawings from digitized gray level images, without use of domain knowledge, by modeling preattentive and perceptual organization functions of the human visual system. First, edge points are identified by standard low-level processing, based on the Canny edge operator. Edge points are then linked into single-pixel thick straight- line segments and circular arcs: this operation serves to both filter out isolated and highly irregular segments, and to lump the remaining points into a smaller number of structures for manipulation by later stages of processing. The next stages consist in linking the segments into a set of closed boundaries, which is the system's definition of a line drawing. According to the principles of Gestalt psychology, closure allows us to organize the world by filling in the gaps in a visual stimulation so as to perceive whole objects instead of disjoint parts. To achieve such closure, the system selects particular features or combinations of features by methods akin to those of preattentive processing in humans: features include gaps, pairs of straight or curved parallel lines, L- and T-junctions, pairs of symmetrical lines, and the orientation and length of single lines. These preattentive features are grouped into higher-level structures according to the principles of proximity, similarity, closure, symmetry, and feature conjunction. Achieving closure may require supplying missing segments linking contour concavities. Choices are made between competing structures on the basis of their overall compliance with the principles of closure and symmetry. Results include clean line drawings of curvilinear manufactured objects. The procedures described are part of a system called VITREO (viewpoint-independent 3-D recognition and extraction of objects).
The Registry of Knowledge Translation Methods and Tools: a resource to support evidence-informed public health.

PubMed

Peirson, Leslea; Catallo, Cristina; Chera, Sunita

2013-08-01

This paper examines the development of a globally accessible online Registry of Knowledge Translation Methods and Tools to support evidence-informed public health. A search strategy, screening and data extraction tools, and writing template were developed to find, assess, and summarize relevant methods and tools. An interactive website and searchable database were designed to house the registry. Formative evaluation was undertaken to inform refinements. Over 43,000 citations were screened; almost 700 were full-text reviewed, 140 of which were included. By November 2012, 133 summaries were available. Between January 1 and November 30, 2012 over 32,945 visitors from more than 190 countries accessed the registry. Results from 286 surveys and 19 interviews indicated the registry is valued and useful, but would benefit from a more intuitive indexing system and refinements to the summaries. User stories and promotional activities help expand the reach and uptake of knowledge translation methods and tools in public health contexts. The National Collaborating Centre for Methods and Tools' Registry of Methods and Tools is a unique and practical resource for public health decision makers worldwide.
Unsupervised Medical Entity Recognition and Linking in Chinese Online Medical Text

PubMed Central

Gan, Liang; Cheng, Mian; Wu, Quanyuan

2018-01-01

Online medical text is full of references to medical entities (MEs), which are valuable in many applications, including medical knowledge-based (KB) construction, decision support systems, and the treatment of diseases. However, the diverse and ambiguous nature of the surface forms gives rise to a great difficulty for ME identification. Many existing solutions have focused on supervised approaches, which are often task-dependent. In other words, applying them to different kinds of corpora or identifying new entity categories requires major effort in data annotation and feature definition. In this paper, we propose unMERL, an unsupervised framework for recognizing and linking medical entities mentioned in Chinese online medical text. For ME recognition, unMERL first exploits a knowledge-driven approach to extract candidate entities from free text. Then, the categories of the candidate entities are determined using a distributed semantic-based approach. For ME linking, we propose a collaborative inference approach which takes full advantage of heterogenous entity knowledge and unstructured information in KB. Experimental results on real corpora demonstrate significant benefits compared to recent approaches with respect to both ME recognition and linking. PMID:29849994
Knowledge Discovery from Vibration Measurements

PubMed Central

Li, Jian; Wang, Daoyao

2014-01-01

The framework as well as the particular algorithms of pattern recognition process is widely adopted in structural health monitoring (SHM). However, as a part of the overall process of knowledge discovery from data bases (KDD), the results of pattern recognition are only changes and patterns of changes of data features. In this paper, based on the similarity between KDD and SHM and considering the particularity of SHM problems, a four-step framework of SHM is proposed which extends the final goal of SHM from detecting damages to extracting knowledge to facilitate decision making. The purposes and proper methods of each step of this framework are discussed. To demonstrate the proposed SHM framework, a specific SHM method which is composed by the second order structural parameter identification, statistical control chart analysis, and system reliability analysis is then presented. To examine the performance of this SHM method, real sensor data measured from a lab size steel bridge model structure are used. The developed four-step framework of SHM has the potential to clarify the process of SHM to facilitate the further development of SHM techniques. PMID:24574933
The prediction of crystal structure by merging knowledge methods with first principles quantum mechanics

NASA Astrophysics Data System (ADS)

Ceder, Gerbrand

2007-03-01

The prediction of structure is a key problem in computational materials science that forms the platform on which rational materials design can be performed. Finding structure by traditional optimization methods on quantum mechanical energy models is not possible due to the complexity and high dimensionality of the coordinate space. An unusual, but efficient solution to this problem can be obtained by merging ideas from heuristic and ab initio methods: In the same way that scientist build empirical rules by observation of experimental trends, we have developed machine learning approaches that extract knowledge from a large set of experimental information and a database of over 15,000 first principles computations, and used these to rapidly direct accurate quantum mechanical techniques to the lowest energy crystal structure of a material. Knowledge is captured in a Bayesian probability network that relates the probability to find a particular crystal structure at a given composition to structure and energy information at other compositions. We show that this approach is highly efficient in finding the ground states of binary metallic alloys and can be easily generalized to more complex systems.
Segmentation of kidney using C-V model and anatomy priors

NASA Astrophysics Data System (ADS)

Lu, Jinghua; Chen, Jie; Zhang, Juan; Yang, Wenjia

2007-12-01

This paper presents an approach for kidney segmentation on abdominal CT images as the first step of a virtual reality surgery system. Segmentation for medical images is often challenging because of the objects' complicated anatomical structures, various gray levels, and unclear edges. A coarse to fine approach has been applied in the kidney segmentation using Chan-Vese model (C-V model) and anatomy prior knowledge. In pre-processing stage, the candidate kidney regions are located. Then C-V model formulated by level set method is applied in these smaller ROI, which can reduce the calculation complexity to a certain extent. At last, after some mathematical morphology procedures, the specified kidney structures have been extracted interactively with prior knowledge. The satisfying results on abdominal CT series show that the proposed approach keeps all the advantages of C-V model and overcome its disadvantages.
The Knowledge Program: an Innovative, Comprehensive Electronic Data Capture System and Warehouse

PubMed Central

Katzan, Irene; Speck, Micheal; Dopler, Chris; Urchek, John; Bielawski, Kay; Dunphy, Cheryl; Jehi, Lara; Bae, Charles; Parchman, Alandra

2011-01-01

Data contained in the electronic health record (EHR) present a tremendous opportunity to improve quality-of-care and enhance research capabilities. However, the EHR is not structured to provide data for such purposes: most clinical information is entered as free text and content varies substantially between providers. Discrete information on patients’ functional status is typically not collected. Data extraction tools are often unavailable. We have developed the Knowledge Program (KP), a comprehensive initiative to improve the collection of discrete clinical information into the EHR and the retrievability of data for use in research, quality, and patient care. A distinct feature of the KP is the systematic collection of patient-reported outcomes, which is captured discretely, allowing more refined analyses of care outcomes. The KP capitalizes on features of the Epic EHR and utilizes an external IT infrastructure distinct from Epic for enhanced functionality. Here, we describe the development and implementation of the KP. PMID:22195124
The Adam and Eve Robot Scientists for the Automated Discovery of Scientific Knowledge

NASA Astrophysics Data System (ADS)

King, Ross

A Robot Scientist is a physically implemented robotic system that applies techniques from artificial intelligence to execute cycles of automated scientific experimentation. A Robot Scientist can automatically execute cycles of hypothesis formation, selection of efficient experiments to discriminate between hypotheses, execution of experiments using laboratory automation equipment, and analysis of results. The motivation for developing Robot Scientists is to better understand science, and to make scientific research more efficient. The Robot Scientist `Adam' was the first machine to autonomously discover scientific knowledge: both form and experimentally confirm novel hypotheses. Adam worked in the domain of yeast functional genomics. The Robot Scientist `Eve' was originally developed to automate early-stage drug development, with specific application to neglected tropical disease such as malaria, African sleeping sickness, etc. We are now adapting Eve to work with on cancer. We are also teaching Eve to autonomously extract information from the scientific literature.
Automated Data Cleansing in Data Harvesting and Data Migration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Martin, Mark; Vowell, Lance; King, Ian

2011-03-16

In the proposal for this project, we noted how the explosion of digitized information available through corporate databases, data stores and online search systems has resulted in the knowledge worker being bombarded by information. Knowledge workers typically spend more than 20-30% of their time seeking and sorting information, only finding the information 50-60% of the time . This information exists as unstructured, semi-structured and structured data. The problem of information overload is compounded by the production of duplicate or near-duplicate information. In addition, near-duplicate items frequently have different origins, creating a situation in which each item may have unique informationmore » of value, but their differences are not significant enough to justify maintaining them as separate entities. Effective tools can be provided to eliminate duplicate and near-duplicate information. The proposed approach was to extract unique information from data sets and consolidation that information into a single comprehensive file.« less

Knowledge Extraction and Semantic Annotation of Text from the Encyclopedia of Life

PubMed Central

Thessen, Anne E.; Parr, Cynthia Sims

2014-01-01

Numerous digitization and ontological initiatives have focused on translating biological knowledge from narrative text to machine-readable formats. In this paper, we describe two workflows for knowledge extraction and semantic annotation of text data objects featured in an online biodiversity aggregator, the Encyclopedia of Life. One workflow tags text with DBpedia URIs based on keywords. Another workflow finds taxon names in text using GNRD for the purpose of building a species association network. Both workflows work well: the annotation workflow has an F1 Score of 0.941 and the association algorithm has an F1 Score of 0.885. Existing text annotators such as Terminizer and DBpedia Spotlight performed well, but require some optimization to be useful in the ecology and evolution domain. Important future work includes scaling up and improving accuracy through the use of distributional semantics. PMID:24594988
Neuroimaging Feature Terminology: A Controlled Terminology for the Annotation of Brain Imaging Features.

PubMed

Iyappan, Anandhi; Younesi, Erfan; Redolfi, Alberto; Vrooman, Henri; Khanna, Shashank; Frisoni, Giovanni B; Hofmann-Apitius, Martin

2017-01-01

Ontologies and terminologies are used for interoperability of knowledge and data in a standard manner among interdisciplinary research groups. Existing imaging ontologies capture general aspects of the imaging domain as a whole such as methodological concepts or calibrations of imaging instruments. However, none of the existing ontologies covers the diagnostic features measured by imaging technologies in the context of neurodegenerative diseases. Therefore, the Neuro-Imaging Feature Terminology (NIFT) was developed to organize the knowledge domain of measured brain features in association with neurodegenerative diseases by imaging technologies. The purpose is to identify quantitative imaging biomarkers that can be extracted from multi-modal brain imaging data. This terminology attempts to cover measured features and parameters in brain scans relevant to disease progression. In this paper, we demonstrate the systematic retrieval of measured indices from literature and how the extracted knowledge can be further used for disease modeling that integrates neuroimaging features with molecular processes.
Green extraction of grape skin phenolics by using deep eutectic solvents.

PubMed

Cvjetko Bubalo, Marina; Ćurko, Natka; Tomašević, Marina; Kovačević Ganić, Karin; Radojčić Redovniković, Ivana

2016-06-01

Conventional extraction techniques for plant phenolics are usually associated with high organic solvent consumption and long extraction times. In order to establish an environmentally friendly extraction method for grape skin phenolics, deep eutectic solvents (DES) as a green alternative to conventional solvents coupled with highly efficient microwave-assisted and ultrasound-assisted extraction methods (MAE and UAE, respectively) have been considered. Initially, screening of five different DES for proposed extraction was performed and choline chloride-based DES containing oxalic acid as a hydrogen bond donor with 25% of water was selected as the most promising one, resulting in more effective extraction of grape skin phenolic compounds compared to conventional solvents. Additionally, in our study, UAE proved to be the best extraction method with extraction efficiency superior to both MAE and conventional extraction method. The knowledge acquired in this study will contribute to further DES implementation in extraction of biologically active compounds from various plant sources. Copyright © 2016 Elsevier Ltd. All rights reserved.
Producing More Actionable Science Isn't the Problem; It's Providing Decision-Makers with Access to Right Actionable Knowledge

NASA Astrophysics Data System (ADS)

Trexler, M.

2017-12-01

Policy-makers today have almost infinite climate-relevant scientific and other information available to them. The problem for climate change decision-making isn't missing science or inadequate knowledge of climate risks; the problem is that the "right" climate change actionable knowledge isn't getting to the right decision-maker, or is getting there too early or too late to effectively influence her decision-making. Actionable knowledge is not one-size-fit-all, and for a given decision-maker might involve scientific, economic, or risk-based information. Simply producing more and more information as we are today is not the solution, and actually makes it harder for individual decision-makers to access "their" actionable knowledge. The Climatographers began building the Climate Web five years ago to test the hypothesis that a knowledge management system could help navigate the gap between infinite information and individual actionable knowledge. Today the Climate Web's more than 1,500 index terms allow instant access to almost any climate change topic. It is a curated public-access knowledgebase of more than 1,000 books, 2,000 videos, 15,000 reports and articles, 25,000 news stories, and 3,000 websites. But it is also much more, linking together tens of thousands of individually extracted ideas and graphics, and providing Deep Dives into more than 100 key topics from changing probability distributions of extreme events to climate communications best practices to cognitive dissonance in climate change decision-making. The public-access Climate Web is uniquely able to support cross-silo learning, collaboration, and actionable knowledge dissemination. The presentation will use the Climate Web to demonstrate why knowledge management should be seen as a critical component of science and policy-making collaborations.
INDUCTIVE SYSTEM HEALTH MONITORING WITH STATISTICAL METRICS

NASA Technical Reports Server (NTRS)

Iverson, David L.

2005-01-01

Model-based reasoning is a powerful method for performing system monitoring and diagnosis. Building models for model-based reasoning is often a difficult and time consuming process. The Inductive Monitoring System (IMS) software was developed to provide a technique to automatically produce health monitoring knowledge bases for systems that are either difficult to model (simulate) with a computer or which require computer models that are too complex to use for real time monitoring. IMS processes nominal data sets collected either directly from the system or from simulations to build a knowledge base that can be used to detect anomalous behavior in the system. Machine learning and data mining techniques are used to characterize typical system behavior by extracting general classes of nominal data from archived data sets. In particular, a clustering algorithm forms groups of nominal values for sets of related parameters. This establishes constraints on those parameter values that should hold during nominal operation. During monitoring, IMS provides a statistically weighted measure of the deviation of current system behavior from the established normal baseline. If the deviation increases beyond the expected level, an anomaly is suspected, prompting further investigation by an operator or automated system. IMS has shown potential to be an effective, low cost technique to produce system monitoring capability for a variety of applications. We describe the training and system health monitoring techniques of IMS. We also present the application of IMS to a data set from the Space Shuttle Columbia STS-107 flight. IMS was able to detect an anomaly in the launch telemetry shortly after a foam impact damaged Columbia's thermal protection system.
Can big data transform electronic health records into learning health systems?

PubMed

Harper, Ellen

2014-01-01

In the United States and globally, healthcare delivery is in the midst of an acute transformation with the adoption and use of health information technology (health IT) thus generating increasing amounts of patient care data available in computable form. Secure and trusted use of these data, beyond their original purpose can change the way we think about business, health, education, and innovation in the years to come. "Big Data" is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it.
Why `false' colours are seen by butterflies

NASA Astrophysics Data System (ADS)

Kelber, Almut

1999-11-01

Light can be described by its intensity, spectral distribution and polarization, and normally a visual system analyses these independently to extract the maximum amount of information. Here I present behavioural evidence that this does not happen in butterflies, whose choice of oviposition substrate on the basis of its colour appears to be strongly influenced by the direction of polarization of the light reflected from the substrate. To my knowledge, this is the first record of `false' colours being perceived as a result of light polarization. This detection of false colours may help butterflies to find optimal oviposition sites.
Updated United Nations Framework Classification for reserves and resources of extractive industries

USGS Publications Warehouse

Ahlbrandt, T.S.; Blaise, J.R.; Blystad, P.; Kelter, D.; Gabrielyants, G.; Heiberg, S.; Martinez, A.; Ross, J.G.; Slavov, S.; Subelj, A.; Young, E.D.

2004-01-01

The United Nations have studied how the oil and gas resource classification developed jointly by the SPE, the World Petroleum Congress (WPC) and the American Association of Petroleum Geologists (AAPG) could be harmonized with the United Nations Framework Classification (UNFC) for Solid Fuel and Mineral Resources (1). The United Nations has continued to build on this and other works, with support from many relevant international organizations, with the objective of updating the UNFC to apply to the extractive industries. The result is the United Nations Framework Classification for Energy and Mineral Resources (2) that this paper will present. Reserves and resources are categorized with respect to three sets of criteria: ??? Economic and commercial viability ??? Field project status and feasibility ??? The level of geologic knowledge The field project status criteria are readily recognized as the ones highlighted in the SPE/WPC/AAPG classification system of 2000. The geologic criteria absorb the rich traditions that form the primary basis for the Russian classification system, and the ones used to delimit, in part, proved reserves. Economic and commercial criteria facilitate the use of the classification in general, and reflect the commercial considerations used to delimit proved reserves in particular. The classification system will help to develop a common understanding of reserves and resources for all the extractive industries and will assist: ??? International and national resources management to secure supplies; ??? Industries' management of business processes to achieve efficiency in exploration and production; and ??? An appropriate basis for documenting the value of reserves and resources in financial statements.
Resolving anaphoras for the extraction of drug-drug interactions in pharmacological documents

PubMed Central

2010-01-01

Background Drug-drug interactions are frequently reported in the increasing amount of biomedical literature. Information Extraction (IE) techniques have been devised as a useful instrument to manage this knowledge. Nevertheless, IE at the sentence level has a limited effect because of the frequent references to previous entities in the discourse, a phenomenon known as 'anaphora'. DrugNerAR, a drug anaphora resolution system is presented to address the problem of co-referring expressions in pharmacological literature. This development is part of a larger and innovative study about automatic drug-drug interaction extraction. Methods The system uses a set of linguistic rules drawn by Centering Theory over the analysis provided by a biomedical syntactic parser. Semantic information provided by the Unified Medical Language System (UMLS) is also integrated in order to improve the recognition and the resolution of nominal drug anaphors. Besides, a corpus has been developed in order to analyze the phenomena and evaluate the current approach. Each possible case of anaphoric expression was looked into to determine the most effective way of resolution. Results An F-score of 0.76 in anaphora resolution was achieved, outperforming significantly the baseline by almost 73%. This ad-hoc reference line was developed to check the results as there is no previous work on anaphora resolution in pharmalogical documents. The obtained results resemble those found in related-semantic domains. Conclusions The present approach shows very promising results in the challenge of accounting for anaphoric expressions in pharmacological texts. DrugNerAr obtains similar results to other approaches dealing with anaphora resolution in the biomedical domain, but, unlike these approaches, it focuses on documents reflecting drug interactions. The Centering Theory has proved being effective at the selection of antecedents in anaphora resolution. A key component in the success of this framework is the analysis provided by the MMTx program and the DrugNer system that allows to deal with the complexity of the pharmacological language. It is expected that the positive results of the resolver increases performance of our future drug-drug interaction extraction system. PMID:20406499
Large-scale extraction of brain connectivity from the neuroscientific literature

PubMed Central

Richardet, Renaud; Chappelier, Jean-Cédric; Telefont, Martin; Hill, Sean

2015-01-01

Motivation: In neuroscience, as in many other scientific domains, the primary form of knowledge dissemination is through published articles. One challenge for modern neuroinformatics is finding methods to make the knowledge from the tremendous backlog of publications accessible for search, analysis and the integration of such data into computational models. A key example of this is metascale brain connectivity, where results are not reported in a normalized repository. Instead, these experimental results are published in natural language, scattered among individual scientific publications. This lack of normalization and centralization hinders the large-scale integration of brain connectivity results. In this article, we present text-mining models to extract and aggregate brain connectivity results from 13.2 million PubMed abstracts and 630 216 full-text publications related to neuroscience. The brain regions are identified with three different named entity recognizers (NERs) and then normalized against two atlases: the Allen Brain Atlas (ABA) and the atlas from the Brain Architecture Management System (BAMS). We then use three different extractors to assess inter-region connectivity. Results: NERs and connectivity extractors are evaluated against a manually annotated corpus. The complete in litero extraction models are also evaluated against in vivo connectivity data from ABA with an estimated precision of 78%. The resulting database contains over 4 million brain region mentions and over 100 000 (ABA) and 122 000 (BAMS) potential brain region connections. This database drastically accelerates connectivity literature review, by providing a centralized repository of connectivity data to neuroscientists. Availability and implementation: The resulting models are publicly available at github.com/BlueBrain/bluima. Contact: renaud.richardet@epfl.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25609795
Knowledge-Based Image Analysis.

DTIC Science & Technology

1981-04-01

UNCLASSIF1 ED ETL-025s N IIp ETL-0258 AL Ai01319 S"Knowledge-based image analysis u George C. Stockman Barbara A. Lambird I David Lavine Laveen N. Kanal...extraction, verification, region classification, pattern recognition, image analysis . 3 20. A. CT (Continue on rever.. d. It necessary and Identify by...UNCLgSTFTF n In f SECURITY CLASSIFICATION OF THIS PAGE (When Date Entered) .L1 - I Table of Contents Knowledge Based Image Analysis I Preface
Ginkgo biloba extracts: a review of the pharmacokinetics of the active ingredients.

PubMed

Ude, Christian; Schubert-Zsilavecz, Manfred; Wurglics, Mario

2013-09-01

Ginkgo biloba is among the most favourite and best explored herbal drugs. Standardized extracts of Ginkgo biloba represent the only herbal alternative to synthetic antidementia drugs in the therapy of cognitive decline and Alzheimer's diseases. The clinical efficiency of such standardized Ginkgo biloba extracts (GBE) is still controversial, but authors of numerous international clinical studies recommended the use of GBE in the described therapies.Extracts of Ginkgo biloba are a mixture of substances with a wide variety of physical and chemical properties and activities. Numerous pharmacological investigations lead to the conclusion that the terpene trilactones (TTL) and the flavonoids of GBE are responsible for the main pharmacological effects of the extract in the therapy of cognitive decline. Therefore, the quality of GBE products must be oriented on a defined quantity of TTL and flavonoids. Furthermore, because of their toxic potential the amount of ginkgolic acid should be less than 5 ppm.However, data on pharmacokinetics and bioavailability, especially related to the central nervous system (CNS), which is the target tissue, are relatively rare. A few investigations characterize the TTL and flavonoids of Ginkgo biloba pharmacokinetically in plasma and in the brain. Recent investigations show that significant levels of TTL and Ginkgo biloba flavonoids cross the blood-brain barrier and enter the CNS of rats after oral application of GBE. Knowledge about the pharmacokinetic behaviour of these substances is necessary to discuss the pharmacological results on a more realistic basis.
Effects of Mangifera indica L. aqueous extract (Vimang) on primary culture of rat hepatocytes.

PubMed

Rodeiro, I; Donato, M T; Jiménez, N; Garrido, G; Delgado, R; Gómez-Lechón, M J

2007-12-01

Vimang is an aqueous extract from stem bark of Mangifera indica L. (Mango) with pharmacological properties. It is a mixture of polyphenols (as main components), terpenoids, steroids, fatty acids and microelements. In the present work we studied the cytotoxic effects of Vimang on rat hepatocytes, possible interactions of the extract with drug-metabolizing enzymes and its effects on GSH levels and lipid peroxidation. No cytotoxic effects were observed after 24 h exposure to Vimang of up to 1000 microg/mL, while a moderate cytotoxicity was observed after 48 and 72 h of exposure at higher concentrations (500 and 1000 microg/mL). The effect of the extract (50-400 microg/mL) on several P450 isozymes was evaluated. Exposure of hepatocytes to Vimang at concentrations of up to 100 microg/mL produced a significant reduction (60%) in 7-methoxyresorufin-O-demethylase (MROD; CYP1A2) activity, an increase (50%) in 7-penthoxyresorufin-O-depentylase (PROD; CYP2B1) activity, while no significant effect was observed with other isozymes. To our knowledge, this is the first report regarding the modulation of the activity of the P450 system by an extract of Mangifera indica L. The antioxidant properties of Vimang were also evaluated in t-butyl-hydroperoxide-treated hepatocytes. A 36-h pre-treatment of cells with Vimang (25-200 microg/mL) strongly inhibited the decrease of GSH levels and lipid peroxidation induced by t-butyl-hydroperoxide dose- and time-dependently.
Three-dimensional tracking for efficient fire fighting in complex situations

NASA Astrophysics Data System (ADS)

Akhloufi, Moulay; Rossi, Lucile

2009-05-01

Each year, hundred millions hectares of forests burn causing human and economic losses. For efficient fire fighting, the personnel in the ground need tools permitting the prediction of fire front propagation. In this work, we present a new technique for automatically tracking fire spread in three-dimensional space. The proposed approach uses a stereo system to extract a 3D shape from fire images. A new segmentation technique is proposed and permits the extraction of fire regions in complex unstructured scenes. It works in the visible spectrum and combines information extracted from YUV and RGB color spaces. Unlike other techniques, our algorithm does not require previous knowledge about the scene. The resulting fire regions are classified into different homogenous zones using clustering techniques. Contours are then extracted and a feature detection algorithm is used to detect interest points like local maxima and corners. Extracted points from stereo images are then used to compute the 3D shape of the fire front. The resulting data permits to build the fire volume. The final model is used to compute important spatial and temporal fire characteristics like: spread dynamics, local orientation, heading direction, etc. Tests conducted on the ground show the efficiency of the proposed scheme. This scheme is being integrated with a fire spread mathematical model in order to predict and anticipate the fire behaviour during fire fighting. Also of interest to fire-fighters, is the proposed automatic segmentation technique that can be used in early detection of fire in complex scenes.
A new adaptive control strategy for a class of nonlinear system using RBF neuro-sliding-mode technique: application to SEIG wind turbine control system

NASA Astrophysics Data System (ADS)

Kenné, Godpromesse; Fotso, Armel Simo; Lamnabhi-Lagarrigue, Françoise

2017-04-01

In this paper, a new hybrid method which combines radial basis function (RBF) neural network with a sliding-mode technique to take advantage of their common features is used to control a class of nonlinear systems. A real-time dynamic nonlinear learning law of the weight vector is synthesized and the closed-loop stability has been demonstrated using Lyapunov theory. The solution presented in this work does not need the knowledge of the perturbation bounds, neither the knowledge of the full state of the nonlinear system. In addition, the bounds of the nonlinear functions are assumed to be unknown and the proposed RBF structure uses reduced number of hidden units. This hybrid control strategy is applied to extract the maximum available energy from a stand-alone self-excited variable low-wind speed energy conversion system and design the dc-voltage and rotor flux controllers as well as the load-side frequency and voltage regulators assuming that the measured outputs are the rotor speed, stator currents, load-side currents and voltages despite large variation of the rotor resistance and uncertainties on the inductances. Finally, simulation results compared with those obtained using the well-known second-order sliding-mode controller are given to show the effectiveness and feasibility of the proposed approach.
Forest fire autonomous decision system based on fuzzy logic

NASA Astrophysics Data System (ADS)

Lei, Z.; Lu, Jianhua

2010-11-01

The proposed system integrates GPS / pseudolite / IMU and thermal camera in order to autonomously process the graphs by identification, extraction, tracking of forest fire or hot spots. The airborne detection platform, the graph-based algorithms and the signal processing frame are analyzed detailed; especially the rules of the decision function are expressed in terms of fuzzy logic, which is an appropriate method to express imprecise knowledge. The membership function and weights of the rules are fixed through a supervised learning process. The perception system in this paper is based on a network of sensorial stations and central stations. The sensorial stations collect data including infrared and visual images and meteorological information. The central stations exchange data to perform distributed analysis. The experiment results show that working procedure of detection system is reasonable and can accurately output the detection alarm and the computation of infrared oscillations.
CAMUR: Knowledge extraction from RNA-seq cancer data through equivalent classification rules.

PubMed

Cestarelli, Valerio; Fiscon, Giulia; Felici, Giovanni; Bertolazzi, Paola; Weitschek, Emanuel

2016-03-01

Nowadays, knowledge extraction methods from Next Generation Sequencing data are highly requested. In this work, we focus on RNA-seq gene expression analysis and specifically on case-control studies with rule-based supervised classification algorithms that build a model able to discriminate cases from controls. State of the art algorithms compute a single classification model that contains few features (genes). On the contrary, our goal is to elicit a higher amount of knowledge by computing many classification models, and therefore to identify most of the genes related to the predicted class. We propose CAMUR, a new method that extracts multiple and equivalent classification models. CAMUR iteratively computes a rule-based classification model, calculates the power set of the genes present in the rules, iteratively eliminates those combinations from the data set, and performs again the classification procedure until a stopping criterion is verified. CAMUR includes an ad-hoc knowledge repository (database) and a querying tool.We analyze three different types of RNA-seq data sets (Breast, Head and Neck, and Stomach Cancer) from The Cancer Genome Atlas (TCGA) and we validate CAMUR and its models also on non-TCGA data. Our experimental results show the efficacy of CAMUR: we obtain several reliable equivalent classification models, from which the most frequent genes, their relationships, and the relation with a particular cancer are deduced. dmb.iasi.cnr.it/camur.php emanuel@iasi.cnr.it Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Parallel implementation and evaluation of motion estimation system algorithms on a distributed memory multiprocessor using knowledge based mappings

NASA Technical Reports Server (NTRS)

Choudhary, Alok Nidhi; Leung, Mun K.; Huang, Thomas S.; Patel, Janak H.

1989-01-01

Several techniques to perform static and dynamic load balancing techniques for vision systems are presented. These techniques are novel in the sense that they capture the computational requirements of a task by examining the data when it is produced. Furthermore, they can be applied to many vision systems because many algorithms in different systems are either the same, or have similar computational characteristics. These techniques are evaluated by applying them on a parallel implementation of the algorithms in a motion estimation system on a hypercube multiprocessor system. The motion estimation system consists of the following steps: (1) extraction of features; (2) stereo match of images in one time instant; (3) time match of images from different time instants; (4) stereo match to compute final unambiguous points; and (5) computation of motion parameters. It is shown that the performance gains when these data decomposition and load balancing techniques are used are significant and the overhead of using these techniques is minimal.
Automatic Line Network Extraction from Aerial Imagery of Urban Areas through Knowledge-Based Image Analysis.

DTIC Science & Technology

1988-01-19

approach for the analysis of aerial images. In this approach image analysis is performed ast three levels of abstraction, namely iconic or low-level... image analysis , symbolic or medium-level image analysis , and semantic or high-level image analysis . Domain dependent knowledge about prototypical urban
Extraction of Graph Information Based on Image Contents and the Use of Ontology

ERIC Educational Resources Information Center

Kanjanawattana, Sarunya; Kimura, Masaomi

2016-01-01

A graph is an effective form of data representation used to summarize complex information. Explicit information such as the relationship between the X- and Y-axes can be easily extracted from a graph by applying human intelligence. However, implicit knowledge such as information obtained from other related concepts in an ontology also resides in…

Some links on this page may take you to non-federal websites. Their policies may differ from this site.