knowledge discovery methods: Topics by Science.gov

Sample records for knowledge discovery methods

Knowledge Discovery from Databases: An Introductory Review.

ERIC Educational Resources Information Center

Vickery, Brian

1997-01-01

Introduces new procedures being used to extract knowledge from databases and discusses rationales for developing knowledge discovery methods. Methods are described for such techniques as classification, clustering, and the detection of deviations from pre-established norms. Examines potential uses of knowledge discovery in the information field.…
SemaTyP: a knowledge graph based literature mining method for drug discovery.

PubMed

Sang, Shengtian; Yang, Zhihao; Wang, Lei; Liu, Xiaoxia; Lin, Hongfei; Wang, Jian

2018-05-30

Drug discovery is the process through which potential new medicines are identified. High-throughput screening and computer-aided drug discovery/design are the two main drug discovery methods for now, which have successfully discovered a series of drugs. However, development of new drugs is still an extremely time-consuming and expensive process. Biomedical literature contains important clues for the identification of potential treatments. It could support experts in biomedicine on their way towards new discoveries. Here, we propose a biomedical knowledge graph-based drug discovery method called SemaTyP, which discovers candidate drugs for diseases by mining published biomedical literature. We first construct a biomedical knowledge graph with the relations extracted from biomedical abstracts, then a logistic regression model is trained by learning the semantic types of paths of known drug therapies' existing in the biomedical knowledge graph, finally the learned model is used to discover drug therapies for new diseases. The experimental results show that our method could not only effectively discover new drug therapies for new diseases, but also could provide the potential mechanism of action of the candidate drugs. In this paper we propose a novel knowledge graph based literature mining method for drug discovery. It could be a supplementary method for current drug discovery methods.
Knowledge-Based Topic Model for Unsupervised Object Discovery and Localization.

PubMed

Niu, Zhenxing; Hua, Gang; Wang, Le; Gao, Xinbo

Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.
A collaborative filtering-based approach to biomedical knowledge discovery.

PubMed

Lever, Jake; Gakkhar, Sitanshu; Gottlieb, Michael; Rashnavadi, Tahereh; Lin, Santina; Siu, Celia; Smith, Maia; Jones, Martin R; Krzywinski, Martin; Jones, Steven J M; Wren, Jonathan

2018-02-15

The increase in publication rates makes it challenging for an individual researcher to stay abreast of all relevant research in order to find novel research hypotheses. Literature-based discovery methods make use of knowledge graphs built using text mining and can infer future associations between biomedical concepts that will likely occur in new publications. These predictions are a valuable resource for researchers to explore a research topic. Current methods for prediction are based on the local structure of the knowledge graph. A method that uses global knowledge from across the knowledge graph needs to be developed in order to make knowledge discovery a frequently used tool by researchers. We propose an approach based on the singular value decomposition (SVD) that is able to combine data from across the knowledge graph through a reduced representation. Using cooccurrence data extracted from published literature, we show that SVD performs better than the leading methods for scoring discoveries. We also show the diminishing predictive power of knowledge discovery as we compare our predictions with real associations that appear further into the future. Finally, we examine the strengths and weaknesses of the SVD approach against another well-performing system using several predicted associations. All code and results files for this analysis can be accessed at https://github.com/jakelever/knowledgediscovery. sjones@bcgsc.ca. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
KNODWAT: A scientific framework application for testing knowledge discovery methods for the biomedical domain

PubMed Central

2013-01-01

Background Professionals in the biomedical domain are confronted with an increasing mass of data. Developing methods to assist professional end users in the field of Knowledge Discovery to identify, extract, visualize and understand useful information from these huge amounts of data is a huge challenge. However, there are so many diverse methods and methodologies available, that for biomedical researchers who are inexperienced in the use of even relatively popular knowledge discovery methods, it can be very difficult to select the most appropriate method for their particular research problem. Results A web application, called KNODWAT (KNOwledge Discovery With Advanced Techniques) has been developed, using Java on Spring framework 3.1. and following a user-centered approach. The software runs on Java 1.6 and above and requires a web server such as Apache Tomcat and a database server such as the MySQL Server. For frontend functionality and styling, Twitter Bootstrap was used as well as jQuery for interactive user interface operations. Conclusions The framework presented is user-centric, highly extensible and flexible. Since it enables methods for testing using existing data to assess suitability and performance, it is especially suitable for inexperienced biomedical researchers, new to the field of knowledge discovery and data mining. For testing purposes two algorithms, CART and C4.5 were implemented using the WEKA data mining framework. PMID:23763826
KNODWAT: a scientific framework application for testing knowledge discovery methods for the biomedical domain.

PubMed

Holzinger, Andreas; Zupan, Mario

2013-06-13

Professionals in the biomedical domain are confronted with an increasing mass of data. Developing methods to assist professional end users in the field of Knowledge Discovery to identify, extract, visualize and understand useful information from these huge amounts of data is a huge challenge. However, there are so many diverse methods and methodologies available, that for biomedical researchers who are inexperienced in the use of even relatively popular knowledge discovery methods, it can be very difficult to select the most appropriate method for their particular research problem. A web application, called KNODWAT (KNOwledge Discovery With Advanced Techniques) has been developed, using Java on Spring framework 3.1. and following a user-centered approach. The software runs on Java 1.6 and above and requires a web server such as Apache Tomcat and a database server such as the MySQL Server. For frontend functionality and styling, Twitter Bootstrap was used as well as jQuery for interactive user interface operations. The framework presented is user-centric, highly extensible and flexible. Since it enables methods for testing using existing data to assess suitability and performance, it is especially suitable for inexperienced biomedical researchers, new to the field of knowledge discovery and data mining. For testing purposes two algorithms, CART and C4.5 were implemented using the WEKA data mining framework.
Knowledge Discovery from Posts in Online Health Communities Using Unified Medical Language System.

PubMed

Chen, Donghua; Zhang, Runtong; Liu, Kecheng; Hou, Lei

2018-06-19

Patient-reported posts in Online Health Communities (OHCs) contain various valuable information that can help establish knowledge-based online support for online patients. However, utilizing these reports to improve online patient services in the absence of appropriate medical and healthcare expert knowledge is difficult. Thus, we propose a comprehensive knowledge discovery method that is based on the Unified Medical Language System for the analysis of narrative posts in OHCs. First, we propose a domain-knowledge support framework for OHCs to provide a basis for post analysis. Second, we develop a Knowledge-Involved Topic Modeling (KI-TM) method to extract and expand explicit knowledge within the text. We propose four metrics, namely, explicit knowledge rate, latent knowledge rate, knowledge correlation rate, and perplexity, for the evaluation of the KI-TM method. Our experimental results indicate that our proposed method outperforms existing methods in terms of providing knowledge support. Our method enhances knowledge support for online patients and can help develop intelligent OHCs in the future.
Progress in Biomedical Knowledge Discovery: A 25-year Retrospective

PubMed Central

Sacchi, L.

2016-01-01

Summary Objectives We sought to explore, via a systematic review of the literature, the state of the art of knowledge discovery in biomedical databases as it existed in 1992, and then now, 25 years later, mainly focused on supervised learning. Methods We performed a rigorous systematic search of PubMed and latent Dirichlet allocation to identify themes in the literature and trends in the science of knowledge discovery in and between time periods and compare these trends. We restricted the result set using a bracket of five years previous, such that the 1992 result set was restricted to articles published between 1987 and 1992, and the 2015 set between 2011 and 2015. This was to reflect the current literature available at the time to researchers and others at the target dates of 1992 and 2015. The search term was framed as: Knowledge Discovery OR Data Mining OR Pattern Discovery OR Pattern Recognition, Automated. Results A total 538 and 18,172 documents were retrieved for 1992 and 2015, respectively. The number and type of data sources increased dramatically over the observation period, primarily due to the advent of electronic clinical systems. The period 1992-2015 saw the emergence of new areas of research in knowledge discovery, and the refinement and application of machine learning approaches that were nascent or unknown in 1992. Conclusions Over the 25 years of the observation period, we identified numerous developments that impacted the science of knowledge discovery, including the availability of new forms of data, new machine learning algorithms, and new application domains. Through a bibliometric analysis we examine the striking changes in the availability of highly heterogeneous data resources, the evolution of new algorithmic approaches to knowledge discovery, and we consider from legal, social, and political perspectives possible explanations of the growth of the field. Finally, we reflect on the achievements of the past 25 years to consider what the next 25 years will bring with regard to the availability of even more complex data and to the methods that could be, and are being now developed for the discovery of new knowledge in biomedical data. PMID:27488403
Knowledge discovery with classification rules in a cardiovascular dataset.

PubMed

Podgorelec, Vili; Kokol, Peter; Stiglic, Milojka Molan; Hericko, Marjan; Rozman, Ivan

2005-12-01

In this paper we study an evolutionary machine learning approach to data mining and knowledge discovery based on the induction of classification rules. A method for automatic rules induction called AREX using evolutionary induction of decision trees and automatic programming is introduced. The proposed algorithm is applied to a cardiovascular dataset consisting of different groups of attributes which should possibly reveal the presence of some specific cardiovascular problems in young patients. A case study is presented that shows the use of AREX for the classification of patients and for discovering possible new medical knowledge from the dataset. The defined knowledge discovery loop comprises a medical expert's assessment of induced rules to drive the evolution of rule sets towards more appropriate solutions. The final result is the discovery of a possible new medical knowledge in the field of pediatric cardiology.
Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships

PubMed Central

2010-01-01

Background The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. Results In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. Conclusion High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data. PMID:20122245
Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships.

PubMed

Seok, Junhee; Kaushal, Amit; Davis, Ronald W; Xiao, Wenzhong

2010-01-18

The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data.
Predicting future discoveries from current scientific literature.

PubMed

Petrič, Ingrid; Cestnik, Bojan

2014-01-01

Knowledge discovery in biomedicine is a time-consuming process starting from the basic research, through preclinical testing, towards possible clinical applications. Crossing of conceptual boundaries is often needed for groundbreaking biomedical research that generates highly inventive discoveries. We demonstrate the ability of a creative literature mining method to advance valuable new discoveries based on rare ideas from existing literature. When emerging ideas from scientific literature are put together as fragments of knowledge in a systematic way, they may lead to original, sometimes surprising, research findings. If enough scientific evidence is already published for the association of such findings, they can be considered as scientific hypotheses. In this chapter, we describe a method for the computer-aided generation of such hypotheses based on the existing scientific literature. Our literature-based discovery of NF-kappaB with its possible connections to autism was recently approved by scientific community, which confirms the ability of our literature mining methodology to accelerate future discoveries based on rare ideas from existing literature.
Students and Teacher Academic Evaluation Perceptions: Methodology to Construct a Representation Based on Actionable Knowledge Discovery Framework

ERIC Educational Resources Information Center

Molina, Otilia Alejandro; Ratté, Sylvie

2017-01-01

This research introduces a method to construct a unified representation of teachers and students perspectives based on the actionable knowledge discovery (AKD) and delivery framework. The representation is constructed using two models: one obtained from student evaluations and the other obtained from teachers' reflections about their teaching…
Knowledge extraction from evolving spiking neural networks with rank order population coding.

PubMed

Soltic, Snjezana; Kasabov, Nikola

2010-12-01

This paper demonstrates how knowledge can be extracted from evolving spiking neural networks with rank order population coding. Knowledge discovery is a very important feature of intelligent systems. Yet, a disproportionally small amount of research is centered on the issue of knowledge extraction from spiking neural networks which are considered to be the third generation of artificial neural networks. The lack of knowledge representation compatibility is becoming a major detriment to end users of these networks. We show that a high-level knowledge can be obtained from evolving spiking neural networks. More specifically, we propose a method for fuzzy rule extraction from an evolving spiking network with rank order population coding. The proposed method was used for knowledge discovery on two benchmark taste recognition problems where the knowledge learnt by an evolving spiking neural network was extracted in the form of zero-order Takagi-Sugeno fuzzy IF-THEN rules.
Translating three states of knowledge--discovery, invention, and innovation

PubMed Central

2010-01-01

Background Knowledge Translation (KT) has historically focused on the proper use of knowledge in healthcare delivery. A knowledge base has been created through empirical research and resides in scholarly literature. Some knowledge is amenable to direct application by stakeholders who are engaged during or after the research process, as shown by the Knowledge to Action (KTA) model. Other knowledge requires multiple transformations before achieving utility for end users. For example, conceptual knowledge generated through science or engineering may become embodied as a technology-based invention through development methods. The invention may then be integrated within an innovative device or service through production methods. To what extent is KT relevant to these transformations? How might the KTA model accommodate these additional development and production activities while preserving the KT concepts? Discussion Stakeholders adopt and use knowledge that has perceived utility, such as a solution to a problem. Achieving a technology-based solution involves three methods that generate knowledge in three states, analogous to the three classic states of matter. Research activity generates discoveries that are intangible and highly malleable like a gas; development activity transforms discoveries into inventions that are moderately tangible yet still malleable like a liquid; and production activity transforms inventions into innovations that are tangible and immutable like a solid. The paper demonstrates how the KTA model can accommodate all three types of activity and address all three states of knowledge. Linking the three activities in one model also illustrates the importance of engaging the relevant stakeholders prior to initiating any knowledge-related activities. Summary Science and engineering focused on technology-based devices or services change the state of knowledge through three successive activities. Achieving knowledge implementation requires methods that accommodate these three activities and knowledge states. Accomplishing beneficial societal impacts from technology-based knowledge involves the successful progression through all three activities, and the effective communication of each successive knowledge state to the relevant stakeholders. The KTA model appears suitable for structuring and linking these processes. PMID:20205873
Knowledge discovery about quality of life changes of spinal cord injury patients: clustering based on rules by states.

PubMed

Gibert, Karina; García-Rudolph, Alejandro; Curcoll, Lluïsa; Soler, Dolors; Pla, Laura; Tormos, José María

2009-01-01

In this paper, an integral Knowledge Discovery Methodology, named Clustering based on rules by States, which incorporates artificial intelligence (AI) and statistical methods as well as interpretation-oriented tools, is used for extracting knowledge patterns about the evolution over time of the Quality of Life (QoL) of patients with Spinal Cord Injury. The methodology incorporates the interaction with experts as a crucial element with the clustering methodology to guarantee usefulness of the results. Four typical patterns are discovered by taking into account prior expert knowledge. Several hypotheses are elaborated about the reasons for psychological distress or decreases in QoL of patients over time. The knowledge discovery from data (KDD) approach turns out, once again, to be a suitable formal framework for handling multidimensional complexity of the health domains.
18 CFR 385.403 - Methods of discovery; general provisions (Rule 403).

Code of Federal Regulations, 2010 CFR

2010-04-01

... 18 Conservation of Power and Water Resources 1 2010-04-01 2010-04-01 false Methods of discovery; general provisions (Rule 403). 385.403 Section 385.403 Conservation of Power and Water Resources FEDERAL... the response is true and accurate to the best of that person's knowledge, information, and belief...
Progress in Biomedical Knowledge Discovery: A 25-year Retrospective.

PubMed

Sacchi, L; Holmes, J H

2016-08-02

We sought to explore, via a systematic review of the literature, the state of the art of knowledge discovery in biomedical databases as it existed in 1992, and then now, 25 years later, mainly focused on supervised learning. We performed a rigorous systematic search of PubMed and latent Dirichlet allocation to identify themes in the literature and trends in the science of knowledge discovery in and between time periods and compare these trends. We restricted the result set using a bracket of five years previous, such that the 1992 result set was restricted to articles published between 1987 and 1992, and the 2015 set between 2011 and 2015. This was to reflect the current literature available at the time to researchers and others at the target dates of 1992 and 2015. The search term was framed as: Knowledge Discovery OR Data Mining OR Pattern Discovery OR Pattern Recognition, Automated. A total 538 and 18,172 documents were retrieved for 1992 and 2015, respectively. The number and type of data sources increased dramatically over the observation period, primarily due to the advent of electronic clinical systems. The period 1992- 2015 saw the emergence of new areas of research in knowledge discovery, and the refinement and application of machine learning approaches that were nascent or unknown in 1992. Over the 25 years of the observation period, we identified numerous developments that impacted the science of knowledge discovery, including the availability of new forms of data, new machine learning algorithms, and new application domains. Through a bibliometric analysis we examine the striking changes in the availability of highly heterogeneous data resources, the evolution of new algorithmic approaches to knowledge discovery, and we consider from legal, social, and political perspectives possible explanations of the growth of the field. Finally, we reflect on the achievements of the past 25 years to consider what the next 25 years will bring with regard to the availability of even more complex data and to the methods that could be, and are being now developed for the discovery of new knowledge in biomedical data.
Working with Data: Discovering Knowledge through Mining and Analysis; Systematic Knowledge Management and Knowledge Discovery; Text Mining; Methodological Approach in Discovering User Search Patterns through Web Log Analysis; Knowledge Discovery in Databases Using Formal Concept Analysis; Knowledge Discovery with a Little Perspective.

ERIC Educational Resources Information Center

Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.

2000-01-01

These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)
Building Better Decision-Support by Using Knowledge Discovery.

ERIC Educational Resources Information Center

Jurisica, Igor

2000-01-01

Discusses knowledge-based decision-support systems that use artificial intelligence approaches. Addresses the issue of how to create an effective case-based reasoning system for complex and evolving domains, focusing on automated methods for system optimization and domain knowledge evolution that can supplement knowledge acquired from domain…

Applying Knowledge Discovery in Databases in Public Health Data Set: Challenges and Concerns

PubMed Central

Volrathongchia, Kanittha

2003-01-01

In attempting to apply Knowledge Discovery in Databases (KDD) to generate a predictive model from a health care dataset that is currently available to the public, the first step is to pre-process the data to overcome the challenges of missing data, redundant observations, and records containing inaccurate data. This study will demonstrate how to use simple pre-processing methods to improve the quality of input data. PMID:14728545
Temporal data mining for the quality assessment of hemodialysis services.

PubMed

Bellazzi, Riccardo; Larizza, Cristiana; Magni, Paolo; Bellazzi, Roberto

2005-05-01

This paper describes the temporal data mining aspects of a research project that deals with the definition of methods and tools for the assessment of the clinical performance of hemodialysis (HD) services, on the basis of the time series automatically collected during hemodialysis sessions. Intelligent data analysis and temporal data mining techniques are applied to gain insight and to discover knowledge on the causes of unsatisfactory clinical results. In particular, two new methods for association rule discovery and temporal rule discovery are applied to the time series. Such methods exploit several pre-processing techniques, comprising data reduction, multi-scale filtering and temporal abstractions. We have analyzed the data of more than 5800 dialysis sessions coming from 43 different patients monitored for 19 months. The qualitative rules associating the outcome parameters and the measured variables were examined by the domain experts, which were able to distinguish between rules confirming available background knowledge and unexpected but plausible rules. The new methods proposed in the paper are suitable tools for knowledge discovery in clinical time series. Their use in the context of an auditing system for dialysis management helped clinicians to improve their understanding of the patients' behavior.
Computational methods in drug discovery

PubMed Central

Leelananda, Sumudu P

2016-01-01

The process for drug discovery and development is challenging, time consuming and expensive. Computer-aided drug discovery (CADD) tools can act as a virtual shortcut, assisting in the expedition of this long process and potentially reducing the cost of research and development. Today CADD has become an effective and indispensable tool in therapeutic development. The human genome project has made available a substantial amount of sequence data that can be used in various drug discovery projects. Additionally, increasing knowledge of biological structures, as well as increasing computer power have made it possible to use computational methods effectively in various phases of the drug discovery and development pipeline. The importance of in silico tools is greater than ever before and has advanced pharmaceutical research. Here we present an overview of computational methods used in different facets of drug discovery and highlight some of the recent successes. In this review, both structure-based and ligand-based drug discovery methods are discussed. Advances in virtual high-throughput screening, protein structure prediction methods, protein–ligand docking, pharmacophore modeling and QSAR techniques are reviewed. PMID:28144341
Computational methods in drug discovery.

PubMed

Leelananda, Sumudu P; Lindert, Steffen

2016-01-01

The process for drug discovery and development is challenging, time consuming and expensive. Computer-aided drug discovery (CADD) tools can act as a virtual shortcut, assisting in the expedition of this long process and potentially reducing the cost of research and development. Today CADD has become an effective and indispensable tool in therapeutic development. The human genome project has made available a substantial amount of sequence data that can be used in various drug discovery projects. Additionally, increasing knowledge of biological structures, as well as increasing computer power have made it possible to use computational methods effectively in various phases of the drug discovery and development pipeline. The importance of in silico tools is greater than ever before and has advanced pharmaceutical research. Here we present an overview of computational methods used in different facets of drug discovery and highlight some of the recent successes. In this review, both structure-based and ligand-based drug discovery methods are discussed. Advances in virtual high-throughput screening, protein structure prediction methods, protein-ligand docking, pharmacophore modeling and QSAR techniques are reviewed.
Ontology-guided data preparation for discovering genotype-phenotype relationships.

PubMed

Coulet, Adrien; Smaïl-Tabbone, Malika; Benlian, Pascale; Napoli, Amedeo; Devignes, Marie-Dominique

2008-04-25

Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bio-ontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning. This paper presents a method that exploits bio-ontologies for guiding data selection within the preparation step of the KDD process. We propose three scenarios in which domain knowledge and ontology elements such as subsumption, properties, class descriptions, are taken into account for data selection, before the data mining step. Each of these scenarios is illustrated within a case-study relative to the search of genotype-phenotype relationships in a familial hypercholesterolemia dataset. The guiding of data selection based on domain knowledge is analysed and shows a direct influence on the volume and significance of the data mining results. The method proposed in this paper is an efficient alternative to numerical methods for data selection based on domain knowledge. In turn, the results of this study may be reused in ontology modelling and data integration.
Eliciting and Representing High-Level Knowledge Requirements to Discover Ecological Knowledge in Flower-Visiting Data

PubMed Central

2016-01-01

Observations of individual organisms (data) can be combined with expert ecological knowledge of species, especially causal knowledge, to model and extract from flower–visiting data useful information about behavioral interactions between insect and plant organisms, such as nectar foraging and pollen transfer. We describe and evaluate a method to elicit and represent such expert causal knowledge of behavioral ecology, and discuss the potential for wider application of this method to the design of knowledge-based systems for knowledge discovery in biodiversity and ecosystem informatics. PMID:27851814
Computational functional genomics-based approaches in analgesic drug discovery and repurposing.

PubMed

Lippmann, Catharina; Kringel, Dario; Ultsch, Alfred; Lötsch, Jörn

2018-06-01

Persistent pain is a major healthcare problem affecting a fifth of adults worldwide with still limited treatment options. The search for new analgesics increasingly includes the novel research area of functional genomics, which combines data derived from various processes related to DNA sequence, gene expression or protein function and uses advanced methods of data mining and knowledge discovery with the goal of understanding the relationship between the genome and the phenotype. Its use in drug discovery and repurposing for analgesic indications has so far been performed using knowledge discovery in gene function and drug target-related databases; next-generation sequencing; and functional proteomics-based approaches. Here, we discuss recent efforts in functional genomics-based approaches to analgesic drug discovery and repurposing and highlight the potential of computational functional genomics in this field including a demonstration of the workflow using a novel R library 'dbtORA'.
Knowledge Discovery and Data Mining in Iran's Climatic Researches

NASA Astrophysics Data System (ADS)

Karimi, Mostafa

2013-04-01

Advances in measurement technology and data collection is the database gets larger. Large databases require powerful tools for analysis data. Iterative process of acquiring knowledge from information obtained from data processing is done in various forms in all scientific fields. However, when the data volume large, and many of the problems the Traditional methods cannot respond. in the recent years, use of databases in various scientific fields, especially atmospheric databases in climatology expanded. in addition, increases in the amount of data generated by the climate models is a challenge for analysis of it for extraction of hidden pattern and knowledge. The approach to this problem has been made in recent years uses the process of knowledge discovery and data mining techniques with the use of the concepts of machine learning, artificial intelligence and expert (professional) systems is overall performance. Data manning is analytically process for manning in massive volume data. The ultimate goal of data mining is access to information and finally knowledge. climatology is a part of science that uses variety and massive volume data. Goal of the climate data manning is Achieve to information from variety and massive atmospheric and non-atmospheric data. in fact, Knowledge Discovery performs these activities in a logical and predetermined and almost automatic process. The goal of this research is study of uses knowledge Discovery and data mining technique in Iranian climate research. For Achieve This goal, study content (descriptive) analysis and classify base method and issue. The result shown that in climatic research of Iran most clustering, k-means and wards applied and in terms of issues precipitation and atmospheric circulation patterns most introduced. Although several studies in geography and climate issues with statistical techniques such as clustering and pattern extraction is done, Due to the nature of statistics and data mining, but cannot say for internal climate studies in data mining and knowledge discovery techniques are used. However, it is necessary to use the KDD Approach and DM techniques in the climatic studies, specific interpreter of climate modeling result.
Challenges in Biomarker Discovery: Combining Expert Insights with Statistical Analysis of Complex Omics Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

McDermott, Jason E.; Wang, Jing; Mitchell, Hugh D.

2013-01-01

The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities both for purely statistical and expert knowledge-based approaches and would benefit from improved integration of the two. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges thatmore » have been encountered. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to biomarker discovery and characterization are key to future success in the biomarker field. We will describe our recommendations of possible approaches to this problem including metrics for the evaluation of biomarkers.« less
Knowledge discovery by accuracy maximization

PubMed Central

Cacciatore, Stefano; Luchinat, Claudio; Tenori, Leonardo

2014-01-01

Here we describe KODAMA (knowledge discovery by accuracy maximization), an unsupervised and semisupervised learning algorithm that performs feature extraction from noisy and high-dimensional data. Unlike other data mining methods, the peculiarity of KODAMA is that it is driven by an integrated procedure of cross-validation of the results. The discovery of a local manifold’s topology is led by a classifier through a Monte Carlo procedure of maximization of cross-validated predictive accuracy. Briefly, our approach differs from previous methods in that it has an integrated procedure of validation of the results. In this way, the method ensures the highest robustness of the obtained solution. This robustness is demonstrated on experimental datasets of gene expression and metabolomics, where KODAMA compares favorably with other existing feature extraction methods. KODAMA is then applied to an astronomical dataset, revealing unexpected features. Interesting and not easily predictable features are also found in the analysis of the State of the Union speeches by American presidents: KODAMA reveals an abrupt linguistic transition sharply separating all post-Reagan from all pre-Reagan speeches. The transition occurs during Reagan’s presidency and not from its beginning. PMID:24706821
k-neighborhood Decentralization: A Comprehensive Solution to Index the UMLS for Large Scale Knowledge Discovery

PubMed Central

Xiang, Yang; Lu, Kewei; James, Stephen L.; Borlawsky, Tara B.; Huang, Kun; Payne, Philip R.O.

2011-01-01

The Unified Medical Language System (UMLS) is the largest thesaurus in the biomedical informatics domain. Previous works have shown that knowledge constructs comprised of transitively-associated UMLS concepts are effective for discovering potentially novel biomedical hypotheses. However, the extremely large size of the UMLS becomes a major challenge for these applications. To address this problem, we designed a k-neighborhood Decentralization Labeling Scheme (kDLS) for the UMLS, and the corresponding method to effectively evaluate the kDLS indexing results. kDLS provides a comprehensive solution for indexing the UMLS for very efficient large scale knowledge discovery. We demonstrated that it is highly effective to use kDLS paths to prioritize disease-gene relations across the whole genome, with extremely high fold-enrichment values. To our knowledge, this is the first indexing scheme capable of supporting efficient large scale knowledge discovery on the UMLS as a whole. Our expectation is that kDLS will become a vital engine for retrieving information and generating hypotheses from the UMLS for future medical informatics applications. PMID:22154838
k-Neighborhood decentralization: a comprehensive solution to index the UMLS for large scale knowledge discovery.

PubMed

Xiang, Yang; Lu, Kewei; James, Stephen L; Borlawsky, Tara B; Huang, Kun; Payne, Philip R O

2012-04-01

The Unified Medical Language System (UMLS) is the largest thesaurus in the biomedical informatics domain. Previous works have shown that knowledge constructs comprised of transitively-associated UMLS concepts are effective for discovering potentially novel biomedical hypotheses. However, the extremely large size of the UMLS becomes a major challenge for these applications. To address this problem, we designed a k-neighborhood Decentralization Labeling Scheme (kDLS) for the UMLS, and the corresponding method to effectively evaluate the kDLS indexing results. kDLS provides a comprehensive solution for indexing the UMLS for very efficient large scale knowledge discovery. We demonstrated that it is highly effective to use kDLS paths to prioritize disease-gene relations across the whole genome, with extremely high fold-enrichment values. To our knowledge, this is the first indexing scheme capable of supporting efficient large scale knowledge discovery on the UMLS as a whole. Our expectation is that kDLS will become a vital engine for retrieving information and generating hypotheses from the UMLS for future medical informatics applications. Copyright Â© 2011 Elsevier Inc. All rights reserved.
A knowledgebase system to enhance scientific discovery: Telemakus

PubMed Central

Fuller, Sherrilynne S; Revere, Debra; Bugni, Paul F; Martin, George M

2004-01-01

Background With the rapid expansion of scientific research, the ability to effectively find or integrate new domain knowledge in the sciences is proving increasingly difficult. Efforts to improve and speed up scientific discovery are being explored on a number of fronts. However, much of this work is based on traditional search and retrieval approaches and the bibliographic citation presentation format remains unchanged. Methods Case study. Results The Telemakus KnowledgeBase System provides flexible new tools for creating knowledgebases to facilitate retrieval and review of scientific research reports. In formalizing the representation of the research methods and results of scientific reports, Telemakus offers a potential strategy to enhance the scientific discovery process. While other research has demonstrated that aggregating and analyzing research findings across domains augments knowledge discovery, the Telemakus system is unique in combining document surrogates with interactive concept maps of linked relationships across groups of research reports. Conclusion Based on how scientists conduct research and read the literature, the Telemakus KnowledgeBase System brings together three innovations in analyzing, displaying and summarizing research reports across a domain: (1) research report schema, a document surrogate of extracted research methods and findings presented in a consistent and structured schema format which mimics the research process itself and provides a high-level surrogate to facilitate searching and rapid review of retrieved documents; (2) research findings, used to index the documents, allowing searchers to request, for example, research studies which have studied the relationship between neoplasms and vitamin E; and (3) visual exploration interface of linked relationships for interactive querying of research findings across the knowledgebase and graphical displays of what is known as well as, through gaps in the map, what is yet to be tested. The rationale and system architecture are described and plans for the future are discussed. PMID:15507158
Evaluation Techniques for the Sandy Point Discovery Center, Great Bay National Estuarine Research Reserve.

ERIC Educational Resources Information Center

Heffernan, Bernadette M.

1998-01-01

Describes work done to provide staff of the Sandy Point Discovery Center with methods for evaluating exhibits and interpretive programming. Quantitative and qualitative evaluation measures were designed to assess the program's objective of estuary education. Pretest-posttest questionnaires and interviews are used to measure subjects' knowledge and…
Applying knowledge-anchored hypothesis discovery methods to advance clinical and translational research: the OAMiner project

PubMed Central

Jackson, Rebecca D; Best, Thomas M; Borlawsky, Tara B; Lai, Albert M; James, Stephen; Gurcan, Metin N

2012-01-01

The conduct of clinical and translational research regularly involves the use of a variety of heterogeneous and large-scale data resources. Scalable methods for the integrative analysis of such resources, particularly when attempting to leverage computable domain knowledge in order to generate actionable hypotheses in a high-throughput manner, remain an open area of research. In this report, we describe both a generalizable design pattern for such integrative knowledge-anchored hypothesis discovery operations and our experience in applying that design pattern in the experimental context of a set of driving research questions related to the publicly available Osteoarthritis Initiative data repository. We believe that this ‘test bed’ project and the lessons learned during its execution are both generalizable and representative of common clinical and translational research paradigms. PMID:22647689
Integrative Sparse K-Means With Overlapping Group Lasso in Genomic Applications for Disease Subtype Discovery

PubMed Central

Huo, Zhiguang; Tseng, George

2017-01-01

Cancer subtypes discovery is the first step to deliver personalized medicine to cancer patients. With the accumulation of massive multi-level omics datasets and established biological knowledge databases, omics data integration with incorporation of rich existing biological knowledge is essential for deciphering a biological mechanism behind the complex diseases. In this manuscript, we propose an integrative sparse K-means (is-K means) approach to discover disease subtypes with the guidance of prior biological knowledge via sparse overlapping group lasso. An algorithm using an alternating direction method of multiplier (ADMM) will be applied for fast optimization. Simulation and three real applications in breast cancer and leukemia will be used to compare is-K means with existing methods and demonstrate its superior clustering accuracy, feature selection, functional annotation of detected molecular features and computing efficiency. PMID:28959370
Integrative Sparse K-Means With Overlapping Group Lasso in Genomic Applications for Disease Subtype Discovery.

PubMed

Huo, Zhiguang; Tseng, George

2017-06-01

Cancer subtypes discovery is the first step to deliver personalized medicine to cancer patients. With the accumulation of massive multi-level omics datasets and established biological knowledge databases, omics data integration with incorporation of rich existing biological knowledge is essential for deciphering a biological mechanism behind the complex diseases. In this manuscript, we propose an integrative sparse K -means (is- K means) approach to discover disease subtypes with the guidance of prior biological knowledge via sparse overlapping group lasso. An algorithm using an alternating direction method of multiplier (ADMM) will be applied for fast optimization. Simulation and three real applications in breast cancer and leukemia will be used to compare is- K means with existing methods and demonstrate its superior clustering accuracy, feature selection, functional annotation of detected molecular features and computing efficiency.
How can knowledge discovery methods uncover spatio-temporal patterns in environmental data?

NASA Astrophysics Data System (ADS)

Wachowicz, Monica

2000-04-01

This paper proposes the integration of KDD, GVis and STDB as a long-term strategy, which will allow users to apply knowledge discovery methods for uncovering spatio-temporal patterns in environmental data. The main goal is to combine innovative techniques and associated tools for exploring very large environmental data sets in order to arrive at valid, novel, potentially useful, and ultimately understandable spatio-temporal patterns. The GeoInsight approach is described using the principles and key developments in the research domains of KDD, GVis, and STDB. The GeoInsight approach aims at the integration of these research domains in order to provide tools for performing information retrieval, exploration, analysis, and visualization. The result is a knowledge-based design, which involves visual thinking (perceptual-cognitive process) and automated information processing (computer-analytical process).
Improved accuracy of supervised CRM discovery with interpolated Markov models and cross-species comparison

PubMed Central

Kazemian, Majid; Zhu, Qiyun; Halfon, Marc S.; Sinha, Saurabh

2011-01-01

Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, ‘enhancers’), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for ‘motif-blind’ CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to ‘supervise’ the search. We propose a new statistical method, based on ‘Interpolated Markov Models’, for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers. PMID:21821659
Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases

PubMed Central

Frijters, Raoul; van Vugt, Marianne; Smeets, Ruben; van Schaik, René; de Vlieg, Jacob; Alkema, Wynand

2010-01-01

The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs. PMID:20885778

Modeling & Informatics at Vertex Pharmaceuticals Incorporated: our philosophy for sustained impact

NASA Astrophysics Data System (ADS)

McGaughey, Georgia; Patrick Walters, W.

2017-03-01

Molecular modelers and informaticians have the unique opportunity to integrate cross-functional data using a myriad of tools, methods and visuals to generate information. Using their drug discovery expertise, information is transformed to knowledge that impacts drug discovery. These insights are often times formulated locally and then applied more broadly, which influence the discovery of new medicines. This is particularly true in an organization where the members are exposed to projects throughout an organization, such as in the case of the global Modeling & Informatics group at Vertex Pharmaceuticals. From its inception, Vertex has been a leader in the development and use of computational methods for drug discovery. In this paper, we describe the Modeling & Informatics group at Vertex and the underlying philosophy, which has driven this team to sustain impact on the discovery of first-in-class transformative medicines.
Method and system for knowledge discovery using non-linear statistical analysis and a 1st and 2nd tier computer program

DOEpatents

Hively, Lee M [Philadelphia, TN

2011-07-12

The invention relates to a method and apparatus for simultaneously processing different sources of test data into informational data and then processing different categories of informational data into knowledge-based data. The knowledge-based data can then be communicated between nodes in a system of multiple computers according to rules for a type of complex, hierarchical computer system modeled on a human brain.
Beginning to manage drug discovery and development knowledge.

PubMed

Sumner-Smith, M

2001-05-01

Knowledge management approaches and technologies are beginning to be implemented by the pharmaceutical industry in support of new drug discovery and development processes aimed at greater efficiencies and effectiveness. This trend coincides with moves to reduce paper, coordinate larger teams with more diverse skills that are distributed around the globe, and to comply with regulatory requirements for electronic submissions and the associated maintenance of electronic records. Concurrently, the available technologies have implemented web-based architectures with a greater range of collaborative tools and personalization through portal approaches. However, successful application of knowledge management methods depends on effective cultural change management, as well as proper architectural design to match the organizational and work processes within a company.
RHSEG and Subdue: Background and Preliminary Approach for Combining these Technologies for Enhanced Image Data Analysis, Mining and Knowledge Discovery

NASA Technical Reports Server (NTRS)

Tilton, James C.; Cook, Diane J.

2008-01-01

Under a project recently selected for funding by NASA's Science Mission Directorate under the Applied Information Systems Research (AISR) program, Tilton and Cook will design and implement the integration of the Subdue graph based knowledge discovery system, developed at the University of Texas Arlington and Washington State University, with image segmentation hierarchies produced by the RHSEG software, developed at NASA GSFC, and perform pilot demonstration studies of data analysis, mining and knowledge discovery on NASA data. Subdue represents a method for discovering substructures in structural databases. Subdue is devised for general-purpose automated discovery, concept learning, and hierarchical clustering, with or without domain knowledge. Subdue was developed by Cook and her colleague, Lawrence B. Holder. For Subdue to be effective in finding patterns in imagery data, the data must be abstracted up from the pixel domain. An appropriate abstraction of imagery data is a segmentation hierarchy: a set of several segmentations of the same image at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. The RHSEG program, a recursive approximation to a Hierarchical Segmentation approach (HSEG), can produce segmentation hierarchies quickly and effectively for a wide variety of images. RHSEG and HSEG were developed at NASA GSFC by Tilton. In this presentation we provide background on the RHSEG and Subdue technologies and present a preliminary analysis on how RHSEG and Subdue may be combined to enhance image data analysis, mining and knowledge discovery.
Using a computer-based simulation with an artificial intelligence component and discovery learning to formulate training needs for a new technology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hillis, D.R.

A computer-based simulation with an artificial intelligence component and discovery learning was investigated as a method to formulate training needs for new or unfamiliar technologies. Specifically, the study examined if this simulation method would provide for the recognition of applications and knowledge/skills which would be the basis for establishing training needs. The study also examined the effect of field-dependence/independence on recognition of applications and knowledge/skills. A pretest-posttest control group experimental design involving fifty-eight college students from an industrial technology program was used. The study concluded that the simulation was effective in developing recognition of applications and the knowledge/skills for amore » new or unfamiliar technology. And, the simulation's effectiveness for providing this recognition was not limited by an individual's field-dependence/independence.« less
The relation between prior knowledge and students' collaborative discovery learning processes

NASA Astrophysics Data System (ADS)

Gijlers, Hannie; de Jong, Ton

2005-03-01

In this study we investigate how prior knowledge influences knowledge development during collaborative discovery learning. Fifteen dyads of students (pre-university education, 15-16 years old) worked on a discovery learning task in the physics field of kinematics. The (face-to-face) communication between students was recorded and the interaction with the environment was logged. Based on students' individual judgments of the truth-value and testability of a series of domain-specific propositions, a detailed description of the knowledge configuration for each dyad was created before they entered the learning environment. Qualitative analyses of two dialogues illustrated that prior knowledge influences the discovery learning processes, and knowledge development in a pair of students. Assessments of student and dyad definitional (domain-specific) knowledge, generic (mathematical and graph) knowledge, and generic (discovery) skills were related to the students' dialogue in different discovery learning processes. Results show that a high level of definitional prior knowledge is positively related to the proportion of communication regarding the interpretation of results. Heterogeneity with respect to generic prior knowledge was positively related to the number of utterances made in the discovery process categories hypotheses generation and experimentation. Results of the qualitative analyses indicated that collaboration between extremely heterogeneous dyads is difficult when the high achiever is not willing to scaffold information and work in the low achiever's zone of proximal development.
Discovering and Articulating What Is Not yet Known: Using Action Learning and Grounded Theory as a Knowledge Management Strategy

ERIC Educational Resources Information Center

Pauleen, David J.; Corbitt, Brian; Yoong, Pak

2007-01-01

Purpose: To provide a conceptual model for the discovery and articulation of emergent organizational knowledge, particularly knowledge that develops when people work with new technologies. Design/methodology/approach: The model is based on two widely accepted research methods--action learning and grounded theory--and is illustrated using a case…
Key Relation Extraction from Biomedical Publications.

PubMed

Huang, Lan; Wang, Ye; Gong, Leiguang; Kulikowski, Casimir; Bai, Tian

2017-01-01

Within the large body of biomedical knowledge, recent findings and discoveries are most often presented as research articles. Their number has been increasing sharply since the turn of the century, presenting ever-growing challenges for search and discovery of knowledge and information related to specific topics of interest, even with the help of advanced online search tools. This is especially true when the goal of a search is to find or discover key relations between important concepts or topic words. We have developed an innovative method for extracting key relations between concepts from abstracts of articles. The method focuses on relations between keywords or topic words in the articles. Early experiments with the method on PubMed publications have shown promising results in searching and discovering keywords and their relationships that are strongly related to the main topic of an article.
From the EBM pyramid to the Greek temple: a new conceptual approach to Guidelines as implementation tools in mental health.

PubMed

Salvador-Carulla, L; Lukersmith, S; Sullivan, W

2017-04-01

Guideline methods to develop recommendations dedicate most effort around organising discovery and corroboration knowledge following the evidence-based medicine (EBM) framework. Guidelines typically use a single dimension of information, and generally discard contextual evidence and formal expert knowledge and consumer's experiences in the process. In recognition of the limitations of guidelines in complex cases, complex interventions and systems research, there has been significant effort to develop new tools, guides, resources and structures to use alongside EBM methods of guideline development. In addition to these advances, a new framework based on the philosophy of science is required. Guidelines should be defined as implementation decision support tools for improving the decision-making process in real-world practice and not only as a procedure to optimise the knowledge base of scientific discovery and corroboration. A shift from the model of the EBM pyramid of corroboration of evidence to the use of broader multi-domain perspective graphically depicted as 'Greek temple' could be considered. This model takes into account the different stages of scientific knowledge (discovery, corroboration and implementation), the sources of knowledge relevant to guideline development (experimental, observational, contextual, expert-based and experiential); their underlying inference mechanisms (deduction, induction, abduction, means-end inferences) and a more precise definition of evidence and related terms. The applicability of this broader approach is presented for the development of the Canadian Consensus Guidelines for the Primary Care of People with Developmental Disabilities.
A bioinformatics knowledge discovery in text application for grid computing

PubMed Central

Castellano, Marcello; Mastronardi, Giuseppe; Bellotti, Roberto; Tarricone, Gianfranco

2009-01-01

Background A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. Methods The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. Results A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. Conclusion In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities. PMID:19534749
CONSTRUCTING KNOWLEDGE FROM MULTIVARIATE SPATIOTEMPORAL DATA: INTEGRATING GEOGRAPHIC VISUALIZATION WITH KNOWLEDGE DISCOVERY IN DATABASE METHODS. (R825195)

EPA Science Inventory

The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
Automated Knowledge Discovery from Simulators

NASA Technical Reports Server (NTRS)

Burl, Michael C.; DeCoste, D.; Enke, B. L.; Mazzoni, D.; Merline, W. J.; Scharenbroich, L.

2006-01-01

In this paper, we explore one aspect of knowledge discovery from simulators, the landscape characterization problem, where the aim is to identify regions in the input/ parameter/model space that lead to a particular output behavior. Large-scale numerical simulators are in widespread use by scientists and engineers across a range of government agencies, academia, and industry; in many cases, simulators provide the only means to examine processes that are infeasible or impossible to study otherwise. However, the cost of simulation studies can be quite high, both in terms of the time and computational resources required to conduct the trials and the manpower needed to sift through the resulting output. Thus, there is strong motivation to develop automated methods that enable more efficient knowledge extraction.
Information Fusion for Natural and Man-Made Disasters

DTIC Science & Technology

2007-01-31

comprehensively large, and metaphysically accurate model of situations, through which specific tasks such as situation assessment, knowledge discovery , or the...significance” is always context specific. Event discovery is a very important element of the HLF process, which can lead to knowledge discovery about...expected, given the current state of knowledge . Examples of such behavior may include discovery of a new aggregate or situation, a specific pattern of
Challenges in Biomarker Discovery: Combining Expert Insights with Statistical Analysis of Complex Omics Data

PubMed Central

McDermott, Jason E.; Wang, Jing; Mitchell, Hugh; Webb-Robertson, Bobbie-Jo; Hafen, Ryan; Ramey, John; Rodland, Karin D.

2012-01-01

Introduction The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful molecular signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities for more sophisticated approaches to integrating purely statistical and expert knowledge-based approaches. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges that have been encountered in deriving valid and useful signatures of disease. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to identify predictive signatures of disease are key to future success in the biomarker field. We will describe our recommendations for possible approaches to this problem including metrics for the evaluation of biomarkers. PMID:23335946
Integrated Computational Analysis of Genes Associated with Human Hereditary Insensitivity to Pain. A Drug Repurposing Perspective

PubMed Central

Lötsch, Jörn; Lippmann, Catharina; Kringel, Dario; Ultsch, Alfred

2017-01-01

Genes causally involved in human insensitivity to pain provide a unique molecular source of studying the pathophysiology of pain and the development of novel analgesic drugs. The increasing availability of “big data” enables novel research approaches to chronic pain while also requiring novel techniques for data mining and knowledge discovery. We used machine learning to combine the knowledge about n = 20 genes causally involved in human hereditary insensitivity to pain with the knowledge about the functions of thousands of genes. An integrated computational analysis proposed that among the functions of this set of genes, the processes related to nervous system development and to ceramide and sphingosine signaling pathways are particularly important. This is in line with earlier suggestions to use these pathways as therapeutic target in pain. Following identification of the biological processes characterizing hereditary insensitivity to pain, the biological processes were used for a similarity analysis with the functions of n = 4,834 database-queried drugs. Using emergent self-organizing maps, a cluster of n = 22 drugs was identified sharing important functional features with hereditary insensitivity to pain. Several members of this cluster had been implicated in pain in preclinical experiments. Thus, the present concept of machine-learned knowledge discovery for pain research provides biologically plausible results and seems to be suitable for drug discovery by identifying a narrow choice of repurposing candidates, demonstrating that contemporary machine-learned methods offer innovative approaches to knowledge discovery from available evidence. PMID:28848388
Developing integrated crop knowledge networks to advance candidate gene discovery.

PubMed

Hassani-Pak, Keywan; Castellote, Martin; Esch, Maria; Hindle, Matthew; Lysenko, Artem; Taubert, Jan; Rawlings, Christopher

2016-12-01

The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.
A New System To Support Knowledge Discovery: Telemakus.

ERIC Educational Resources Information Center

Revere, Debra; Fuller, Sherrilynne S.; Bugni, Paul F.; Martin, George M.

2003-01-01

The Telemakus System builds on the areas of concept representation, schema theory, and information visualization to enhance knowledge discovery from scientific literature. This article describes the underlying theories and an overview of a working implementation designed to enhance the knowledge discovery process through retrieval, visual and…
Improved accuracy of supervised CRM discovery with interpolated Markov models and cross-species comparison.

PubMed

Kazemian, Majid; Zhu, Qiyun; Halfon, Marc S; Sinha, Saurabh

2011-12-01

Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, 'enhancers'), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for 'motif-blind' CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to 'supervise' the search. We propose a new statistical method, based on 'Interpolated Markov Models', for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers. © The Author(s) 2011. Published by Oxford University Press.
Cosmic Discovery

NASA Astrophysics Data System (ADS)

Harwit, Martin

1984-04-01

In the remarkable opening section of this book, a well-known Cornell astronomer gives precise thumbnail histories of the 43 basic cosmic discoveries - stars, planets, novae, pulsars, comets, gamma-ray bursts, and the like - that form the core of our knowledge of the universe. Many of them, he points out, were made accidentally and outside the mainstream of astronomical research and funding. This observation leads him to speculate on how many more major phenomena there might be and how they might be most effectively sought out in afield now dominated by large instruments and complex investigative modes and observational conditions. The book also examines discovery in terms of its political, financial, and sociological context - the role of new technologies and of industry and the military in revealing new knowledge; and methods of funding, of peer review, and of allotting time on our largest telescopes. It concludes with specific recommendations for organizing astronomy in ways that will best lead to the discovery of the many - at least sixty - phenomena that Harwit estimates are still waiting to be found.
Knowledge Discovery as an Aid to Organizational Creativity.

ERIC Educational Resources Information Center

Siau, Keng

2000-01-01

This article presents the concept of knowledge discovery, a process of searching for associations in large volumes of computer data, as an aid to creativity. It then discusses the various techniques in knowledge discovery. Mednick's associative theory of creative thought serves as the theoretical foundation for this research. (Contains…

Advances in Knowledge Discovery and Data Mining 21st Pacific Asia Conference, PAKDD 2017 Held in Jeju, South Korea, May 23 26, 2017. Proceedings Part I, Part II.

DTIC Science & Technology

2017-06-27

From - To) 05-27-2017 Final 17-03-2017 - 15-03-2018 4. TITLE AND SUBTITLE Sa. CONTRACT NUMBER FA2386-17-1-0102 Advances in Knowledge Discovery and...Springer; Switzerland. 14. ABSTRACT The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference...in the areas of knowledge discovery and data mining (KDD). We had three keynote speeches, delivered by Sang Cha from Seoul National University
Knowledge Discovery from Vibration Measurements

PubMed Central

Li, Jian; Wang, Daoyao

2014-01-01

The framework as well as the particular algorithms of pattern recognition process is widely adopted in structural health monitoring (SHM). However, as a part of the overall process of knowledge discovery from data bases (KDD), the results of pattern recognition are only changes and patterns of changes of data features. In this paper, based on the similarity between KDD and SHM and considering the particularity of SHM problems, a four-step framework of SHM is proposed which extends the final goal of SHM from detecting damages to extracting knowledge to facilitate decision making. The purposes and proper methods of each step of this framework are discussed. To demonstrate the proposed SHM framework, a specific SHM method which is composed by the second order structural parameter identification, statistical control chart analysis, and system reliability analysis is then presented. To examine the performance of this SHM method, real sensor data measured from a lab size steel bridge model structure are used. The developed four-step framework of SHM has the potential to clarify the process of SHM to facilitate the further development of SHM techniques. PMID:24574933
Recommendation Techniques for Drug-Target Interaction Prediction and Drug Repositioning.

PubMed

Alaimo, Salvatore; Giugno, Rosalba; Pulvirenti, Alfredo

2016-01-01

The usage of computational methods in drug discovery is a common practice. More recently, by exploiting the wealth of biological knowledge bases, a novel approach called drug repositioning has raised. Several computational methods are available, and these try to make a high-level integration of all the knowledge in order to discover unknown mechanisms. In this chapter, we review drug-target interaction prediction methods based on a recommendation system. We also give some extensions which go beyond the bipartite network case.
The Relation between Prior Knowledge and Students' Collaborative Discovery Learning Processes

ERIC Educational Resources Information Center

Gijlers, Hannie; de Jong, Ton

2005-01-01

In this study we investigate how prior knowledge influences knowledge development during collaborative discovery learning. Fifteen dyads of students (pre-university education, 15-16 years old) worked on a discovery learning task in the physics field of kinematics. The (face-to-face) communication between students was recorded and the interaction…
Conceptual dissonance: evaluating the efficacy of natural language processing techniques for validating translational knowledge constructs.

PubMed

Payne, Philip R O; Kwok, Alan; Dhaval, Rakesh; Borlawsky, Tara B

2009-03-01

The conduct of large-scale translational studies presents significant challenges related to the storage, management and analysis of integrative data sets. Ideally, the application of methodologies such as conceptual knowledge discovery in databases (CKDD) provides a means for moving beyond intuitive hypothesis discovery and testing in such data sets, and towards the high-throughput generation and evaluation of knowledge-anchored relationships between complex bio-molecular and phenotypic variables. However, the induction of such high-throughput hypotheses is non-trivial, and requires correspondingly high-throughput validation methodologies. In this manuscript, we describe an evaluation of the efficacy of a natural language processing-based approach to validating such hypotheses. As part of this evaluation, we will examine a phenomenon that we have labeled as "Conceptual Dissonance" in which conceptual knowledge derived from two or more sources of comparable scope and granularity cannot be readily integrated or compared using conventional methods and automated tools.
Virtual Observatories, Data Mining, and Astroinformatics

NASA Astrophysics Data System (ADS)

Borne, Kirk

The historical, current, and future trends in knowledge discovery from data in astronomy are presented here. The story begins with a brief history of data gathering and data organization. A description of the development ofnew information science technologies for astronomical discovery is then presented. Among these are e-Science and the virtual observatory, with its data discovery, access, display, and integration protocols; astroinformatics and data mining for exploratory data analysis, information extraction, and knowledge discovery from distributed data collections; new sky surveys' databases, including rich multivariate observational parameter sets for large numbers of objects; and the emerging discipline of data-oriented astronomical research, called astroinformatics. Astroinformatics is described as the fourth paradigm of astronomical research, following the three traditional research methodologies: observation, theory, and computation/modeling. Astroinformatics research areas include machine learning, data mining, visualization, statistics, semantic science, and scientific data management.Each of these areas is now an active research discipline, with significantscience-enabling applications in astronomy. Research challenges and sample research scenarios are presented in these areas, in addition to sample algorithms for data-oriented research. These information science technologies enable scientific knowledge discovery from the increasingly large and complex data collections in astronomy. The education and training of the modern astronomy student must consequently include skill development in these areas, whose practitioners have traditionally been limited to applied mathematicians, computer scientists, and statisticians. Modern astronomical researchers must cross these traditional discipline boundaries, thereby borrowing the best of breed methodologies from multiple disciplines. In the era of large sky surveys and numerous large telescopes, the potential for astronomical discovery is equally large, and so the data-oriented research methods, algorithms, and techniques that are presented here will enable the greatest discovery potential from the ever-growing data and information resources in astronomy.
Systems and methods for knowledge discovery in spatial data

DOEpatents

Obradovic, Zoran; Fiez, Timothy E.; Vucetic, Slobodan; Lazarevic, Aleksandar; Pokrajac, Dragoljub; Hoskinson, Reed L.

2005-03-08

Systems and methods are provided for knowledge discovery in spatial data as well as to systems and methods for optimizing recipes used in spatial environments such as may be found in precision agriculture. A spatial data analysis and modeling module is provided which allows users to interactively and flexibly analyze and mine spatial data. The spatial data analysis and modeling module applies spatial data mining algorithms through a number of steps. The data loading and generation module obtains or generates spatial data and allows for basic partitioning. The inspection module provides basic statistical analysis. The preprocessing module smoothes and cleans the data and allows for basic manipulation of the data. The partitioning module provides for more advanced data partitioning. The prediction module applies regression and classification algorithms on the spatial data. The integration module enhances prediction methods by combining and integrating models. The recommendation module provides the user with site-specific recommendations as to how to optimize a recipe for a spatial environment such as a fertilizer recipe for an agricultural field.
The Knowledge-Integrated Network Biomarkers Discovery for Major Adverse Cardiac Events

PubMed Central

Jin, Guangxu; Zhou, Xiaobo; Wang, Honghui; Zhao, Hong; Cui, Kemi; Zhang, Xiang-Sun; Chen, Luonan; Hazen, Stanley L.; Li, King; Wong, Stephen T. C.

2010-01-01

The mass spectrometry (MS) technology in clinical proteomics is very promising for discovery of new biomarkers for diseases management. To overcome the obstacles of data noises in MS analysis, we proposed a new approach of knowledge-integrated biomarker discovery using data from Major Adverse Cardiac Events (MACE) patients. We first built up a cardiovascular-related network based on protein information coming from protein annotations in Uniprot, protein–protein interaction (PPI), and signal transduction database. Distinct from the previous machine learning methods in MS data processing, we then used statistical methods to discover biomarkers in cardiovascular-related network. Through the tradeoff between known protein information and data noises in mass spectrometry data, we finally could firmly identify those high-confident biomarkers. Most importantly, aided by protein–protein interaction network, that is, cardiovascular-related network, we proposed a new type of biomarkers, that is, network biomarkers, composed of a set of proteins and the interactions among them. The candidate network biomarkers can classify the two groups of patients more accurately than current single ones without consideration of biological molecular interaction. PMID:18665624
Interoperability between biomedical ontologies through relation expansion, upper-level ontologies and automatic reasoning.

PubMed

Hoehndorf, Robert; Dumontier, Michel; Oellrich, Anika; Rebholz-Schuhmann, Dietrich; Schofield, Paul N; Gkoutos, Georgios V

2011-01-01

Researchers design ontologies as a means to accurately annotate and integrate experimental data across heterogeneous and disparate data- and knowledge bases. Formal ontologies make the semantics of terms and relations explicit such that automated reasoning can be used to verify the consistency of knowledge. However, many biomedical ontologies do not sufficiently formalize the semantics of their relations and are therefore limited with respect to automated reasoning for large scale data integration and knowledge discovery. We describe a method to improve automated reasoning over biomedical ontologies and identify several thousand contradictory class definitions. Our approach aligns terms in biomedical ontologies with foundational classes in a top-level ontology and formalizes composite relations as class expressions. We describe the semi-automated repair of contradictions and demonstrate expressive queries over interoperable ontologies. Our work forms an important cornerstone for data integration, automatic inference and knowledge discovery based on formal representations of knowledge. Our results and analysis software are available at http://bioonto.de/pmwiki.php/Main/ReasonableOntologies.
Knowledge Representation and Data Mining of Neuronal Morphologies Using Neuroinformatics Tools and Formal Ontologies

ERIC Educational Resources Information Center

Polavaram, Sridevi

2016-01-01

Neuroscience can greatly benefit from using novel methods in computer science and informatics, which enable knowledge discovery in unexpected ways. Currently one of the biggest challenges in Neuroscience is to map the functional circuitry of the brain. The applications of this goal range from understanding structural reorganization of neurons to…
75 FR 5299 - Office of Special Education and Rehabilitative Services; Overview Information; National Institute...

Federal Register 2010, 2011, 2012, 2013, 2014

2010-02-02

... planned outputs are expected to contribute to advances in knowledge, improvements in policy and practice... of accomplishments (e.g., new or improved tools, methods, discoveries, standards, interventions...
PCM-SABRE: a platform for benchmarking and comparing outcome prediction methods in precision cancer medicine.

PubMed

Eyal-Altman, Noah; Last, Mark; Rubin, Eitan

2017-01-17

Numerous publications attempt to predict cancer survival outcome from gene expression data using machine-learning methods. A direct comparison of these works is challenging for the following reasons: (1) inconsistent measures used to evaluate the performance of different models, and (2) incomplete specification of critical stages in the process of knowledge discovery. There is a need for a platform that would allow researchers to replicate previous works and to test the impact of changes in the knowledge discovery process on the accuracy of the induced models. We developed the PCM-SABRE platform, which supports the entire knowledge discovery process for cancer outcome analysis. PCM-SABRE was developed using KNIME. By using PCM-SABRE to reproduce the results of previously published works on breast cancer survival, we define a baseline for evaluating future attempts to predict cancer outcome with machine learning. We used PCM-SABRE to replicate previous work that describe predictive models of breast cancer recurrence, and tested the performance of all possible combinations of feature selection methods and data mining algorithms that was used in either of the works. We reconstructed the work of Chou et al. observing similar trends - superior performance of Probabilistic Neural Network (PNN) and logistic regression (LR) algorithms and inconclusive impact of feature pre-selection with the decision tree algorithm on subsequent analysis. PCM-SABRE is a software tool that provides an intuitive environment for rapid development of predictive models in cancer precision medicine.
Trends in Modern Drug Discovery.

PubMed

Eder, Jörg; Herrling, Paul L

2016-01-01

Drugs discovered by the pharmaceutical industry over the past 100 years have dramatically changed the practice of medicine and impacted on many aspects of our culture. For many years, drug discovery was a target- and mechanism-agnostic approach that was based on ethnobotanical knowledge often fueled by serendipity. With the advent of modern molecular biology methods and based on knowledge of the human genome, drug discovery has now largely changed into a hypothesis-driven target-based approach, a development which was paralleled by significant environmental changes in the pharmaceutical industry. Laboratories became increasingly computerized and automated, and geographically dispersed research sites are now more and more clustered into large centers to capture technological and biological synergies. Today, academia, the regulatory agencies, and the pharmaceutical industry all contribute to drug discovery, and, in order to translate the basic science into new medical treatments for unmet medical needs, pharmaceutical companies have to have a critical mass of excellent scientists working in many therapeutic fields, disciplines, and technologies. The imperative for the pharmaceutical industry to discover breakthrough medicines is matched by the increasing numbers of first-in-class drugs approved in recent years and reflects the impact of modern drug discovery approaches, technologies, and genomics.
Network-based approaches to climate knowledge discovery

NASA Astrophysics Data System (ADS)

Budich, Reinhard; Nyberg, Per; Weigel, Tobias

2011-11-01

Climate Knowledge Discovery Workshop; Hamburg, Germany, 30 March to 1 April 2011 Do complex networks combined with semantic Web technologies offer the next generation of solutions in climate science? To address this question, a first Climate Knowledge Discovery (CKD) Workshop, hosted by the German Climate Computing Center (Deutsches Klimarechenzentrum (DKRZ)), brought together climate and computer scientists from major American and European laboratories, data centers, and universities, as well as representatives from industry, the broader academic community, and the semantic Web communities. The participants, representing six countries, were concerned with large-scale Earth system modeling and computational data analysis. The motivation for the meeting was the growing problem that climate scientists generate data faster than it can be interpreted and the need to prepare for further exponential data increases. Current analysis approaches are focused primarily on traditional methods, which are best suited for large-scale phenomena and coarse-resolution data sets. The workshop focused on the open discussion of ideas and technologies to provide the next generation of solutions to cope with the increasing data volumes in climate science.
Computational biology for cardiovascular biomarker discovery.

PubMed

Azuaje, Francisco; Devaux, Yvan; Wagner, Daniel

2009-07-01

Computational biology is essential in the process of translating biological knowledge into clinical practice, as well as in the understanding of biological phenomena based on the resources and technologies originating from the clinical environment. One such key contribution of computational biology is the discovery of biomarkers for predicting clinical outcomes using 'omic' information. This process involves the predictive modelling and integration of different types of data and knowledge for screening, diagnostic or prognostic purposes. Moreover, this requires the design and combination of different methodologies based on statistical analysis and machine learning. This article introduces key computational approaches and applications to biomarker discovery based on different types of 'omic' data. Although we emphasize applications in cardiovascular research, the computational requirements and advances discussed here are also relevant to other domains. We will start by introducing some of the contributions of computational biology to translational research, followed by an overview of methods and technologies used for the identification of biomarkers with predictive or classification value. The main types of 'omic' approaches to biomarker discovery will be presented with specific examples from cardiovascular research. This will include a review of computational methodologies for single-source and integrative data applications. Major computational methods for model evaluation will be described together with recommendations for reporting models and results. We will present recent advances in cardiovascular biomarker discovery based on the combination of gene expression and functional network analyses. The review will conclude with a discussion of key challenges for computational biology, including perspectives from the biosciences and clinical areas.
A New Student Performance Analysing System Using Knowledge Discovery in Higher Educational Databases

ERIC Educational Resources Information Center

Guruler, Huseyin; Istanbullu, Ayhan; Karahasan, Mehmet

2010-01-01

Knowledge discovery is a wide ranged process including data mining, which is used to find out meaningful and useful patterns in large amounts of data. In order to explore the factors having impact on the success of university students, knowledge discovery software, called MUSKUP, has been developed and tested on student data. In this system a…
To ontologise or not to ontologise: An information model for a geospatial knowledge infrastructure

NASA Astrophysics Data System (ADS)

Stock, Kristin; Stojanovic, Tim; Reitsma, Femke; Ou, Yang; Bishr, Mohamed; Ortmann, Jens; Robertson, Anne

2012-08-01

A geospatial knowledge infrastructure consists of a set of interoperable components, including software, information, hardware, procedures and standards, that work together to support advanced discovery and creation of geoscientific resources, including publications, data sets and web services. The focus of the work presented is the development of such an infrastructure for resource discovery. Advanced resource discovery is intended to support scientists in finding resources that meet their needs, and focuses on representing the semantic details of the scientific resources, including the detailed aspects of the science that led to the resource being created. This paper describes an information model for a geospatial knowledge infrastructure that uses ontologies to represent these semantic details, including knowledge about domain concepts, the scientific elements of the resource (analysis methods, theories and scientific processes) and web services. This semantic information can be used to enable more intelligent search over scientific resources, and to support new ways to infer and visualise scientific knowledge. The work describes the requirements for semantic support of a knowledge infrastructure, and analyses the different options for information storage based on the twin goals of semantic richness and syntactic interoperability to allow communication between different infrastructures. Such interoperability is achieved by the use of open standards, and the architecture of the knowledge infrastructure adopts such standards, particularly from the geospatial community. The paper then describes an information model that uses a range of different types of ontologies, explaining those ontologies and their content. The information model was successfully implemented in a working geospatial knowledge infrastructure, but the evaluation identified some issues in creating the ontologies.
Knowledge Discovery in Databases.

ERIC Educational Resources Information Center

Norton, M. Jay

1999-01-01

Knowledge discovery in databases (KDD) revolves around the investigation and creation of knowledge, processes, algorithms, and mechanisms for retrieving knowledge from data collections. The article is an introductory overview of KDD. The rationale and environment of its development and applications are discussed. Issues related to database design…
Knowledge discovery from data and Monte-Carlo DEA to evaluate technical efficiency of mental health care in small health areas

PubMed Central

García-Alonso, Carlos; Pérez-Naranjo, Leonor

2009-01-01

Introduction Knowledge management, based on information transfer between experts and analysts, is crucial for the validity and usability of data envelopment analysis (DEA). Aim To design and develop a methodology: i) to assess technical efficiency of small health areas (SHA) in an uncertainty environment, and ii) to transfer information between experts and operational models, in both directions, for improving expert’s knowledge. Method A procedure derived from knowledge discovery from data (KDD) is used to select, interpret and weigh DEA inputs and outputs. Based on KDD results, an expert-driven Monte-Carlo DEA model has been designed to assess the technical efficiency of SHA in Andalusia. Results In terms of probability, SHA 29 is the most efficient being, on the contrary, SHA 22 very inefficient. 73% of analysed SHA have a probability of being efficient (Pe) >0.9 and 18% <0.5. Conclusions Expert knowledge is necessary to design and validate any operational model. KDD techniques make the transfer of information from experts to any operational model easy and results obtained from the latter improve expert’s knowledge.
Radioactive Dating: A Method for Geochronology.

ERIC Educational Resources Information Center

Rowe, M. W.

1985-01-01

Gives historical background on the discovery of natural radiation and discusses various techniques for using knowledge of radiochemistry in geochronological studies. Indicates that of these radioactive techniques, Potassium-40/Argon-40 dating is used most often. (JN)

Information Fusion - Methods and Aggregation Operators

NASA Astrophysics Data System (ADS)

Torra, Vicenç

Information fusion techniques are commonly applied in Data Mining and Knowledge Discovery. In this chapter, we will give an overview of such applications considering their three main uses. This is, we consider fusion methods for data preprocessing, model building and information extraction. Some aggregation operators (i.e. particular fusion methods) and their properties are briefly described as well.
Computational discovery of picomolar Q(o) site inhibitors of cytochrome bc1 complex.

PubMed

Hao, Ge-Fei; Wang, Fu; Li, Hui; Zhu, Xiao-Lei; Yang, Wen-Chao; Huang, Li-Shar; Wu, Jia-Wei; Berry, Edward A; Yang, Guang-Fu

2012-07-11

A critical challenge to the fragment-based drug discovery (FBDD) is its low-throughput nature due to the necessity of biophysical method-based fragment screening. Herein, a method of pharmacophore-linked fragment virtual screening (PFVS) was successfully developed. Its application yielded the first picomolar-range Q(o) site inhibitors of the cytochrome bc(1) complex, an important membrane protein for drug and fungicide discovery. Compared with the original hit compound 4 (K(i) = 881.80 nM, porcine bc(1)), the most potent compound 4f displayed 20 507-fold improved binding affinity (K(i) = 43.00 pM). Compound 4f was proved to be a noncompetitive inhibitor with respect to the substrate cytochrome c, but a competitive inhibitor with respect to the substrate ubiquinol. Additionally, we determined the crystal structure of compound 4e (K(i) = 83.00 pM) bound to the chicken bc(1) at 2.70 Å resolution, providing a molecular basis for understanding its ultrapotency. To our knowledge, this study is the first application of the FBDD method in the discovery of picomolar inhibitors of a membrane protein. This work demonstrates that the novel PFVS approach is a high-throughput drug discovery method, independent of biophysical screening techniques.
Knowledge discovery for pancreatic cancer using inductive logic programming.

PubMed

Qiu, Yushan; Shimada, Kazuaki; Hiraoka, Nobuyoshi; Maeshiro, Kensei; Ching, Wai-Ki; Aoki-Kinoshita, Kiyoko F; Furuta, Koh

2014-08-01

Pancreatic cancer is a devastating disease and predicting the status of the patients becomes an important and urgent issue. The authors explore the applicability of inductive logic programming (ILP) method in the disease and show that the accumulated clinical laboratory data can be used to predict disease characteristics, and this will contribute to the selection of therapeutic modalities of pancreatic cancer. The availability of a large amount of clinical laboratory data provides clues to aid in the knowledge discovery of diseases. In predicting the differentiation of tumour and the status of lymph node metastasis in pancreatic cancer, using the ILP model, three rules are developed that are consistent with descriptions in the literature. The rules that are identified are useful to detect the differentiation of tumour and the status of lymph node metastasis in pancreatic cancer and therefore contributed significantly to the decision of therapeutic strategies. In addition, the proposed method is compared with the other typical classification techniques and the results further confirm the superiority and merit of the proposed method.
Knowledge discovery in traditional Chinese medicine: state of the art and perspectives.

PubMed

Feng, Yi; Wu, Zhaohui; Zhou, Xuezhong; Zhou, Zhongmei; Fan, Weiyu

2006-11-01

As a complementary medical system to Western medicine, traditional Chinese medicine (TCM) provides a unique theoretical and practical approach to the treatment of diseases over thousands of years. Confronted with the increasing popularity of TCM and the huge volume of TCM data, historically accumulated and recently obtained, there is an urgent need to explore these resources effectively by the techniques of knowledge discovery in database (KDD). This paper aims at providing an overview of recent KDD studies in TCM field. A literature search was conducted in both English and Chinese publications, and major studies of knowledge discovery in TCM (KDTCM) reported in these materials were identified. Based on an introduction to the state of the art of TCM data resources, a review of four subfields of KDTCM research was presented, including KDD for the research of Chinese medical formula, KDD for the research of Chinese herbal medicine, KDD for TCM syndrome research, and KDD for TCM clinical diagnosis. Furthermore, the current state and main problems in each subfield were summarized based on a discussion of existing studies, and future directions for each subfield were also proposed accordingly. A series of KDD methods are used in existing KDTCM researches, ranging from conventional frequent itemset mining to state of the art latent structure model. Considerable interesting discoveries are obtained by these methods, such as novel TCM paired drugs discovered by frequent itemset analysis, functional community of related genes discovered under syndrome perspective by text mining, the high proportion of toxic plants in the botanical family Ranunculaceae disclosed by statistical analysis, the association between M-cholinoceptor blocking drug and Solanaceae revealed by association rule mining, etc. It is particularly inspiring to see some studies connecting TCM with biomedicine, which provide a novel top-down view for functional genomics research. However, further developments of KDD methods are still expected to better adapt to the features of TCM. Existing studies demonstrate that KDTCM is effective in obtaining medical discoveries. However, much more work needs to be done in order to discover real diamonds from TCM domain. The usage and development of KDTCM in the future will substantially contribute to the TCM community, as well as modern life science.
Transfer Learning of Classification Rules for Biomarker Discovery and Verification from Molecular Profiling Studies

PubMed Central

Ganchev, Philip; Malehorn, David; Bigbee, William L.; Gopalakrishnan, Vanathi

2013-01-01

We present a novel framework for integrative biomarker discovery from related but separate data sets created in biomarker profiling studies. The framework takes prior knowledge in the form of interpretable, modular rules, and uses them during the learning of rules on a new data set. The framework consists of two methods of transfer of knowledge from source to target data: transfer of whole rules and transfer of rule structures. We evaluated the methods on three pairs of data sets: one genomic and two proteomic. We used standard measures of classification performance and three novel measures of amount of transfer. Preliminary evaluation shows that whole-rule transfer improves classification performance over using the target data alone, especially when there is more source data than target data. It also improves performance over using the union of the data sets. PMID:21571094
A Knowledge Discovery framework for Planetary Defense

NASA Astrophysics Data System (ADS)

Jiang, Y.; Yang, C. P.; Li, Y.; Yu, M.; Bambacus, M.; Seery, B.; Barbee, B.

2016-12-01

Planetary Defense, a project funded by NASA Goddard and the NSF, is a multi-faceted effort focused on the mitigation of Near Earth Object (NEO) threats to our planet. Currently, there exists a dispersion of information concerning NEO's amongst different organizations and scientists, leading to a lack of a coherent system of information to be used for efficient NEO mitigation. In this paper, a planetary defense knowledge discovery engine is proposed to better assist the development and integration of a NEO responding system. Specifically, we have implemented an organized information framework by two means: 1) the development of a semantic knowledge base, which provides a structure for relevant information. It has been developed by the implementation of web crawling and natural language processing techniques, which allows us to collect and store the most relevant structured information on a regular basis. 2) the development of a knowledge discovery engine, which allows for the efficient retrieval of information from our knowledge base. The knowledge discovery engine has been built on the top of Elasticsearch, an open source full-text search engine, as well as cutting-edge machine learning ranking and recommendation algorithms. This proposed framework is expected to advance the knowledge discovery and innovation in planetary science domain.
Translational Research 2.0: a framework for accelerating collaborative discovery.

PubMed

Asakiewicz, Chris

2014-05-01

The world wide web has revolutionized the conduct of global, cross-disciplinary research. In the life sciences, interdisciplinary approaches to problem solving and collaboration are becoming increasingly important in facilitating knowledge discovery and integration. Web 2.0 technologies promise to have a profound impact - enabling reproducibility, aiding in discovery, and accelerating and transforming medical and healthcare research across the healthcare ecosystem. However, knowledge integration and discovery require a consistent foundation upon which to operate. A foundation should be capable of addressing some of the critical issues associated with how research is conducted within the ecosystem today and how it should be conducted for the future. This article will discuss a framework for enhancing collaborative knowledge discovery across the medical and healthcare research ecosystem. A framework that could serve as a foundation upon which ecosystem stakeholders can enhance the way data, information and knowledge is created, shared and used to accelerate the translation of knowledge from one area of the ecosystem to another.
Interdisciplinary Laboratory Course Facilitating Knowledge Integration, Mutualistic Teaming, and Original Discovery.

PubMed

Full, Robert J; Dudley, Robert; Koehl, M A R; Libby, Thomas; Schwab, Cheryl

2015-11-01

Experiencing the thrill of an original scientific discovery can be transformative to students unsure about becoming a scientist, yet few courses offer authentic research experiences. Increasingly, cutting-edge discoveries require an interdisciplinary approach not offered in current departmental-based courses. Here, we describe a one-semester, learning laboratory course on organismal biomechanics offered at our large research university that enables interdisciplinary teams of students from biology and engineering to grow intellectually, collaborate effectively, and make original discoveries. To attain this goal, we avoid traditional "cookbook" laboratories by training 20 students to use a dozen research stations. Teams of five students rotate to a new station each week where a professor, graduate student, and/or team member assists in the use of equipment, guides students through stages of critical thinking, encourages interdisciplinary collaboration, and moves them toward authentic discovery. Weekly discussion sections that involve the entire class offer exchange of discipline-specific knowledge, advice on experimental design, methods of collecting and analyzing data, a statistics primer, and best practices for writing and presenting scientific papers. The building of skills in concert with weekly guided inquiry facilitates original discovery via a final research project that can be presented at a national meeting or published in a scientific journal. © The Author 2015. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
A framework for interval-valued information system

NASA Astrophysics Data System (ADS)

Yin, Yunfei; Gong, Guanghong; Han, Liang

2012-09-01

Interval-valued information system is used to transform the conventional dataset into the interval-valued form. To conduct the interval-valued data mining, we conduct two investigations: (1) construct the interval-valued information system, and (2) conduct the interval-valued knowledge discovery. In constructing the interval-valued information system, we first make the paired attributes in the database discovered, and then, make them stored in the neighbour locations in a common database and regard them as 'one' new field. In conducting the interval-valued knowledge discovery, we utilise some related priori knowledge and regard the priori knowledge as the control objectives; and design an approximate closed-loop control mining system. On the implemented experimental platform (prototype), we conduct the corresponding experiments and compare the proposed algorithms with several typical algorithms, such as the Apriori algorithm, the FP-growth algorithm and the CLOSE+ algorithm. The experimental results show that the interval-valued information system method is more effective than the conventional algorithms in discovering interval-valued patterns.
A renaissance of neural networks in drug discovery.

PubMed

Baskin, Igor I; Winkler, David; Tetko, Igor V

2016-08-01

Neural networks are becoming a very popular method for solving machine learning and artificial intelligence problems. The variety of neural network types and their application to drug discovery requires expert knowledge to choose the most appropriate approach. In this review, the authors discuss traditional and newly emerging neural network approaches to drug discovery. Their focus is on backpropagation neural networks and their variants, self-organizing maps and associated methods, and a relatively new technique, deep learning. The most important technical issues are discussed including overfitting and its prevention through regularization, ensemble and multitask modeling, model interpretation, and estimation of applicability domain. Different aspects of using neural networks in drug discovery are considered: building structure-activity models with respect to various targets; predicting drug selectivity, toxicity profiles, ADMET and physicochemical properties; characteristics of drug-delivery systems and virtual screening. Neural networks continue to grow in importance for drug discovery. Recent developments in deep learning suggests further improvements may be gained in the analysis of large chemical data sets. It's anticipated that neural networks will be more widely used in drug discovery in the future, and applied in non-traditional areas such as drug delivery systems, biologically compatible materials, and regenerative medicine.
Knowledge Discovery from Biomedical Ontologies in Cross Domains.

PubMed

Shen, Feichen; Lee, Yugyung

2016-01-01

In recent years, there is an increasing demand for sharing and integration of medical data in biomedical research. In order to improve a health care system, it is required to support the integration of data by facilitating semantic interoperability systems and practices. Semantic interoperability is difficult to achieve in these systems as the conceptual models underlying datasets are not fully exploited. In this paper, we propose a semantic framework, called Medical Knowledge Discovery and Data Mining (MedKDD), that aims to build a topic hierarchy and serve the semantic interoperability between different ontologies. For the purpose, we fully focus on the discovery of semantic patterns about the association of relations in the heterogeneous information network representing different types of objects and relationships in multiple biological ontologies and the creation of a topic hierarchy through the analysis of the discovered patterns. These patterns are used to cluster heterogeneous information networks into a set of smaller topic graphs in a hierarchical manner and then to conduct cross domain knowledge discovery from the multiple biological ontologies. Thus, patterns made a greater contribution in the knowledge discovery across multiple ontologies. We have demonstrated the cross domain knowledge discovery in the MedKDD framework using a case study with 9 primary biological ontologies from Bio2RDF and compared it with the cross domain query processing approach, namely SLAP. We have confirmed the effectiveness of the MedKDD framework in knowledge discovery from multiple medical ontologies.
Knowledge Discovery from Biomedical Ontologies in Cross Domains

PubMed Central

Shen, Feichen; Lee, Yugyung

2016-01-01

In recent years, there is an increasing demand for sharing and integration of medical data in biomedical research. In order to improve a health care system, it is required to support the integration of data by facilitating semantic interoperability systems and practices. Semantic interoperability is difficult to achieve in these systems as the conceptual models underlying datasets are not fully exploited. In this paper, we propose a semantic framework, called Medical Knowledge Discovery and Data Mining (MedKDD), that aims to build a topic hierarchy and serve the semantic interoperability between different ontologies. For the purpose, we fully focus on the discovery of semantic patterns about the association of relations in the heterogeneous information network representing different types of objects and relationships in multiple biological ontologies and the creation of a topic hierarchy through the analysis of the discovered patterns. These patterns are used to cluster heterogeneous information networks into a set of smaller topic graphs in a hierarchical manner and then to conduct cross domain knowledge discovery from the multiple biological ontologies. Thus, patterns made a greater contribution in the knowledge discovery across multiple ontologies. We have demonstrated the cross domain knowledge discovery in the MedKDD framework using a case study with 9 primary biological ontologies from Bio2RDF and compared it with the cross domain query processing approach, namely SLAP. We have confirmed the effectiveness of the MedKDD framework in knowledge discovery from multiple medical ontologies. PMID:27548262
Communication in Collaborative Discovery Learning

ERIC Educational Resources Information Center

Saab, Nadira; van Joolingen, Wouter R.; van Hout-Wolters, Bernadette H. A. M.

2005-01-01

Background: Constructivist approaches to learning focus on learning environments in which students have the opportunity to construct knowledge themselves, and negotiate this knowledge with others. "Discovery learning" and "collaborative learning" are examples of learning contexts that cater for knowledge construction processes. We introduce a…
Practice-Based Knowledge Discovery for Comparative Effectiveness Research: An Organizing Framework

PubMed Central

Lucero, Robert J.; Bakken, Suzanne

2014-01-01

Electronic health information systems can increase the ability of health-care organizations to investigate the effects of clinical interventions. The authors present an organizing framework that integrates outcomes and informatics research paradigms to guide knowledge discovery in electronic clinical databases. They illustrate its application using the example of hospital acquired pressure ulcers (HAPU). The Knowledge Discovery through Informatics for Comparative Effectiveness Research (KDI-CER) framework was conceived as a heuristic to conceptualize study designs and address potential methodological limitations imposed by using a single research perspective. Advances in informatics research can play a complementary role in advancing the field of outcomes research including CER. The KDI-CER framework can be used to facilitate knowledge discovery from routinely collected electronic clinical data. PMID:25278645
Reflective Random Indexing and indirect inference: a scalable method for discovery of implicit connections.

PubMed

Cohen, Trevor; Schvaneveldt, Roger; Widdows, Dominic

2010-04-01

The discovery of implicit connections between terms that do not occur together in any scientific document underlies the model of literature-based knowledge discovery first proposed by Swanson. Corpus-derived statistical models of semantic distance such as Latent Semantic Analysis (LSA) have been evaluated previously as methods for the discovery of such implicit connections. However, LSA in particular is dependent on a computationally demanding method of dimension reduction as a means to obtain meaningful indirect inference, limiting its ability to scale to large text corpora. In this paper, we evaluate the ability of Random Indexing (RI), a scalable distributional model of word associations, to draw meaningful implicit relationships between terms in general and biomedical language. Proponents of this method have achieved comparable performance to LSA on several cognitive tasks while using a simpler and less computationally demanding method of dimension reduction than LSA employs. In this paper, we demonstrate that the original implementation of RI is ineffective at inferring meaningful indirect connections, and evaluate Reflective Random Indexing (RRI), an iterative variant of the method that is better able to perform indirect inference. RRI is shown to lead to more clearly related indirect connections and to outperform existing RI implementations in the prediction of future direct co-occurrence in the MEDLINE corpus. 2009 Elsevier Inc. All rights reserved.
Large scale analysis of the mutational landscape in HT-SELEX improves aptamer discovery

PubMed Central

Hoinka, Jan; Berezhnoy, Alexey; Dao, Phuong; Sauna, Zuben E.; Gilboa, Eli; Przytycka, Teresa M.

2015-01-01

High-Throughput (HT) SELEX combines SELEX (Systematic Evolution of Ligands by EXponential Enrichment), a method for aptamer discovery, with massively parallel sequencing technologies. This emerging technology provides data for a global analysis of the selection process and for simultaneous discovery of a large number of candidates but currently lacks dedicated computational approaches for their analysis. To close this gap, we developed novel in-silico methods to analyze HT-SELEX data and utilized them to study the emergence of polymerase errors during HT-SELEX. Rather than considering these errors as a nuisance, we demonstrated their utility for guiding aptamer discovery. Our approach builds on two main advancements in aptamer analysis: AptaMut—a novel technique allowing for the identification of polymerase errors conferring an improved binding affinity relative to the ‘parent’ sequence and AptaCluster—an aptamer clustering algorithm which is to our best knowledge, the only currently available tool capable of efficiently clustering entire aptamer pools. We applied these methods to an HT-SELEX experiment developing aptamers against Interleukin 10 receptor alpha chain (IL-10RA) and experimentally confirmed our predictions thus validating our computational methods. PMID:25870409
Discovery learning model with geogebra assisted for improvement mathematical visual thinking ability

NASA Astrophysics Data System (ADS)

Juandi, D.; Priatna, N.

2018-05-01

The main goal of this study is to improve the mathematical visual thinking ability of high school student through implementation the Discovery Learning Model with Geogebra Assisted. This objective can be achieved through study used quasi-experimental method, with non-random pretest-posttest control design. The sample subject of this research consist of 62 senior school student grade XI in one of school in Bandung district. The required data will be collected through documentation, observation, written tests, interviews, daily journals, and student worksheets. The results of this study are: 1) Improvement students Mathematical Visual Thinking Ability who obtain learning with applied the Discovery Learning Model with Geogebra assisted is significantly higher than students who obtain conventional learning; 2) There is a difference in the improvement of students’ Mathematical Visual Thinking ability between groups based on prior knowledge mathematical abilities (high, medium, and low) who obtained the treatment. 3) The Mathematical Visual Thinking Ability improvement of the high group is significantly higher than in the medium and low groups. 4) The quality of improvement ability of high and low prior knowledge is moderate category, in while the quality of improvement ability in the high category achieved by student with medium prior knowledge.
Systematic identification of latent disease-gene associations from PubMed articles.

PubMed

Zhang, Yuji; Shen, Feichen; Mojarad, Majid Rastegar; Li, Dingcheng; Liu, Sijia; Tao, Cui; Yu, Yue; Liu, Hongfang

2018-01-01

Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research.
Systematic identification of latent disease-gene associations from PubMed articles

PubMed Central

Mojarad, Majid Rastegar; Li, Dingcheng; Liu, Sijia; Tao, Cui; Yu, Yue; Liu, Hongfang

2018-01-01

Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research. PMID:29373609
Object-graphs for context-aware visual category discovery.

PubMed

Lee, Yong Jae; Grauman, Kristen

2012-02-01

How can knowing about some categories help us to discover new ones in unlabeled images? Unsupervised visual category discovery is useful to mine for recurring objects without human supervision, but existing methods assume no prior information and thus tend to perform poorly for cluttered scenes with multiple objects. We propose to leverage knowledge about previously learned categories to enable more accurate discovery, and address challenges in estimating their familiarity in unsegmented, unlabeled images. We introduce two variants of a novel object-graph descriptor to encode the 2D and 3D spatial layout of object-level co-occurrence patterns relative to an unfamiliar region and show that by using them to model the interaction between an image’s known and unknown objects, we can better detect new visual categories. Rather than mine for all categories from scratch, our method identifies new objects while drawing on useful cues from familiar ones. We evaluate our approach on several benchmark data sets and demonstrate clear improvements in discovery over conventional purely appearance-based baselines.

Mississippi State University Center for Air Sea Technology. FY93 and FY 94 Research Program in Navy Ocean Modeling and Prediction

DTIC Science & Technology

1994-09-30

relational versus object oriented DBMS, knowledge discovery, data models, rnetadata, data filtering, clustering techniques, and synthetic data. A secondary...The first was the investigation of Al/ES Lapplications (knowledge discovery, data mining, and clustering ). Here CAST collabo.rated with Dr. Fred Petry...knowledge discovery system based on clustering techniques; implemented an on-line data browser to the DBMS; completed preliminary efforts to apply object
Modelling and enhanced molecular dynamics to steer structure-based drug discovery.

PubMed

Kalyaanamoorthy, Subha; Chen, Yi-Ping Phoebe

2014-05-01

The ever-increasing gap between the availabilities of the genome sequences and the crystal structures of proteins remains one of the significant challenges to the modern drug discovery efforts. The knowledge of structure-dynamics-functionalities of proteins is important in order to understand several key aspects of structure-based drug discovery, such as drug-protein interactions, drug binding and unbinding mechanisms and protein-protein interactions. This review presents a brief overview on the different state of the art computational approaches that are applied for protein structure modelling and molecular dynamics simulations of biological systems. We give an essence of how different enhanced sampling molecular dynamics approaches, together with regular molecular dynamics methods, assist in steering the structure based drug discovery processes. Copyright © 2013 Elsevier Ltd. All rights reserved.
Outside-In Systems Pharmacology Combines Innovative Computational Methods With High-Throughput Whole Vertebrate Studies.

PubMed

Schulthess, Pascal; van Wijk, Rob C; Krekels, Elke H J; Yates, James W T; Spaink, Herman P; van der Graaf, Piet H

2018-04-25

To advance the systems approach in pharmacology, experimental models and computational methods need to be integrated from early drug discovery onward. Here, we propose outside-in model development, a model identification technique to understand and predict the dynamics of a system without requiring prior biological and/or pharmacological knowledge. The advanced data required could be obtained by whole vertebrate, high-throughput, low-resource dose-exposure-effect experimentation with the zebrafish larva. Combinations of these innovative techniques could improve early drug discovery. © 2018 The Authors CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.
Interactive knowledge discovery with the doctor-in-the-loop: a practical example of cerebral aneurysms research.

PubMed

Girardi, Dominic; Küng, Josef; Kleiser, Raimund; Sonnberger, Michael; Csillag, Doris; Trenkler, Johannes; Holzinger, Andreas

2016-09-01

Established process models for knowledge discovery find the domain-expert in a customer-like and supervising role. In the field of biomedical research, it is necessary to move the domain-experts into the center of this process with far-reaching consequences for both their research output and the process itself. In this paper, we revise the established process models for knowledge discovery and propose a new process model for domain-expert-driven interactive knowledge discovery. Furthermore, we present a research infrastructure which is adapted to this new process model and demonstrate how the domain-expert can be deeply integrated even into the highly complex data-mining process and data-exploration tasks. We evaluated this approach in the medical domain for the case of cerebral aneurysms research.
Influence of Previous Knowledge, Language Skills and Domain-Specific Interest on Observation Competency

ERIC Educational Resources Information Center

Kohlhauf, Lucia; Rutke, Ulrike; Neuhaus, Birgit

2011-01-01

Many epoch-making biological discoveries (e.g. Darwinian Theory) were based upon observations. Nevertheless, observation is often regarded as "just looking" rather than a basic scientific skill. As observation is one of the main research methods in biological sciences, it must be considered as an independent research method and systematic practice…
Combining knowledge discovery from databases (KDD) and case-based reasoning (CBR) to support diagnosis of medical images

NASA Astrophysics Data System (ADS)

Stranieri, Andrew; Yearwood, John; Pham, Binh

1999-07-01

The development of data warehouses for the storage and analysis of very large corpora of medical image data represents a significant trend in health care and research. Amongst other benefits, the trend toward warehousing enables the use of techniques for automatically discovering knowledge from large and distributed databases. In this paper, we present an application design for knowledge discovery from databases (KDD) techniques that enhance the performance of the problem solving strategy known as case- based reasoning (CBR) for the diagnosis of radiological images. The problem of diagnosing the abnormality of the cervical spine is used to illustrate the method. The design of a case-based medical image diagnostic support system has three essential characteristics. The first is a case representation that comprises textual descriptions of the image, visual features that are known to be useful for indexing images, and additional visual features to be discovered by data mining many existing images. The second characteristic of the approach presented here involves the development of a case base that comprises an optimal number and distribution of cases. The third characteristic involves the automatic discovery, using KDD techniques, of adaptation knowledge to enhance the performance of the case based reasoner. Together, the three characteristics of our approach can overcome real time efficiency obstacles that otherwise mitigate against the use of CBR to the domain of medical image analysis.
Integrating semantic web technologies and geospatial catalog services for geospatial information discovery and processing in cyberinfrastructure

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yue, Peng; Gong, Jianya; Di, Liping

Abstract A geospatial catalogue service provides a network-based meta-information repository and interface for advertising and discovering shared geospatial data and services. Descriptive information (i.e., metadata) for geospatial data and services is structured and organized in catalogue services. The approaches currently available for searching and using that information are often inadequate. Semantic Web technologies show promise for better discovery methods by exploiting the underlying semantics. Such development needs special attention from the Cyberinfrastructure perspective, so that the traditional focus on discovery of and access to geospatial data can be expanded to support the increased demand for processing of geospatial information andmore » discovery of knowledge. Semantic descriptions for geospatial data, services, and geoprocessing service chains are structured, organized, and registered through extending elements in the ebXML Registry Information Model (ebRIM) of a geospatial catalogue service, which follows the interface specifications of the Open Geospatial Consortium (OGC) Catalogue Services for the Web (CSW). The process models for geoprocessing service chains, as a type of geospatial knowledge, are captured, registered, and discoverable. Semantics-enhanced discovery for geospatial data, services/service chains, and process models is described. Semantic search middleware that can support virtual data product materialization is developed for the geospatial catalogue service. The creation of such a semantics-enhanced geospatial catalogue service is important in meeting the demands for geospatial information discovery and analysis in Cyberinfrastructure.« less
76 FR 79122 - Magnuson-Stevens Act Provisions; Fisheries Off West Coast States; Pacific Coast Groundfish...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-12-21

... management measures for the remainder of the biennial period that would take into account new knowledge... precautionary, in response to the discovery of an error in the methods that were used to estimate landings of...
Knowledge Discovery in Textual Documentation: Qualitative and Quantitative Analyses.

ERIC Educational Resources Information Center

Loh, Stanley; De Oliveira, Jose Palazzo M.; Gastal, Fabio Leite

2001-01-01

Presents an application of knowledge discovery in texts (KDT) concerning medical records of a psychiatric hospital. The approach helps physicians to extract knowledge about patients and diseases that may be used for epidemiological studies, for training professionals, and to support physicians to diagnose and evaluate diseases. (Author/AEF)
X-ray crystallography over the past decade for novel drug discovery - where are we heading next?

PubMed

Zheng, Heping; Handing, Katarzyna B; Zimmerman, Matthew D; Shabalin, Ivan G; Almo, Steven C; Minor, Wladek

2015-01-01

Macromolecular X-ray crystallography has been the primary methodology for determining the three-dimensional structures of proteins, nucleic acids and viruses. Structural information has paved the way for structure-guided drug discovery and laid the foundations for structural bioinformatics. However, X-ray crystallography still has a few fundamental limitations, some of which may be overcome and complemented using emerging methods and technologies in other areas of structural biology. This review describes how structural knowledge gained from X-ray crystallography has been used to advance other biophysical methods for structure determination (and vice versa). This article also covers current practices for integrating data generated by other biochemical and biophysical methods with those obtained from X-ray crystallography. Finally, the authors articulate their vision about how a combination of structural and biochemical/biophysical methods may improve our understanding of biological processes and interactions. X-ray crystallography has been, and will continue to serve as, the central source of experimental structural biology data used in the discovery of new drugs. However, other structural biology techniques are useful not only to overcome the major limitation of X-ray crystallography, but also to provide complementary structural data that is useful in drug discovery. The use of recent advancements in biochemical, spectroscopy and bioinformatics methods may revolutionize drug discovery, albeit only when these data are combined and analyzed with effective data management systems. Accurate and complete data management is crucial for developing experimental procedures that are robust and reproducible.
[The discovery of blood circulation: revolution or revision?].

PubMed

Crignon, Claire

2011-01-01

The discovery of the principle of blood circulation by William Harvey is generally considered as one of the major events of the "scientific revolution" of the 17th century. This paper reconsiders the question by taking in account the way Harvey's discovery was discussed by some contemporary philosophers and physicians, in particular Fontenelle, who insisted on the necessity of redefining methods and principles of medical knowledge, basing themselves on the revival of anatomy and physiology, and of its consequences on the way it permits to think about the human nature. This return allows us to consider the opportunity of substituting the kuhnian scheme of "structure of scientific revolutions" for the bachelardian concept of "refonte".
Identification of potential inhibitors based on compound proposal contest: Tyrosine-protein kinase Yes as a target.

PubMed

Chiba, Shuntaro; Ikeda, Kazuyoshi; Ishida, Takashi; Gromiha, M Michael; Taguchi, Y-H; Iwadate, Mitsuo; Umeyama, Hideaki; Hsin, Kun-Yi; Kitano, Hiroaki; Yamamoto, Kazuki; Sugaya, Nobuyoshi; Kato, Koya; Okuno, Tatsuya; Chikenji, George; Mochizuki, Masahiro; Yasuo, Nobuaki; Yoshino, Ryunosuke; Yanagisawa, Keisuke; Ban, Tomohiro; Teramoto, Reiji; Ramakrishnan, Chandrasekaran; Thangakani, A Mary; Velmurugan, D; Prathipati, Philip; Ito, Junichi; Tsuchiya, Yuko; Mizuguchi, Kenji; Honma, Teruki; Hirokawa, Takatsugu; Akiyama, Yutaka; Sekijima, Masakazu

2015-11-26

A search of broader range of chemical space is important for drug discovery. Different methods of computer-aided drug discovery (CADD) are known to propose compounds in different chemical spaces as hit molecules for the same target protein. This study aimed at using multiple CADD methods through open innovation to achieve a level of hit molecule diversity that is not achievable with any particular single method. We held a compound proposal contest, in which multiple research groups participated and predicted inhibitors of tyrosine-protein kinase Yes. This showed whether collective knowledge based on individual approaches helped to obtain hit compounds from a broad range of chemical space and whether the contest-based approach was effective.
Identification of potential inhibitors based on compound proposal contest: Tyrosine-protein kinase Yes as a target

PubMed Central

Chiba, Shuntaro; Ikeda, Kazuyoshi; Ishida, Takashi; Gromiha, M. Michael; Taguchi, Y-h.; Iwadate, Mitsuo; Umeyama, Hideaki; Hsin, Kun-Yi; Kitano, Hiroaki; Yamamoto, Kazuki; Sugaya, Nobuyoshi; Kato, Koya; Okuno, Tatsuya; Chikenji, George; Mochizuki, Masahiro; Yasuo, Nobuaki; Yoshino, Ryunosuke; Yanagisawa, Keisuke; Ban, Tomohiro; Teramoto, Reiji; Ramakrishnan, Chandrasekaran; Thangakani, A. Mary; Velmurugan, D.; Prathipati, Philip; Ito, Junichi; Tsuchiya, Yuko; Mizuguchi, Kenji; Honma, Teruki; Hirokawa, Takatsugu; Akiyama, Yutaka; Sekijima, Masakazu

2015-01-01

A search of broader range of chemical space is important for drug discovery. Different methods of computer-aided drug discovery (CADD) are known to propose compounds in different chemical spaces as hit molecules for the same target protein. This study aimed at using multiple CADD methods through open innovation to achieve a level of hit molecule diversity that is not achievable with any particular single method. We held a compound proposal contest, in which multiple research groups participated and predicted inhibitors of tyrosine-protein kinase Yes. This showed whether collective knowledge based on individual approaches helped to obtain hit compounds from a broad range of chemical space and whether the contest-based approach was effective. PMID:26607293
Streptomyces species: Ideal chassis for natural product discovery and overproduction.

PubMed

Liu, Ran; Deng, Zixin; Liu, Tiangang

2018-05-28

There is considerable interest in mining organisms for new natural products (NPs) and in improving methods to overproduce valuable NPs. Because of the rapid development of tools and strategies for metabolic engineering and the markedly increased knowledge of the biosynthetic pathways and genetics of NP-producing organisms, genome mining and overproduction of NPs can be dramatically accelerated. In particular, Streptomyces species have been proposed as suitable chassis organisms for NP discovery and overproduction because of their many unique characteristics not shared with yeast, Escherichia coli, or other microorganisms. In this review, we summarize the methods for genome sequencing, gene cluster prediction, and gene editing in Streptomyces, as well as metabolic engineering strategies for NP overproduction and approaches for generating new products. Finally, two strategies for utilizing Streptomyces as the chassis for NP discovery and overproduction are emphasized. Copyright © 2018 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Tactical Decision Making under Categorical Uncertainty with Applications to Modeling and Simulation

DTIC Science & Technology

2008-12-01

Method, Rene Descartes (1637) addressed the importance of discovery and truth through science. To accomplish this, he asked man to “reject all...previous knowledge, opinion, and customs” ( Descartes , 1637, p. 21). He writes: The first was never to accept anything as true which I did not clearly know...and distinctly as to exclude all possibility of doubt. Descartes was arguing two points. First, knowledge, and therefore truth, cannot, and must
Building Faculty Capacity through the Learning Sciences

ERIC Educational Resources Information Center

Moy, Elizabeth; O'Sullivan, Gerard; Terlecki, Melissa; Jernstedt, Christian

2014-01-01

Discoveries in the learning sciences (especially in neuroscience) have yielded a rich and growing body of knowledge about how students learn, yet this knowledge is only half of the story. The other half is "know how," i.e. the application of this knowledge. For faculty members, that means applying the discoveries of the learning sciences…
Towards building a disease-phenotype knowledge base: extracting disease-manifestation relationship from literature

PubMed Central

Xu, Rong; Li, Li; Wang, QuanQiu

2013-01-01

Motivation: Systems approaches to studying phenotypic relationships among diseases are emerging as an active area of research for both novel disease gene discovery and drug repurposing. Currently, systematic study of disease phenotypic relationships on a phenome-wide scale is limited because large-scale machine-understandable disease–phenotype relationship knowledge bases are often unavailable. Here, we present an automatic approach to extract disease–manifestation (D-M) pairs (one specific type of disease–phenotype relationship) from the wide body of published biomedical literature. Data and Methods: Our method leverages external knowledge and limits the amount of human effort required. For the text corpus, we used 119 085 682 MEDLINE sentences (21 354 075 citations). First, we used D-M pairs from existing biomedical ontologies as prior knowledge to automatically discover D-M–specific syntactic patterns. We then extracted additional pairs from MEDLINE using the learned patterns. Finally, we analysed correlations between disease manifestations and disease-associated genes and drugs to demonstrate the potential of this newly created knowledge base in disease gene discovery and drug repurposing. Results: In total, we extracted 121 359 unique D-M pairs with a high precision of 0.924. Among the extracted pairs, 120 419 (99.2%) have not been captured in existing structured knowledge sources. We have shown that disease manifestations correlate positively with both disease-associated genes and drug treatments. Conclusions: The main contribution of our study is the creation of a large-scale and accurate D-M phenotype relationship knowledge base. This unique knowledge base, when combined with existing phenotypic, genetic and proteomic datasets, can have profound implications in our deeper understanding of disease etiology and in rapid drug repurposing. Availability: http://nlp.case.edu/public/data/DMPatternUMLS/ Contact: rxx@case.edu PMID:23828786
The Montessori Method and the Kindergarten. Bulletin, 1914, No. 28. Whole Number 602

ERIC Educational Resources Information Center

Harrison, Elizabeth

1914-01-01

Recently an earnest, brilliant, and learned Italian woman, Dr. Maria Montessori, has become famous, probably beyond her desire, for her contribution to the knowledge of little children and for the embodiment of her own and the discoveries of others in what she likes to call "a method of a new science of education." Her scientific investigations as…
The center for causal discovery of biomedical knowledge from big data

PubMed Central

Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard

2015-01-01

The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. PMID:26138794
The Effect of Rules and Discovery in the Retention and Retrieval of Braille Inkprint Letter Pairs.

ERIC Educational Resources Information Center

Nagengast, Daniel L.; And Others

The effects of rule knowledge were investigated using Braille inkprint pairs. Both recognition and recall were studied in three groups of subjects: rule knowledge, rule discovery, and no rule. Two hypotheses were tested: (1) that the group exposed to the rule would score better than would a discovery group and a control group; and (2) that all…

Big data to smart data in Alzheimer's disease: The brain health modeling initiative to foster actionable knowledge.

PubMed

Geerts, Hugo; Dacks, Penny A; Devanarayan, Viswanath; Haas, Magali; Khachaturian, Zaven S; Gordon, Mark Forrest; Maudsley, Stuart; Romero, Klaus; Stephenson, Diane

2016-09-01

Massive investment and technological advances in the collection of extensive and longitudinal information on thousands of Alzheimer patients results in large amounts of data. These "big-data" databases can potentially advance CNS research and drug development. However, although necessary, they are not sufficient, and we posit that they must be matched with analytical methods that go beyond retrospective data-driven associations with various clinical phenotypes. Although these empirically derived associations can generate novel and useful hypotheses, they need to be organically integrated in a quantitative understanding of the pathology that can be actionable for drug discovery and development. We argue that mechanism-based modeling and simulation approaches, where existing domain knowledge is formally integrated using complexity science and quantitative systems pharmacology can be combined with data-driven analytics to generate predictive actionable knowledge for drug discovery programs, target validation, and optimization of clinical development. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Video Game Learning Dynamics: Actionable Measures of Multidimensional Learning Trajectories

ERIC Educational Resources Information Center

Reese, Debbie Denise; Tabachnick, Barbara G.; Kosko, Robert E.

2015-01-01

Valid, accessible, reusable methods for instructional video game design and embedded assessment can provide actionable information enhancing individual and collective achievement. Cyberlearning through game-based, metaphor-enhanced learning objects (CyGaMEs) design and embedded assessment quantify player behavior to study knowledge discovery and…
Linking Automated Data Analysis and Visualization with Applications in Developmental Biology and High-Energy Physics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ruebel, Oliver

2009-11-20

Knowledge discovery from large and complex collections of today's scientific datasets is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the increasing number of data dimensions and data objects is presenting tremendous challenges for data analysis and effective data exploration methods and tools. Researchers are overwhelmed with data and standard tools are often insufficient to enable effective data analysis and knowledge discovery. The main objective of this thesis is to provide important new capabilities to accelerate scientific knowledge discovery form large, complex, and multivariate scientific data. The research coveredmore » in this thesis addresses these scientific challenges using a combination of scientific visualization, information visualization, automated data analysis, and other enabling technologies, such as efficient data management. The effectiveness of the proposed analysis methods is demonstrated via applications in two distinct scientific research fields, namely developmental biology and high-energy physics.Advances in microscopy, image analysis, and embryo registration enable for the first time measurement of gene expression at cellular resolution for entire organisms. Analysis of high-dimensional spatial gene expression datasets is a challenging task. By integrating data clustering and visualization, analysis of complex, time-varying, spatial gene expression patterns and their formation becomes possible. The analysis framework MATLAB and the visualization have been integrated, making advanced analysis tools accessible to biologist and enabling bioinformatic researchers to directly integrate their analysis with the visualization. Laser wakefield particle accelerators (LWFAs) promise to be a new compact source of high-energy particles and radiation, with wide applications ranging from medicine to physics. To gain insight into the complex physical processes of particle acceleration, physicists model LWFAs computationally. The datasets produced by LWFA simulations are (i) extremely large, (ii) of varying spatial and temporal resolution, (iii) heterogeneous, and (iv) high-dimensional, making analysis and knowledge discovery from complex LWFA simulation data a challenging task. To address these challenges this thesis describes the integration of the visualization system VisIt and the state-of-the-art index/query system FastBit, enabling interactive visual exploration of extremely large three-dimensional particle datasets. Researchers are especially interested in beams of high-energy particles formed during the course of a simulation. This thesis describes novel methods for automatic detection and analysis of particle beams enabling a more accurate and efficient data analysis process. By integrating these automated analysis methods with visualization, this research enables more accurate, efficient, and effective analysis of LWFA simulation data than previously possible.« less
Scientific discovery as a combinatorial optimisation problem: How best to navigate the landscape of possible experiments?

PubMed Central

Kell, Douglas B

2012-01-01

A considerable number of areas of bioscience, including gene and drug discovery, metabolic engineering for the biotechnological improvement of organisms, and the processes of natural and directed evolution, are best viewed in terms of a ‘landscape’ representing a large search space of possible solutions or experiments populated by a considerably smaller number of actual solutions that then emerge. This is what makes these problems ‘hard’, but as such these are to be seen as combinatorial optimisation problems that are best attacked by heuristic methods known from that field. Such landscapes, which may also represent or include multiple objectives, are effectively modelled in silico, with modern active learning algorithms such as those based on Darwinian evolution providing guidance, using existing knowledge, as to what is the ‘best’ experiment to do next. An awareness, and the application, of these methods can thereby enhance the scientific discovery process considerably. This analysis fits comfortably with an emerging epistemology that sees scientific reasoning, the search for solutions, and scientific discovery as Bayesian processes. PMID:22252984
Scientific discovery as a combinatorial optimisation problem: how best to navigate the landscape of possible experiments?

PubMed

Kell, Douglas B

2012-03-01

A considerable number of areas of bioscience, including gene and drug discovery, metabolic engineering for the biotechnological improvement of organisms, and the processes of natural and directed evolution, are best viewed in terms of a 'landscape' representing a large search space of possible solutions or experiments populated by a considerably smaller number of actual solutions that then emerge. This is what makes these problems 'hard', but as such these are to be seen as combinatorial optimisation problems that are best attacked by heuristic methods known from that field. Such landscapes, which may also represent or include multiple objectives, are effectively modelled in silico, with modern active learning algorithms such as those based on Darwinian evolution providing guidance, using existing knowledge, as to what is the 'best' experiment to do next. An awareness, and the application, of these methods can thereby enhance the scientific discovery process considerably. This analysis fits comfortably with an emerging epistemology that sees scientific reasoning, the search for solutions, and scientific discovery as Bayesian processes. Copyright © 2012 WILEY Periodicals, Inc.
Concept Formation in Scientific Knowledge Discovery from a Constructivist View

NASA Astrophysics Data System (ADS)

Peng, Wei; Gero, John S.

The central goal of scientific knowledge discovery is to learn cause-effect relationships among natural phenomena presented as variables and the consequences their interactions. Scientific knowledge is normally expressed as scientific taxonomies and qualitative and quantitative laws [1]. This type of knowledge represents intrinsic regularities of the observed phenomena that can be used to explain and predict behaviors of the phenomena. It is a generalization that is abstracted and externalized from a set of contexts and applicable to a broader scope. Scientific knowledge is a type of third-person knowledge, i.e., knowledge that independent of a specific enquirer. Artificial intelligence approaches, particularly data mining algorithms that are used to identify meaningful patterns from large data sets, are approaches that aim to facilitate the knowledge discovery process [2]. A broad spectrum of algorithms has been developed in addressing classification, associative learning, and clustering problems. However, their linkages to people who use them have not been adequately explored. Issues in relation to supporting the interpretation of the patterns, the application of prior knowledge to the data mining process and addressing user interactions remain challenges for building knowledge discovery tools [3]. As a consequence, scientists rely on their experience to formulate problems, evaluate hypotheses, reason about untraceable factors and derive new problems. This type of knowledge which they have developed during their career is called “first-person” knowledge. The formation of scientific knowledge (third-person knowledge) is highly influenced by the enquirer’s first-person knowledge construct, which is a result of his or her interactions with the environment. There have been attempts to craft automatic knowledge discovery tools but these systems are limited in their capabilities to handle the dynamics of personal experience. There are now trends in developing approaches to assist scientists applying their expertise to model formation, simulation, and prediction in various domains [4], [5]. On the other hand, first-person knowledge becomes third-person theory only if it proves general by evidence and is acknowledged by a scientific community. Researchers start to focus on building interactive cooperation platforms [1] to accommodate different views into the knowledge discovery process. There are some fundamental questions in relation to scientific knowledge development. What aremajor components for knowledge construction and how do people construct their knowledge? How is this personal construct assimilated and accommodated into a scientific paradigm? How can one design a computational system to facilitate these processes? This chapter does not attempt to answer all these questions but serves as a basis to foster thinking along this line. A brief literature review about how people develop their knowledge is carried out through a constructivist view. A hydrological modeling scenario is presented to elucidate the approach.
Concept Formation in Scientific Knowledge Discovery from a Constructivist View

NASA Astrophysics Data System (ADS)

Peng, Wei; Gero, John S.

The central goal of scientific knowledge discovery is to learn cause-effect relationships among natural phenomena presented as variables and the consequences their interactions. Scientific knowledge is normally expressed as scientific taxonomies and qualitative and quantitative laws [1]. This type of knowledge represents intrinsic regularities of the observed phenomena that can be used to explain and predict behaviors of the phenomena. It is a generalization that is abstracted and externalized from a set of contexts and applicable to a broader scope. Scientific knowledge is a type of third-person knowledge, i.e., knowledge that independent of a specific enquirer. Artificial intelligence approaches, particularly data mining algorithms that are used to identify meaningful patterns from large data sets, are approaches that aim to facilitate the knowledge discovery process [2]. A broad spectrum of algorithms has been developed in addressing classification, associative learning, and clustering problems. However, their linkages to people who use them have not been adequately explored. Issues in relation to supporting the interpretation of the patterns, the application of prior knowledge to the data mining process and addressing user interactions remain challenges for building knowledge discovery tools [3]. As a consequence, scientists rely on their experience to formulate problems, evaluate hypotheses, reason about untraceable factors and derive new problems. This type of knowledge which they have developed during their career is called "first-person" knowledge. The formation of scientific knowledge (third-person knowledge) is highly influenced by the enquirer's first-person knowledge construct, which is a result of his or her interactions with the environment. There have been attempts to craft automatic knowledge discovery tools but these systems are limited in their capabilities to handle the dynamics of personal experience. There are now trends in developing approaches to assist scientists applying their expertise to model formation, simulation, and prediction in various domains [4], [5]. On the other hand, first-person knowledge becomes third-person theory only if it proves general by evidence and is acknowledged by a scientific community. Researchers start to focus on building interactive cooperation platforms [1] to accommodate different views into the knowledge discovery process. There are some fundamental questions in relation to scientific knowledge development. What aremajor components for knowledge construction and how do people construct their knowledge? How is this personal construct assimilated and accommodated into a scientific paradigm? How can one design a computational system to facilitate these processes? This chapter does not attempt to answer all these questions but serves as a basis to foster thinking along this line. A brief literature review about how people develop their knowledge is carried out through a constructivist view. A hydrological modeling scenario is presented to elucidate the approach.
X-ray crystallography over the past decade for novel drug discovery – where are we heading next?

PubMed Central

Zheng, Heping; Handing, Katarzyna B; Zimmerman, Matthew D; Shabalin, Ivan G; Almo, Steven C; Minor, Wladek

2015-01-01

Introduction Macromolecular X-ray crystallography has been the primary methodology for determining the three-dimensional structures of proteins, nucleic acids and viruses. Structural information has paved the way for structure-guided drug discovery and laid the foundations for structural bioinformatics. However, X-ray crystallography still has a few fundamental limitations, some of which may be overcome and complemented using emerging methods and technologies in other areas of structural biology. Areas covered This review describes how structural knowledge gained from X-ray crystallography has been used to advance other biophysical methods for structure determination (and vice versa). This article also covers current practices for integrating data generated by other biochemical and biophysical methods with those obtained from X-ray crystallography. Finally, the authors articulate their vision about how a combination of structural and biochemical/biophysical methods may improve our understanding of biological processes and interactions. Expert opinion X-ray crystallography has been, and will continue to serve as, the central source of experimental structural biology data used in the discovery of new drugs. However, other structural biology techniques are useful not only to overcome the major limitation of X-ray crystallography, but also to provide complementary structural data that is useful in drug discovery. The use of recent advancements in biochemical, spectroscopy and bioinformatics methods may revolutionize drug discovery, albeit only when these data are combined and analyzed with effective data management systems. Accurate and complete data management is crucial for developing experimental procedures that are robust and reproducible. PMID:26177814
Using JournalMap to improve ecological knowledge discovery and visualization

USDA-ARS?s Scientific Manuscript database

Background/Question/Methods Most of the ecological research conducted around the world is tied to specific places, however, that location information is locked up in the text and figures of scientific articles in myriad forms that are not easily searchable. While access to ecological literature ha...
Knowledge Discovery and Data Mining: An Overview

NASA Technical Reports Server (NTRS)

Fayyad, U.

1995-01-01

The process of knowledge discovery and data mining is the process of information extraction from very large databases. Its importance is described along with several techniques and considerations for selecting the most appropriate technique for extracting information from a particular data set.
12 CFR 263.53 - Discovery depositions.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 12 Banks and Banking 4 2014-01-01 2014-01-01 false Discovery depositions. 263.53 Section 263.53... Discovery depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts...
12 CFR 263.53 - Discovery depositions.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 12 Banks and Banking 4 2012-01-01 2012-01-01 false Discovery depositions. 263.53 Section 263.53... Discovery depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts...
An integrative model for in-silico clinical-genomics discovery science.

PubMed

Lussier, Yves A; Sarkar, Indra Nell; Cantor, Michael

2002-01-01

Human Genome discovery research has set the pace for Post-Genomic Discovery Research. While post-genomic fields focused at the molecular level are intensively pursued, little effort is being deployed in the later stages of molecular medicine discovery research, such as clinical-genomics. The objective of this study is to demonstrate the relevance and significance of integrating mainstream clinical informatics decision support systems to current bioinformatics genomic discovery science. This paper is a feasibility study of an original model enabling novel "in-silico" clinical-genomic discovery science and that demonstrates its feasibility. This model is designed to mediate queries among clinical and genomic knowledge bases with relevant bioinformatic analytic tools (e.g. gene clustering). Briefly, trait-disease-gene relationships were successfully illustrated using QMR, OMIM, SNOMED-RT, GeneCluster and TreeView. The analyses were visualized as two-dimensional dendrograms of clinical observations clustered around genes. To our knowledge, this is the first study using knowledge bases of clinical decision support systems for genomic discovery. Although this study is a proof of principle, it provides a framework for the development of clinical decision-support-system driven, high-throughput clinical-genomic technologies which could potentially unveil significant high-level functions of genes.
Data Mining.

ERIC Educational Resources Information Center

Benoit, Gerald

2002-01-01

Discusses data mining (DM) and knowledge discovery in databases (KDD), taking the view that KDD is the larger view of the entire process, with DM emphasizing the cleaning, warehousing, mining, and visualization of knowledge discovery in databases. Highlights include algorithms; users; the Internet; text mining; and information extraction.…
Competitive-Cooperative Automated Reasoning from Distributed and Multiple Source of Data

NASA Astrophysics Data System (ADS)

Fard, Amin Milani

Knowledge extraction from distributed database systems, have been investigated during past decade in order to analyze billions of information records. In this work a competitive deduction approach in a heterogeneous data grid environment is proposed using classic data mining and statistical methods. By applying a game theory concept in a multi-agent model, we tried to design a policy for hierarchical knowledge discovery and inference fusion. To show the system run, a sample multi-expert system has also been developed.
The emergence of translational epidemiology: from scientific discovery to population health impact.

PubMed

Khoury, Muin J; Gwinn, Marta; Ioannidis, John P A

2010-09-01

Recent emphasis on translational research (TR) is highlighting the role of epidemiology in translating scientific discoveries into population health impact. The authors present applications of epidemiology in TR through 4 phases designated T1-T4, illustrated by examples from human genomics. In T1, epidemiology explores the role of a basic scientific discovery (e.g., a disease risk factor or biomarker) in developing a "candidate application" for use in practice (e.g., a test used to guide interventions). In T2, epidemiology can help to evaluate the efficacy of a candidate application by using observational studies and randomized controlled trials. In T3, epidemiology can help to assess facilitators and barriers for uptake and implementation of candidate applications in practice. In T4, epidemiology can help to assess the impact of using candidate applications on population health outcomes. Epidemiology also has a leading role in knowledge synthesis, especially using quantitative methods (e.g., meta-analysis). To explore the emergence of TR in epidemiology, the authors compared articles published in selected issues of the Journal in 1999 and 2009. The proportion of articles identified as translational doubled from 16% (11/69) in 1999 to 33% (22/66) in 2009 (P = 0.02). Epidemiology is increasingly recognized as an important component of TR. By quantifying and integrating knowledge across disciplines, epidemiology provides crucial methods and tools for TR.
The Emergence of Translational Epidemiology: From Scientific Discovery to Population Health Impact

PubMed Central

Khoury, Muin J.; Gwinn, Marta; Ioannidis, John P. A.

2010-01-01

Recent emphasis on translational research (TR) is highlighting the role of epidemiology in translating scientific discoveries into population health impact. The authors present applications of epidemiology in TR through 4 phases designated T1–T4, illustrated by examples from human genomics. In T1, epidemiology explores the role of a basic scientific discovery (e.g., a disease risk factor or biomarker) in developing a “candidate application” for use in practice (e.g., a test used to guide interventions). In T2, epidemiology can help to evaluate the efficacy of a candidate application by using observational studies and randomized controlled trials. In T3, epidemiology can help to assess facilitators and barriers for uptake and implementation of candidate applications in practice. In T4, epidemiology can help to assess the impact of using candidate applications on population health outcomes. Epidemiology also has a leading role in knowledge synthesis, especially using quantitative methods (e.g., meta-analysis). To explore the emergence of TR in epidemiology, the authors compared articles published in selected issues of the Journal in 1999 and 2009. The proportion of articles identified as translational doubled from 16% (11/69) in 1999 to 33% (22/66) in 2009 (P = 0.02). Epidemiology is increasingly recognized as an important component of TR. By quantifying and integrating knowledge across disciplines, epidemiology provides crucial methods and tools for TR. PMID:20688899
From Residency to Lifelong Learning.

PubMed

Brandt, Keith

2015-11-01

The residency training experience is the perfect environment for learning. The university/institution patient population provides a never-ending supply of patients with unique management challenges. Resources abound that allow the discovery of knowledge about similar situations. Senior teachers provide counseling and help direct appropriate care. Periodic testing and evaluations identify deficiencies, which can be corrected with future study. What happens, however, when the resident graduates? Do they possess all the knowledge they'll need for the rest of their career? Will medical discovery stand still limiting the need for future study? If initial certification establishes that the physician has the skills and knowledge to function as an independent physician and surgeon, how do we assure the public that plastic surgeons will practice lifelong learning and remain safe throughout their career? Enter Maintenance of Certification (MOC). In an ideal world, MOC would provide many of the same tools as residency training: identification of gaps in knowledge, resources to correct those deficiencies, overall assessment of knowledge, feedback about communication skills and professionalism, and methods to evaluate and improve one's practice. This article discusses the need; for education and self-assessment that extends beyond residency training and a commitment to lifelong learning. The American Board of Plastic Surgery MOC program is described to demonstrate how it helps the diplomate reach the goal of continuous practice improvement.
Resource Discovery within the Networked "Hybrid" Library.

ERIC Educational Resources Information Center

Leigh, Sally-Anne

This paper focuses on the development, adoption, and integration of resource discovery, knowledge management, and/or knowledge sharing interfaces such as interactive portals, and the use of the library's World Wide Web presence to increase the availability and usability of information services. The introduction addresses changes in library…
A biological compression model and its applications.

PubMed

Cao, Minh Duc; Dix, Trevor I; Allison, Lloyd

2011-01-01

A biological compression model, expert model, is presented which is superior to existing compression algorithms in both compression performance and speed. The model is able to compress whole eukaryotic genomes. Most importantly, the model provides a framework for knowledge discovery from biological data. It can be used for repeat element discovery, sequence alignment and phylogenetic analysis. We demonstrate that the model can handle statistically biased sequences and distantly related sequences where conventional knowledge discovery tools often fail.

Basics of Antibody Phage Display Technology.

PubMed

Ledsgaard, Line; Kilstrup, Mogens; Karatt-Vellatt, Aneesh; McCafferty, John; Laustsen, Andreas H

2018-06-09

Antibody discovery has become increasingly important in almost all areas of modern medicine. Different antibody discovery approaches exist, but one that has gained increasing interest in the field of toxinology and antivenom research is phage display technology. In this review, the lifecycle of the M13 phage and the basics of phage display technology are presented together with important factors influencing the success rates of phage display experiments. Moreover, the pros and cons of different antigen display methods and the use of naïve versus immunized phage display antibody libraries is discussed, and selected examples from the field of antivenom research are highlighted. This review thus provides in-depth knowledge on the principles and use of phage display technology with a special focus on discovery of antibodies that target animal toxins.
Prediction of intracellular exposure bridges the gap between target- and cell-based drug discovery

PubMed Central

Gordon, Laurie J.; Wayne, Gareth J.; Almqvist, Helena; Axelsson, Hanna; Seashore-Ludlow, Brinton; Treyer, Andrea; Lundbäck, Thomas; West, Andy; Hann, Michael M.; Artursson, Per

2017-01-01

Inadequate target exposure is a major cause of high attrition in drug discovery. Here, we show that a label-free method for quantifying the intracellular bioavailability (Fic) of drug molecules predicts drug access to intracellular targets and hence, pharmacological effect. We determined Fic in multiple cellular assays and cell types representing different targets from a number of therapeutic areas, including cancer, inflammation, and dementia. Both cytosolic targets and targets localized in subcellular compartments were investigated. Fic gives insights on membrane-permeable compounds in terms of cellular potency and intracellular target engagement, compared with biochemical potency measurements alone. Knowledge of the amount of drug that is locally available to bind intracellular targets provides a powerful tool for compound selection in early drug discovery. PMID:28701380
Label-assisted mass spectrometry for the acceleration of reaction discovery and optimization

NASA Astrophysics Data System (ADS)

Cabrera-Pardo, Jaime R.; Chai, David I.; Liu, Song; Mrksich, Milan; Kozmin, Sergey A.

2013-05-01

The identification of new reactions expands our knowledge of chemical reactivity and enables new synthetic applications. Accelerating the pace of this discovery process remains challenging. We describe a highly effective and simple platform for screening a large number of potential chemical reactions in order to discover and optimize previously unknown catalytic transformations, thereby revealing new chemical reactivity. Our strategy is based on labelling one of the reactants with a polyaromatic chemical tag, which selectively undergoes a photoionization/desorption process upon laser irradiation, without the assistance of an external matrix, and enables rapid mass spectrometric detection of any products originating from such labelled reactants in complex reaction mixtures without any chromatographic separation. This method was successfully used for high-throughput discovery and subsequent optimization of two previously unknown benzannulation reactions.
Advanced Computing Methods for Knowledge Discovery and Prognosis in Acoustic Emission Monitoring

ERIC Educational Resources Information Center

Mejia, Felipe

2012-01-01

Structural health monitoring (SHM) has gained significant popularity in the last decade. This growing interest, coupled with new sensing technologies, has resulted in an overwhelming amount of data in need of management and useful interpretation. Acoustic emission (AE) testing has been particularly fraught by the problem of growing data and is…
Design and Implementation of a Prototype Ontology Aided Knowledge Discovery Assistant (OAKDA) Application

DTIC Science & Technology

2006-12-01

speed of search engines improves the efficiency of such methods, effectiveness is not improved. The objective of this thesis is to construct and test...interest, users are assisted in finding a relevant set of key terms that will aid the search engines in narrowing, widening, or refocusing a Web search
Analytical considerations for mass spectrometry profiling in serum biomarker discovery.

PubMed

Whiteley, Gordon R; Colantonio, Simona; Sacconi, Andrea; Saul, Richard G

2009-03-01

The potential of using mass spectrometry profiling as a diagnostic tool has been demonstrated for a wide variety of diseases. Various cancers and cancer-related diseases have been the focus of much of this work because of both the paucity of good diagnostic markers and the knowledge that early diagnosis is the most powerful weapon in treating cancer. The implementation of mass spectrometry as a routine diagnostic tool has proved to be difficult, however, primarily because of the stringent controls that are required for the method to be reproducible. The method is evolving as a powerful guide to the discovery of biomarkers that could, in turn, be used either individually or in an array or panel of tests for early disease detection. Using proteomic patterns to guide biomarker discovery and the possibility of deployment in the clinical laboratory environment on current instrumentation or in a hybrid technology has the possibility of being the early diagnosis tool that is needed.
Ontology-based structured cosine similarity in document summarization: with applications to mobile audio-based knowledge management.

PubMed

Yuan, Soe-Tsyr; Sun, Jerry

2005-10-01

Development of algorithms for automated text categorization in massive text document sets is an important research area of data mining and knowledge discovery. Most of the text-clustering methods were grounded in the term-based measurement of distance or similarity, ignoring the structure of the documents. In this paper, we present a novel method named structured cosine similarity (SCS) that furnishes document clustering with a new way of modeling on document summarization, considering the structure of the documents so as to improve the performance of document clustering in terms of quality, stability, and efficiency. This study was motivated by the problem of clustering speech documents (of no rich document features) attained from the wireless experience oral sharing conducted by mobile workforce of enterprises, fulfilling audio-based knowledge management. In other words, this problem aims to facilitate knowledge acquisition and sharing by speech. The evaluations also show fairly promising results on our method of structured cosine similarity.
Form-Focused Discovery Activities in English Classes

ERIC Educational Resources Information Center

Ogeyik, Muhlise Cosgun

2011-01-01

Form-focused discovery activities allow language learners to grasp various aspects of a target language by contributing implicit knowledge by using discovered explicit knowledge. Moreover, such activities can assist learners to perceive and discover the features of their language input. In foreign language teaching environments, they can be used…
Placental Proteomics: A Shortcut to Biological Insight

PubMed Central

Robinson, John M.; Vandré, Dale D.; Ackerman, William E.

2012-01-01

Proteomics analysis of biological samples has the potential to identify novel protein expression patterns and/or changes in protein expression patterns in different developmental or disease states. An important component of successful proteomics research, at least in its present form, is to reduce the complexity of the sample if it is derived from cells or tissues. One method to simplify complex tissues is to focus on a specific, highly purified sub-proteome. Using this approach we have developed methods to prepare highly enriched fractions of the apical plasma membrane of the syncytiotrophoblast. Through proteomics analysis of this fraction we have identified over five hundred proteins several of which were previously not known to reside in the syncytiotrophoblast. Herein, we focus on two of these, dysferlin and myoferlin. These proteins, largely known from studies of skeletal muscle, may not have been found in the human placenta were it not for discovery-based proteomics analysis. This new knowledge, acquired through a discovery-driven approach, can now be applied for the generation of hypothesis-based experimentation. Thus discovery-based and hypothesis-based research are complimentary approaches that when coupled together can hasten scientific discoveries. PMID:19070895
Order priors for Bayesian network discovery with an application to malware phylogeny

DOE PAGES

Oyen, Diane; Anderson, Blake; Sentz, Kari; ...

2017-09-15

Here, Bayesian networks have been used extensively to model and discover dependency relationships among sets of random variables. We learn Bayesian network structure with a combination of human knowledge about the partial ordering of variables and statistical inference of conditional dependencies from observed data. Our approach leverages complementary information from human knowledge and inference from observed data to produce networks that reflect human beliefs about the system as well as to fit the observed data. Applying prior beliefs about partial orderings of variables is an approach distinctly different from existing methods that incorporate prior beliefs about direct dependencies (or edges)more » in a Bayesian network. We provide an efficient implementation of the partial-order prior in a Bayesian structure discovery learning algorithm, as well as an edge prior, showing that both priors meet the local modularity requirement necessary for an efficient Bayesian discovery algorithm. In benchmark studies, the partial-order prior improves the accuracy of Bayesian network structure learning as well as the edge prior, even though order priors are more general. Our primary motivation is in characterizing the evolution of families of malware to aid cyber security analysts. For the problem of malware phylogeny discovery, we find that our algorithm, compared to existing malware phylogeny algorithms, more accurately discovers true dependencies that are missed by other algorithms.« less
Order priors for Bayesian network discovery with an application to malware phylogeny

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oyen, Diane; Anderson, Blake; Sentz, Kari

Here, Bayesian networks have been used extensively to model and discover dependency relationships among sets of random variables. We learn Bayesian network structure with a combination of human knowledge about the partial ordering of variables and statistical inference of conditional dependencies from observed data. Our approach leverages complementary information from human knowledge and inference from observed data to produce networks that reflect human beliefs about the system as well as to fit the observed data. Applying prior beliefs about partial orderings of variables is an approach distinctly different from existing methods that incorporate prior beliefs about direct dependencies (or edges)more » in a Bayesian network. We provide an efficient implementation of the partial-order prior in a Bayesian structure discovery learning algorithm, as well as an edge prior, showing that both priors meet the local modularity requirement necessary for an efficient Bayesian discovery algorithm. In benchmark studies, the partial-order prior improves the accuracy of Bayesian network structure learning as well as the edge prior, even though order priors are more general. Our primary motivation is in characterizing the evolution of families of malware to aid cyber security analysts. For the problem of malware phylogeny discovery, we find that our algorithm, compared to existing malware phylogeny algorithms, more accurately discovers true dependencies that are missed by other algorithms.« less
75 FR 66766 - NIAID Blue Ribbon Panel Meeting on Adjuvant Discovery and Development

Federal Register 2010, 2011, 2012, 2013, 2014

2010-10-29

..., identifies gaps in knowledge and capabilities, and defines NIAID's goals for the continued discovery... DEPARTMENT OF HEALTH AND HUMAN SERVICES NIAID Blue Ribbon Panel Meeting on Adjuvant Discovery and... agenda for the discovery, development and clinical evaluation of adjuvants for use with preventive...
12 CFR 263.53 - Discovery depositions.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 12 Banks and Banking 3 2011-01-01 2011-01-01 false Discovery depositions. 263.53 Section 263.53... depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts material to the...
12 CFR 19.170 - Discovery depositions.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 12 Banks and Banking 1 2010-01-01 2010-01-01 false Discovery depositions. 19.170 Section 19.170... PROCEDURE Discovery Depositions and Subpoenas § 19.170 Discovery depositions. (a) General rule. In any... deposition of an expert, or of a person, including another party, who has direct knowledge of matters that...
12 CFR 19.170 - Discovery depositions.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 12 Banks and Banking 1 2011-01-01 2011-01-01 false Discovery depositions. 19.170 Section 19.170... PROCEDURE Discovery Depositions and Subpoenas § 19.170 Discovery depositions. (a) General rule. In any... deposition of an expert, or of a person, including another party, who has direct knowledge of matters that...
12 CFR 263.53 - Discovery depositions.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 12 Banks and Banking 3 2010-01-01 2010-01-01 false Discovery depositions. 263.53 Section 263.53... depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts material to the...
BioGraph: unsupervised biomedical knowledge discovery via automated hypothesis generation

PubMed Central

2011-01-01

We present BioGraph, a data integration and data mining platform for the exploration and discovery of biomedical information. The platform offers prioritizations of putative disease genes, supported by functional hypotheses. We show that BioGraph can retrospectively confirm recently discovered disease genes and identify potential susceptibility genes, outperforming existing technologies, without requiring prior domain knowledge. Additionally, BioGraph allows for generic biomedical applications beyond gene discovery. BioGraph is accessible at http://www.biograph.be. PMID:21696594
Medical knowledge discovery and management.

PubMed

Prior, Fred

2009-05-01

Although the volume of medical information is growing rapidly, the ability to rapidly convert this data into "actionable insights" and new medical knowledge is lagging far behind. The first step in the knowledge discovery process is data management and integration, which logically can be accomplished through the application of data warehouse technologies. A key insight that arises from efforts in biosurveillance and the global scope of military medicine is that information must be integrated over both time (longitudinal health records) and space (spatial localization of health-related events). Once data are compiled and integrated it is essential to encode the semantics and relationships among data elements through the use of ontologies and semantic web technologies to convert data into knowledge. Medical images form a special class of health-related information. Traditionally knowledge has been extracted from images by human observation and encoded via controlled terminologies. This approach is rapidly being replaced by quantitative analyses that more reliably support knowledge extraction. The goals of knowledge discovery are the improvement of both the timeliness and accuracy of medical decision making and the identification of new procedures and therapies.
Taking stock of current societal, political and academic stakeholders in the Canadian healthcare knowledge translation agenda

PubMed Central

Newton, Mandi S; Scott-Findlay, Shannon

2007-01-01

Background In the past 15 years, knowledge translation in healthcare has emerged as a multifaceted and complex agenda. Theoretical and polemical discussions, the development of a science to study and measure the effects of translating research evidence into healthcare, and the role of key stakeholders including academe, healthcare decision-makers, the public, and government funding bodies have brought scholarly, organizational, social, and political dimensions to the agenda. Objective This paper discusses the current knowledge translation agenda in Canadian healthcare and how elements in this agenda shape the discovery and translation of health knowledge. Discussion The current knowledge translation agenda in Canadian healthcare involves the influence of values, priorities, and people; stakes which greatly shape the discovery of research knowledge and how it is or is not instituted in healthcare delivery. As this agenda continues to take shape and direction, ensuring that it is accountable for its influences is essential and should be at the forefront of concern to the Canadian public and healthcare community. This transparency will allow for scrutiny, debate, and improvements in health knowledge discovery and health services delivery. PMID:17916256
Concept of operations for knowledge discovery from Big Data across enterprise data warehouses

NASA Astrophysics Data System (ADS)

Sukumar, Sreenivas R.; Olama, Mohammed M.; McNair, Allen W.; Nutaro, James J.

2013-05-01

The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Options that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.

A Thematic Analysis of Theoretical Models for Translational Science in Nursing: Mapping the Field

PubMed Central

Mitchell, Sandra A.; Fisher, Cheryl A.; Hastings, Clare E.; Silverman, Leanne B.; Wallen, Gwenyth R.

2010-01-01

Background The quantity and diversity of conceptual models in translational science may complicate rather than advance the use of theory. Purpose This paper offers a comparative thematic analysis of the models available to inform knowledge development, transfer, and utilization. Method Literature searches identified 47 models for knowledge translation. Four thematic areas emerged: (1) evidence-based practice and knowledge transformation processes; (2) strategic change to promote adoption of new knowledge; (3) knowledge exchange and synthesis for application and inquiry; (4) designing and interpreting dissemination research. Discussion This analysis distinguishes the contributions made by leaders and researchers at each phase in the process of discovery, development, and service delivery. It also informs the selection of models to guide activities in knowledge translation. Conclusions A flexible theoretical stance is essential to simultaneously develop new knowledge and accelerate the translation of that knowledge into practice behaviors and programs of care that support optimal patient outcomes. PMID:21074646
The center for causal discovery of biomedical knowledge from big data.

PubMed

Cooper, Gregory F; Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard

2015-11-01

The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.All rights reserved. For Permissions, please email: journals.permissions@oup.com.
How does non-formal marine education affect student attitude and knowledge? A case study using SCDNR's Discovery program

NASA Astrophysics Data System (ADS)

McGovern, Mary Francis

Non-formal environmental education provides students the opportunity to learn in ways that would not be possible in a traditional classroom setting. Outdoor learning allows students to make connections to their environment and helps to foster an appreciation for nature. This type of education can be interdisciplinary---students not only develop skills in science, but also in mathematics, social studies, technology, and critical thinking. This case study focuses on a non-formal marine education program, the South Carolina Department of Natural Resources' (SCDNR) Discovery vessel based program. The Discovery curriculum was evaluated to determine impact on student knowledge about and attitude toward the estuary. Students from two South Carolina coastal counties who attended the boat program during fall 2014 were asked to complete a brief survey before, immediately after, and two weeks following the program. The results of this study indicate that both student knowledge about and attitude significantly improved after completion of the Discovery vessel based program. Knowledge and attitude scores demonstrated a positive correlation.
Diagnostic games: from adequate formalization of clinical experience to structure discovery.

PubMed

Shifrin, Michael A; Kasparova, Eva I

2008-01-01

A method of obtaining well-founded and reproducible results in clinical decision making is presented. It is based on "diagnostic games", a procedure of elicitation and formalization of experts' knowledge and experience. The use of this procedure allows formulating decision rules in the terms of an adequate language, that are both unambiguous and clinically clear.
Bioinformatics in protein kinases regulatory network and drug discovery.

PubMed

Chen, Qingfeng; Luo, Haiqiong; Zhang, Chengqi; Chen, Yi-Ping Phoebe

2015-04-01

Protein kinases have been implicated in a number of diseases, where kinases participate many aspects that control cell growth, movement and death. The deregulated kinase activities and the knowledge of these disorders are of great clinical interest of drug discovery. The most critical issue is the development of safe and efficient disease diagnosis and treatment for less cost and in less time. It is critical to develop innovative approaches that aim at the root cause of a disease, not just its symptoms. Bioinformatics including genetic, genomic, mathematics and computational technologies, has become the most promising option for effective drug discovery, and has showed its potential in early stage of drug-target identification and target validation. It is essential that these aspects are understood and integrated into new methods used in drug discovery for diseases arisen from deregulated kinase activity. This article reviews bioinformatics techniques for protein kinase data management and analysis, kinase pathways and drug targets and describes their potential application in pharma ceutical industry. Copyright © 2015 Elsevier Inc. All rights reserved.
Cross-organism learning method to discover new gene functionalities.

PubMed

Domeniconi, Giacomo; Masseroli, Marco; Moro, Gianluca; Pinoli, Pietro

2016-04-01

Knowledge of gene and protein functions is paramount for the understanding of physiological and pathological biological processes, as well as in the development of new drugs and therapies. Analyses for biomedical knowledge discovery greatly benefit from the availability of gene and protein functional feature descriptions expressed through controlled terminologies and ontologies, i.e., of gene and protein biomedical controlled annotations. In the last years, several databases of such annotations have become available; yet, these valuable annotations are incomplete, include errors and only some of them represent highly reliable human curated information. Computational techniques able to reliably predict new gene or protein annotations with an associated likelihood value are thus paramount. Here, we propose a novel cross-organisms learning approach to reliably predict new functionalities for the genes of an organism based on the known controlled annotations of the genes of another, evolutionarily related and better studied, organism. We leverage a new representation of the annotation discovery problem and a random perturbation of the available controlled annotations to allow the application of supervised algorithms to predict with good accuracy unknown gene annotations. Taking advantage of the numerous gene annotations available for a well-studied organism, our cross-organisms learning method creates and trains better prediction models, which can then be applied to predict new gene annotations of a target organism. We tested and compared our method with the equivalent single organism approach on different gene annotation datasets of five evolutionarily related organisms (Homo sapiens, Mus musculus, Bos taurus, Gallus gallus and Dictyostelium discoideum). Results show both the usefulness of the perturbation method of available annotations for better prediction model training and a great improvement of the cross-organism models with respect to the single-organism ones, without influence of the evolutionary distance between the considered organisms. The generated ranked lists of reliably predicted annotations, which describe novel gene functionalities and have an associated likelihood value, are very valuable both to complement available annotations, for better coverage in biomedical knowledge discovery analyses, and to quicken the annotation curation process, by focusing it on the prioritized novel annotations predicted. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Ontology-Based Search of Genomic Metadata.

PubMed

Fernandez, Javier D; Lenzerini, Maurizio; Masseroli, Marco; Venco, Francesco; Ceri, Stefano

2016-01-01

The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic, and epigenomic discoveries. Yet, search of relevant datasets for knowledge discovery is limitedly supported: metadata describing ENCODE datasets are quite simple and incomplete, and not described by a coherent underlying ontology. Here, we show how to overcome this limitation, by adopting an ENCODE metadata searching approach which uses high-quality ontological knowledge and state-of-the-art indexing technologies. Specifically, we developed S.O.S. GeM (http://www.bioinformatics.deib.polimi.it/SOSGeM/), a system supporting effective semantic search and retrieval of ENCODE datasets. First, we constructed a Semantic Knowledge Base by starting with concepts extracted from ENCODE metadata, matched to and expanded on biomedical ontologies integrated in the well-established Unified Medical Language System. We prove that this inference method is sound and complete. Then, we leveraged the Semantic Knowledge Base to semantically search ENCODE data from arbitrary biologists' queries. This allows correctly finding more datasets than those extracted by a purely syntactic search, as supported by the other available systems. We empirically show the relevance of found datasets to the biologists' queries.
A bioinformatics knowledge discovery in text application for grid computing.

PubMed

Castellano, Marcello; Mastronardi, Giuseppe; Bellotti, Roberto; Tarricone, Gianfranco

2009-06-16

A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities.
Enhancing Learning Environments through Solution-based Knowledge Discovery Tools: Forecasting for Self-Perpetuating Systemic Reform.

ERIC Educational Resources Information Center

Tsantis, Linda; Castellani, John

2001-01-01

This article explores how knowledge-discovery applications can empower educators with the information they need to provide anticipatory guidance for teaching and learning, forecast school and district needs, and find critical markers for making the best program decisions for children and youth with disabilities. Data mining for schools is…
Application of Knowledge Discovery in Databases Methodologies for Predictive Models for Pregnancy Adverse Events

ERIC Educational Resources Information Center

Taft, Laritza M.

2010-01-01

In its report "To Err is Human", The Institute of Medicine recommended the implementation of internal and external voluntary and mandatory automatic reporting systems to increase detection of adverse events. Knowledge Discovery in Databases (KDD) allows the detection of patterns and trends that would be hidden or less detectable if analyzed by…
Knowledge Discovery Process: Case Study of RNAV Adherence of Radar Track Data

NASA Technical Reports Server (NTRS)

Matthews, Bryan

2018-01-01

This talk is an introduction to the knowledge discovery process, beginning with: identifying the problem, choosing data sources, matching the appropriate machine learning tools, and reviewing the results. The overview will be given in the context of an ongoing study that is assessing RNAV adherence of commercial aircraft in the national airspace.
A Virtual Bioinformatics Knowledge Environment for Early Cancer Detection

NASA Technical Reports Server (NTRS)

Crichton, Daniel; Srivastava, Sudhir; Johnsey, Donald

2003-01-01

Discovery of disease biomarkers for cancer is a leading focus of early detection. The National Cancer Institute created a network of collaborating institutions focused on the discovery and validation of cancer biomarkers called the Early Detection Research Network (EDRN). Informatics plays a key role in enabling a virtual knowledge environment that provides scientists real time access to distributed data sets located at research institutions across the nation. The distributed and heterogeneous nature of the collaboration makes data sharing across institutions very difficult. EDRN has developed a comprehensive informatics effort focused on developing a national infrastructure enabling seamless access, sharing and discovery of science data resources across all EDRN sites. This paper will discuss the EDRN knowledge system architecture, its objectives and its accomplishments.
Empowering Accelerated Personal, Professional and Scholarly Discovery among Information Seekers: An Educational Vision

ERIC Educational Resources Information Center

Harmon, Glynn

2013-01-01

The term discovery applies herein to the successful outcome of inquiry in which a significant personal, professional or scholarly breakthrough or insight occurs, and which is individually or socially acknowledged as a key contribution to knowledge. Since discoveries culminate at fixed points in time, discoveries can serve as an outcome metric for…
A Framework of Knowledge Integration and Discovery for Supporting Pharmacogenomics Target Predication of Adverse Drug Events: A Case Study of Drug-Induced Long QT Syndrome.

PubMed

Jiang, Guoqian; Wang, Chen; Zhu, Qian; Chute, Christopher G

2013-01-01

Knowledge-driven text mining is becoming an important research area for identifying pharmacogenomics target genes. However, few of such studies have been focused on the pharmacogenomics targets of adverse drug events (ADEs). The objective of the present study is to build a framework of knowledge integration and discovery that aims to support pharmacogenomics target predication of ADEs. We integrate a semantically annotated literature corpus Semantic MEDLINE with a semantically coded ADE knowledgebase known as ADEpedia using a semantic web based framework. We developed a knowledge discovery approach combining a network analysis of a protein-protein interaction (PPI) network and a gene functional classification approach. We performed a case study of drug-induced long QT syndrome for demonstrating the usefulness of the framework in predicting potential pharmacogenomics targets of ADEs.
Building Knowledge Graphs for NASA's Earth Science Enterprise

NASA Astrophysics Data System (ADS)

Zhang, J.; Lee, T. J.; Ramachandran, R.; Shi, R.; Bao, Q.; Gatlin, P. N.; Weigel, A. M.; Maskey, M.; Miller, J. J.

2016-12-01

Inspired by Google Knowledge Graph, we have been building a prototype Knowledge Graph for Earth scientists, connecting information and data in NASA's Earth science enterprise. Our primary goal is to advance the state-of-the-art NASA knowledge extraction capability by going beyond traditional catalog search and linking different distributed information (such as data, publications, services, tools and people). This will enable a more efficient pathway to knowledge discovery. While Google Knowledge Graph provides impressive semantic-search and aggregation capabilities, it is limited to search topics for general public. We use the similar knowledge graph approach to semantically link information gathered from a wide variety of sources within the NASA Earth Science enterprise. Our prototype serves as a proof of concept on the viability of building an operational "knowledge base" system for NASA Earth science. Information is pulled from structured sources (such as NASA CMR catalog, GCMD, and Climate and Forecast Conventions) and unstructured sources (such as research papers). Leveraging modern techniques of machine learning, information retrieval, and deep learning, we provide an integrated data mining and information discovery environment to help Earth scientists to use the best data, tools, methodologies, and models available to answer a hypothesis. Our knowledge graph would be able to answer questions like: Which articles discuss topics investigating similar hypotheses? How have these methods been tested for accuracy? Which approaches have been highly cited within the scientific community? What variables were used for this method and what datasets were used to represent them? What processing was necessary to use this data? These questions then lead researchers and citizen scientists to investigate the sources where data can be found, available user guides, information on how the data was acquired, and available tools and models to use with this data. As a proof of concept, we focus on a well-defined domain - Hurricane Science linking research articles and their findings, data, people and tools/services. Modern information retrieval, natural language processing machine learning and deep learning techniques are applied to build the knowledge network.
Relational Network for Knowledge Discovery through Heterogeneous Biomedical and Clinical Features

PubMed Central

Chen, Huaidong; Chen, Wei; Liu, Chenglin; Zhang, Le; Su, Jing; Zhou, Xiaobo

2016-01-01

Biomedical big data, as a whole, covers numerous features, while each dataset specifically delineates part of them. “Full feature spectrum” knowledge discovery across heterogeneous data sources remains a major challenge. We developed a method called bootstrapping for unified feature association measurement (BUFAM) for pairwise association analysis, and relational dependency network (RDN) modeling for global module detection on features across breast cancer cohorts. Discovered knowledge was cross-validated using data from Wake Forest Baptist Medical Center’s electronic medical records and annotated with BioCarta signaling signatures. The clinical potential of the discovered modules was exhibited by stratifying patients for drug responses. A series of discovered associations provided new insights into breast cancer, such as the effects of patient’s cultural background on preferences for surgical procedure. We also discovered two groups of highly associated features, the HER2 and the ER modules, each of which described how phenotypes were associated with molecular signatures, diagnostic features, and clinical decisions. The discovered “ER module”, which was dominated by cancer immunity, was used as an example for patient stratification and prediction of drug responses to tamoxifen and chemotherapy. BUFAM-derived RDN modeling demonstrated unique ability to discover clinically meaningful and actionable knowledge across highly heterogeneous biomedical big data sets. PMID:27427091
Relational Network for Knowledge Discovery through Heterogeneous Biomedical and Clinical Features

NASA Astrophysics Data System (ADS)

Chen, Huaidong; Chen, Wei; Liu, Chenglin; Zhang, Le; Su, Jing; Zhou, Xiaobo

2016-07-01

Biomedical big data, as a whole, covers numerous features, while each dataset specifically delineates part of them. “Full feature spectrum” knowledge discovery across heterogeneous data sources remains a major challenge. We developed a method called bootstrapping for unified feature association measurement (BUFAM) for pairwise association analysis, and relational dependency network (RDN) modeling for global module detection on features across breast cancer cohorts. Discovered knowledge was cross-validated using data from Wake Forest Baptist Medical Center’s electronic medical records and annotated with BioCarta signaling signatures. The clinical potential of the discovered modules was exhibited by stratifying patients for drug responses. A series of discovered associations provided new insights into breast cancer, such as the effects of patient’s cultural background on preferences for surgical procedure. We also discovered two groups of highly associated features, the HER2 and the ER modules, each of which described how phenotypes were associated with molecular signatures, diagnostic features, and clinical decisions. The discovered “ER module”, which was dominated by cancer immunity, was used as an example for patient stratification and prediction of drug responses to tamoxifen and chemotherapy. BUFAM-derived RDN modeling demonstrated unique ability to discover clinically meaningful and actionable knowledge across highly heterogeneous biomedical big data sets.
External Dependencies-Driven Architecture Discovery and Analysis of Implemented Systems

NASA Technical Reports Server (NTRS)

Ganesan, Dharmalingam; Lindvall, Mikael; Ron, Monica

2014-01-01

A method for architecture discovery and analysis of implemented systems (AIS) is disclosed. The premise of the method is that architecture decisions are inspired and influenced by the external entities that the software system makes use of. Examples of such external entities are COTS components, frameworks, and ultimately even the programming language itself and its libraries. Traces of these architecture decisions can thus be found in the implemented software and is manifested in the way software systems use such external entities. While this fact is often ignored in contemporary reverse engineering methods, the AIS method actively leverages and makes use of the dependencies to external entities as a starting point for the architecture discovery. The AIS method is demonstrated using the NASA's Space Network Access System (SNAS). The results show that, with abundant evidence, the method offers reusable and repeatable guidelines for discovering the architecture and locating potential risks (e.g. low testability, decreased performance) that are hidden deep in the implementation. The analysis is conducted by using external dependencies to identify, classify and review a minimal set of key source code files. Given the benefits of analyzing external dependencies as a way to discover architectures, it is argued that external dependencies deserve to be treated as first-class citizens during reverse engineering. The current structure of a knowledge base of external entities and analysis questions with strategies for getting answers is also discussed.
Data Mining and Knowledge Discovery tools for exploiting big Earth-Observation data

NASA Astrophysics Data System (ADS)

Espinoza Molina, D.; Datcu, M.

2015-04-01

The continuous increase in the size of the archives and in the variety and complexity of Earth-Observation (EO) sensors require new methodologies and tools that allow the end-user to access a large image repository, to extract and to infer knowledge about the patterns hidden in the images, to retrieve dynamically a collection of relevant images, and to support the creation of emerging applications (e.g.: change detection, global monitoring, disaster and risk management, image time series, etc.). In this context, we are concerned with providing a platform for data mining and knowledge discovery content from EO archives. The platform's goal is to implement a communication channel between Payload Ground Segments and the end-user who receives the content of the data coded in an understandable format associated with semantics that is ready for immediate exploitation. It will provide the user with automated tools to explore and understand the content of highly complex images archives. The challenge lies in the extraction of meaningful information and understanding observations of large extended areas, over long periods of time, with a broad variety of EO imaging sensors in synergy with other related measurements and data. The platform is composed of several components such as 1.) ingestion of EO images and related data providing basic features for image analysis, 2.) query engine based on metadata, semantics and image content, 3.) data mining and knowledge discovery tools for supporting the interpretation and understanding of image content, 4.) semantic definition of the image content via machine learning methods. All these components are integrated and supported by a relational database management system, ensuring the integrity and consistency of Terabytes of Earth Observation data.
'Big Data' Collaboration: Exploring, Recording and Sharing Enterprise Knowledge

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sukumar, Sreenivas R; Ferrell, Regina Kay

2013-01-01

As data sources and data size proliferate, knowledge discovery from "Big Data" is starting to pose several challenges. In this paper, we address a specific challenge in the practice of enterprise knowledge management while extracting actionable nuggets from diverse data sources of seemingly-related information. In particular, we address the challenge of archiving knowledge gained through collaboration, dissemination and visualization as part of the data analysis, inference and decision-making lifecycle. We motivate the implementation of an enterprise data-discovery and knowledge recorder tool, called SEEKER based on real world case-study. We demonstrate SEEKER capturing schema and data-element relationships, tracking the data elementsmore » of value based on the queries and the analytical artifacts that are being created by analysts as they use the data. We show how the tool serves as digital record of institutional domain knowledge and a documentation for the evolution of data elements, queries and schemas over time. As a knowledge management service, a tool like SEEKER saves enterprise resources and time by avoiding analytic silos, expediting the process of multi-source data integration and intelligently documenting discoveries from fellow analysts.« less

Using Learning Analytics to Identify Medical Student Misconceptions in an Online Virtual Patient Environment

ERIC Educational Resources Information Center

Poitras, Eric G.; Naismith, Laura M.; Doleck, Tenzin; Lajoie, Susanne P.

2016-01-01

This study aimed to identify misconceptions in medical student knowledge by mining user interactions in the MedU online learning environment. Data from 13000 attempts at a single virtual patient case were extracted from the MedU MySQL database. A subgroup discovery method was applied to identify patterns in learner-generated annotations and…
Environmental Visualization and Horizontal Fusion

DTIC Science & Technology

2005-10-01

the section on EVIS Rules. Federated Search – Discovering Content Another method of discovering services and their content has been implemented...in HF through a next-generation knowledge discovery framework called Federated Search . A virtual information space, called Collateral Space was...environmental mission effects products, is presented later in the paper. Federated Search allows users to search through Collateral Space data that is
The Expanding Diversity of Mycobacterium tuberculosis Drug Targets.

PubMed

Wellington, Samantha; Hung, Deborah T

2018-05-11

After decades of relative inactivity, a large increase in efforts to discover antitubercular therapeutics has brought insights into the biology of Mycobacterium tuberculosis (Mtb) and promising new drugs such as bedaquiline, which inhibits ATP synthase, and the nitroimidazoles delamanid and pretomanid, which inhibit both mycolic acid synthesis and energy production. Despite these advances, the drug discovery pipeline remains underpopulated. The field desperately needs compounds with novel mechanisms of action capable of inhibiting multi- and extensively drug -resistant Mtb (M/XDR-TB) and, potentially, nonreplicating Mtb with the hope of shortening the duration of required therapy. New knowledge about Mtb, along with new methods and technologies, has driven exploration into novel target areas, such as energy production and central metabolism, that diverge from the classical targets in macromolecular synthesis. Here, we review new small molecule drug candidates that act on these novel targets to highlight the methods and perspectives advancing the field. These new targets bring with them the aspiration of shortening treatment duration as well as a pipeline of effective regimens against XDR-TB, positioning Mtb drug discovery to become a model for anti-infective discovery.
Bioenergy Knowledge Discovery Framework Fact Sheet

DOE Office of Scientific and Technical Information (OSTI.GOV)

None

The Bioenergy Knowledge Discovery Framework (KDF) supports the development of a sustainable bioenergy industry by providing access to a variety of data sets, publications, and collaboration and mapping tools that support bioenergy research, analysis, and decision making. In the KDF, users can search for information, contribute data, and use the tools and map interface to synthesize, analyze, and visualize information in a spatially integrated manner.
Teachers' Journal Club: Bridging between the Dynamics of Biological Discoveries and Biology Teachers

ERIC Educational Resources Information Center

Brill, Gilat; Falk, Hedda; Yarden, Anat

2003-01-01

Since biology is one of the most dynamic research fields within the natural sciences, the gap between the accumulated knowledge in biology and the knowledge that is taught in schools, increases rapidly with time. Our long-term objective is to develop means to bridge between the dynamics of biological discoveries and the biology teachers and…
An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework.

PubMed

Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji

2016-01-01

Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. © The Author(s) 2016. Published by Oxford University Press.
An integrative data analysis platform for gene set analysis and knowledge discovery in a data warehouse framework

PubMed Central

Chen, Yi-An; Tripathi, Lokesh P.; Mizuguchi, Kenji

2016-01-01

Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org PMID:26989145
A Knowledge Discovery Approach to Diagnosing Intracranial Hematomas on Brain CT: Recognition, Measurement and Classification

NASA Astrophysics Data System (ADS)

Liao, Chun-Chih; Xiao, Furen; Wong, Jau-Min; Chiang, I.-Jen

Computed tomography (CT) of the brain is preferred study on neurological emergencies. Physicians use CT to diagnose various types of intracranial hematomas, including epidural, subdural and intracerebral hematomas according to their locations and shapes. We propose a novel method that can automatically diagnose intracranial hematomas by combining machine vision and knowledge discovery techniques. The skull on the CT slice is located and the depth of each intracranial pixel is labeled. After normalization of the pixel intensities by their depth, the hyperdense area of intracranial hematoma is segmented with multi-resolution thresholding and region-growing. We then apply C4.5 algorithm to construct a decision tree using the features of the segmented hematoma and the diagnoses made by physicians. The algorithm was evaluated on 48 pathological images treated in a single institute. The two discovered rules closely resemble those used by human experts, and are able to make correct diagnoses in all cases.
Valuing the scholarship of integration and the scholarship of application in the academy for health sciences scholars: recommended methods

PubMed Central

Hofmeyer, Anne; Newton, Mandi; Scott, Cathie

2007-01-01

In the landmark 1990 publication Scholarship Reconsidered, Boyer challenged the 'teaching verses research debates' by advocating for the scholarship of discovery, teaching, integration, and application. The scholarship of discovery considers publications and research as the yardstick in the merit, promotion and tenure system the world over. But this narrow view of scholarship does not fully support the obligations of universities to serve global societies and to improve health and health equity. Mechanisms to report the scholarship of teaching have been developed and adopted by some universities. In this article, we contribute to the less developed areas of scholarship, i.e. integration and application. We firstly situate the scholarship of discovery, teaching, integration and application within the interprofessional and knowledge exchange debates. Second, we propose a means for health science scholars to report the process and outcomes of the scholarship of integration and application with other disciplines, decision-makers and communities. We conclude with recommendations for structural and process change in faculty merit, tenure, and promotion systems so that health science scholars with varied academic portfolios are valued and many forms of academic scholarship are sustained. It is vital academic institutions remain relevant in an era when the production of knowledge is increasingly recognized as a social collaborative activity. PMID:17535436
[Research of Odo Bujwid (1857-1942) concerning the vaccine against rabies-historical characterisation].

PubMed

Wasiewicz, Barbara

2016-01-01

The present article refers to the historical characterisation of Odo Bujwid's (1857-1942) research concerning the vaccine against rabies. The introduction refers to the treatment methods applied before Ludwik Pasteur's discovery. The following part refers to Odo Bujwid's own research including diagnostics, characterisation of the symptoms of disease, modification of the original Ludwik Pasteur's method and statistical information. The resume emphasizes that Odo Bujwid's scientific research was the introduction and generalisation the worldwide microbiology knowledge at the polish lands.
Development of Scientific Approach Based on Discovery Learning Module

NASA Astrophysics Data System (ADS)

Ellizar, E.; Hardeli, H.; Beltris, S.; Suharni, R.

2018-04-01

Scientific Approach is a learning process, designed to make the students actively construct their own knowledge through stages of scientific method. The scientific approach in learning process can be done by using learning modules. One of the learning model is discovery based learning. Discovery learning is a learning model for the valuable things in learning through various activities, such as observation, experience, and reasoning. In fact, the students’ activity to construct their own knowledge were not optimal. It’s because the available learning modules were not in line with the scientific approach. The purpose of this study was to develop a scientific approach discovery based learning module on Acid Based, also on electrolyte and non-electrolyte solution. The developing process of this chemistry modules use the Plomp Model with three main stages. The stages are preliminary research, prototyping stage, and the assessment stage. The subject of this research was the 10th and 11th Grade of Senior High School students (SMAN 2 Padang). Validation were tested by the experts of Chemistry lecturers and teachers. Practicality of these modules had been tested through questionnaire. The effectiveness had been tested through experimental procedure by comparing student achievement between experiment and control groups. Based on the findings, it can be concluded that the developed scientific approach discovery based learning module significantly improve the students’ learning in Acid-based and Electrolyte solution. The result of the data analysis indicated that the chemistry module was valid in content, construct, and presentation. Chemistry module also has a good practicality level and also accordance with the available time. This chemistry module was also effective, because it can help the students to understand the content of the learning material. That’s proved by the result of learning student. Based on the result can conclude that chemistry module based on discovery learning and scientific approach in electrolyte and non-electrolyte solution and Acid Based for the 10th and 11th grade of senior high school students were valid, practice, and effective.
Nursing Routine Data as a Basis for Association Analysis in the Domain of Nursing Knowledge

PubMed Central

Sellemann, Björn; Stausberg, Jürgen; Hübner, Ursula

2012-01-01

This paper describes the data mining method of association analysis within the framework of Knowledge Discovery in Databases (KDD) with the aim to identify standard patterns of nursing care. The approach is application-oriented and used on nursing routine data of the method LEP nursing 2. The increasing use of information technology in hospitals, especially of nursing information systems, requires the storage of large data sets, which hitherto have not always been analyzed adequately. Three association analyses for the days of admission, surgery and discharge, have been performed. The results of almost 1.5 million generated association rules indicate that it is valid to apply association analysis to nursing routine data. All rules are semantically trivial, since they reflect existing knowledge from the domain of nursing. This may be due either to the method LEP Nursing 2, or to the nursing activities themselves. Nonetheless, association analysis may in future become a useful analytical tool on the basis of structured nursing routine data. PMID:24199122
Learning in the context of distribution drift

DTIC Science & Technology

2017-05-09

published in the leading data mining journal, Data Mining and Knowledge Discovery (Webb et. al., 2016)1. We have shown that the previous qualitative...learner Low-bias learner Aggregated classifier Figure 7: Architecture for learning fr m streaming data in th co text of variable or unknown...Learning limited dependence Bayesian classifiers, in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD
A Bioinformatic Approach to Inter Functional Interactions within Protein Sequences

DTIC Science & Technology

2009-02-23

AFOSR/AOARD Reference Number: USAFAOGA07: FA4869-07-1-4050 AFOSR/AOARD Program Manager : Hiroshi Motoda, Ph.D. Period of...Conference on Knowledge Discovery and Data Mining.) In a separate study we have applied our approaches to the problem of whole genome alignment. We have...SIGKDD Conference on Knowledge Discovery and Data Mining Attached. Interactions: Please list: (a) Participation/presentations at meetings
Modeling technology innovation: How science, engineering, and industry methods can combine to generate beneficial socioeconomic impacts

PubMed Central

2012-01-01

Background Government-sponsored science, technology, and innovation (STI) programs support the socioeconomic aspects of public policies, in addition to expanding the knowledge base. For example, beneficial healthcare services and devices are expected to result from investments in research and development (R&D) programs, which assume a causal link to commercial innovation. Such programs are increasingly held accountable for evidence of impact—that is, innovative goods and services resulting from R&D activity. However, the absence of comprehensive models and metrics skews evidence gathering toward bibliometrics about research outputs (published discoveries), with less focus on transfer metrics about development outputs (patented prototypes) and almost none on econometrics related to production outputs (commercial innovations). This disparity is particularly problematic for the expressed intent of such programs, as most measurable socioeconomic benefits result from the last category of outputs. Methods This paper proposes a conceptual framework integrating all three knowledge-generating methods into a logic model, useful for planning, obtaining, and measuring the intended beneficial impacts through the implementation of knowledge in practice. Additionally, the integration of the Context-Input-Process-Product (CIPP) model of evaluation proactively builds relevance into STI policies and programs while sustaining rigor. Results The resulting logic model framework explicitly traces the progress of knowledge from inputs, following it through the three knowledge-generating processes and their respective knowledge outputs (discovery, invention, innovation), as it generates the intended socio-beneficial impacts. It is a hybrid model for generating technology-based innovations, where best practices in new product development merge with a widely accepted knowledge-translation approach. Given the emphasis on evidence-based practice in the medical and health fields and “bench to bedside” expectations for knowledge transfer, sponsors and grantees alike should find the model useful for planning, implementing, and evaluating innovation processes. Conclusions High-cost/high-risk industries like healthcare require the market deployment of technology-based innovations to improve domestic society in a global economy. An appropriate balance of relevance and rigor in research, development, and production is crucial to optimize the return on public investment in such programs. The technology-innovation process needs a comprehensive operational model to effectively allocate public funds and thereby deliberately and systematically accomplish socioeconomic benefits. PMID:22591638
Concept of Operations for Collaboration and Discovery from Big Data Across Enterprise Data Warehouses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Olama, Mohammed M; Nutaro, James J; Sukumar, Sreenivas R

2013-01-01

The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Optionsmore » that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.« less
Knowledge Retrieval Solutions.

ERIC Educational Resources Information Center

Khan, Kamran

1998-01-01

Excalibur RetrievalWare offers true knowledge retrieval solutions. Its fundamental technologies, Adaptive Pattern Recognition Processing and Semantic Networks, have capabilities for knowledge discovery and knowledge management of full-text, structured and visual information. The software delivers a combination of accuracy, extensibility,…
Flood AI: An Intelligent Systems for Discovery and Communication of Disaster Knowledge

NASA Astrophysics Data System (ADS)

Demir, I.; Sermet, M. Y.

2017-12-01

Communities are not immune from extreme events or natural disasters that can lead to large-scale consequences for the nation and public. Improving resilience to better prepare, plan, recover, and adapt to disasters is critical to reduce the impacts of extreme events. The National Research Council (NRC) report discusses the topic of how to increase resilience to extreme events through a vision of resilient nation in the year 2030. The report highlights the importance of data, information, gaps and knowledge challenges that needs to be addressed, and suggests every individual to access the risk and vulnerability information to make their communities more resilient. This project presents an intelligent system, Flood AI, for flooding to improve societal preparedness by providing a knowledge engine using voice recognition, artificial intelligence, and natural language processing based on a generalized ontology for disasters with a primary focus on flooding. The knowledge engine utilizes the flood ontology and concepts to connect user input to relevant knowledge discovery channels on flooding by developing a data acquisition and processing framework utilizing environmental observations, forecast models, and knowledge bases. Communication channels of the framework includes web-based systems, agent-based chat bots, smartphone applications, automated web workflows, and smart home devices, opening the knowledge discovery for flooding to many unique use cases.
Informing child welfare policy and practice: using knowledge discovery and data mining technology via a dynamic Web site.

PubMed

Duncan, Dean F; Kum, Hye-Chung; Weigensberg, Elizabeth Caplick; Flair, Kimberly A; Stewart, C Joy

2008-11-01

Proper management and implementation of an effective child welfare agency requires the constant use of information about the experiences and outcomes of children involved in the system, emphasizing the need for comprehensive, timely, and accurate data. In the past 20 years, there have been many advances in technology that can maximize the potential of administrative data to promote better evaluation and management in the field of child welfare. Specifically, this article discusses the use of knowledge discovery and data mining (KDD), which makes it possible to create longitudinal data files from administrative data sources, extract valuable knowledge, and make the information available via a user-friendly public Web site. This article demonstrates a successful project in North Carolina where knowledge discovery and data mining technology was used to develop a comprehensive set of child welfare outcomes available through a public Web site to facilitate information sharing of child welfare data to improve policy and practice.
Research to knowledge: promoting the training of physician-scientists in the biology of pregnancy.

PubMed

Sadovsky, Yoel; Caughey, Aaron B; DiVito, Michelle; D'Alton, Mary E; Murtha, Amy P

2018-01-01

Common disorders of pregnancy, such as preeclampsia, preterm birth, and fetal growth abnormalities, continue to challenge perinatal biologists seeking insights into disease pathogenesis that will result in better diagnosis, therapy, and disease prevention. These challenges have recently been intensified with discoveries that associate gestational diseases with long-term maternal and neonatal outcomes. Whereas modern high-throughput investigative tools enable scientists and clinicians to noninvasively probe the maternal-fetal genome, epigenome, and other analytes, their implications for clinical medicine remain uncertain. Bridging these knowledge gaps depends on strengthening the existing pool of scientists with expertise in basic, translational, and clinical tools to address pertinent questions in the biology of pregnancy. Although PhD researchers are critical in this quest, physician-scientists would facilitate the inquiry by bringing together clinical challenges and investigative tools, promoting a culture of intellectual curiosity among clinical providers, and helping transform discoveries into relevant knowledge and clinical solutions. Uncertainties related to future administration of health care, federal support for research, attrition of physician-scientists, and an inadequate supply of new scholars may jeopardize our ability to address these challenges. New initiatives are necessary to attract current scholars and future generations of researchers seeking expertise in the scientific method and to support them, through mentorship and guidance, in pursuing a career that combines scientific investigation with clinical medicine. These efforts will promote breadth and depth of inquiry into the biology of pregnancy and enhance the pace of translation of scientific discoveries into better medicine and disease prevention. Copyright © 2017 Elsevier Inc. All rights reserved.

Biomedical discovery acceleration, with applications to craniofacial development.

PubMed

Leach, Sonia M; Tipney, Hannah; Feng, Weiguo; Baumgartner, William A; Kasliwal, Priyanka; Schuyler, Ronald P; Williams, Trevor; Spritz, Richard A; Hunter, Lawrence

2009-03-01

The profusion of high-throughput instruments and the explosion of new results in the scientific literature, particularly in molecular biomedicine, is both a blessing and a curse to the bench researcher. Even knowledgeable and experienced scientists can benefit from computational tools that help navigate this vast and rapidly evolving terrain. In this paper, we describe a novel computational approach to this challenge, a knowledge-based system that combines reading, reasoning, and reporting methods to facilitate analysis of experimental data. Reading methods extract information from external resources, either by parsing structured data or using biomedical language processing to extract information from unstructured data, and track knowledge provenance. Reasoning methods enrich the knowledge that results from reading by, for example, noting two genes that are annotated to the same ontology term or database entry. Reasoning is also used to combine all sources into a knowledge network that represents the integration of all sorts of relationships between a pair of genes, and to calculate a combined reliability score. Reporting methods combine the knowledge network with a congruent network constructed from experimental data and visualize the combined network in a tool that facilitates the knowledge-based analysis of that data. An implementation of this approach, called the Hanalyzer, is demonstrated on a large-scale gene expression array dataset relevant to craniofacial development. The use of the tool was critical in the creation of hypotheses regarding the roles of four genes never previously characterized as involved in craniofacial development; each of these hypotheses was validated by further experimental work.
Design of Automatic Extraction Algorithm of Knowledge Points for MOOCs

PubMed Central

Chen, Haijian; Han, Dongmei; Zhao, Lina

2015-01-01

In recent years, Massive Open Online Courses (MOOCs) are very popular among college students and have a powerful impact on academic institutions. In the MOOCs environment, knowledge discovery and knowledge sharing are very important, which currently are often achieved by ontology techniques. In building ontology, automatic extraction technology is crucial. Because the general methods of text mining algorithm do not have obvious effect on online course, we designed automatic extracting course knowledge points (AECKP) algorithm for online course. It includes document classification, Chinese word segmentation, and POS tagging for each document. Vector Space Model (VSM) is used to calculate similarity and design the weight to optimize the TF-IDF algorithm output values, and the higher scores will be selected as knowledge points. Course documents of “C programming language” are selected for the experiment in this study. The results show that the proposed approach can achieve satisfactory accuracy rate and recall rate. PMID:26448738
The self-organizing fractal theory as a universal discovery method: the phenomenon of life.

PubMed

Kurakin, Alexei

2011-03-29

A universal discovery method potentially applicable to all disciplines studying organizational phenomena has been developed. This method takes advantage of a new form of global symmetry, namely, scale-invariance of self-organizational dynamics of energy/matter at all levels of organizational hierarchy, from elementary particles through cells and organisms to the Universe as a whole. The method is based on an alternative conceptualization of physical reality postulating that the energy/matter comprising the Universe is far from equilibrium, that it exists as a flow, and that it develops via self-organization in accordance with the empirical laws of nonequilibrium thermodynamics. It is postulated that the energy/matter flowing through and comprising the Universe evolves as a multiscale, self-similar structure-process, i.e., as a self-organizing fractal. This means that certain organizational structures and processes are scale-invariant and are reproduced at all levels of the organizational hierarchy. Being a form of symmetry, scale-invariance naturally lends itself to a new discovery method that allows for the deduction of missing information by comparing scale-invariant organizational patterns across different levels of the organizational hierarchy.An application of the new discovery method to life sciences reveals that moving electrons represent a keystone physical force (flux) that powers, animates, informs, and binds all living structures-processes into a planetary-wide, multiscale system of electron flow/circulation, and that all living organisms and their larger-scale organizations emerge to function as electron transport networks that are supported by and, at the same time, support the flow of electrons down the Earth's redox gradient maintained along the core-mantle-crust-ocean-atmosphere axis of the planet. The presented findings lead to a radically new perspective on the nature and origin of life, suggesting that living matter is an organizational state/phase of nonliving matter and a natural consequence of the evolution and self-organization of nonliving matter.The presented paradigm opens doors for explosive advances in many disciplines, by uniting them within a single conceptual framework and providing a discovery method that allows for the systematic generation of knowledge through comparison and complementation of empirical data across different sciences and disciplines.
The self-organizing fractal theory as a universal discovery method: the phenomenon of life

PubMed Central

2011-01-01

A universal discovery method potentially applicable to all disciplines studying organizational phenomena has been developed. This method takes advantage of a new form of global symmetry, namely, scale-invariance of self-organizational dynamics of energy/matter at all levels of organizational hierarchy, from elementary particles through cells and organisms to the Universe as a whole. The method is based on an alternative conceptualization of physical reality postulating that the energy/matter comprising the Universe is far from equilibrium, that it exists as a flow, and that it develops via self-organization in accordance with the empirical laws of nonequilibrium thermodynamics. It is postulated that the energy/matter flowing through and comprising the Universe evolves as a multiscale, self-similar structure-process, i.e., as a self-organizing fractal. This means that certain organizational structures and processes are scale-invariant and are reproduced at all levels of the organizational hierarchy. Being a form of symmetry, scale-invariance naturally lends itself to a new discovery method that allows for the deduction of missing information by comparing scale-invariant organizational patterns across different levels of the organizational hierarchy. An application of the new discovery method to life sciences reveals that moving electrons represent a keystone physical force (flux) that powers, animates, informs, and binds all living structures-processes into a planetary-wide, multiscale system of electron flow/circulation, and that all living organisms and their larger-scale organizations emerge to function as electron transport networks that are supported by and, at the same time, support the flow of electrons down the Earth's redox gradient maintained along the core-mantle-crust-ocean-atmosphere axis of the planet. The presented findings lead to a radically new perspective on the nature and origin of life, suggesting that living matter is an organizational state/phase of nonliving matter and a natural consequence of the evolution and self-organization of nonliving matter. The presented paradigm opens doors for explosive advances in many disciplines, by uniting them within a single conceptual framework and providing a discovery method that allows for the systematic generation of knowledge through comparison and complementation of empirical data across different sciences and disciplines. PMID:21447162
Cognitive methodology for forecasting oil and gas industry using pattern-based neural information technologies

NASA Astrophysics Data System (ADS)

Gafurov, O.; Gafurov, D.; Syryamkin, V.

2018-05-01

The paper analyses a field of computer science formed at the intersection of such areas of natural science as artificial intelligence, mathematical statistics, and database theory, which is referred to as "Data Mining" (discovery of knowledge in data). The theory of neural networks is applied along with classical methods of mathematical analysis and numerical simulation. The paper describes the technique protected by the patent of the Russian Federation for the invention “A Method for Determining Location of Production Wells during the Development of Hydrocarbon Fields” [1–3] and implemented using the geoinformation system NeuroInformGeo. There are no analogues in domestic and international practice. The paper gives an example of comparing the forecast of the oil reservoir quality made by the geophysicist interpreter using standard methods and the forecast of the oil reservoir quality made using this technology. The technical result achieved shows the increase of efficiency, effectiveness, and ecological compatibility of development of mineral deposits and discovery of a new oil deposit.
Covalent docking of large libraries for the discovery of chemical probes.

PubMed

London, Nir; Miller, Rand M; Krishnan, Shyam; Uchida, Kenji; Irwin, John J; Eidam, Oliv; Gibold, Lucie; Cimermančič, Peter; Bonnet, Richard; Shoichet, Brian K; Taunton, Jack

2014-12-01

Chemical probes that form a covalent bond with a protein target often show enhanced selectivity, potency and utility for biological studies. Despite these advantages, protein-reactive compounds are usually avoided in high-throughput screening campaigns. Here we describe a general method (DOCKovalent) for screening large virtual libraries of electrophilic small molecules. We apply this method prospectively to discover reversible covalent fragments that target distinct protein nucleophiles, including the catalytic serine of AmpC β-lactamase and noncatalytic cysteines in RSK2, MSK1 and JAK3 kinases. We identify submicromolar to low-nanomolar hits with high ligand efficiency, cellular activity and selectivity, including what are to our knowledge the first reported reversible covalent inhibitors of JAK3. Crystal structures of inhibitor complexes with AmpC and RSK2 confirm the docking predictions and guide further optimization. As covalent virtual screening may have broad utility for the rapid discovery of chemical probes, we have made the method freely available through an automated web server (http://covalent.docking.org/).
On the Growth of Scientific Knowledge: Yeast Biology as a Case Study

PubMed Central

He, Xionglei; Zhang, Jianzhi

2009-01-01

The tempo and mode of human knowledge expansion is an enduring yet poorly understood topic. Through a temporal network analysis of three decades of discoveries of protein interactions and genetic interactions in baker's yeast, we show that the growth of scientific knowledge is exponential over time and that important subjects tend to be studied earlier. However, expansions of different domains of knowledge are highly heterogeneous and episodic such that the temporal turnover of knowledge hubs is much greater than expected by chance. Familiar subjects are preferentially studied over new subjects, leading to a reduced pace of innovation. While research is increasingly done in teams, the number of discoveries per researcher is greater in smaller teams. These findings reveal collective human behaviors in scientific research and help design better strategies in future knowledge exploration. PMID:19300476
On the growth of scientific knowledge: yeast biology as a case study.

PubMed

He, Xionglei; Zhang, Jianzhi

2009-03-01

The tempo and mode of human knowledge expansion is an enduring yet poorly understood topic. Through a temporal network analysis of three decades of discoveries of protein interactions and genetic interactions in baker's yeast, we show that the growth of scientific knowledge is exponential over time and that important subjects tend to be studied earlier. However, expansions of different domains of knowledge are highly heterogeneous and episodic such that the temporal turnover of knowledge hubs is much greater than expected by chance. Familiar subjects are preferentially studied over new subjects, leading to a reduced pace of innovation. While research is increasingly done in teams, the number of discoveries per researcher is greater in smaller teams. These findings reveal collective human behaviors in scientific research and help design better strategies in future knowledge exploration.
Big, Deep, and Smart Data in Scanning Probe Microscopy

DOE PAGES

Kalinin, Sergei V.; Strelcov, Evgheni; Belianinov, Alex; ...

2016-09-27

Scanning probe microscopy techniques open the door to nanoscience and nanotechnology by enabling imaging and manipulation of structure and functionality of matter on nanometer and atomic scales. We analyze the discovery process by SPM in terms of information flow from tip-surface junction to the knowledge adoption by scientific community. Furthermore, we discuss the challenges and opportunities offered by merging of SPM and advanced data mining, visual analytics, and knowledge discovery technologies.
Exploiting Early Intent Recognition for Competitive Advantage

DTIC Science & Technology

2009-01-01

basketball [Bhan- dari et al., 1997; Jug et al., 2003], and Robocup soccer sim- ulations [Riley and Veloso, 2000; 2002; Kuhlmann et al., 2006] and non...actions (e.g. before, after, around). Jug et al. [2003] used a similar framework for offline basketball game analysis. More recently, Hess et al...and K. Ramanujam. Advanced Scout: Data mining and knowledge discovery in NBA data. Data Mining and Knowledge Discovery, 1(1):121–125, 1997. [Chang
An Alternative Time for Telling: When Conceptual Instruction Prior to Exploration Improves Mathematical Knowledge

ERIC Educational Resources Information Center

Fyfe, Emily R.; DeCaro, Marci S.; Rittle-Johnson, Bethany

2013-01-01

An emerging consensus suggests that guided discovery, which combines discovery and instruction, is a more effective educational approach than either one in isolation. The goal of this study was to examine two specific forms of guided discovery, testing whether conceptual instruction should precede or follow exploratory problem solving. In both…
Contributing, Exchanging and Linking for Learning: Supporting Web Co-Discovery in One-to-One Environments

ERIC Educational Resources Information Center

Liu, Chen-Chung; Don, Ping-Hsing; Chung, Chen-Wei; Lin, Shao-Jun; Chen, Gwo-Dong; Liu, Baw-Jhiune

2010-01-01

While Web discovery is usually undertaken as a solitary activity, Web co-discovery may transform Web learning activities from the isolated individual search process into interactive and collaborative knowledge exploration. Recent studies have proposed Web co-search environments on a single computer, supported by multiple one-to-one technologies.…
Personal discovery in diabetes self-management: Discovering cause and effect using self-monitoring data

PubMed Central

Mamykina, Lena; Heitkemper, Elizabeth M.; Smaldone, Arlene M.; Kukafka, Rita; Cole-Lewis, Heather J.; Davidson, Patricia G.; Mynatt, Elizabeth D.; Cassells, Andrea; Tobin, Jonathan N.; Hripcsak, George

2017-01-01

Objective To outline new design directions for informatics solutions that facilitate personal discovery with self-monitoring data. We investigate this question in the context of chronic disease self-management with the focus on type 2 diabetes. Materials and methods We conducted an observational qualitative study of discovery with personal data among adults attending a diabetes self-management education (DSME) program that utilized a discovery-based curriculum. The study included observations of class sessions, and interviews and focus groups with the educator and attendees of the program (n = 14). Results The main discovery in diabetes self-management evolved around discovering patterns of association between characteristics of individuals’ activities and changes in their blood glucose levels that the participants referred to as “cause and effect”. This discovery empowered individuals to actively engage in self-management and provided a desired flexibility in selection of personalized self-management strategies. We show that discovery of cause and effect involves four essential phases: (1) feature selection, (2) hypothesis generation, (3) feature evaluation, and (4) goal specification. Further, we identify opportunities to support discovery at each stage with informatics and data visualization solutions by providing assistance with: (1) active manipulation of collected data (e.g., grouping, filtering and side-by-side inspection), (2) hypotheses formulation (e.g., using natural language statements or constructing visual queries), (3) inference evaluation (e.g., through aggregation and visual comparison, and statistical analysis of associations), and (4) translation of discoveries into actionable goals (e.g., tailored selection from computable knowledge sources of effective diabetes self-management behaviors). Discussion The study suggests that discovery of cause and effect in diabetes can be a powerful approach to helping individuals to improve their self-management strategies, and that self-monitoring data can serve as a driving engine for personal discovery that may lead to sustainable behavior changes. Conclusions Enabling personal discovery is a promising new approach to enhancing chronic disease self-management with informatics interventions. PMID:28974460
A method for exploring implicit concept relatedness in biomedical knowledge network.

PubMed

Bai, Tian; Gong, Leiguang; Wang, Ye; Wang, Yan; Kulikowski, Casimir A; Huang, Lan

2016-07-19

Biomedical information and knowledge, structural and non-structural, stored in different repositories can be semantically connected to form a hybrid knowledge network. How to compute relatedness between concepts and discover valuable but implicit information or knowledge from it effectively and efficiently is of paramount importance for precision medicine, and a major challenge facing the biomedical research community. In this study, a hybrid biomedical knowledge network is constructed by linking concepts across multiple biomedical ontologies as well as non-structural biomedical knowledge sources. To discover implicit relatedness between concepts in ontologies for which potentially valuable relationships (implicit knowledge) may exist, we developed a Multi-Ontology Relatedness Model (MORM) within the knowledge network, for which a relatedness network (RN) is defined and computed across multiple ontologies using a formal inference mechanism of set-theoretic operations. Semantic constraints are designed and implemented to prune the search space of the relatedness network. Experiments to test examples of several biomedical applications have been carried out, and the evaluation of the results showed an encouraging potential of the proposed approach to biomedical knowledge discovery.
Knowledge Management in Higher Education: A Knowledge Repository Approach

ERIC Educational Resources Information Center

Wedman, John; Wang, Feng-Kwei

2005-01-01

One might expect higher education, where the discovery and dissemination of new and useful knowledge is vital, to be among the first to implement knowledge management practices. Surprisingly, higher education has been slow to implement knowledge management practices (Townley, 2003). This article describes an ongoing research and development effort…
Crowdsourcing Knowledge Discovery and Innovations in Medicine

PubMed Central

2014-01-01

Clinicians face difficult treatment decisions in contexts that are not well addressed by available evidence as formulated based on research. The digitization of medicine provides an opportunity for clinicians to collaborate with researchers and data scientists on solutions to previously ambiguous and seemingly insolvable questions. But these groups tend to work in isolated environments, and do not communicate or interact effectively. Clinicians are typically buried in the weeds and exigencies of daily practice such that they do not recognize or act on ways to improve knowledge discovery. Researchers may not be able to identify the gaps in clinical knowledge. For data scientists, the main challenge is discerning what is relevant in a domain that is both unfamiliar and complex. Each type of domain expert can contribute skills unavailable to the other groups. “Health hackathons” and “data marathons”, in which diverse participants work together, can leverage the current ready availability of digital data to discover new knowledge. Utilizing the complementary skills and expertise of these talented, but functionally divided groups, innovations are formulated at the systems level. As a result, the knowledge discovery process is simultaneously democratized and improved, real problems are solved, cross-disciplinary collaboration is supported, and innovations are enabled. PMID:25239002
Crowdsourcing knowledge discovery and innovations in medicine.

PubMed

Celi, Leo Anthony; Ippolito, Andrea; Montgomery, Robert A; Moses, Christopher; Stone, David J

2014-09-19

Clinicians face difficult treatment decisions in contexts that are not well addressed by available evidence as formulated based on research. The digitization of medicine provides an opportunity for clinicians to collaborate with researchers and data scientists on solutions to previously ambiguous and seemingly insolvable questions. But these groups tend to work in isolated environments, and do not communicate or interact effectively. Clinicians are typically buried in the weeds and exigencies of daily practice such that they do not recognize or act on ways to improve knowledge discovery. Researchers may not be able to identify the gaps in clinical knowledge. For data scientists, the main challenge is discerning what is relevant in a domain that is both unfamiliar and complex. Each type of domain expert can contribute skills unavailable to the other groups. "Health hackathons" and "data marathons", in which diverse participants work together, can leverage the current ready availability of digital data to discover new knowledge. Utilizing the complementary skills and expertise of these talented, but functionally divided groups, innovations are formulated at the systems level. As a result, the knowledge discovery process is simultaneously democratized and improved, real problems are solved, cross-disciplinary collaboration is supported, and innovations are enabled.
Empirical study using network of semantically related associations in bridging the knowledge gap.

PubMed

Abedi, Vida; Yeasin, Mohammed; Zand, Ramin

2014-11-27

The data overload has created a new set of challenges in finding meaningful and relevant information with minimal cognitive effort. However designing robust and scalable knowledge discovery systems remains a challenge. Recent innovations in the (biological) literature mining tools have opened new avenues to understand the confluence of various diseases, genes, risk factors as well as biological processes in bridging the gaps between the massive amounts of scientific data and harvesting useful knowledge. In this paper, we highlight some of the findings using a text analytics tool, called ARIANA--Adaptive Robust and Integrative Analysis for finding Novel Associations. Empirical study using ARIANA reveals knowledge discovery instances that illustrate the efficacy of such tool. For example, ARIANA can capture the connection between the drug hexamethonium and pulmonary inflammation and fibrosis that caused the tragic death of a healthy volunteer in a 2001 John Hopkins asthma study, even though the abstract of the study was not part of the semantic model. An integrated system, such as ARIANA, could assist the human expert in exploratory literature search by bringing forward hidden associations, promoting data reuse and knowledge discovery as well as stimulating interdisciplinary projects by connecting information across the disciplines.
Impact of Network Activity Levels on the Performance of Passive Network Service Dependency Discovery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carroll, Thomas E.; Chikkagoudar, Satish; Arthur-Durett, Kristine M.

Network services often do not operate alone, but instead, depend on other services distributed throughout a network to correctly function. If a service fails, is disrupted, or degraded, it is likely to impair other services. The web of dependencies can be surprisingly complex---especially within a large enterprise network---and evolve with time. Acquiring, maintaining, and understanding dependency knowledge is critical for many network management and cyber defense activities. While automation can improve situation awareness for network operators and cyber practitioners, poor detection accuracy reduces their confidence and can complicate their roles. In this paper we rigorously study the effects of networkmore » activity levels on the detection accuracy of passive network-based service dependency discovery methods. The accuracy of all except for one method was inversely proportional to network activity levels. Our proposed cross correlation method was particularly robust to the influence of network activity. The proposed experimental treatment will further advance a more scientific evaluation of methods and provide the ability to determine their operational boundaries.« less
New approaches to antimicrobial discovery.

PubMed

Lewis, Kim

2017-06-15

The spread of resistant organisms is producing a human health crisis, as we are witnessing the emergence of pathogens resistant to all available antibiotics. An increase in chronic infections presents an additional challenge - these diseases are difficult to treat due to antibiotic-tolerant persister cells. Overmining of soil Actinomycetes ended the golden era of antibiotic discovery in the 60s, and efforts to replace this source by screening synthetic compound libraries was not successful. Bacteria have an efficient permeability barrier, preventing penetration of most synthetic compounds. Empirically establishing rules of penetration for antimicrobials will form the knowledge base to produce libraries tailored to antibiotic discovery, and will revive rational drug design. Two untapped sources of natural products hold the promise of reviving natural product discovery. Most bacterial species, over 99%, are uncultured, and methods to grow these organisms have been developed, and the first promising compounds are in development. Genome sequencing shows that known producers harbor many more operons coding for secondary metabolites than we can account for, providing an additional rich source of antibiotics. Revival of natural product discovery will require high-throughput identification of novel compounds within a large background of known substances. This could be achieved by rapid acquisition of transcription profiles from active extracts that will point to potentially novel compounds. Copyright © 2016 Elsevier Inc. All rights reserved.

Text-based discovery in biomedicine: the architecture of the DAD-system.

PubMed

Weeber, M; Klein, H; Aronson, A R; Mork, J G; de Jong-van den Berg, L T; Vos, R

2000-01-01

Current scientific research takes place in highly specialized contexts with poor communication between disciplines as a likely consequence. Knowledge from one discipline may be useful for the other without researchers knowing it. As scientific publications are a condensation of this knowledge, literature-based discovery tools may help the individual scientist to explore new useful domains. We report on the development of the DAD-system, a concept-based Natural Language Processing system for PubMed citations that provides the biomedical researcher such a tool. We describe the general architecture and illustrate its operation by a simulation of a well-known text-based discovery: The favorable effects of fish oil on patients suffering from Raynaud's disease [1].
Summary of the BioLINK SIG 2013 meeting at ISMB/ECCB 2013.

PubMed

Verspoor, Karin; Shatkay, Hagit; Hirschman, Lynette; Blaschke, Christian; Valencia, Alfonso

2015-01-15

The ISMB Special Interest Group on Linking Literature, Information and Knowledge for Biology (BioLINK) organized a one-day workshop at ISMB/ECCB 2013 in Berlin, Germany. The theme of the workshop was 'Roles for text mining in biomedical knowledge discovery and translational medicine'. This summary reviews the outcomes of the workshop. Meeting themes included concept annotation methods and applications, extraction of biological relationships and the use of text-mined data for biological data analysis. All articles are available at http://biolinksig.org/proceedings-online/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Which are the greatest recent discoveries and the greatest future challenges in nutrition?

PubMed

Katan, M B; Boekschoten, M V; Connor, W E; Mensink, R P; Seidell, J; Vessby, B; Willett, W

2009-01-01

Nutrition science aims to create new knowledge, but scientists rarely sit back to reflect on what nutrition research has achieved in recent decades. We report the outcome of a 1-day symposium at which the audience was asked to vote on the greatest discoveries in nutrition since 1976 and on the greatest challenges for the coming 30 years. Most of the 128 participants were Dutch scientists working in nutrition or related biomedical and public health fields. Candidate discoveries and challenges were nominated by five invited speakers and by members of the audience. Ballot forms were then prepared on which participants selected one discovery and one challenge. A total of 15 discoveries and 14 challenges were nominated. The audience elected Folic acid prevents birth defects as the greatest discovery in nutrition science since 1976. Controlling obesity and insulin resistance through activity and diet was elected as the greatest challenge for the coming 30 years. This selection was probably biased by the interests and knowledge of the speakers and the audience. For the present review, we therefore added 12 discoveries from the period 1976 to 2006 that we judged worthy of consideration, but that had not been nominated at the meeting. The meeting did not represent an objective selection process, but it did demonstrate that the past 30 years have yielded major new discoveries in nutrition and health.
Towards a Semantic Web of Things: A Hybrid Semantic Annotation, Extraction, and Reasoning Framework for Cyber-Physical System.

PubMed

Wu, Zhenyu; Xu, Yuan; Yang, Yunong; Zhang, Chunhong; Zhu, Xinning; Ji, Yang

2017-02-20

Web of Things (WoT) facilitates the discovery and interoperability of Internet of Things (IoT) devices in a cyber-physical system (CPS). Moreover, a uniform knowledge representation of physical resources is quite necessary for further composition, collaboration, and decision-making process in CPS. Though several efforts have integrated semantics with WoT, such as knowledge engineering methods based on semantic sensor networks (SSN), it still could not represent the complex relationships between devices when dynamic composition and collaboration occur, and it totally depends on manual construction of a knowledge base with low scalability. In this paper, to addresses these limitations, we propose the semantic Web of Things (SWoT) framework for CPS (SWoT4CPS). SWoT4CPS provides a hybrid solution with both ontological engineering methods by extending SSN and machine learning methods based on an entity linking (EL) model. To testify to the feasibility and performance, we demonstrate the framework by implementing a temperature anomaly diagnosis and automatic control use case in a building automation system. Evaluation results on the EL method show that linking domain knowledge to DBpedia has a relative high accuracy and the time complexity is at a tolerant level. Advantages and disadvantages of SWoT4CPS with future work are also discussed.
Current Applications of Chromatographic Methods in the Study of Human Body Fluids for Diagnosing Disorders.

PubMed

Jóźwik, Jagoda; Kałużna-Czaplińska, Joanna

2016-01-01

Currently, analysis of various human body fluids is one of the most essential and promising approaches to enable the discovery of biomarkers or pathophysiological mechanisms for disorders and diseases. Analysis of these fluids is challenging due to their complex composition and unique characteristics. Development of new analytical methods in this field has made it possible to analyze body fluids with higher selectivity, sensitivity, and precision. The composition and concentration of analytes in body fluids are most often determined by chromatography-based techniques. There is no doubt that proper use of knowledge that comes from a better understanding of the role of body fluids requires the cooperation of scientists of diverse specializations, including analytical chemists, biologists, and physicians. This article summarizes current knowledge about the application of different chromatographic methods in analyses of a wide range of compounds in human body fluids in order to diagnose certain diseases and disorders.
Pathway-based analyses.

PubMed

Kent, Jack W

2016-02-03

New technologies for acquisition of genomic data, while offering unprecedented opportunities for genetic discovery, also impose severe burdens of interpretation and penalties for multiple testing. The Pathway-based Analyses Group of the Genetic Analysis Workshop 19 (GAW19) sought reduction of multiple-testing burden through various approaches to aggregation of highdimensional data in pathways informed by prior biological knowledge. Experimental methods testedincluded the use of "synthetic pathways" (random sets of genes) to estimate power and false-positive error rate of methods applied to simulated data; data reduction via independent components analysis, single-nucleotide polymorphism (SNP)-SNP interaction, and use of gene sets to estimate genetic similarity; and general assessment of the efficacy of prior biological knowledge to reduce the dimensionality of complex genomic data. The work of this group explored several promising approaches to managing high-dimensional data, with the caveat that these methods are necessarily constrained by the quality of external bioinformatic annotation.
Distributed data mining on grids: services, tools, and applications.

PubMed

Cannataro, Mario; Congiusta, Antonio; Pugliese, Andrea; Talia, Domenico; Trunfio, Paolo

2004-12-01

Data mining algorithms are widely used today for the analysis of large corporate and scientific datasets stored in databases and data archives. Industry, science, and commerce fields often need to analyze very large datasets maintained over geographically distributed sites by using the computational power of distributed and parallel systems. The grid can play a significant role in providing an effective computational support for distributed knowledge discovery applications. For the development of data mining applications on grids we designed a system called Knowledge Grid. This paper describes the Knowledge Grid framework and presents the toolset provided by the Knowledge Grid for implementing distributed knowledge discovery. The paper discusses how to design and implement data mining applications by using the Knowledge Grid tools starting from searching grid resources, composing software and data components, and executing the resulting data mining process on a grid. Some performance results are also discussed.
Knowledge Discovery/A Collaborative Approach, an Innovative Solution

NASA Technical Reports Server (NTRS)

Fitts, Mary A.

2009-01-01

Collaboration between Medical Informatics and Healthcare Systems (MIHCS) at NASA/Johnson Space Center (JSC) and the Texas Medical Center (TMC) Library was established to investigate technologies for facilitating knowledge discovery across multiple life sciences research disciplines in multiple repositories. After reviewing 14 potential Enterprise Search System (ESS) solutions, Collexis was determined to best meet the expressed needs. A three month pilot evaluation of Collexis produced positive reports from multiple scientists across 12 research disciplines. The joint venture and a pilot-phased approach achieved the desired results without the high cost of purchasing software, hardware or additional resources to conduct the task. Medical research is highly compartmentalized by discipline, e.g. cardiology, immunology, neurology. The medical research community at large, as well as at JSC, recognizes the need for cross-referencing relevant information to generate best evidence. Cross-discipline collaboration at JSC is specifically required to close knowledge gaps affecting space exploration. To facilitate knowledge discovery across these communities, MIHCS combined expertise with the TMC library and found Collexis to best fit the needs of our researchers including:
Big, Deep, and Smart Data in Scanning Probe Microscopy.

PubMed

Kalinin, Sergei V; Strelcov, Evgheni; Belianinov, Alex; Somnath, Suhas; Vasudevan, Rama K; Lingerfelt, Eric J; Archibald, Richard K; Chen, Chaomei; Proksch, Roger; Laanait, Nouamane; Jesse, Stephen

2016-09-27

Scanning probe microscopy (SPM) techniques have opened the door to nanoscience and nanotechnology by enabling imaging and manipulation of the structure and functionality of matter at nanometer and atomic scales. Here, we analyze the scientific discovery process in SPM by following the information flow from the tip-surface junction, to knowledge adoption by the wider scientific community. We further discuss the challenges and opportunities offered by merging SPM with advanced data mining, visual analytics, and knowledge discovery technologies.
Building Scalable Knowledge Graphs for Earth Science

NASA Technical Reports Server (NTRS)

Ramachandran, Rahul; Maskey, Manil; Gatlin, Patrick; Zhang, Jia; Duan, Xiaoyi; Miller, J. J.; Bugbee, Kaylin; Christopher, Sundar; Freitag, Brian

2017-01-01

Knowledge Graphs link key entities in a specific domain with other entities via relationships. From these relationships, researchers can query knowledge graphs for probabilistic recommendations to infer new knowledge. Scientific papers are an untapped resource which knowledge graphs could leverage to accelerate research discovery. Goal: Develop an end-to-end (semi) automated methodology for constructing Knowledge Graphs for Earth Science.
Genetic discoveries and nursing implications for complex disease prevention and management.

PubMed

Frazier, Lorraine; Meininger, Janet; Halsey Lea, Dale; Boerwinkle, Eric

2004-01-01

The purpose of this article is to examine the management of patients with complex diseases, in light of recent genetic discoveries, and to explore how these genetic discoveries will impact nursing practice and nursing research. The nursing science processes discussed are not comprehensive of all nursing practice but, instead, are concentrated in areas where genetics will have the greatest influence. Advances in genetic science will revolutionize our approach to patients and to health care in the prevention, diagnosis, and treatment of disease, raising many issues for nursing research and practice. As the scope of genetics expands to encompass multifactorial disease processes, a continuing reexamination of the knowledge base is required for nursing practice, with incorporation of genetic knowledge into the repertoire of every nurse, and with advanced knowledge for nurses who select specialty roles in the genetics area. This article explores the impact of this revolution on nursing science and practice as well as the opportunities for nursing science and practice to participate fully in this revolution. Because of the high proportion of the population at risk for complex diseases and because nurses are occupied every day in the prevention, assessment, treatment, and therapeutic intervention of patients with such diseases in practice and research, there is great opportunity for nurses to improve health care through the application (nursing practice) and discovery (nursing research) of genetic knowledge.
Nature's Medicines: Traditional Knowledge and Intellectual Property Management. Case Studies from the National Institutes of Health (NIH), USA

PubMed Central

Gupta, Ranjan; Gabrielsen, Bjarne; Ferguson, Steven M.

2009-01-01

With the emergence and re-emergence of infectious diseases and development of multi-drug resistance, there is a dire need to find newer cures and to produce more drugs and vaccines in the pipeline. To meet these increasing demands biomedical researchers and pharmaceutical companies are combining advanced methods of drug discovery, such as combinatorial chemistry, high-throughput screening and genomics, with conventional approaches using natural products and traditional knowledge. However, such approaches require much international cooperation and understanding of international laws and conventions as well as local customs and traditions. This article reviews the forty years of cumulative experience at the National Institutes of Health (initiated by the National Cancer Institute) in natural products drug discovery. It presents (1) three major cooperative programs (2) the legal mechanisms for cooperation and (3) illustrative case studies from these programs. We hope that these discussions and our lessons learned would be helpful to others seeking to develop their own models of cooperation for the benefit of global health. PMID:16475917
Questioning supports effective transmission of knowledge and increased exploratory learning in pre-kindergarten children.

PubMed

Yu, Yue; Landrum, Asheley R; Bonawitz, Elizabeth; Shafto, Patrick

2018-06-19

How can education optimize transmission of knowledge while also fostering further learning? Focusing on children at the cusp of formal schooling (N = 180, age = 4.0-6.0 y), we investigate learning after direct instruction by a knowledgeable teacher, after questioning by a knowledgeable teacher, and after questioning by a naïve informant. Consistent with previous findings, instruction by a knowledgeable teacher allows effective information transmission but at the cost of exploration and further learning. Critically, we find a dual benefit for questioning by a knowledgeable teacher: Such pedagogical questioning both effectively transmits knowledge and fosters exploration and further learning, regardless of whether the question was directed to the child or directed to a third party and overheard by the child. These effects are not observed when the same question is asked by a naïve informant. We conclude that a teacher's choice of pedagogical method may differentially influence learning through their choices of how, and how not, to present evidence, with implications for transmission of knowledge and self-directed discovery. © 2018 John Wiley & Sons Ltd.
Introducing the Big Knowledge to Use (BK2U) challenge.

PubMed

Perl, Yehoshua; Geller, James; Halper, Michael; Ochs, Christopher; Zheng, Ling; Kapusnik-Uner, Joan

2017-01-01

The purpose of the Big Data to Knowledge initiative is to develop methods for discovering new knowledge from large amounts of data. However, if the resulting knowledge is so large that it resists comprehension, referred to here as Big Knowledge (BK), how can it be used properly and creatively? We call this secondary challenge, Big Knowledge to Use. Without a high-level mental representation of the kinds of knowledge in a BK knowledgebase, effective or innovative use of the knowledge may be limited. We describe summarization and visualization techniques that capture the big picture of a BK knowledgebase, possibly created from Big Data. In this research, we distinguish between assertion BK and rule-based BK (rule BK) and demonstrate the usefulness of summarization and visualization techniques of assertion BK for clinical phenotyping. As an example, we illustrate how a summary of many intracranial bleeding concepts can improve phenotyping, compared to the traditional approach. We also demonstrate the usefulness of summarization and visualization techniques of rule BK for drug-drug interaction discovery. © 2016 New York Academy of Sciences.
Introducing the Big Knowledge to Use (BK2U) challenge

PubMed Central

Perl, Yehoshua; Geller, James; Halper, Michael; Ochs, Christopher; Zheng, Ling; Kapusnik-Uner, Joan

2016-01-01

The purpose of the Big Data to Knowledge (BD2K) initiative is to develop methods for discovering new knowledge from large amounts of data. However, if the resulting knowledge is so large that it resists comprehension, referred to here as Big Knowledge (BK), how can it be used properly and creatively? We call this secondary challenge, Big Knowledge to Use (BK2U). Without a high-level mental representation of the kinds of knowledge in a BK knowledgebase, effective or innovative use of the knowledge may be limited. We describe summarization and visualization techniques that capture the big picture of a BK knowledgebase, possibly created from Big Data. In this research, we distinguish between assertion BK and rule-based BK and demonstrate the usefulness of summarization and visualization techniques of assertion BK for clinical phenotyping. As an example, we illustrate how a summary of many intracranial bleeding concepts can improve phenotyping, compared to the traditional approach. We also demonstrate the usefulness of summarization and visualization techniques of rule-based BK for drug–drug interaction discovery. PMID:27750400
Medical data mining: knowledge discovery in a clinical data warehouse.

PubMed Central

Prather, J. C.; Lobach, D. F.; Goodwin, L. K.; Hales, J. W.; Hage, M. L.; Hammond, W. E.

1997-01-01

Clinical databases have accumulated large quantities of information about patients and their medical conditions. Relationships and patterns within this data could provide new medical knowledge. Unfortunately, few methodologies have been developed and applied to discover this hidden knowledge. In this study, the techniques of data mining (also known as Knowledge Discovery in Databases) were used to search for relationships in a large clinical database. Specifically, data accumulated on 3,902 obstetrical patients were evaluated for factors potentially contributing to preterm birth using exploratory factor analysis. Three factors were identified by the investigators for further exploration. This paper describes the processes involved in mining a clinical database including data warehousing, data query and cleaning, and data analysis. PMID:9357597
Discovery informatics in biological and biomedical sciences: research challenges and opportunities.

PubMed

Honavar, Vasant

2015-01-01

New discoveries in biological, biomedical and health sciences are increasingly being driven by our ability to acquire, share, integrate and analyze, and construct and simulate predictive models of biological systems. While much attention has focused on automating routine aspects of management and analysis of "big data", realizing the full potential of "big data" to accelerate discovery calls for automating many other aspects of the scientific process that have so far largely resisted automation: identifying gaps in the current state of knowledge; generating and prioritizing questions; designing studies; designing, prioritizing, planning, and executing experiments; interpreting results; forming hypotheses; drawing conclusions; replicating studies; validating claims; documenting studies; communicating results; reviewing results; and integrating results into the larger body of knowledge in a discipline. Against this background, the PSB workshop on Discovery Informatics in Biological and Biomedical Sciences explores the opportunities and challenges of automating discovery or assisting humans in discovery through advances (i) Understanding, formalization, and information processing accounts of, the entire scientific process; (ii) Design, development, and evaluation of the computational artifacts (representations, processes) that embody such understanding; and (iii) Application of the resulting artifacts and systems to advance science (by augmenting individual or collective human efforts, or by fully automating science).
10 CFR 2.705 - Discovery-additional methods.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 10 Energy 1 2010-01-01 2010-01-01 false Discovery-additional methods. 2.705 Section 2.705 Energy... Rules for Formal Adjudications § 2.705 Discovery-additional methods. (a) Discovery methods. Parties may obtain discovery by one or more of the following methods: depositions upon oral examination or written...
Automated discovery systems and the inductivist controversy

NASA Astrophysics Data System (ADS)

Giza, Piotr

2017-09-01

The paper explores possible influences that some developments in the field of branches of AI, called automated discovery and machine learning systems, might have upon some aspects of the old debate between Francis Bacon's inductivism and Karl Popper's falsificationism. Donald Gillies facetiously calls this controversy 'the duel of two English knights', and claims, after some analysis of historical cases of discovery, that Baconian induction had been used in science very rarely, or not at all, although he argues that the situation has changed with the advent of machine learning systems. (Some clarification of terms machine learning and automated discovery is required here. The key idea of machine learning is that, given data with associated outcomes, software can be trained to make those associations in future cases which typically amounts to inducing some rules from individual cases classified by the experts. Automated discovery (also called machine discovery) deals with uncovering new knowledge that is valuable for human beings, and its key idea is that discovery is like other intellectual tasks and that the general idea of heuristic search in problem spaces applies also to discovery tasks. However, since machine learning systems discover (very low-level) regularities in data, throughout this paper I use the generic term automated discovery for both kinds of systems. I will elaborate on this later on). Gillies's line of argument can be generalised: thanks to automated discovery systems, philosophers of science have at their disposal a new tool for empirically testing their philosophical hypotheses. Accordingly, in the paper, I will address the question, which of the two philosophical conceptions of scientific method is better vindicated in view of the successes and failures of systems developed within three major research programmes in the field: machine learning systems in the Turing tradition, normative theory of scientific discovery formulated by Herbert Simon's group and the programme called HHNT, proposed by J. Holland, K. Holyoak, R. Nisbett and P. Thagard.
Cooperative knowledge evolution: a construction-integration approach to knowledge discovery in medicine.

PubMed

Schmalhofer, F J; Tschaitschian, B

1998-11-01

In this paper, we perform a cognitive analysis of knowledge discovery processes. As a result of this analysis, the construction-integration theory is proposed as a general framework for developing cooperative knowledge evolution systems. We thus suggest that for the acquisition of new domain knowledge in medicine, one should first construct pluralistic views on a given topic which may contain inconsistencies as well as redundancies. Only thereafter does this knowledge become consolidated into a situation-specific circumscription and the early inconsistencies become eliminated. As a proof for the viability of such knowledge acquisition processes in medicine, we present the IDEAS system, which can be used for the intelligent documentation of adverse events in clinical studies. This system provides a better documentation of the side-effects of medical drugs. Thereby, knowledge evolution occurs by achieving consistent explanations in increasingly larger contexts (i.e., more cases and more pharmaceutical substrates). Finally, it is shown how prototypes, model-based approaches and cooperative knowledge evolution systems can be distinguished as different classes of knowledge-based systems.

Tire Changes, Fresh Air, and Yellow Flags: Challenges in Predictive Analytics for Professional Racing.

PubMed

Tulabandhula, Theja; Rudin, Cynthia

2014-06-01

Our goal is to design a prediction and decision system for real-time use during a professional car race. In designing a knowledge discovery process for racing, we faced several challenges that were overcome only when domain knowledge of racing was carefully infused within statistical modeling techniques. In this article, we describe how we leveraged expert knowledge of the domain to produce a real-time decision system for tire changes within a race. Our forecasts have the potential to impact how racing teams can optimize strategy by making tire-change decisions to benefit their rank position. Our work significantly expands previous research on sports analytics, as it is the only work on analytical methods for within-race prediction and decision making for professional car racing.
Attractor Signaling Models for Discovery of Combinatorial Therapies

DTIC Science & Technology

2013-09-01

year!survival!rate!for!this! disease ! less!than!15%.!Over!the!years,!many!specific!mechanisms!associated!with!drug!resistance!in!lung!cancer! have!been...reprogramming of pluripotent stem cells [4]. More- over, it has been suggested that a biological system in a chronic or therapy-resistant disease state can...designing new therapeutic methods for complex diseases such as can- cer. Even if our knowledge of biological networks is in- complete, fast progress
Attractor Signaling Models for Discovery of Combinatorial Therapies

DTIC Science & Technology

2014-11-01

acquired!drug!resistance!still!makes!the!5-year!survival!rate!for!this! disease ! less!than!15%.!Over!the!years,!many!specific!mechanisms!associated!with!drug...Moreover, it has been suggested that a biological system in a chronic or therapy- resistant disease state can be seen as a network that has become...therapeutic methods for complex diseases such as cancer. Even if our knowledge of biological networks is incomplete, rapid progress is currently being
Teaching Glycoproteins with a Classical Paper: Knowledge and Methods in the Course of an Exciting Discovery--The story of Discovering HK-ATPase [Beta]-Subunit

ERIC Educational Resources Information Center

Zhu, Lixin

2008-01-01

To integrate research into the teaching of glycoproteins, the story of discovering hydrogen-potassium ATPase (HK-ATPase) [beta] subunit is presented in a way covering all the important teaching points. The interaction between the HK-ATPase [alpha] subunit and a glycoprotein of 60-80 kDa was demonstrated to support the existence of the [beta]…
Active and Interactive Discovery of Goal Selection Knowledge

DTIC Science & Technology

2011-01-01

Generator retrieves the goal ct.g of the most similar case ct and outputs it to the Goal Manager. 5.3 Retention and Maintenance: Active Learning Figure...pp. 202-206). Seattle, WA: AAAI Press. Hu, R., Delaney, S.J., & Mac Namee, B. (2010). EGAL: Exploration guided active learning for TCBR. Proceedings...Sculley, D. (2007). Online active learning methods for fast label- efficient spam filtering. In Proceedings of the Fourth Conference on Email and Anti
Knowledge discovery from data as a framework to decision support in medical domains

PubMed Central

Gibert, Karina

2009-01-01

Introduction Knowledge discovery from data (KDD) is a multidisciplinary discipline which appeared in 1996 for “non trivial identifying of valid, novel, potentially useful, ultimately understandable patterns in data”. Pre-treatment of data and post-processing is as important as the data exploitation (Data Mining) itself. Different analysis techniques can be properly combined to produce explicit knowledge from data. Methods Hybrid KDD methodologies combining Artificial Intelligence with Statistics and visualization have been used to identify patterns in complex medical phenomena: experts provide prior knowledge (pK); it biases the search of distinguishable groups of homogeneous objects; support-interpretation tools (CPG) assisted experts in conceptualization and labelling of discovered patterns, consistently with pK. Results Patterns of dependency in mental disabilities supported decision-making on legislation of the Spanish Dependency Law in Catalonia. Relationships between type of neurorehabilitation treatment and patterns of response for brain damage are assessed. Patterns of the perceived QOL along time are used in spinal cord lesion to improve social inclusion. Conclusion Reality is more and more complex and classical data analyses are not powerful enough to model it. New methodologies are required including multidisciplinarity and stressing on production of understandable models. Interaction with the experts is critical to generate meaningful results which can really support decision-making, particularly convenient transferring the pK to the system, as well as interpreting results in close interaction with experts. KDD is a valuable paradigm, particularly when facing very complex domains, not well understood yet, like many medical phenomena.
Pseudotargeted MS Method for the Sensitive Analysis of Protein Phosphorylation in Protein Complexes.

PubMed

Lyu, Jiawen; Wang, Yan; Mao, Jiawei; Yao, Yating; Wang, Shujuan; Zheng, Yong; Ye, Mingliang

2018-05-15

In this study, we presented an enrichment-free approach for the sensitive analysis of protein phosphorylation in minute amounts of samples, such as purified protein complexes. This method takes advantage of the high sensitivity of parallel reaction monitoring (PRM). Specifically, low confident phosphopeptides identified from the data-dependent acquisition (DDA) data set were used to build a pseudotargeted list for PRM analysis to allow the identification of additional phosphopeptides with high confidence. The development of this targeted approach is very easy as the same sample and the same LC-system were used for the discovery and the targeted analysis phases. No sample fractionation or enrichment was required for the discovery phase which allowed this method to analyze minute amount of sample. We applied this pseudotargeted MS method to quantitatively examine phosphopeptides in affinity purified endogenous Shc1 protein complexes at four temporal stages of EGF signaling and identified 82 phospho-sites. To our knowledge, this is the highest number of phospho-sites identified from the protein complexes. This pseudotargeted MS method is highly sensitive in the identification of low abundance phosphopeptides and could be a powerful tool to study phosphorylation-regulated assembly of protein complex.
Discovery of new candidate genes related to brain development using protein interaction information.

PubMed

Chen, Lei; Chu, Chen; Kong, Xiangyin; Huang, Tao; Cai, Yu-Dong

2015-01-01

Human brain development is a dramatic process composed of a series of complex and fine-tuned spatiotemporal gene expressions. A good comprehension of this process can assist us in developing the potential of our brain. However, we have only limited knowledge about the genes and gene functions that are involved in this biological process. Therefore, a substantial demand remains to discover new brain development-related genes and identify their biological functions. In this study, we aimed to discover new brain-development related genes by building a computational method. We referred to a series of computational methods used to discover new disease-related genes and developed a similar method. In this method, the shortest path algorithm was executed on a weighted graph that was constructed using protein-protein interactions. New candidate genes fell on at least one of the shortest paths connecting two known genes that are related to brain development. A randomization test was then adopted to filter positive discoveries. Of the final identified genes, several have been reported to be associated with brain development, indicating the effectiveness of the method, whereas several of the others may have potential roles in brain development.
Developing computer-based training programs for basic mammalian histology: Didactic versus discovery-based design

NASA Astrophysics Data System (ADS)

Fabian, Henry Joel

Educators have long tried to understand what stimulates students to learn. The Swiss psychologist and zoologist, Jean Claude Piaget, suggested that students are stimulated to learn when they attempt to resolve confusion. He reasoned that students try to explain the world with the knowledge they have acquired in life. When they find their own explanations to be inadequate to explain phenomena, students find themselves in a temporary state of confusion. This prompts students to seek more plausible explanations. At this point, students are primed for learning (Piaget 1964). The Piagetian approach described above is called learning by discovery. To promote discovery learning, a teacher must first allow the student to recognize his misconception and then provide a plausible explanation to replace that misconception (Chinn and Brewer 1993). One application of this method is found in the various learning cycles, which have been demonstrated to be effective means for teaching science (Renner and Lawson 1973, Lawson 1986, Marek and Methven 1991, and Glasson & Lalik 1993). In contrast to the learning cycle, tutorial computer programs are generally not designed to correct student misconceptions, but rather follow a passive, didactic method of teaching. In the didactic or expositional method, the student is told about a phenomenon, but is neither encouraged to explore it, nor explain it in his own terms (Schneider and Renner 1980).
Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy.

PubMed

Bekhuis, Tanja

2006-04-03

Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians.
Intelligent Systems: Terrestrial Observation and Prediction Using Remote Sensing Data

NASA Technical Reports Server (NTRS)

Coughlan, Joseph C.

2005-01-01

NASA has made science and technology investments to better utilize its large space-borne remote sensing data holdings of the Earth. With the launch of Terra, NASA created a data-rich environment where the challenge is to fully utilize the data collected from EOS however, despite unprecedented amounts of observed data, there is a need for increasing the frequency, resolution, and diversity of observations. Current terrestrial models that use remote sensing data were constructed in a relatively data and compute limited era and do not take full advantage of on-line learning methods and assimilation techniques that can exploit these data. NASA has invested in visualization, data mining and knowledge discovery methods which have facilitated data exploitation, but these methods are insufficient for improving Earth science models that have extensive background knowledge nor do these methods refine understanding of complex processes. Investing in interdisciplinary teams that include computational scientists can lead to new models and systems for online operation and analysis of data that can autonomously improve in prediction skill over time.
The OceanLink Project

NASA Astrophysics Data System (ADS)

Narock, T.; Arko, R. A.; Carbotte, S. M.; Chandler, C. L.; Cheatham, M.; Finin, T.; Hitzler, P.; Krisnadhi, A.; Raymond, L. M.; Shepherd, A.; Wiebe, P. H.

2014-12-01

A wide spectrum of maturing methods and tools, collectively characterized as the Semantic Web, is helping to vastly improve the dissemination of scientific research. Creating semantic integration requires input from both domain and cyberinfrastructure scientists. OceanLink, an NSF EarthCube Building Block, is demonstrating semantic technologies through the integration of geoscience data repositories, library holdings, conference abstracts, and funded research awards. Meeting project objectives involves applying semantic technologies to support data representation, discovery, sharing and integration. Our semantic cyberinfrastructure components include ontology design patterns, Linked Data collections, semantic provenance, and associated services to enhance data and knowledge discovery, interoperation, and integration. We discuss how these components are integrated, the continued automated and semi-automated creation of semantic metadata, and techniques we have developed to integrate ontologies, link resources, and preserve provenance and attribution.
Mentor-mentee Relationship: A Win-Win Contract In Graduate Medical Education.

PubMed

Toklu, Hale Z; Fuller, Jacklyn C

2017-12-05

Scholarly activities (i.e., the discovery of new knowledge; development of new technologies, methods, materials, or uses; integration of knowledge leading to new understanding) are intended to measure the quality and quantity of dissemination of knowledge. A successful mentorship program is necessary during residency to help residents achieve the six core competencies (patient care, medical knowledge, practice-based learning and improvement, systems-based practice, professionalism, interpersonal and communication skills) required by the Accreditation Council for Graduate Medical Education (ACGME). The role of the mentor in this process is pivotal in the advancement of the residents' knowledge about evidence-based medicine. With this process, while mentees become more self-regulated, exhibit confidence in their performance, and demonstrate more insight and aptitude in their jobs, mentors also achieve elevated higher self-esteem, enhanced leadership skills, and personal gratification. As such, we may conclude that mentoring is a two-sided relationship; i.e., a 'win-win' style of commitment between the mentor and mentee. Hence, both parties will eventually advance academically, as well as professionally.
Modeling technology innovation: how science, engineering, and industry methods can combine to generate beneficial socioeconomic impacts.

PubMed

Stone, Vathsala I; Lane, Joseph P

2012-05-16

Government-sponsored science, technology, and innovation (STI) programs support the socioeconomic aspects of public policies, in addition to expanding the knowledge base. For example, beneficial healthcare services and devices are expected to result from investments in research and development (R&D) programs, which assume a causal link to commercial innovation. Such programs are increasingly held accountable for evidence of impact-that is, innovative goods and services resulting from R&D activity. However, the absence of comprehensive models and metrics skews evidence gathering toward bibliometrics about research outputs (published discoveries), with less focus on transfer metrics about development outputs (patented prototypes) and almost none on econometrics related to production outputs (commercial innovations). This disparity is particularly problematic for the expressed intent of such programs, as most measurable socioeconomic benefits result from the last category of outputs. This paper proposes a conceptual framework integrating all three knowledge-generating methods into a logic model, useful for planning, obtaining, and measuring the intended beneficial impacts through the implementation of knowledge in practice. Additionally, the integration of the Context-Input-Process-Product (CIPP) model of evaluation proactively builds relevance into STI policies and programs while sustaining rigor. The resulting logic model framework explicitly traces the progress of knowledge from inputs, following it through the three knowledge-generating processes and their respective knowledge outputs (discovery, invention, innovation), as it generates the intended socio-beneficial impacts. It is a hybrid model for generating technology-based innovations, where best practices in new product development merge with a widely accepted knowledge-translation approach. Given the emphasis on evidence-based practice in the medical and health fields and "bench to bedside" expectations for knowledge transfer, sponsors and grantees alike should find the model useful for planning, implementing, and evaluating innovation processes. High-cost/high-risk industries like healthcare require the market deployment of technology-based innovations to improve domestic society in a global economy. An appropriate balance of relevance and rigor in research, development, and production is crucial to optimize the return on public investment in such programs. The technology-innovation process needs a comprehensive operational model to effectively allocate public funds and thereby deliberately and systematically accomplish socioeconomic benefits.
Knowledge Discovery in Biological Databases for Revealing Candidate Genes Linked to Complex Phenotypes.

PubMed

Hassani-Pak, Keywan; Rawlings, Christopher

2017-06-13

Genetics and "omics" studies designed to uncover genotype to phenotype relationships often identify large numbers of potential candidate genes, among which the causal genes are hidden. Scientists generally lack the time and technical expertise to review all relevant information available from the literature, from key model species and from a potentially wide range of related biological databases in a variety of data formats with variable quality and coverage. Computational tools are needed for the integration and evaluation of heterogeneous information in order to prioritise candidate genes and components of interaction networks that, if perturbed through potential interventions, have a positive impact on the biological outcome in the whole organism without producing negative side effects. Here we review several bioinformatics tools and databases that play an important role in biological knowledge discovery and candidate gene prioritization. We conclude with several key challenges that need to be addressed in order to facilitate biological knowledge discovery in the future.
Company Profile: Selventa, Inc.

PubMed

Fryburg, David A; Latino, Louis J; Tagliamonte, John; Kenney, Renee D; Song, Diane H; Levine, Arnold J; de Graaf, David

2012-08-01

Selventa, Inc. (MA, USA) is a biomarker discovery company that enables personalized healthcare. Originally founded as Genstruct, Inc., Selventa has undergone significant evolution from a technology-based service provider to an active partner in the development of diagnostic tests, functioning as a molecular dashboard of disease activity using a unique platform. As part of that evolution, approximately 2 years ago the company was rebranded as Selventa to reflect its new identity and mission. The contributions to biomedical research by Selventa are based on in silico, reverse-engineering methods to determine biological causality. That is, given a set of in vitro or in vivo biological observations, which biological mechanisms can explain the measured results? Facilitated by a large and carefully curated knowledge base, these in silico methods generated new insights into the mechanisms driving a disease. As Selventa's methods would enable biomarker discovery and be directly applicable to generating novel diagnostics, the scientists at Selventa have focused on the development of predictive biomarkers of response in autoimmune and oncologic diseases. Selventa is presently building a portfolio of independent, as well as partnered, biomarker projects with the intention to create diagnostic tests that predict response to therapy.
[Analysis on traditional Chinese medicine prescriptions treating cancer based on traditional Chinese medicine inheritance assistance system and discovery of new prescriptions].

PubMed

Yu, Ming; Cao, Qi-chen; Su, Yu-xi; Sui, Xin; Yang, Hong-jun; Huang, Lu-qi; Wang, Wen-ping

2015-08-01

Malignant tumor is one of the main causes for death in the world at present as well as a major disease seriously harming human health and life and restricting the social and economic development. There are many kinds of reports about traditional Chinese medicine patent prescriptions, empirical prescriptions and self-made prescriptions treating cancer, and prescription rules were often analyzed based on medication frequency. Such methods were applicable for discovering dominant experience but hard to have an innovative discovery and knowledge. In this paper, based on the traditional Chinese medicine inheritance assistance system, the software integration of mutual information improvement method, complex system entropy clustering and unsupervised entropy-level clustering data mining methods was adopted to analyze the rules of traditional Chinese medicine prescriptions for cancer. Totally 114 prescriptions were selected, the frequency of herbs in prescription was determined, and 85 core combinations and 13 new prescriptions were indentified. The traditional Chinese medicine inheritance assistance system, as a valuable traditional Chinese medicine research-supporting tool, can be used to record, manage, inquire and analyze prescription data.
Great Originals of Modern Physics

ERIC Educational Resources Information Center

Decker, Fred W.

1972-01-01

European travel can provide an intimate view of the implements and locales of great discoveries in physics for the knowledgeable traveler. The four museums at Cambridge, London, Remscheid-Lennep, and Munich display a full range of discovery apparatus in modern physics as outlined here. (Author/TS)
Dulse on the Distaff Side.

ERIC Educational Resources Information Center

MacKenzie, Marion

1983-01-01

Scientific research leading to the discovery of female plants of the red alga Palmaria plamata (dulse) is described. This discovery has not only advanced knowledge of marine organisms and taxonomic relationships but also has practical implications. The complete life cycle of this organism is included. (JN)
43 CFR 4.1132 - Scope of discovery.

Code of Federal Regulations, 2014 CFR

2014-10-01

..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...

43 CFR 4.1132 - Scope of discovery.

Code of Federal Regulations, 2012 CFR

2012-10-01

..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
43 CFR 4.1132 - Scope of discovery.

Code of Federal Regulations, 2013 CFR

2013-10-01

..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
43 CFR 4.1132 - Scope of discovery.

Code of Federal Regulations, 2011 CFR

2011-10-01

..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
Scientific Knowledge Discovery in Complex Semantic Networks of Geophysical Systems

NASA Astrophysics Data System (ADS)

Fox, P.

2012-04-01

The vast majority of explorations of the Earth's systems are limited in their ability to effectively explore the most important (often most difficult) problems because they are forced to interconnect at the data-element, or syntactic, level rather than at a higher scientific, or semantic, level. Recent successes in the application of complex network theory and algorithms to climate data, raise expectations that more general graph-based approaches offer the opportunity for new discoveries. In the past ~ 5 years in the natural sciences there has substantial progress in providing both specialists and non-specialists the ability to describe in machine readable form, geophysical quantities and relations among them in meaningful and natural ways, effectively breaking the prior syntax barrier. The corresponding open-world semantics and reasoning provide higher-level interconnections. That is, semantics provided around the data structures, using semantically-equipped tools, and semantically aware interfaces between science application components allowing for discovery at the knowledge level. More recently, formal semantic approaches to continuous and aggregate physical processes are beginning to show promise and are soon likely to be ready to apply to geoscientific systems. To illustrate these opportunities, this presentation presents two application examples featuring domain vocabulary (ontology) and property relations (named and typed edges in the graphs). First, a climate knowledge discovery pilot encoding and exploration of CMIP5 catalog information with the eventual goal to encode and explore CMIP5 data. Second, a multi-stakeholder knowledge network for integrated assessments in marine ecosystems, where the data is highly inter-disciplinary.
Causal discovery in the geosciences-Using synthetic data to learn how to interpret results

NASA Astrophysics Data System (ADS)

Ebert-Uphoff, Imme; Deng, Yi

2017-02-01

Causal discovery algorithms based on probabilistic graphical models have recently emerged in geoscience applications for the identification and visualization of dynamical processes. The key idea is to learn the structure of a graphical model from observed spatio-temporal data, thus finding pathways of interactions in the observed physical system. Studying those pathways allows geoscientists to learn subtle details about the underlying dynamical mechanisms governing our planet. Initial studies using this approach on real-world atmospheric data have shown great potential for scientific discovery. However, in these initial studies no ground truth was available, so that the resulting graphs have been evaluated only by whether a domain expert thinks they seemed physically plausible. The lack of ground truth is a typical problem when using causal discovery in the geosciences. Furthermore, while most of the connections found by this method match domain knowledge, we encountered one type of connection for which no explanation was found. To address both of these issues we developed a simulation framework that generates synthetic data of typical atmospheric processes (advection and diffusion). Applying the causal discovery algorithm to the synthetic data allowed us (1) to develop a better understanding of how these physical processes appear in the resulting connectivity graphs, and thus how to better interpret such connectivity graphs when obtained from real-world data; (2) to solve the mystery of the previously unexplained connections.
Binding Free Energy Calculations for Lead Optimization: Assessment of Their Accuracy in an Industrial Drug Design Context.

PubMed

Homeyer, Nadine; Stoll, Friederike; Hillisch, Alexander; Gohlke, Holger

2014-08-12

Correctly ranking compounds according to their computed relative binding affinities will be of great value for decision making in the lead optimization phase of industrial drug discovery. However, the performance of existing computationally demanding binding free energy calculation methods in this context is largely unknown. We analyzed the performance of the molecular mechanics continuum solvent, the linear interaction energy (LIE), and the thermodynamic integration (TI) approach for three sets of compounds from industrial lead optimization projects. The data sets pose challenges typical for this early stage of drug discovery. None of the methods was sufficiently predictive when applied out of the box without considering these challenges. Detailed investigations of failures revealed critical points that are essential for good binding free energy predictions. When data set-specific features were considered accordingly, predictions valuable for lead optimization could be obtained for all approaches but LIE. Our findings lead to clear recommendations for when to use which of the above approaches. Our findings also stress the important role of expert knowledge in this process, not least for estimating the accuracy of prediction results by TI, using indicators such as the size and chemical structure of exchanged groups and the statistical error in the predictions. Such knowledge will be invaluable when it comes to the question which of the TI results can be trusted for decision making.
14 CFR 406.143 - Discovery.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 14 Aeronautics and Space 4 2014-01-01 2014-01-01 false Discovery. 406.143 Section 406.143... Transportation Adjudications § 406.143 Discovery. (a) Initiation of discovery. Any party may initiate discovery... after a complaint has been filed. (b) Methods of discovery. The following methods of discovery are...
14 CFR 406.143 - Discovery.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 14 Aeronautics and Space 4 2011-01-01 2011-01-01 false Discovery. 406.143 Section 406.143... Transportation Adjudications § 406.143 Discovery. (a) Initiation of discovery. Any party may initiate discovery... after a complaint has been filed. (b) Methods of discovery. The following methods of discovery are...
14 CFR 406.143 - Discovery.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 14 Aeronautics and Space 4 2012-01-01 2012-01-01 false Discovery. 406.143 Section 406.143... Transportation Adjudications § 406.143 Discovery. (a) Initiation of discovery. Any party may initiate discovery... after a complaint has been filed. (b) Methods of discovery. The following methods of discovery are...
14 CFR 406.143 - Discovery.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 14 Aeronautics and Space 4 2013-01-01 2013-01-01 false Discovery. 406.143 Section 406.143... Transportation Adjudications § 406.143 Discovery. (a) Initiation of discovery. Any party may initiate discovery... after a complaint has been filed. (b) Methods of discovery. The following methods of discovery are...
14 CFR 406.143 - Discovery.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 14 Aeronautics and Space 4 2010-01-01 2010-01-01 false Discovery. 406.143 Section 406.143... Transportation Adjudications § 406.143 Discovery. (a) Initiation of discovery. Any party may initiate discovery... after a complaint has been filed. (b) Methods of discovery. The following methods of discovery are...
Lung tumor diagnosis and subtype discovery by gene expression profiling.

PubMed

Wang, Lu-yong; Tu, Zhuowen

2006-01-01

The optimal treatment of patients with complex diseases, such as cancers, depends on the accurate diagnosis by using a combination of clinical and histopathological data. In many scenarios, it becomes tremendously difficult because of the limitations in clinical presentation and histopathology. To accurate diagnose complex diseases, the molecular classification based on gene or protein expression profiles are indispensable for modern medicine. Moreover, many heterogeneous diseases consist of various potential subtypes in molecular basis and differ remarkably in their response to therapies. It is critical to accurate predict subgroup on disease gene expression profiles. More fundamental knowledge of the molecular basis and classification of disease could aid in the prediction of patient outcome, the informed selection of therapies, and identification of novel molecular targets for therapy. In this paper, we propose a new disease diagnostic method, probabilistic boosting tree (PB tree) method, on gene expression profiles of lung tumors. It enables accurate disease classification and subtype discovery in disease. It automatically constructs a tree in which each node combines a number of weak classifiers into a strong classifier. Also, subtype discovery is naturally embedded in the learning process. Our algorithm achieves excellent diagnostic performance, and meanwhile it is capable of detecting the disease subtype based on gene expression profile.
Visual-servoing optical microscopy

DOEpatents

Callahan, Daniel E.; Parvin, Bahram

2009-06-09

The present invention provides methods and devices for the knowledge-based discovery and optimization of differences between cell types. In particular, the present invention provides visual servoing optical microscopy, as well as analysis methods. The present invention provides means for the close monitoring of hundreds of individual, living cells over time: quantification of dynamic physiological responses in multiple channels; real-time digital image segmentation and analysis; intelligent, repetitive computer-applied cell stress and cell stimulation; and the ability to return to the same field of cells for long-term studies and observation. The present invention further provides means to optimize culture conditions for specific subpopulations of cells.
Visual-servoing optical microscopy

DOEpatents

Callahan, Daniel E [Martinez, CA; Parvin, Bahram [Mill Valley, CA

2011-05-24

The present invention provides methods and devices for the knowledge-based discovery and optimization of differences between cell types. In particular, the present invention provides visual servoing optical microscopy, as well as analysis methods. The present invention provides means for the close monitoring of hundreds of individual, living cells over time; quantification of dynamic physiological responses in multiple channels; real-time digital image segmentation and analysis; intelligent, repetitive computer-applied cell stress and cell stimulation; and the ability to return to the same field of cells for long-term studies and observation. The present invention further provides means to optimize culture conditions for specific subpopulations of cells.
Visual-servoing optical microscopy

DOEpatents

Callahan, Daniel E; Parvin, Bahram

2013-10-01

The present invention provides methods and devices for the knowledge-based discovery and optimization of differences between cell types. In particular, the present invention provides visual servoing optical microscopy, as well as analysis methods. The present invention provides means for the close monitoring of hundreds of individual, living cells over time; quantification of dynamic physiological responses in multiple channels; real-time digital image segmentation and analysis; intelligent, repetitive computer-applied cell stress and cell stimulation; and the ability to return to the same field of cells for long-term studies and observation. The present invention further provides means to optimize culture conditions for specific subpopulations of cells.
An Ensemble Approach to Building Mercer Kernels with Prior Information

NASA Technical Reports Server (NTRS)

Srivastava, Ashok N.; Schumann, Johann; Fischer, Bernd

2005-01-01

This paper presents a new methodology for automatic knowledge driven data mining based on the theory of Mercer Kernels, which are highly nonlinear symmetric positive definite mappings from the original image space to a very high, possibly dimensional feature space. we describe a new method called Mixture Density Mercer Kernels to learn kernel function directly from data, rather than using pre-defined kernels. These data adaptive kernels can encode prior knowledge in the kernel using a Bayesian formulation, thus allowing for physical information to be encoded in the model. Specifically, we demonstrate the use of the algorithm in situations with extremely small samples of data. We compare the results with existing algorithms on data from the Sloan Digital Sky Survey (SDSS) and demonstrate the method's superior performance against standard methods. The code for these experiments has been generated with the AUTOBAYES tool, which automatically generates efficient and documented C/C++ code from abstract statistical model specifications. The core of the system is a schema library which contains templates for learning and knowledge discovery algorithms like different versions of EM, or numeric optimization methods like conjugate gradient methods. The template instantiation is supported by symbolic-algebraic computations, which allows AUTOBAYES to find closed-form solutions and, where possible, to integrate them into the code.
Towards a Semantic Web of Things: A Hybrid Semantic Annotation, Extraction, and Reasoning Framework for Cyber-Physical System

PubMed Central

Wu, Zhenyu; Xu, Yuan; Yang, Yunong; Zhang, Chunhong; Zhu, Xinning; Ji, Yang

2017-01-01

Web of Things (WoT) facilitates the discovery and interoperability of Internet of Things (IoT) devices in a cyber-physical system (CPS). Moreover, a uniform knowledge representation of physical resources is quite necessary for further composition, collaboration, and decision-making process in CPS. Though several efforts have integrated semantics with WoT, such as knowledge engineering methods based on semantic sensor networks (SSN), it still could not represent the complex relationships between devices when dynamic composition and collaboration occur, and it totally depends on manual construction of a knowledge base with low scalability. In this paper, to addresses these limitations, we propose the semantic Web of Things (SWoT) framework for CPS (SWoT4CPS). SWoT4CPS provides a hybrid solution with both ontological engineering methods by extending SSN and machine learning methods based on an entity linking (EL) model. To testify to the feasibility and performance, we demonstrate the framework by implementing a temperature anomaly diagnosis and automatic control use case in a building automation system. Evaluation results on the EL method show that linking domain knowledge to DBpedia has a relative high accuracy and the time complexity is at a tolerant level. Advantages and disadvantages of SWoT4CPS with future work are also discussed. PMID:28230725
From Information Center to Discovery System: Next Step for Libraries?

ERIC Educational Resources Information Center

Marcum, James W.

2001-01-01

Proposes a discovery system model to guide technology integration in academic libraries that fuses organizational learning, systems learning, and knowledge creation techniques with constructivist learning practices to suggest possible future directions for digital libraries. Topics include accessing visual and continuous media; information…
Foreword to "The Secret of Childhood."

ERIC Educational Resources Information Center

Stephenson, Margaret E.

2000-01-01

Discusses the basic discoveries of Montessori's Casa dei Bambini. Considers principles of Montessori's organizing theory: the absorbent mind, the unfolding nature of life, the spiritual embryo, self-construction, acquisition of culture, creativity of life, repetition of exercise, freedom within limits, children's discovery of knowledge, the secret…
Daily life activity routine discovery in hemiparetic rehabilitation patients using topic models.

PubMed

Seiter, J; Derungs, A; Schuster-Amft, C; Amft, O; Tröster, G

2015-01-01

Monitoring natural behavior and activity routines of hemiparetic rehabilitation patients across the day can provide valuable progress information for therapists and patients and contribute to an optimized rehabilitation process. In particular, continuous patient monitoring could add type, frequency and duration of daily life activity routines and hence complement standard clinical scores that are assessed for particular tasks only. Machine learning methods have been applied to infer activity routines from sensor data. However, supervised methods require activity annotations to build recognition models and thus require extensive patient supervision. Discovery methods, including topic models could provide patient routine information and deal with variability in activity and movement performance across patients. Topic models have been used to discover characteristic activity routine patterns of healthy individuals using activity primitives recognized from supervised sensor data. Yet, the applicability of topic models for hemiparetic rehabilitation patients and techniques to derive activity primitives without supervision needs to be addressed. We investigate, 1) whether a topic model-based activity routine discovery framework can infer activity routines of rehabilitation patients from wearable motion sensor data. 2) We compare the performance of our topic model-based activity routine discovery using rule-based and clustering-based activity vocabulary. We analyze the activity routine discovery in a dataset recorded with 11 hemiparetic rehabilitation patients during up to ten full recording days per individual in an ambulatory daycare rehabilitation center using wearable motion sensors attached to both wrists and the non-affected thigh. We introduce and compare rule-based and clustering-based activity vocabulary to process statistical and frequency acceleration features to activity words. Activity words were used for activity routine pattern discovery using topic models based on Latent Dirichlet Allocation. Discovered activity routine patterns were then mapped to six categorized activity routines. Using the rule-based approach, activity routines could be discovered with an average accuracy of 76% across all patients. The rule-based approach outperformed clustering by 10% and showed less confusions for predicted activity routines. Topic models are suitable to discover daily life activity routines in hemiparetic rehabilitation patients without trained classifiers and activity annotations. Activity routines show characteristic patterns regarding activity primitives including body and extremity postures and movement. A patient-independent rule set can be derived. Including expert knowledge supports successful activity routine discovery over completely data-driven clustering.

Optical methods in nano-biotechnology

NASA Astrophysics Data System (ADS)

Bruno, Luigi; Gentile, Francesco

2016-01-01

A scientific theory is not a mathematical paradigm. It is a framework that explains natural facts and may predict future observations. A scientific theory may be modified, improved, or rejected. Science is less a collection of theories and more the process that brings either to deny some hypothesis, maintain or accept somehow universal beliefs (or disbeliefs), and create new models that may improve or replace precedent theories. This process cannot be entrusted to common sense, personal experiences or anecdotes (many precepts in physics are indeed counterintuitive), but on a rigorous design, observation and rational to statistical analysis of new experiments. Scientific results are always provisional: scientists rarely proclaim an absolute truth or absolute certainty. Uncertainty is inevitable at the frontiers of knowledge. Notably, this is the definition of the scientific method and what we have written in the above echoes the opinion Marcia McNutt who is the Editor of Science 'Science is a method for deciding whether what we choose to believe has a basis in the laws of nature or not'. A new discovery, a new theory that explains that discovery and the scientific method itself need observations, verifications and are susceptible of falsification.
The application of knowledge discovery in databases to post-marketing drug safety: example of the WHO database.

PubMed

Bate, A; Lindquist, M; Edwards, I R

2008-04-01

After market launch, new information on adverse effects of medicinal products is almost exclusively first highlighted by spontaneous reporting. As data sets of spontaneous reports have become larger, and computational capability has increased, quantitative methods have been increasingly applied to such data sets. The screening of such data sets is an application of knowledge discovery in databases (KDD). Effective KDD is an iterative and interactive process made up of the following steps: developing an understanding of an application domain, creating a target data set, data cleaning and pre-processing, data reduction and projection, choosing the data mining task, choosing the data mining algorithm, data mining, interpretation of results and consolidating and using acquired knowledge. The process of KDD as it applies to the analysis of spontaneous reports can be exemplified by its routine use on the 3.5 million suspected adverse drug reaction (ADR) reports in the WHO ADR database. Examples of new adverse effects first highlighted by the KDD process on WHO data include topiramate glaucoma, infliximab vasculitis and the association of selective serotonin reuptake inhibitors (SSRIs) and neonatal convulsions. The KDD process has already improved our ability to highlight previously unsuspected ADRs for clinical review in spontaneous reporting, and we anticipate that such techniques will be increasingly used in the successful screening of other healthcare data sets such as patient records in the future.
The discovery of medicines for rare diseases

PubMed Central

Swinney, David C; Xia, Shuangluo

2015-01-01

There is a pressing need for new medicines (new molecular entities; NMEs) for rare diseases as few of the 6800 rare diseases (according to the NIH) have approved treatments. Drug discovery strategies for the 102 orphan NMEs approved by the US FDA between 1999 and 2012 were analyzed to learn from past success: 46 NMEs were first in class; 51 were followers; and five were imaging agents. First-in-class medicines were discovered with phenotypic assays (15), target-based approaches (12) and biologic strategies (18). Identification of genetic causes in areas with more basic and translational research such as cancer and in-born errors in metabolism contributed to success regardless of discovery strategy. In conclusion, greater knowledge increases the chance of success and empirical solutions can be effective when knowledge is incomplete. PMID:25068983
The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery

PubMed Central

2014-01-01

The Semanticscience Integrated Ontology (SIO) is an ontology to facilitate biomedical knowledge discovery. SIO features a simple upper level comprised of essential types and relations for the rich description of arbitrary (real, hypothesized, virtual, fictional) objects, processes and their attributes. SIO specifies simple design patterns to describe and associate qualities, capabilities, functions, quantities, and informational entities including textual, geometrical, and mathematical entities, and provides specific extensions in the domains of chemistry, biology, biochemistry, and bioinformatics. SIO provides an ontological foundation for the Bio2RDF linked data for the life sciences project and is used for semantic integration and discovery for SADI-based semantic web services. SIO is freely available to all users under a creative commons by attribution license. See website for further information: http://sio.semanticscience.org. PMID:24602174
43 CFR 4.1130 - Discovery methods.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 43 Public Lands: Interior 1 2013-10-01 2013-10-01 false Discovery methods. 4.1130 Section 4.1130... Special Rules Applicable to Surface Coal Mining Hearings and Appeals Discovery § 4.1130 Discovery methods. Parties may obtain discovery by one or more of the following methods— (a) Depositions upon oral...
43 CFR 4.1130 - Discovery methods.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 43 Public Lands: Interior 1 2010-10-01 2010-10-01 false Discovery methods. 4.1130 Section 4.1130... Special Rules Applicable to Surface Coal Mining Hearings and Appeals Discovery § 4.1130 Discovery methods. Parties may obtain discovery by one or more of the following methods— (a) Depositions upon oral...
43 CFR 4.1130 - Discovery methods.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 43 Public Lands: Interior 1 2011-10-01 2011-10-01 false Discovery methods. 4.1130 Section 4.1130... Special Rules Applicable to Surface Coal Mining Hearings and Appeals Discovery § 4.1130 Discovery methods. Parties may obtain discovery by one or more of the following methods— (a) Depositions upon oral...
43 CFR 4.1130 - Discovery methods.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 43 Public Lands: Interior 1 2014-10-01 2014-10-01 false Discovery methods. 4.1130 Section 4.1130... Special Rules Applicable to Surface Coal Mining Hearings and Appeals Discovery § 4.1130 Discovery methods. Parties may obtain discovery by one or more of the following methods— (a) Depositions upon oral...
43 CFR 4.1130 - Discovery methods.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 43 Public Lands: Interior 1 2012-10-01 2011-10-01 true Discovery methods. 4.1130 Section 4.1130... Special Rules Applicable to Surface Coal Mining Hearings and Appeals Discovery § 4.1130 Discovery methods. Parties may obtain discovery by one or more of the following methods— (a) Depositions upon oral...
Knowledge discovery from structured mammography reports using inductive logic programming.

PubMed

Burnside, Elizabeth S; Davis, Jesse; Costa, Victor Santos; Dutra, Inês de Castro; Kahn, Charles E; Fine, Jason; Page, David

2005-01-01

The development of large mammography databases provides an opportunity for knowledge discovery and data mining techniques to recognize patterns not previously appreciated. Using a database from a breast imaging practice containing patient risk factors, imaging findings, and biopsy results, we tested whether inductive logic programming (ILP) could discover interesting hypotheses that could subsequently be tested and validated. The ILP algorithm discovered two hypotheses from the data that were 1) judged as interesting by a subspecialty trained mammographer and 2) validated by analysis of the data itself.
A Metadata based Knowledge Discovery Methodology for Seeding Translational Research.

PubMed

Kothari, Cartik R; Payne, Philip R O

2015-01-01

In this paper, we present a semantic, metadata based knowledge discovery methodology for identifying teams of researchers from diverse backgrounds who can collaborate on interdisciplinary research projects: projects in areas that have been identified as high-impact areas at The Ohio State University. This methodology involves the semantic annotation of keywords and the postulation of semantic metrics to improve the efficiency of the path exploration algorithm as well as to rank the results. Results indicate that our methodology can discover groups of experts from diverse areas who can collaborate on translational research projects.
ASIS 2000: Knowledge Innovations: Celebrating Our Heritage, Designing Our Future. Proceedings of the ASIS Annual Meeting (63rd, Chicago, Illinois, November 12-16, 2000). Volume 37.

ERIC Educational Resources Information Center

Kraft, Donald H., Ed.

The 2000 ASIS (American Society for Information Science) conference explored knowledge innovation. The tracks in the conference program included knowledge discovery, capture, and creation; classification and representation; information retrieval; knowledge dissemination; and social, behavioral, ethical, and legal aspects. This proceedings is…
Evaluating the Science of Discovery in Complex Health Systems

ERIC Educational Resources Information Center

Norman, Cameron D.; Best, Allan; Mortimer, Sharon; Huerta, Timothy; Buchan, Alison

2011-01-01

Complex health problems such as chronic disease or pandemics require knowledge that transcends disciplinary boundaries to generate solutions. Such transdisciplinary discovery requires researchers to work and collaborate across boundaries, combining elements of basic and applied science. At the same time, calls for more interdisciplinary health…
29 CFR 18.14 - Scope of discovery.

Code of Federal Regulations, 2014 CFR

2014-07-01

... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
49 CFR 386.38 - Scope of discovery.

Code of Federal Regulations, 2011 CFR

2011-10-01

... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
49 CFR 386.38 - Scope of discovery.

Code of Federal Regulations, 2012 CFR

2012-10-01

... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
29 CFR 18.14 - Scope of discovery.

Code of Federal Regulations, 2012 CFR

2012-07-01

... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
49 CFR 386.38 - Scope of discovery.

Code of Federal Regulations, 2013 CFR

2013-10-01

... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
29 CFR 18.14 - Scope of discovery.

Code of Federal Regulations, 2011 CFR

2011-07-01

... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
29 CFR 18.14 - Scope of discovery.

Code of Federal Regulations, 2013 CFR

2013-07-01

... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...

49 CFR 386.38 - Scope of discovery.

Code of Federal Regulations, 2014 CFR

2014-10-01

... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
ESIP's Earth Science Knowledge Graph (ESKG) Testbed Project: An Automatic Approach to Building Interdisciplinary Earth Science Knowledge Graphs to Improve Data Discovery

NASA Astrophysics Data System (ADS)

McGibbney, L. J.; Jiang, Y.; Burgess, A. B.

2017-12-01

Big Earth observation data have been produced, archived and made available online, but discovering the right data in a manner that precisely and efficiently satisfies user needs presents a significant challenge to the Earth Science (ES) community. An emerging trend in information retrieval community is to utilize knowledge graphs to assist users in quickly finding desired information from across knowledge sources. This is particularly prevalent within the fields of social media and complex multimodal information processing to name but a few, however building a domain-specific knowledge graph is labour-intensive and hard to keep up-to-date. In this work, we update our progress on the Earth Science Knowledge Graph (ESKG) project; an ESIP-funded testbed project which provides an automatic approach to building a dynamic knowledge graph for ES to improve interdisciplinary data discovery by leveraging implicit, latent existing knowledge present within across several U.S Federal Agencies e.g. NASA, NOAA and USGS. ESKG strengthens ties between observations and user communities by: 1) developing a knowledge graph derived from various sources e.g. Web pages, Web Services, etc. via natural language processing and knowledge extraction techniques; 2) allowing users to traverse, explore, query, reason and navigate ES data via knowledge graph interaction. ESKG has the potential to revolutionize the way in which ES communities interact with ES data in the open world through the entity, spatial and temporal linkages and characteristics that make it up. This project enables the advancement of ESIP collaboration areas including both Discovery and Semantic Technologies by putting graph information right at our fingertips in an interactive, modern manner and reducing the efforts to constructing ontology. To demonstrate the ESKG concept, we will demonstrate use of our framework across NASA JPL's PO.DAAC, NOAA's Earth Observation Requirements Evaluation System (EORES) and various USGS systems.
A data-driven, knowledge-based approach to biomarker discovery: application to circulating microRNA markers of colorectal cancer prognosis.

PubMed

Vafaee, Fatemeh; Diakos, Connie; Kirschner, Michaela B; Reid, Glen; Michael, Michael Z; Horvath, Lisa G; Alinejad-Rokny, Hamid; Cheng, Zhangkai Jason; Kuncic, Zdenka; Clarke, Stephen

2018-01-01

Recent advances in high-throughput technologies have provided an unprecedented opportunity to identify molecular markers of disease processes. This plethora of complex-omics data has simultaneously complicated the problem of extracting meaningful molecular signatures and opened up new opportunities for more sophisticated integrative and holistic approaches. In this era, effective integration of data-driven and knowledge-based approaches for biomarker identification has been recognised as key to improving the identification of high-performance biomarkers, and necessary for translational applications. Here, we have evaluated the role of circulating microRNA as a means of predicting the prognosis of patients with colorectal cancer, which is the second leading cause of cancer-related death worldwide. We have developed a multi-objective optimisation method that effectively integrates a data-driven approach with the knowledge obtained from the microRNA-mediated regulatory network to identify robust plasma microRNA signatures which are reliable in terms of predictive power as well as functional relevance. The proposed multi-objective framework has the capacity to adjust for conflicting biomarker objectives and to incorporate heterogeneous information facilitating systems approaches to biomarker discovery. We have found a prognostic signature of colorectal cancer comprising 11 circulating microRNAs. The identified signature predicts the patients' survival outcome and targets pathways underlying colorectal cancer progression. The altered expression of the identified microRNAs was confirmed in an independent public data set of plasma samples of patients in early stage vs advanced colorectal cancer. Furthermore, the generality of the proposed method was demonstrated across three publicly available miRNA data sets associated with biomarker studies in other diseases.
Evidence-based medicine: is it a bridge too far?

PubMed

Fernandez, Ana; Sturmberg, Joachim; Lukersmith, Sue; Madden, Rosamond; Torkfar, Ghazal; Colagiuri, Ruth; Salvador-Carulla, Luis

2015-11-06

This paper aims to describe the contextual factors that gave rise to evidence-based medicine (EBM), as well as its controversies and limitations in the current health context. Our analysis utilizes two frameworks: (1) a complex adaptive view of health that sees both health and healthcare as non-linear phenomena emerging from their different components; and (2) the unified approach to the philosophy of science that provides a new background for understanding the differences between the phases of discovery, corroboration, and implementation in science. The need for standardization, the development of clinical epidemiology, concerns about the economic sustainability of health systems and increasing numbers of clinical trials, together with the increase in the computer's ability to handle large amounts of data, have paved the way for the development of the EBM movement. It was quickly adopted on the basis of authoritative knowledge rather than evidence of its own capacity to improve the efficiency and equity of health systems. The main problem with the EBM approach is the restricted and simplistic approach to scientific knowledge, which prioritizes internal validity as the major quality of the studies to be included in clinical guidelines. As a corollary, the preferred method for generating evidence is the explanatory randomized controlled trial. This method can be useful in the phase of discovery but is inadequate in the field of implementation, which needs to incorporate additional information including expert knowledge, patients' values and the context. EBM needs to move forward and perceive health and healthcare as a complex interaction, i.e. an interconnected, non-linear phenomenon that may be better analysed using a variety of complexity science techniques.
When fragments link: a bibliometric perspective on the development of fragment-based drug discovery.

PubMed

Romasanta, Angelo K S; van der Sijde, Peter; Hellsten, Iina; Hubbard, Roderick E; Keseru, Gyorgy M; van Muijlwijk-Koezen, Jacqueline; de Esch, Iwan J P

2018-05-05

Fragment-based drug discovery (FBDD) is a highly interdisciplinary field, rich in ideas integrated from pharmaceutical sciences, chemistry, biology, and physics, among others. To enrich our understanding of the development of the field, we used bibliometric techniques to analyze 3642 publications in FBDD, complementing accounts by key practitioners. Mapping its core papers, we found the transfer of knowledge from academia to industry. Co-authorship analysis showed that university-industry collaboration has grown over time. Moreover, we show how ideas from other scientific disciplines have been integrated into the FBDD paradigm. Keyword analysis showed that the field is organized into four interconnected practices: library design, fragment screening, computational methods, and optimization. This study highlights the importance of interactions among various individuals and institutions from diverse disciplines in newly emerging scientific fields. Copyright © 2018. Published by Elsevier Ltd.
Ethnopharmacological survey of Samburu district, Kenya

PubMed Central

Nanyingi, Mark O; Mbaria, James M; Lanyasunya, Adamson L; Wagate, Cyrus G; Koros, Kipsengeret B; Kaburia, Humphrey F; Munenge, Rahab W; Ogara, William O

2008-01-01

Background Ethnobotanical pharmacopoeia is confidently used in disease intervention and there is need for documentation and preservation of traditional medical knowledge to bolster the discovery of novel drugs. The objective of the present study was to document the indigenous medicinal plant utilization, management and their extinction threats in Samburu District, Kenya. Methods Field research was conducted in six divisions of Samburu District in Kenya. We randomly sampled 100 consented interviewees stratified by age, gender, occupation and level of education. We collected plant use data through semi-structured questionnaires; transect walks, oral interviews and focus groups discussions. Voucher specimens of all cited botanic species were collected and deposited at University of Nairobi's botany herbarium. Results Data on plant use from the informants yielded 990 citations on 56 medicinal plant species, which are used to treat 54 different animal and human diseases including; malaria, digestive disorders, respiratory syndromes and ectoparasites. Conclusion The ethnomedicinal use of plant species was documented in the study area for treatment of both human and veterinary diseases. The local population has high ethnobotanical knowledge and has adopted sound management conservation practices. The major threatening factors reported were anthropogenic and natural. Ethnomedical documentation and sustainable plant utilization can support drug discovery efforts in developing countries. PMID:18498665
Using an improved association rules mining optimization algorithm in web-based mobile-learning system

NASA Astrophysics Data System (ADS)

Huang, Yin; Chen, Jianhua; Xiong, Shaojun

2009-07-01

Mobile-Learning (M-learning) makes many learners get the advantages of both traditional learning and E-learning. Currently, Web-based Mobile-Learning Systems have created many new ways and defined new relationships between educators and learners. Association rule mining is one of the most important fields in data mining and knowledge discovery in databases. Rules explosion is a serious problem which causes great concerns, as conventional mining algorithms often produce too many rules for decision makers to digest. Since Web-based Mobile-Learning System collects vast amounts of student profile data, data mining and knowledge discovery techniques can be applied to find interesting relationships between attributes of learners, assessments, the solution strategies adopted by learners and so on. Therefore ,this paper focus on a new data-mining algorithm, combined with the advantages of genetic algorithm and simulated annealing algorithm , called ARGSA(Association rules based on an improved Genetic Simulated Annealing Algorithm), to mine the association rules. This paper first takes advantage of the Parallel Genetic Algorithm and Simulated Algorithm designed specifically for discovering association rules. Moreover, the analysis and experiment are also made to show the proposed method is superior to the Apriori algorithm in this Mobile-Learning system.
Computational approaches to predict bacteriophage–host relationships

PubMed Central

Edwards, Robert A.; McNair, Katelyn; Faust, Karoline; Raes, Jeroen; Dutilh, Bas E.

2015-01-01

Metagenomics has changed the face of virus discovery by enabling the accurate identification of viral genome sequences without requiring isolation of the viruses. As a result, metagenomic virus discovery leaves the first and most fundamental question about any novel virus unanswered: What host does the virus infect? The diversity of the global virosphere and the volumes of data obtained in metagenomic sequencing projects demand computational tools for virus–host prediction. We focus on bacteriophages (phages, viruses that infect bacteria), the most abundant and diverse group of viruses found in environmental metagenomes. By analyzing 820 phages with annotated hosts, we review and assess the predictive power of in silico phage–host signals. Sequence homology approaches are the most effective at identifying known phage–host pairs. Compositional and abundance-based methods contain significant signal for phage–host classification, providing opportunities for analyzing the unknowns in viral metagenomes. Together, these computational approaches further our knowledge of the interactions between phages and their hosts. Importantly, we find that all reviewed signals significantly link phages to their hosts, illustrating how current knowledge and insights about the interaction mechanisms and ecology of coevolving phages and bacteria can be exploited to predict phage–host relationships, with potential relevance for medical and industrial applications. PMID:26657537
Ontology-based content analysis of US patent applications from 2001-2010.

PubMed

Weber, Lutz; Böhme, Timo; Irmer, Matthias

2013-01-01

Ontology-based semantic text analysis methods allow to automatically extract knowledge relationships and data from text documents. In this review, we have applied these technologies for the systematic analysis of pharmaceutical patents. Hierarchical concepts from the knowledge domains of chemical compounds, diseases and proteins were used to annotate full-text US patent applications that deal with pharmacological activities of chemical compounds and filed in the years 2001-2010. Compounds claimed in these applications have been classified into their respective compound classes to review the distribution of scaffold types or general compound classes such as natural products in a time-dependent manner. Similarly, the target proteins and claimed utility of the compounds have been classified and the most relevant were extracted. The method presented allows the discovery of the main areas of innovation as well as emerging fields of patenting activities - providing a broad statistical basis for competitor analysis and decision-making efforts.
Identifying UMLS concepts from ECG Impressions using KnowledgeMap

PubMed Central

Denny, Joshua C.; Spickard, Anderson; Miller, Randolph A; Schildcrout, Jonathan; Darbar, Dawood; Rosenbloom, S. Trent; Peterson, Josh F.

2005-01-01

Electrocardiogram (ECG) impressions represent a wealth of medical information for potential decision support and drug-effect discovery. Much of this information is inaccessible to automated methods in the free-text portion of the ECG report. We studied the application of the KnowledgeMap concept identifier (KMCI) to map Unified Medical Language System (UMLS) concepts from ECG impressions. ECGs were processed by KMCI and the results scored for accuracy by multiple raters. Reviewers also recorded unidentified concepts through the scoring interface. Overall, KMCI correctly identified 1059 out of 1171 concepts for a recall of 0.90. Precision, indicating the proportion of ECG concepts correctly identified, was 0.94. KMCI was particularly effective at identifying ECG rhythms (330/333), perfusion changes (65/66), and noncardiac medical concepts (11/11). In conclusion, KMCI is an effective method for mapping ECG impressions to UMLS concepts. PMID:16779029
Reuniting Virtue and Knowledge

ERIC Educational Resources Information Center

Culham, Tom

2015-01-01

Einstein held that intuition is more important than rational inquiry as a source of discovery. Further, he explicitly and implicitly linked the heart, the sacred, devotion and intuitive knowledge. The raison d'être of universities is the advance of knowledge; however, they have primarily focused on developing student's skills in working with…
29 CFR 18.13 - Discovery methods.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 29 Labor 1 2013-07-01 2013-07-01 false Discovery methods. 18.13 Section 18.13 Labor Office of the... ADMINISTRATIVE LAW JUDGES General § 18.13 Discovery methods. Parties may obtain discovery by one or more of the following methods: Depositions upon oral examination or written questions; written interrogatories...
29 CFR 18.13 - Discovery methods.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 29 Labor 1 2014-07-01 2013-07-01 true Discovery methods. 18.13 Section 18.13 Labor Office of the... ADMINISTRATIVE LAW JUDGES General § 18.13 Discovery methods. Parties may obtain discovery by one or more of the following methods: Depositions upon oral examination or written questions; written interrogatories...
29 CFR 18.13 - Discovery methods.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 29 Labor 1 2011-07-01 2011-07-01 false Discovery methods. 18.13 Section 18.13 Labor Office of the... ADMINISTRATIVE LAW JUDGES General § 18.13 Discovery methods. Parties may obtain discovery by one or more of the following methods: Depositions upon oral examination or written questions; written interrogatories...
29 CFR 18.13 - Discovery methods.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 29 Labor 1 2012-07-01 2012-07-01 false Discovery methods. 18.13 Section 18.13 Labor Office of the... ADMINISTRATIVE LAW JUDGES General § 18.13 Discovery methods. Parties may obtain discovery by one or more of the following methods: Depositions upon oral examination or written questions; written interrogatories...
29 CFR 18.13 - Discovery methods.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 29 Labor 1 2010-07-01 2010-07-01 true Discovery methods. 18.13 Section 18.13 Labor Office of the... ADMINISTRATIVE LAW JUDGES General § 18.13 Discovery methods. Parties may obtain discovery by one or more of the following methods: Depositions upon oral examination or written questions; written interrogatories...
The Discovery Method in Training.

ERIC Educational Resources Information Center

Belbin, R. M.

In the form of a discussion between faceless people, this booklet concerns discovery learning and its advantages. Subjects covered in the discussions are: Introducing the Discovery Method; An Experiment with British Railways; The OECD Research Projects in U.S.A., Austria, and Sweden; How the Discovery Method Differs from Other Methods; Discovery…
The discovery of HTLV-1, the first pathogenic human retrovirus.

PubMed

Coffin, John M

2015-12-22

After the discovery of retroviral reverse transcriptase in 1970, there was a flurry of activity, sparked by the "War on Cancer," to identify human cancer retroviruses. After many false claims resulting from various artifacts, most scientists abandoned the search, but the Gallo laboratory carried on, developing both specific assays and new cell culture methods that enabled them to report, in the accompanying 1980 PNAS paper, identification and partial characterization of human T-cell leukemia virus (HTLV; now known as HTLV-1) produced by a T-cell line from a lymphoma patient. Follow-up studies, including collaboration with the group that first identified a cluster of adult T-cell leukemia (ATL) cases in Japan, provided conclusive evidence that HTLV was the cause of this disease. HTLV-1 is now known to infect at least 4-10 million people worldwide, about 5% of whom will develop ATL. Despite intensive research, knowledge of the viral etiology has not led to improvement in treatment or outcome of ATL. However, the technology for discovery of HTLV and acknowledgment of the existence of pathogenic human retroviruses laid the technical and intellectual foundation for the discovery of the cause of AIDS soon afterward. Without this advance, our ability to diagnose and treat HIV infection most likely would have been long delayed.
Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships.

PubMed

Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M

2013-10-01

The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Mapping Quantitative Field Resistance Against Apple Scab in a 'Fiesta' x 'Discovery' Progeny.

PubMed

Liebhard, R; Koller, B; Patocchi, A; Kellerhals, M; Pfammatter, W; Jermini, M; Gessler, C

2003-04-01

ABSTRACT Breeding of resistant apple cultivars (Malus x domestica) as a disease management strategy relies on the knowledge and understanding of the underlying genetics. The availability of molecular markers and genetic linkage maps enables the detection and the analysis of major resistance genes as well as of quantitative trait loci (QTL) contributing to the resistance of a genotype. Such a genetic linkage map was constructed, based on a segregating population of the cross between apple cvs. Fiesta (syn. Red Pippin) and Discovery. The progeny was observed for 3 years at three different sites in Switzerland and field resistance against apple scab (Venturia inaequalis) was assessed. Only a weak correlation was detected between leaf scab and fruit scab. A QTL analysis was performed, based on the genetic linkage map consisting of 804 molecular markers and covering all 17 chromosomes of apple. With the maximum likelihood-based interval mapping method, eight genomic regions were identified, six conferring resistance against leaf scab and two conferring fruit scab resistance. Although cv. Discovery showed a much stronger resistance against scab in the field, most QTL identified were attributed to the more susceptible parent 'Fiesta'. This indicated a high degree of homozygosity at the scab resistance loci in 'Discovery', preventing their detection in the progeny due to the lack of segregation.

Joint the Center for Applied Scientific Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gamblin, Todd; Bremer, Timo; Van Essen, Brian

The Center for Applied Scientific Computing serves as Livermore Lab’s window to the broader computer science, computational physics, applied mathematics, and data science research communities. In collaboration with academic, industrial, and other government laboratory partners, we conduct world-class scientific research and development on problems critical to national security. CASC applies the power of high-performance computing and the efficiency of modern computational methods to the realms of stockpile stewardship, cyber and energy security, and knowledge discovery for intelligence applications.
The Physics of Thermoelectric Energy Conversion

NASA Astrophysics Data System (ADS)

Goldsmid, H. Julian

2017-04-01

This book outlines the principles of thermoelectric generation and refrigeration from the discovery of the Seebeck and Peltier effects in the 19th century through the introduction of semiconductor thermoelements in the mid-20th century to the more recent development of nanostructured materials. The conditions for favourable electronic properties are discussed. The methods for selecting materials with a low lattice thermal conductivity are outlined and the ways in which the scattering of phonons can be enhanced are described. The book is aimed at readers without specialised knowledge.
Integrative Systems Biology for Data Driven Knowledge Discovery

PubMed Central

Greene, Casey S.; Troyanskaya, Olga G.

2015-01-01

Integrative systems biology is an approach that brings together diverse high throughput experiments and databases to gain new insights into biological processes or systems at molecular through physiological levels. These approaches rely on diverse high-throughput experimental techniques that generate heterogeneous data by assaying varying aspects of complex biological processes. Computational approaches are necessary to provide an integrative view of these experimental results and enable data-driven knowledge discovery. Hypotheses generated from these approaches can direct definitive molecular experiments in a cost effective manner. Using integrative systems biology approaches, we can leverage existing biological knowledge and large-scale data to improve our understanding of yet unknown components of a system of interest and how its malfunction leads to disease. PMID:21044756
18 CFR 385.402 - Scope of discovery (Rule 402).

Code of Federal Regulations, 2010 CFR

2010-04-01

... 18 Conservation of Power and Water Resources 1 2010-04-01 2010-04-01 false Scope of discovery (Rule 402). 385.402 Section 385.402 Conservation of Power and Water Resources FEDERAL ENERGY REGULATORY... persons having any knowledge of any discoverable matter. It is not ground for objection that the...
Doors to Discovery[TM]. What Works Clearinghouse Intervention Report

ERIC Educational Resources Information Center

What Works Clearinghouse, 2013

2013-01-01

"Doors to Discovery"]TM] is a preschool literacy curriculum that uses eight thematic units of activities to help children build fundamental early literacy skills in oral language, phonological awareness, concepts of print, alphabet knowledge, writing, and comprehension. The eight thematic units cover topics such as nature, friendship,…
78 FR 12933 - Proceedings Before the Commodity Futures Trading Commission

Federal Register 2010, 2011, 2012, 2013, 2014

2013-02-26

... proceedings. These new amendments also provide that Judgment Officers may conduct sua sponte discovery in... discovery; (4) sound risk management practices; and (5) other public interest considerations. The amendments... representative capacity, it was done with full power and authority to do so; (C) To the best of his knowledge...
76 FR 64803 - Rules of Adjudication and Enforcement

Federal Register 2010, 2011, 2012, 2013, 2014

2011-10-19

...) is also amended to clarify the limits on discovery when the Commission orders the ALJ to consider the... that the complainant identify, to the best of its knowledge, the ``like or directly competitive... the taking of discovery by the parties shall be at the discretion of the presiding ALJ. The ITCTLA...
78 FR 63253 - Davidson Kempner Capital Management LLC; Notice of Application

Federal Register 2010, 2011, 2012, 2013, 2014

2013-10-23

... employees of the Adviser other than the Contributor have any knowledge of the Contribution prior to its discovery by the Adviser on November 2, 2011. The Contribution was discovered by the Adviser's compliance... names of employees. After discovery of the Contribution, the Adviser and Contributor obtained the...
Gene Patents and Personalized Cancer Care: Impact of the Myriad Case on Clinical Oncology

PubMed Central

Offit, Kenneth; Bradbury, Angela; Storm, Courtney; Merz, Jon F.; Noonan, Kevin E.; Spence, Rebecca

2013-01-01

Genomic discoveries have transformed the practice of oncology and cancer prevention. Diagnostic and therapeutic advances based on cancer genomics developed during a time when it was possible to patent genes. A case before the Supreme Court, Association for Molecular Pathology v Myriad Genetics, Inc seeks to overturn patents on isolated genes. Although the outcomes are uncertain, it is suggested here that the Supreme Court decision will have few immediate effects on oncology practice or research but may have more significant long-term impact. The Federal Circuit court has already rejected Myriad's broad diagnostic methods claims, and this is not affected by the Supreme Court decision. Isolated DNA patents were already becoming obsolete on scientific grounds, in an era when human DNA sequence is public knowledge and because modern methods of next-generation sequencing need not involve isolated DNA. The Association for Molecular Pathology v Myriad Supreme Court decision will have limited impact on new drug development, as new drug patents usually involve cellular methods. A nuanced Supreme Court decision acknowledging the scientific distinction between synthetic cDNA and genomic DNA will further mitigate any adverse impact. A Supreme Court decision to include or exclude all types of DNA from patent eligibility could impact future incentives for genomic discovery as well as the future delivery of medical care. Whatever the outcome of this important case, it is important that judicial and legislative actions in this area maximize genomic discovery while also ensuring patients' access to personalized cancer care. PMID:23766521
Discovery, identification and mitigation of isobaric sulfate metabolite interference to a phosphate prodrug in LC-MS/MS bioanalysis: Critical role of method development in ensuring assay quality.

PubMed

Yuan, Long; Ji, Qin C

2018-06-05

Metabolite interferences represent a major risk of inaccurate quantification when using LC-MS/MS bioanalytical assays. During LC-MS/MS bioanalysis of BMS-919194, a phosphate ester prodrug, in plasma samples from rat and monkey GLP toxicology studies, an unknown peak was detected in the MRM channel of the prodrug. This peak was not observed in previous discovery toxicology studies, in which a fast gradient LC-MS/MS method was used. We found out that this unknown peak would co-elute with the prodrug peak when the discovery method was used, therefore, causing significant overestimation of the exposure of the prodrug in the discovery toxicology studies. To understand the nature of this interfering peak and its impact to bioanalytical assay, we further investigated its formation and identification. The interfering compound and the prodrug were found to be isobaric and to have the same major product ions in electrospray ionization positive mode, thus, could not be differentiated using a triple quadrupole mass spectrometer. By using high-resolution mass spectrometry (HRMS), the interfering metabolite was successfully identified to be an isobaric sulfate metabolite of BMS-919194. To the best of our knowledge, this is the first report that a phosphate prodrug was metabolized in vivo to an isobaric sulfate metabolite, and this metabolite caused significant interference to the analysis of the prodrug. This work demonstrated the presence of the interference risk from isobaric sulfate metabolites to the bioanalysis of phosphate prodrugs in real samples. It is critical to evaluate and mitigate potential metabolite interferences during method development, therefore, minimize the related bioanalytical risks and ensure assay quality. Our work also showed the unique advantages of HRMS in identifying potential metabolite interference during LC-MS/MS bioanalysis. Copyright © 2018 Elsevier B.V. All rights reserved.
Exploring Dance Movement Data Using Sequence Alignment Methods

PubMed Central

Chavoshi, Seyed Hossein; De Baets, Bernard; Neutens, Tijs; De Tré, Guy; Van de Weghe, Nico

2015-01-01

Despite the abundance of research on knowledge discovery from moving object databases, only a limited number of studies have examined the interaction between moving point objects in space over time. This paper describes a novel approach for measuring similarity in the interaction between moving objects. The proposed approach consists of three steps. First, we transform movement data into sequences of successive qualitative relations based on the Qualitative Trajectory Calculus (QTC). Second, sequence alignment methods are applied to measure the similarity between movement sequences. Finally, movement sequences are grouped based on similarity by means of an agglomerative hierarchical clustering method. The applicability of this approach is tested using movement data from samba and tango dancers. PMID:26181435
Firefly Algorithm for Structural Search.

PubMed

Avendaño-Franco, Guillermo; Romero, Aldo H

2016-07-12

The problem of computational structure prediction of materials is approached using the firefly (FF) algorithm. Starting from the chemical composition and optionally using prior knowledge of similar structures, the FF method is able to predict not only known stable structures but also a variety of novel competitive metastable structures. This article focuses on the strengths and limitations of the algorithm as a multimodal global searcher. The algorithm has been implemented in software package PyChemia ( https://github.com/MaterialsDiscovery/PyChemia ), an open source python library for materials analysis. We present applications of the method to van der Waals clusters and crystal structures. The FF method is shown to be competitive when compared to other population-based global searchers.
Robust and Accurate Anomaly Detection in ECG Artifacts Using Time Series Motif Discovery

PubMed Central

Sivaraks, Haemwaan

2015-01-01

Electrocardiogram (ECG) anomaly detection is an important technique for detecting dissimilar heartbeats which helps identify abnormal ECGs before the diagnosis process. Currently available ECG anomaly detection methods, ranging from academic research to commercial ECG machines, still suffer from a high false alarm rate because these methods are not able to differentiate ECG artifacts from real ECG signal, especially, in ECG artifacts that are similar to ECG signals in terms of shape and/or frequency. The problem leads to high vigilance for physicians and misinterpretation risk for nonspecialists. Therefore, this work proposes a novel anomaly detection technique that is highly robust and accurate in the presence of ECG artifacts which can effectively reduce the false alarm rate. Expert knowledge from cardiologists and motif discovery technique is utilized in our design. In addition, every step of the algorithm conforms to the interpretation of cardiologists. Our method can be utilized to both single-lead ECGs and multilead ECGs. Our experiment results on real ECG datasets are interpreted and evaluated by cardiologists. Our proposed algorithm can mostly achieve 100% of accuracy on detection (AoD), sensitivity, specificity, and positive predictive value with 0% false alarm rate. The results demonstrate that our proposed method is highly accurate and robust to artifacts, compared with competitive anomaly detection methods. PMID:25688284
Bridging the Gap in Neurotherapeutic Discovery and Development: The Role of the National Institute of Neurological Disorders and Stroke in Translational Neuroscience.

PubMed

Mott, Meghan; Koroshetz, Walter

2015-07-01

The mission of the National Institute of Neurological Disorders and Stroke (NINDS) is to seek fundamental knowledge about the brain and nervous system and to use that knowledge to reduce the burden of neurological disease. NINDS supports early- and late-stage therapy development funding programs to accelerate preclinical discovery and the development of new therapeutic interventions for neurological disorders. The NINDS Office of Translational Research facilitates and funds the movement of discoveries from the laboratory to patients. Its grantees include academics, often with partnerships with the private sector, as well as small businesses, which, by Congressional mandate, receive > 3% of the NINDS budget for small business innovation research. This article provides an overview of NINDS-funded therapy development programs offered by the NINDS Office of Translational Research.
Asymmetric threat data mining and knowledge discovery

NASA Astrophysics Data System (ADS)

Gilmore, John F.; Pagels, Michael A.; Palk, Justin

2001-03-01

Asymmetric threats differ from the conventional force-on- force military encounters that the Defense Department has historically been trained to engage. Terrorism by its nature is now an operational activity that is neither easily detected or countered as its very existence depends on small covert attacks exploiting the element of surprise. But terrorism does have defined forms, motivations, tactics and organizational structure. Exploiting a terrorism taxonomy provides the opportunity to discover and assess knowledge of terrorist operations. This paper describes the Asymmetric Threat Terrorist Assessment, Countering, and Knowledge (ATTACK) system. ATTACK has been developed to (a) data mine open source intelligence (OSINT) information from web-based newspaper sources, video news web casts, and actual terrorist web sites, (b) evaluate this information against a terrorism taxonomy, (c) exploit country/region specific social, economic, political, and religious knowledge, and (d) discover and predict potential terrorist activities and association links. Details of the asymmetric threat structure and the ATTACK system architecture are presented with results of an actual terrorist data mining and knowledge discovery test case shown.
Oak Ridge Graph Analytics for Medical Innovation (ORiGAMI)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Roberts, Larry W.; Lee, Sangkeun

2016-01-01

In this era of data-driven decisions and discovery where Big Data is producing Bigger Data, data scientists at the Oak Ridge National Laboratory are leveraging unique leadership infrastructure (e.g., Urika XA and Urika GD appliances) to develop scalable algorithms for semantic, logical and statistical reasoning with Big Data (i.e., data stored in databases as well as unstructured data in documents). ORiGAMI is a next-generation knowledge-discovery framework that is: (a) knowledge nurturing (i.e., evolves seamlessly with newer knowledge and data), (b) smart and curious (i.e. using information-foraging and reasoning algorithms to digest content) and (c) synergistic (i.e., interfaces computers with whatmore » they do best to help subject-matter-experts do their best. ORiGAMI has been demonstrated using the National Library of Medicine's SEMANTIC MEDLINE (archive of medical knowledge since 1994).« less
The Cure: Design and Evaluation of a Crowdsourcing Game for Gene Selection for Breast Cancer Survival Prediction

PubMed Central

Loguercio, Salvatore; Griffith, Obi L; Nanis, Max; Wu, Chunlei; Su, Andrew I

2014-01-01

Background Molecular signatures for predicting breast cancer prognosis could greatly improve care through personalization of treatment. Computational analyses of genome-wide expression datasets have identified such signatures, but these signatures leave much to be desired in terms of accuracy, reproducibility, and biological interpretability. Methods that take advantage of structured prior knowledge (eg, protein interaction networks) show promise in helping to define better signatures, but most knowledge remains unstructured. Crowdsourcing via scientific discovery games is an emerging methodology that has the potential to tap into human intelligence at scales and in modes unheard of before. Objective The main objective of this study was to test the hypothesis that knowledge linking expression patterns of specific genes to breast cancer outcomes could be captured from players of an open, Web-based game. We envisioned capturing knowledge both from the player’s prior experience and from their ability to interpret text related to candidate genes presented to them in the context of the game. Methods We developed and evaluated an online game called The Cure that captured information from players regarding genes for use as predictors of breast cancer survival. Information gathered from game play was aggregated using a voting approach, and used to create rankings of genes. The top genes from these rankings were evaluated using annotation enrichment analysis, comparison to prior predictor gene sets, and by using them to train and test machine learning systems for predicting 10 year survival. Results Between its launch in September 2012 and September 2013, The Cure attracted more than 1000 registered players, who collectively played nearly 10,000 games. Gene sets assembled through aggregation of the collected data showed significant enrichment for genes known to be related to key concepts such as cancer, disease progression, and recurrence. In terms of the predictive accuracy of models trained using this information, these gene sets provided comparable performance to gene sets generated using other methods, including those used in commercial tests. The Cure is available on the Internet. Conclusions The principal contribution of this work is to show that crowdsourcing games can be developed as a means to address problems involving domain knowledge. While most prior work on scientific discovery games and crowdsourcing in general takes as a premise that contributors have little or no expertise, here we demonstrated a crowdsourcing system that succeeded in capturing expert knowledge. PMID:25654473
Mentor-mentee Relationship: A Win-Win Contract In Graduate Medical Education

PubMed Central

Fuller, Jacklyn C

2017-01-01

Scholarly activities (i.e., the discovery of new knowledge; development of new technologies, methods, materials, or uses; integration of knowledge leading to new understanding) are intended to measure the quality and quantity of dissemination of knowledge. A successful mentorship program is necessary during residency to help residents achieve the six core competencies (patient care, medical knowledge, practice-based learning and improvement, systems-based practice, professionalism, interpersonal and communication skills) required by the Accreditation Council for Graduate Medical Education (ACGME). The role of the mentor in this process is pivotal in the advancement of the residents’ knowledge about evidence-based medicine. With this process, while mentees become more self-regulated, exhibit confidence in their performance, and demonstrate more insight and aptitude in their jobs, mentors also achieve elevated higher self-esteem, enhanced leadership skills, and personal gratification. As such, we may conclude that mentoring is a two-sided relationship; i.e., a 'win-win' style of commitment between the mentor and mentee. Hence, both parties will eventually advance academically, as well as professionally. PMID:29435394
The Prehistory of Discovery: Precursors of Representational Change in Solving Gear System Problems.

ERIC Educational Resources Information Center

Dixon, James A.; Bangert, Ashley S.

2002-01-01

This study investigated whether the process of representational change undergoes developmental change or different processes occupy different niches in the course of knowledge acquisition. Subjects--college, third-, and sixth-grade students--solved gear system problems over two sessions. Findings indicated that for all grades, discovery of the…
40 CFR 300.300 - Phase I-Discovery or notification.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 27 2010-07-01 2010-07-01 false Phase I-Discovery or notification. 300.300 Section 300.300 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) SUPERFUND... person in charge of a vessel or a facility shall, as soon as he or she has knowledge of any discharge...

A Projection and Density Estimation Method for Knowledge Discovery

PubMed Central

Stanski, Adam; Hellwich, Olaf

2012-01-01

A key ingredient to modern data analysis is probability density estimation. However, it is well known that the curse of dimensionality prevents a proper estimation of densities in high dimensions. The problem is typically circumvented by using a fixed set of assumptions about the data, e.g., by assuming partial independence of features, data on a manifold or a customized kernel. These fixed assumptions limit the applicability of a method. In this paper we propose a framework that uses a flexible set of assumptions instead. It allows to tailor a model to various problems by means of 1d-decompositions. The approach achieves a fast runtime and is not limited by the curse of dimensionality as all estimations are performed in 1d-space. The wide range of applications is demonstrated at two very different real world examples. The first is a data mining software that allows the fully automatic discovery of patterns. The software is publicly available for evaluation. As a second example an image segmentation method is realized. It achieves state of the art performance on a benchmark dataset although it uses only a fraction of the training data and very simple features. PMID:23049675
Comparative analysis of machine learning methods in ligand-based virtual screening of large compound libraries.

PubMed

Ma, Xiao H; Jia, Jia; Zhu, Feng; Xue, Ying; Li, Ze R; Chen, Yu Z

2009-05-01

Machine learning methods have been explored as ligand-based virtual screening tools for facilitating drug lead discovery. These methods predict compounds of specific pharmacodynamic, pharmacokinetic or toxicological properties based on their structure-derived structural and physicochemical properties. Increasing attention has been directed at these methods because of their capability in predicting compounds of diverse structures and complex structure-activity relationships without requiring the knowledge of target 3D structure. This article reviews current progresses in using machine learning methods for virtual screening of pharmacodynamically active compounds from large compound libraries, and analyzes and compares the reported performances of machine learning tools with those of structure-based and other ligand-based (such as pharmacophore and clustering) virtual screening methods. The feasibility to improve the performance of machine learning methods in screening large libraries is discussed.
Homology modeling a fast tool for drug discovery: current perspectives.

PubMed

Vyas, V K; Ukawala, R D; Ghate, M; Chintha, C

2012-01-01

Major goal of structural biology involve formation of protein-ligand complexes; in which the protein molecules act energetically in the course of binding. Therefore, perceptive of protein-ligand interaction will be very important for structure based drug design. Lack of knowledge of 3D structures has hindered efforts to understand the binding specificities of ligands with protein. With increasing in modeling software and the growing number of known protein structures, homology modeling is rapidly becoming the method of choice for obtaining 3D coordinates of proteins. Homology modeling is a representation of the similarity of environmental residues at topologically corresponding positions in the reference proteins. In the absence of experimental data, model building on the basis of a known 3D structure of a homologous protein is at present the only reliable method to obtain the structural information. Knowledge of the 3D structures of proteins provides invaluable insights into the molecular basis of their functions. The recent advances in homology modeling, particularly in detecting and aligning sequences with template structures, distant homologues, modeling of loops and side chains as well as detecting errors in a model contributed to consistent prediction of protein structure, which was not possible even several years ago. This review focused on the features and a role of homology modeling in predicting protein structure and described current developments in this field with victorious applications at the different stages of the drug design and discovery.
Homology Modeling a Fast Tool for Drug Discovery: Current Perspectives

PubMed Central

Vyas, V. K.; Ukawala, R. D.; Ghate, M.; Chintha, C.

2012-01-01

Major goal of structural biology involve formation of protein-ligand complexes; in which the protein molecules act energetically in the course of binding. Therefore, perceptive of protein-ligand interaction will be very important for structure based drug design. Lack of knowledge of 3D structures has hindered efforts to understand the binding specificities of ligands with protein. With increasing in modeling software and the growing number of known protein structures, homology modeling is rapidly becoming the method of choice for obtaining 3D coordinates of proteins. Homology modeling is a representation of the similarity of environmental residues at topologically corresponding positions in the reference proteins. In the absence of experimental data, model building on the basis of a known 3D structure of a homologous protein is at present the only reliable method to obtain the structural information. Knowledge of the 3D structures of proteins provides invaluable insights into the molecular basis of their functions. The recent advances in homology modeling, particularly in detecting and aligning sequences with template structures, distant homologues, modeling of loops and side chains as well as detecting errors in a model contributed to consistent prediction of protein structure, which was not possible even several years ago. This review focused on the features and a role of homology modeling in predicting protein structure and described current developments in this field with victorious applications at the different stages of the drug design and discovery. PMID:23204616
Serendipity: Accidental Discoveries in Science

NASA Astrophysics Data System (ADS)

Roberts, Royston M.

1989-06-01

Many of the things discovered by accident are important in our everyday lives: Teflon, Velcro, nylon, x-rays, penicillin, safety glass, sugar substitutes, and polyethylene and other plastics. And we owe a debt to accident for some of our deepest scientific knowledge, including Newton's theory of gravitation, the Big Bang theory of Creation, and the discovery of DNA. Even the Rosetta Stone, the Dead Sea Scrolls, and the ruins of Pompeii came to light through chance. This book tells the fascinating stories of these and other discoveries and reveals how the inquisitive human mind turns accident into discovery. Written for the layman, yet scientifically accurate, this illuminating collection of anecdotes portrays invention and discovery as quintessentially human acts, due in part to curiosity, perserverance, and luck.
Closed-Loop Multitarget Optimization for Discovery of New Emulsion Polymerization Recipes

PubMed Central

2015-01-01

Self-optimization of chemical reactions enables faster optimization of reaction conditions or discovery of molecules with required target properties. The technology of self-optimization has been expanded to discovery of new process recipes for manufacture of complex functional products. A new machine-learning algorithm, specifically designed for multiobjective target optimization with an explicit aim to minimize the number of “expensive” experiments, guides the discovery process. This “black-box” approach assumes no a priori knowledge of chemical system and hence particularly suited to rapid development of processes to manufacture specialist low-volume, high-value products. The approach was demonstrated in discovery of process recipes for a semibatch emulsion copolymerization, targeting a specific particle size and full conversion. PMID:26435638
Discovery of novel bacterial toxins by genomics and computational biology.

PubMed

Doxey, Andrew C; Mansfield, Michael J; Montecucco, Cesare

2018-06-01

Hundreds and hundreds of bacterial protein toxins are presently known. Traditionally, toxin identification begins with pathological studies of bacterial infectious disease. Following identification and cultivation of a bacterial pathogen, the protein toxin is purified from the culture medium and its pathogenic activity is studied using the methods of biochemistry and structural biology, cell biology, tissue and organ biology, and appropriate animal models, supplemented by bioimaging techniques. The ongoing and explosive development of high-throughput DNA sequencing and bioinformatic approaches have set in motion a revolution in many fields of biology, including microbiology. One consequence is that genes encoding novel bacterial toxins can be identified by bioinformatic and computational methods based on previous knowledge accumulated from studies of the biology and pathology of thousands of known bacterial protein toxins. Starting from the paradigmatic cases of diphtheria toxin, tetanus and botulinum neurotoxins, this review discusses traditional experimental approaches as well as bioinformatics and genomics-driven approaches that facilitate the discovery of novel bacterial toxins. We discuss recent work on the identification of novel botulinum-like toxins from genera such as Weissella, Chryseobacterium, and Enteroccocus, and the implications of these computationally identified toxins in the field. Finally, we discuss the promise of metagenomics in the discovery of novel toxins and their ecological niches, and present data suggesting the existence of uncharacterized, botulinum-like toxin genes in insect gut metagenomes. Copyright © 2018. Published by Elsevier Ltd.
Applying data mining techniques to medical time series: an empirical case study in electroencephalography and stabilometry.

PubMed

Anguera, A; Barreiro, J M; Lara, J A; Lizcano, D

2016-01-01

One of the major challenges in the medical domain today is how to exploit the huge amount of data that this field generates. To do this, approaches are required that are capable of discovering knowledge that is useful for decision making in the medical field. Time series are data types that are common in the medical domain and require specialized analysis techniques and tools, especially if the information of interest to specialists is concentrated within particular time series regions, known as events. This research followed the steps specified by the so-called knowledge discovery in databases (KDD) process to discover knowledge from medical time series derived from stabilometric (396 series) and electroencephalographic (200) patient electronic health records (EHR). The view offered in the paper is based on the experience gathered as part of the VIIP project. Knowledge discovery in medical time series has a number of difficulties and implications that are highlighted by illustrating the application of several techniques that cover the entire KDD process through two case studies. This paper illustrates the application of different knowledge discovery techniques for the purposes of classification within the above domains. The accuracy of this application for the two classes considered in each case is 99.86% and 98.11% for epilepsy diagnosis in the electroencephalography (EEG) domain and 99.4% and 99.1% for early-age sports talent classification in the stabilometry domain. The KDD techniques achieve better results than other traditional neural network-based classification techniques.
The cure: design and evaluation of a crowdsourcing game for gene selection for breast cancer survival prediction.

PubMed

Good, Benjamin M; Loguercio, Salvatore; Griffith, Obi L; Nanis, Max; Wu, Chunlei; Su, Andrew I

2014-07-29

Molecular signatures for predicting breast cancer prognosis could greatly improve care through personalization of treatment. Computational analyses of genome-wide expression datasets have identified such signatures, but these signatures leave much to be desired in terms of accuracy, reproducibility, and biological interpretability. Methods that take advantage of structured prior knowledge (eg, protein interaction networks) show promise in helping to define better signatures, but most knowledge remains unstructured. Crowdsourcing via scientific discovery games is an emerging methodology that has the potential to tap into human intelligence at scales and in modes unheard of before. The main objective of this study was to test the hypothesis that knowledge linking expression patterns of specific genes to breast cancer outcomes could be captured from players of an open, Web-based game. We envisioned capturing knowledge both from the player's prior experience and from their ability to interpret text related to candidate genes presented to them in the context of the game. We developed and evaluated an online game called The Cure that captured information from players regarding genes for use as predictors of breast cancer survival. Information gathered from game play was aggregated using a voting approach, and used to create rankings of genes. The top genes from these rankings were evaluated using annotation enrichment analysis, comparison to prior predictor gene sets, and by using them to train and test machine learning systems for predicting 10 year survival. Between its launch in September 2012 and September 2013, The Cure attracted more than 1000 registered players, who collectively played nearly 10,000 games. Gene sets assembled through aggregation of the collected data showed significant enrichment for genes known to be related to key concepts such as cancer, disease progression, and recurrence. In terms of the predictive accuracy of models trained using this information, these gene sets provided comparable performance to gene sets generated using other methods, including those used in commercial tests. The Cure is available on the Internet. The principal contribution of this work is to show that crowdsourcing games can be developed as a means to address problems involving domain knowledge. While most prior work on scientific discovery games and crowdsourcing in general takes as a premise that contributors have little or no expertise, here we demonstrated a crowdsourcing system that succeeded in capturing expert knowledge.
Equation Discovery for Model Identification in Respiratory Mechanics of the Mechanically Ventilated Human Lung

NASA Astrophysics Data System (ADS)

Ganzert, Steven; Guttmann, Josef; Steinmann, Daniel; Kramer, Stefan

Lung protective ventilation strategies reduce the risk of ventilator associated lung injury. To develop such strategies, knowledge about mechanical properties of the mechanically ventilated human lung is essential. This study was designed to develop an equation discovery system to identify mathematical models of the respiratory system in time-series data obtained from mechanically ventilated patients. Two techniques were combined: (i) the usage of declarative bias to reduce search space complexity and inherently providing the processing of background knowledge. (ii) A newly developed heuristic for traversing the hypothesis space with a greedy, randomized strategy analogical to the GSAT algorithm. In 96.8% of all runs the applied equation discovery system was capable to detect the well-established equation of motion model of the respiratory system in the provided data. We see the potential of this semi-automatic approach to detect more complex mathematical descriptions of the respiratory system from respiratory data.
Antisense oligonucleotide technologies in drug discovery.

PubMed

Aboul-Fadl, Tarek

2006-09-01

The principle of antisense oligonucleotide (AS-OD) technologies is based on the specific inhibition of unwanted gene expression by blocking mRNA activity. It has long appeared to be an ideal strategy to leverage new genomic knowledge for drug discovery and development. In recent years, AS-OD technologies have been widely used as potent and promising tools for this purpose. There is a rapid increase in the number of antisense molecules progressing in clinical trials. AS-OD technologies provide a simple and efficient approach for drug discovery and development and are expected to become a reality in the near future. This editorial describes the established and emerging AS-OD technologies in drug discovery.
100 years of elementary particles [Beam Line, vol. 27, issue 1, Spring 1997

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pais, Abraham; Weinberg, Steven; Quigg, Chris

1997-04-01

This issue of Beam Line commemorates the 100th anniversary of the April 30, 1897 report of the discovery of the electron by J.J. Thomson and the ensuing discovery of other subatomic particles. In the first three articles, theorists Abraham Pais, Steven Weinberg, and Chris Quigg provide their perspectives on the discoveries of elementary particles as well as the implications and future directions resulting from these discoveries. In the following three articles, Michael Riordan, Wolfgang Panofsky, and Virginia Trimble apply our knowledge about elementary particles to high-energy research, electronics technology, and understanding the origin and evolution of our Universe.
100 years of Elementary Particles [Beam Line, vol. 27, issue 1, Spring 1997

DOE R&D Accomplishments Database

Pais, Abraham; Weinberg, Steven; Quigg, Chris; Riordan, Michael; Panofsky, Wolfgang K. H.; Trimble, Virginia

1997-04-01

This issue of Beam Line commemorates the 100th anniversary of the April 30, 1897 report of the discovery of the electron by J.J. Thomson and the ensuing discovery of other subatomic particles. In the first three articles, theorists Abraham Pais, Steven Weinberg, and Chris Quigg provide their perspectives on the discoveries of elementary particles as well as the implications and future directions resulting from these discoveries. In the following three articles, Michael Riordan, Wolfgang Panofsky, and Virginia Trimble apply our knowledge about elementary particles to high-energy research, electronics technology, and understanding the origin and evolution of our Universe.
Invention Versus Direct Instruction: For Some Content, It's a Tie

NASA Astrophysics Data System (ADS)

Chase, Catherine C.; Klahr, David

2017-12-01

An important, but as yet unresolved pedagogical question is whether discovery-oriented or direct instruction methods lead to greater learning and transfer. We address this issue in a study with 101 fourth and fifth grade students that contrasts two distinct instructional methods. One is a blend of discovery and direct instruction called Invent-then-Tell (IT), and the other is a version of direct instruction called Tell-then-Practice (TP). The relative effectiveness of these methods is compared in the context of learning a critical inquiry skill—the control-of-variables strategy. Previous research has demonstrated the success of IT over TP for teaching deep domain structures, while other research has demonstrated the superiority of direct instruction for teaching simple experimental design, a domain-general inquiry skill. In the present study, students in both conditions made equally large gains on an immediate assessment of their application and conceptual understanding of experimental design, and they also performed similarly on a test of far transfer. These results were fairly consistent across school populations with various levels of prior achievement and socioeconomic status. Findings suggest that broad claims about the relative effectiveness of these two distinct methods should be conditionalized by particular instructional contexts, such as the type of knowledge being taught.
User needs analysis and usability assessment of DataMed - a biomedical data discovery index.

PubMed

Dixit, Ram; Rogith, Deevakar; Narayana, Vidya; Salimi, Mandana; Gururaj, Anupama; Ohno-Machado, Lucila; Xu, Hua; Johnson, Todd R

2017-11-30

To present user needs and usability evaluations of DataMed, a Data Discovery Index (DDI) that allows searching for biomedical data from multiple sources. We conducted 2 phases of user studies. Phase 1 was a user needs analysis conducted before the development of DataMed, consisting of interviews with researchers. Phase 2 involved iterative usability evaluations of DataMed prototypes. We analyzed data qualitatively to document researchers' information and user interface needs. Biomedical researchers' information needs in data discovery are complex, multidimensional, and shaped by their context, domain knowledge, and technical experience. User needs analyses validate the need for a DDI, while usability evaluations of DataMed show that even though aggregating metadata into a common search engine and applying traditional information retrieval tools are promising first steps, there remain challenges for DataMed due to incomplete metadata and the complexity of data discovery. Biomedical data poses distinct problems for search when compared to websites or publications. Making data available is not enough to facilitate biomedical data discovery: new retrieval techniques and user interfaces are necessary for dataset exploration. Consistent, complete, and high-quality metadata are vital to enable this process. While available data and researchers' information needs are complex and heterogeneous, a successful DDI must meet those needs and fit into the processes of biomedical researchers. Research directions include formalizing researchers' information needs, standardizing overviews of data to facilitate relevance judgments, implementing user interfaces for concept-based searching, and developing evaluation methods for open-ended discovery systems such as DDIs. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
A network model of knowledge accumulation through diffusion and upgrade

NASA Astrophysics Data System (ADS)

Zhuang, Enyu; Chen, Guanrong; Feng, Gang

2011-07-01

In this paper, we introduce a model to describe knowledge accumulation through knowledge diffusion and knowledge upgrade in a multi-agent network. Here, knowledge diffusion refers to the distribution of existing knowledge in the network, while knowledge upgrade means the discovery of new knowledge. It is found that the population of the network and the number of each agent’s neighbors affect the speed of knowledge accumulation. Four different policies for updating the neighboring agents are thus proposed, and their influence on the speed of knowledge accumulation and the topology evolution of the network are also studied.
Interfaith Education: An Islamic Perspective

ERIC Educational Resources Information Center

Pallavicini, Yahya Sergio Yahe

2016-01-01

According to a teaching of the Prophet Muhammad, "the quest for knowledge is the duty of each Muslim, male or female", where knowledge is meant as the discovery of the real value of things and of oneself in relationship with the world in which God has placed us. This universal dimension of knowledge is in fact a wealth of wisdom of the…
KnowEnG: a knowledge engine for genomics.

PubMed

Sinha, Saurabh; Song, Jun; Weinshilboum, Richard; Jongeneel, Victor; Han, Jiawei

2015-11-01

We describe here the vision, motivations, and research plans of the National Institutes of Health Center for Excellence in Big Data Computing at the University of Illinois, Urbana-Champaign. The Center is organized around the construction of "Knowledge Engine for Genomics" (KnowEnG), an E-science framework for genomics where biomedical scientists will have access to powerful methods of data mining, network mining, and machine learning to extract knowledge out of genomics data. The scientist will come to KnowEnG with their own data sets in the form of spreadsheets and ask KnowEnG to analyze those data sets in the light of a massive knowledge base of community data sets called the "Knowledge Network" that will be at the heart of the system. The Center is undertaking discovery projects aimed at testing the utility of KnowEnG for transforming big data to knowledge. These projects span a broad range of biological enquiry, from pharmacogenomics (in collaboration with Mayo Clinic) to transcriptomics of human behavior. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Assessment of composite motif discovery methods.

PubMed

Klepper, Kjetil; Sandve, Geir K; Abul, Osman; Johansen, Jostein; Drablos, Finn

2008-02-26

Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discovery - discovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cis-regulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery. We have developed a benchmarking framework for composite motif discovery and used it to evaluate the performance of eight published module discovery tools. Benchmark datasets were constructed based on real genomic sequences containing experimentally verified regulatory modules, and the module discovery programs were asked to predict both the locations of these modules and to specify the single motifs involved. To aid the programs in their search, we provided position weight matrices corresponding to the binding motifs of the transcription factors involved. In addition, selections of decoy matrices were mixed with the genuine matrices on one dataset to test the response of programs to varying levels of noise. Although some of the methods tested tended to score somewhat better than others overall, there were still large variations between individual datasets and no single method performed consistently better than the rest in all situations. The variation in performance on individual datasets also shows that the new benchmark datasets represents a suitable variety of challenges to most methods for module discovery.
Multiple reaction monitoring (MRM)-profiling for biomarker discovery applied to human polycystic ovarian syndrome.

PubMed

Cordeiro, Fernanda B; Ferreira, Christina R; Sobreira, Tiago Jose P; Yannell, Karen E; Jarmusch, Alan K; Cedenho, Agnaldo P; Lo Turco, Edson G; Cooks, R Graham

2017-09-15

We describe multiple reaction monitoring (MRM)-profiling, which provides accelerated discovery of discriminating molecular features, and its application to human polycystic ovary syndrome (PCOS) diagnosis. The discovery phase of the MRM-profiling seeks molecular features based on some prior knowledge of the chemical functional groups likely to be present in the sample. It does this through use of a limited number of pre-chosen and chemically specific neutral loss and/or precursor ion MS/MS scans. The output of the discovery phase is a set of precursor/product transitions. In the screening phase these MRM transitions are used to interrogate multiple samples (hence the name MRM-profiling). MRM-profiling was applied to follicular fluid samples of 22 controls and 29 clinically diagnosed PCOS patients. Representative samples were delivered by flow injection to a triple quadrupole mass spectrometer set to perform a number of pre-chosen and chemically specific neutral loss and/or precursor ion MS/MS scans. The output of this discovery phase was a set of 1012 precursor/product transitions. In the screening phase each individual sample was interrogated for these MRM transitions. Principal component analysis (PCA) and receiver operating characteristic (ROC) curves were used for statistical analysis. To evaluate the method's performance, half the samples were used to build a classification model (testing set) and half were blinded (validation set). Twenty transitions were used for the classification of the blind samples, most of them (N = 19) showed lower abundances in the PCOS group and corresponded to phosphatidylethanolamine (PE) and phosphatidylserine (PS) lipids. Agreement of 73% with clinical diagnosis was found when classifying the 26 blind samples. MRM-profiling is a supervised method characterized by its simplicity, speed and the absence of chromatographic separation. It can be used to rapidly isolate discriminating molecules in healthy/disease conditions by tailored screening of signals associated with hundreds of molecules in complex samples. Copyright © 2017 John Wiley & Sons, Ltd.

77 FR 75459 - Self-Regulatory Organizations; BATS Exchange, Inc.; Notice of Filing of a Proposed Rule Change To...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-12-20

... both sides would participate in an Exchange Auction, this proposed change would aid in price discovery... auction price. This proposed change would aid in price discovery and help to reduce the likelihood of... Sell Shares and, therefore, a User would never have complete knowledge of liquidity available on both...
Essential Skills and Knowledge for Troubleshooting E-Resources Access Issues in a Web-Scale Discovery Environment

ERIC Educational Resources Information Center

Carter, Sunshine; Traill, Stacie

2017-01-01

Electronic resource access troubleshooting is familiar work in most libraries. The added complexity introduced when a library implements a web-scale discovery service, however, creates a strong need for well-organized, rigorous training to enable troubleshooting staff to provide the best service possible. This article outlines strategies, tools,…
Revealing Significant Relations between Chemical/Biological Features and Activity: Associative Classification Mining for Drug Discovery

ERIC Educational Resources Information Center

Yu, Pulan

2012-01-01

Classification, clustering and association mining are major tasks of data mining and have been widely used for knowledge discovery. Associative classification mining, the combination of both association rule mining and classification, has emerged as an indispensable way to support decision making and scientific research. In particular, it offers a…
Mothers' Initial Discovery of Childhood Disability: Exploring Maternal Identification of Developmental Issues in Young Children

ERIC Educational Resources Information Center

Silbersack, Elionora W.

2014-01-01

The purpose of this qualitative study was to expand the scarce information available on how mothers first observe their children's early development, assess potential problems, and then come to recognize their concerns. In-depth knowledge about mothers' perspectives on the discovery process can help social workers to promote identification of…
Augmented Reality-Based Simulators as Discovery Learning Tools: An Empirical Study

ERIC Educational Resources Information Center

Ibáñez, María-Blanca; Di-Serio, Ángela; Villarán-Molina, Diego; Delgado-Kloos, Carlos

2015-01-01

This paper reports empirical evidence on having students use AR-SaBEr, a simulation tool based on augmented reality (AR), to discover the basic principles of electricity through a series of experiments. AR-SaBEr was enhanced with knowledge-based support and inquiry-based scaffolding mechanisms, which proved useful for discovery learning in…
76 FR 36320 - Rules of Practice in Proceedings Relative to False Representation and Lottery Orders

Federal Register 2010, 2011, 2012, 2013, 2014

2011-06-22

... officers. 952.18 Evidence. 952.19 Subpoenas. 952.20 Witness fees. 952.21 Discovery. 952.22 Transcript. 952..., motions, proposed orders, and other documents for the record. Discovery need not be filed except as may be... witnesses, that the statement correctly states the witness's opinion or knowledge concerning the matters in...
Making the Long Tail Visible: Social Networking Sites and Independent Music Discovery

ERIC Educational Resources Information Center

Gaffney, Michael; Rafferty, Pauline

2009-01-01

Purpose: The purpose of this paper is to investigate users' knowledge and use of social networking sites and folksonomies to discover if social tagging and folksonomies, within the area of independent music, aid in its information retrieval and discovery. The sites examined in this project are MySpace, Lastfm, Pandora and Allmusic. In addition,…
Enrichment assessment of multiple virtual screening strategies for Toll-like receptor 8 agonists based on a maximal unbiased benchmarking data set.

PubMed

Pei, Fen; Jin, Hongwei; Zhou, Xin; Xia, Jie; Sun, Lidan; Liu, Zhenming; Zhang, Liangren

2015-11-01

Toll-like receptor 8 agonists, which activate adaptive immune responses by inducing robust production of T-helper 1-polarizing cytokines, are promising candidates for vaccine adjuvants. As the binding site of toll-like receptor 8 is large and highly flexible, virtual screening by individual method has inevitable limitations; thus, a comprehensive comparison of different methods may provide insights into seeking effective strategy for the discovery of novel toll-like receptor 8 agonists. In this study, the performance of knowledge-based pharmacophore, shape-based 3D screening, and combined strategies was assessed against a maximum unbiased benchmarking data set containing 13 actives and 1302 decoys specialized for toll-like receptor 8 agonists. Prior structure-activity relationship knowledge was involved in knowledge-based pharmacophore generation, and a set of antagonists was innovatively used to verify the selectivity of the selected knowledge-based pharmacophore. The benchmarking data set was generated from our recently developed 'mubd-decoymaker' protocol. The enrichment assessment demonstrated a considerable performance through our selected three-layer virtual screening strategy: knowledge-based pharmacophore (Phar1) screening, shape-based 3D similarity search (Q4_combo), and then a Gold docking screening. This virtual screening strategy could be further employed to perform large-scale database screening and to discover novel toll-like receptor 8 agonists. © 2015 John Wiley & Sons A/S.
The interdependence between screening methods and screening libraries.

PubMed

Shelat, Anang A; Guy, R Kiplin

2007-06-01

The most common methods for discovery of chemical compounds capable of manipulating biological function involves some form of screening. The success of such screens is highly dependent on the chemical materials - commonly referred to as libraries - that are assayed. Classic methods for the design of screening libraries have depended on knowledge of target structure and relevant pharmacophores for target focus, and on simple count-based measures to assess other properties. The recent proliferation of two novel screening paradigms, structure-based screening and high-content screening, prompts a profound rethink about the ideal composition of small-molecule screening libraries. We suggest that currently utilized libraries are not optimal for addressing new targets by high-throughput screening, or complex phenotypes by high-content screening.
Sports Stars: Analyzing the Performance of Astronomers at Visualization-based Discovery

NASA Astrophysics Data System (ADS)

Fluke, C. J.; Parrington, L.; Hegarty, S.; MacMahon, C.; Morgan, S.; Hassan, A. H.; Kilborn, V. A.

2017-05-01

In this data-rich era of astronomy, there is a growing reliance on automated techniques to discover new knowledge. The role of the astronomer may change from being a discoverer to being a confirmer. But what do astronomers actually look at when they distinguish between “sources” and “noise?” What are the differences between novice and expert astronomers when it comes to visual-based discovery? Can we identify elite talent or coach astronomers to maximize their potential for discovery? By looking to the field of sports performance analysis, we consider an established, domain-wide approach, where the expertise of the viewer (i.e., a member of the coaching team) plays a crucial role in identifying and determining the subtle features of gameplay that provide a winning advantage. As an initial case study, we investigate whether the SportsCode performance analysis software can be used to understand and document how an experienced Hi astronomer makes discoveries in spectral data cubes. We find that the process of timeline-based coding can be applied to spectral cube data by mapping spectral channels to frames within a movie. SportsCode provides a range of easy to use methods for annotation, including feature-based codes and labels, text annotations associated with codes, and image-based drawing. The outputs, including instance movies that are uniquely associated with coded events, provide the basis for a training program or team-based analysis that could be used in unison with discipline specific analysis software. In this coordinated approach to visualization and analysis, SportsCode can act as a visual notebook, recording the insight and decisions in partnership with established analysis methods. Alternatively, in situ annotation and coding of features would be a valuable addition to existing and future visualization and analysis packages.
University of Washington's eScience Institute Promotes New Training and Career Pathways in Data Science

NASA Astrophysics Data System (ADS)

Stone, S.; Parker, M. S.; Howe, B.; Lazowska, E.

2015-12-01

Rapid advances in technology are transforming nearly every field from "data-poor" to "data-rich." The ability to extract knowledge from this abundance of data is the cornerstone of 21st century discovery. At the University of Washington eScience Institute, our mission is to engage researchers across disciplines in developing and applying advanced computational methods and tools to real world problems in data-intensive discovery. Our research team consists of individuals with diverse backgrounds in domain sciences such as astronomy, oceanography and geology, with complementary expertise in advanced statistical and computational techniques such as data management, visualization, and machine learning. Two key elements are necessary to foster careers in data science: individuals with cross-disciplinary training in both method and domain sciences, and career paths emphasizing alternative metrics for advancement. We see persistent and deep-rooted challenges for the career paths of people whose skills, activities and work patterns don't fit neatly into the traditional roles and success metrics of academia. To address these challenges the eScience Institute has developed training programs and established new career opportunities for data-intensive research in academia. Our graduate students and post-docs have mentors in both a methodology and an application field. They also participate in coursework and tutorials to advance technical skill and foster community. Professional Data Scientist positions were created to support research independence while encouraging the development and adoption of domain-specific tools and techniques. The eScience Institute also supports the appointment of faculty who are innovators in developing and applying data science methodologies to advance their field of discovery. Our ultimate goal is to create a supportive environment for data science in academia and to establish global recognition for data-intensive discovery across all fields.
No wisdom in the crowd: genome annotation in the era of big data - current status and future prospects.

PubMed

Danchin, Antoine; Ouzounis, Christos; Tokuyasu, Taku; Zucker, Jean-Daniel

2018-07-01

Science and engineering rely on the accumulation and dissemination of knowledge to make discoveries and create new designs. Discovery-driven genome research rests on knowledge passed on via gene annotations. In response to the deluge of sequencing big data, standard annotation practice employs automated procedures that rely on majority rules. We argue this hinders progress through the generation and propagation of errors, leading investigators into blind alleys. More subtly, this inductive process discourages the discovery of novelty, which remains essential in biological research and reflects the nature of biology itself. Annotation systems, rather than being repositories of facts, should be tools that support multiple modes of inference. By combining deduction, induction and abduction, investigators can generate hypotheses when accurate knowledge is extracted from model databases. A key stance is to depart from 'the sequence tells the structure tells the function' fallacy, placing function first. We illustrate our approach with examples of critical or unexpected pathways, using MicroScope to demonstrate how tools can be implemented following the principles we advocate. We end with a challenge to the reader. © 2018 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
Protein crystallography and drug discovery: recollections of knowledge exchange between academia and industry

PubMed Central

2017-01-01

The development of structure-guided drug discovery is a story of knowledge exchange where new ideas originate from all parts of the research ecosystem. Dorothy Crowfoot Hodgkin obtained insulin from Boots Pure Drug Company in the 1930s and insulin crystallization was optimized in the company Novo in the 1950s, allowing the structure to be determined at Oxford University. The structure of renin was developed in academia, on this occasion in London, in response to a need to develop antihypertensives in pharma. The idea of a dimeric aspartic protease came from an international academic team and was discovered in HIV; it eventually led to new HIV antivirals being developed in industry. Structure-guided fragment-based discovery was developed in large pharma and biotechs, but has been exploited in academia for the development of new inhibitors targeting protein–protein interactions and also antimicrobials to combat mycobacterial infections such as tuberculosis. These observations provide a strong argument against the so-called ‘linear model’, where ideas flow only in one direction from academic institutions to industry. Structure-guided drug discovery is a story of applications of protein crystallography and knowledge exhange between academia and industry that has led to new drug approvals for cancer and other common medical conditions by the Food and Drug Administration in the USA, as well as hope for the treatment of rare genetic diseases and infectious diseases that are a particular challenge in the developing world. PMID:28875019
Choosing experiments to accelerate collective discovery

PubMed Central

Rzhetsky, Andrey; Foster, Jacob G.; Foster, Ian T.

2015-01-01

A scientist’s choice of research problem affects his or her personal career trajectory. Scientists’ combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity’s importance corresponds to its degree centrality, and a problem’s difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies. PMID:26554009
Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies.

PubMed

Rahmani, Elior; Zaitlen, Noah; Baran, Yael; Eng, Celeste; Hu, Donglei; Galanter, Joshua; Oh, Sam; Burchard, Esteban G; Eskin, Eleazar; Zou, James; Halperin, Eran

2016-05-01

In epigenome-wide association studies (EWAS), different methylation profiles of distinct cell types may lead to false discoveries. We introduce ReFACTor, a method based on principal component analysis (PCA) and designed for the correction of cell type heterogeneity in EWAS. ReFACTor does not require knowledge of cell counts, and it provides improved estimates of cell type composition, resulting in improved power and control for false positives in EWAS. Corresponding software is available at http://www.cs.tau.ac.il/~heran/cozygene/software/refactor.html.
Knowledge Discovery, Integration and Communication for Extreme Weather and Flood Resilience Using Artificial Intelligence: Flood AI Alpha

NASA Astrophysics Data System (ADS)

Demir, I.; Sermet, M. Y.

2016-12-01

Nobody is immune from extreme events or natural hazards that can lead to large-scale consequences for the nation and public. One of the solutions to reduce the impacts of extreme events is to invest in improving resilience with the ability to better prepare, plan, recover, and adapt to disasters. The National Research Council (NRC) report discusses the topic of how to increase resilience to extreme events through a vision of resilient nation in the year 2030. The report highlights the importance of data, information, gaps and knowledge challenges that needs to be addressed, and suggests every individual to access the risk and vulnerability information to make their communities more resilient. This abstracts presents our project on developing a resilience framework for flooding to improve societal preparedness with objectives; (a) develop a generalized ontology for extreme events with primary focus on flooding; (b) develop a knowledge engine with voice recognition, artificial intelligence, natural language processing, and inference engine. The knowledge engine will utilize the flood ontology and concepts to connect user input to relevant knowledge discovery outputs on flooding; (c) develop a data acquisition and processing framework from existing environmental observations, forecast models, and social networks. The system will utilize the framework, capabilities and user base of the Iowa Flood Information System (IFIS) to populate and test the system; (d) develop a communication framework to support user interaction and delivery of information to users. The interaction and delivery channels will include voice and text input via web-based system (e.g. IFIS), agent-based bots (e.g. Microsoft Skype, Facebook Messenger), smartphone and augmented reality applications (e.g. smart assistant), and automated web workflows (e.g. IFTTT, CloudWork) to open the knowledge discovery for flooding to thousands of community extensible web workflows.
Functional Abstraction as a Method to Discover Knowledge in Gene Ontologies

PubMed Central

Ultsch, Alfred; Lötsch, Jörn

2014-01-01

Computational analyses of functions of gene sets obtained in microarray analyses or by topical database searches are increasingly important in biology. To understand their functions, the sets are usually mapped to Gene Ontology knowledge bases by means of over-representation analysis (ORA). Its result represents the specific knowledge of the functionality of the gene set. However, the specific ontology typically consists of many terms and relationships, hindering the understanding of the ‘main story’. We developed a methodology to identify a comprehensibly small number of GO terms as “headlines” of the specific ontology allowing to understand all central aspects of the roles of the involved genes. The Functional Abstraction method finds a set of headlines that is specific enough to cover all details of a specific ontology and is abstract enough for human comprehension. This method exceeds the classical approaches at ORA abstraction and by focusing on information rather than decorrelation of GO terms, it directly targets human comprehension. Functional abstraction provides, with a maximum of certainty, information value, coverage and conciseness, a representation of the biological functions in a gene set plays a role. This is the necessary means to interpret complex Gene Ontology results thus strengthening the role of functional genomics in biomarker and drug discovery. PMID:24587272
Building Community-Engaged Health Research and Discovery Infrastructure on the South Side of Chicago: Science in Service to Community Priorities

PubMed Central

Lindau, Stacy Tessler; Makelarski, Jennifer A.; Chin, Marshall H.; Desautels, Shane; Johnson, Daniel; Johnson, Waldo E.; Miller, Doriane; Peters, Susan; Robinson, Connie; Schneider, John; Thicklin, Florence; Watson, Natalie P.; Wolfe, Marcus; Whitaker, Eric

2011-01-01

Objective To describe the roles community members can and should play in, and an asset-based strategy used by Chicago’s South Side Health and Vitality Studies for, building sustainable, large-scale community health research infrastructure. The Studies are a family of research efforts aiming to produce actionable knowledge to inform health policy, programming, and investments for the region. Methods Community and university collaborators, using a consensus-based approach, developed shared theoretical perspectives, guiding principles, and a model for collaboration in 2008, which were used to inform an asset-based operational strategy. Ongoing community engagement and relationship-building support the infrastructure and research activities of the Studies. Results Key steps in the asset-based strategy include: 1) continuous community engagement and relationship building, 2) identifying community priorities, 3) identifying community assets, 4) leveraging assets, 5) conducting research, 6) sharing knowledge and 7) informing action. Examples of community member roles, and how these are informed by the Studies’ guiding principles, are provided. Conclusions Community and university collaborators, with shared vision and principles, can effectively work together to plan innovative, large-scale community-based research that serves community needs and priorities. Sustainable, effective models are needed to realize NIH’s mandate for meaningful translation of biomedical discovery into improved population health. PMID:21236295
Pollen--tiny and ephemeral but not forgotten: New ideas on their ecology and evolution.

PubMed

Williams, Joseph H; Mazer, Susan J

2016-03-01

Ecologists and evolutionary biologists have been interested in the functional biology of pollen since the discovery in the 1800s that pollen grains encompass tiny plants (male gametophytes) that develop and produce sperm cells. After the discovery of double fertilization in flowering plants, botanists in the early 1900s were quick to explore the effects of temperature and maternal nutrients on pollen performance, while evolutionary biologists began studying the nature of haploid selection and pollen competition. A series of technical and theoretic developments have subsequently, but usually separately, expanded our knowledge of the nature of pollen performance and how it evolves. Today, there is a tremendous diversity of interests that touch on pollen performance, ranging from the ecological setting on the stigma, structural and physiological aspects of pollen germination and tube growth, the form of pollen competition and its role in sexual selection in plants, virus transmission, mating system evolution, and inbreeding depression. Given the explosion of technical knowledge of pollen cell biology, computer modeling, and new methods to deal with diversity in a phylogenetic context, we are now more than ever poised for a new era of research that includes complex functional traits that limit or enhance the evolution of these deceptively simple organisms. © 2016 Botanical Society of America.
A Semantic Lexicon-Based Approach for Sense Disambiguation and Its WWW Application

NASA Astrophysics Data System (ADS)

di Lecce, Vincenzo; Calabrese, Marco; Soldo, Domenico

This work proposes a basic framework for resolving sense disambiguation through the use of Semantic Lexicon, a machine readable dictionary managing both word senses and lexico-semantic relations. More specifically, polysemous ambiguity characterizing Web documents is discussed. The adopted Semantic Lexicon is WordNet, a lexical knowledge-base of English words widely adopted in many research studies referring to knowledge discovery. The proposed approach extends recent works on knowledge discovery by focusing on the sense disambiguation aspect. By exploiting the structure of WordNet database, lexico-semantic features are used to resolve the inherent sense ambiguity of written text with particular reference to HTML resources. The obtained results may be extended to generic hypertextual repositories as well. Experiments show that polysemy reduction can be used to hint about the meaning of specific senses in given contexts.

State of the Art in Tumor Antigen and Biomarker Discovery

PubMed Central

Even-Desrumeaux, Klervi; Baty, Daniel; Chames, Patrick

2011-01-01

Our knowledge of tumor immunology has resulted in multiple approaches for the treatment of cancer. However, a gap between research of new tumors markers and development of immunotherapy has been established and very few markers exist that can be used for treatment. The challenge is now to discover new targets for active and passive immunotherapy. This review aims at describing recent advances in biomarkers and tumor antigen discovery in terms of antigen nature and localization, and is highlighting the most recent approaches used for their discovery including “omics” technology. PMID:24212823
Toward better drug repositioning: prioritizing and integrating existing methods into efficient pipelines.

PubMed

Jin, Guangxu; Wong, Stephen T C

2014-05-01

Recycling old drugs, rescuing shelved drugs and extending patents' lives make drug repositioning an attractive form of drug discovery. Drug repositioning accounts for approximately 30% of the newly US Food and Drug Administration (FDA)-approved drugs and vaccines in recent years. The prevalence of drug-repositioning studies has resulted in a variety of innovative computational methods for the identification of new opportunities for the use of old drugs. Questions often arise from customizing or optimizing these methods into efficient drug-repositioning pipelines for alternative applications. It requires a comprehensive understanding of the available methods gained by evaluating both biological and pharmaceutical knowledge and the elucidated mechanism-of-action of drugs. Here, we provide guidance for prioritizing and integrating drug-repositioning methods for specific drug-repositioning pipelines. Copyright © 2013 Elsevier Ltd. All rights reserved.
Application of theoretical methods to increase succinate production in engineered strains.

PubMed

Valderrama-Gomez, M A; Kreitmayer, D; Wolf, S; Marin-Sanguino, A; Kremling, A

2017-04-01

Computational methods have enabled the discovery of non-intuitive strategies to enhance the production of a variety of target molecules. In the case of succinate production, reviews covering the topic have not yet analyzed the impact and future potential that such methods may have. In this work, we review the application of computational methods to the production of succinic acid. We found that while a total of 26 theoretical studies were published between 2002 and 2016, only 10 studies reported the successful experimental implementation of any kind of theoretical knowledge. None of the experimental studies reported an exact application of the computational predictions. However, the combination of computational analysis with complementary strategies, such as directed evolution and comparative genome analysis, serves as a proof of concept and demonstrates that successful metabolic engineering can be guided by rational computational methods.
Cache-Cache Comparison for Supporting Meaningful Learning

ERIC Educational Resources Information Center

Wang, Jingyun; Fujino, Seiji

2015-01-01

The paper presents a meaningful discovery learning environment called "cache-cache comparison" for a personalized learning support system. The processing of seeking hidden relations or concepts in "cache-cache comparison" is intended to encourage learners to actively locate new knowledge in their knowledge framework and check…
From Wisdom to Innocence: Passing on the Knowledge of the Night Sky

NASA Technical Reports Server (NTRS)

Shope, R.

1996-01-01

Memorable learning can happen when the whole family shares the thrill of discovery together. The fascination of the night sky presents a perfect opportunity for gifted parents and children to experience the tradition of passing on knowledge from generation to generation.
Knowledge Discovery in Variant Databases Using Inductive Logic Programming

PubMed Central

Nguyen, Hoan; Luu, Tien-Dao; Poch, Olivier; Thompson, Julie D.

2013-01-01

Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/. PMID:23589683
Knowledge discovery in variant databases using inductive logic programming.

PubMed

Nguyen, Hoan; Luu, Tien-Dao; Poch, Olivier; Thompson, Julie D

2013-01-01

Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/.
Mixed Reality Meets Pharmaceutical Development.

PubMed

Forrest, William P; Mackey, Megan A; Shah, Vivek M; Hassell, Kerry M; Shah, Prashant; Wylie, Jennifer L; Gopinath, Janakiraman; Balderhaar, Henning; Li, Li; Wuelfing, W Peter; Helmy, Roy

2017-12-01

As science evolves, the need for more efficient and innovative knowledge transfer capabilities becomes evident. Advances in drug discovery and delivery sciences have directly impacted the pharmaceutical industry, though the added complexities have not shortened the development process. These added complexities also make it difficult for scientists to rapidly and effectively transfer knowledge to offset the lengthened drug development timelines. While webcams, camera phones, and iPads have been explored as potential new methods of real-time information sharing, the non-"hands-free" nature and lack of viewer and observer point-of-view render them unsuitable for the R&D laboratory or manufacturing setting. As an alternative solution, the Microsoft HoloLens mixed-reality headset was evaluated as a more efficient, hands-free method of knowledge transfer and information sharing. After completing a traditional method transfer between 3 R&D sites (Rahway, NJ; West Point, PA and Schnachen, Switzerland), a retrospective analysis of efficiency gain was performed through the comparison of a mock method transfer between NJ and PA sites using the HoloLens. The results demonstrated a minimum 10-fold gain in efficiency, weighing in from a savings in time, cost, and the ability to have real-time data analysis and discussion. In addition, other use cases were evaluated involving vendor and contract research/manufacturing organizations. Copyright © 2017 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.
Translational systems biology using an agent-based approach for dynamic knowledge representation: An evolutionary paradigm for biomedical research.

PubMed

An, Gary C

2010-01-01

The greatest challenge facing the biomedical research community is the effective translation of basic mechanistic knowledge into clinically effective therapeutics. This challenge is most evident in attempts to understand and modulate "systems" processes/disorders, such as sepsis, cancer, and wound healing. Formulating an investigatory strategy for these issues requires the recognition that these are dynamic processes. Representation of the dynamic behavior of biological systems can aid in the investigation of complex pathophysiological processes by augmenting existing discovery procedures by integrating disparate information sources and knowledge. This approach is termed Translational Systems Biology. Focusing on the development of computational models capturing the behavior of mechanistic hypotheses provides a tool that bridges gaps in the understanding of a disease process by visualizing "thought experiments" to fill those gaps. Agent-based modeling is a computational method particularly well suited to the translation of mechanistic knowledge into a computational framework. Utilizing agent-based models as a means of dynamic hypothesis representation will be a vital means of describing, communicating, and integrating community-wide knowledge. The transparent representation of hypotheses in this dynamic fashion can form the basis of "knowledge ecologies," where selection between competing hypotheses will apply an evolutionary paradigm to the development of community knowledge.
Learning and Relevance in Information Retrieval: A Study in the Application of Exploration and User Knowledge to Enhance Performance

ERIC Educational Resources Information Center

Hyman, Harvey

2012-01-01

This dissertation examines the impact of exploration and learning upon eDiscovery information retrieval; it is written in three parts. Part I contains foundational concepts and background on the topics of information retrieval and eDiscovery. This part informs the reader about the research frameworks, methodologies, data collection, and…
Using Discovery Maps as a Free-Choice Learning Process Can Enhance the Effectiveness of Environmental Education in a Botanical Garden

ERIC Educational Resources Information Center

Yang, Xi; Chen, Jin

2017-01-01

Botanical gardens (BGs) are important agencies that enhance human knowledge and attitude towards flora conservation. By following free-choice learning model, we developed a "Discovery map" and distributed the map to visitors at the Xishuangbanna Tropical Botanical Garden in Yunnan, China. Visitors, who did and did not receive discovery…
Discovery and Observations of a Stem-Boring Weevil (Myrmex sp.) a Potentially Useful Biocontrol of Mistletoe

Treesearch

J. D. Solomon; L. Newsome; T. H. Filer

1984-01-01

A stem-boring weevil obtained from infested clusters of mistletoe was subsequently reared and identified as Myrmex sp. To our knowledge its discovery in Mississippi is the easternmost record of mistletoe-feeding Myrmex, previously recorded only from the West and Southwest. Based on current studies, the weevil overwinters as larvae in tunnels within mistletoe stems....
The importance of Leonhard Euler's discoveries in the field of shipbuilding for the scientific evolution of academician A. N. Krylov

NASA Astrophysics Data System (ADS)

Sharkov, N. A.; Sharkova, O. A.

2018-05-01

The paper identifies the importance of the Leonhard Euler's discoveries in the field of shipbuilding for the scientific evolution of academician A. N. Krylov and for the modern knowledge in survivability and safety of ships. The works by Leonard Euler "Marine Science" and "The Moon Motion New Theory" are discussed.
The semantic connectivity map: an adapting self-organising knowledge discovery method in data bases. Experience in gastro-oesophageal reflux disease.

PubMed

Buscema, Massimo; Grossi, Enzo

2008-01-01

We describe here a new mapping method able to find out connectivity traces among variables thanks to an artificial adaptive system, the Auto Contractive Map (AutoCM), able to define the strength of the associations of each variable with all the others in a dataset. After the training phase, the weights matrix of the AutoCM represents the map of the main connections between the variables. The example of gastro-oesophageal reflux disease data base is extremely useful to figure out how this new approach can help to re-design the overall structure of factors related to complex and specific diseases description.
Determining significant material properties: A discovery approach

NASA Technical Reports Server (NTRS)

Karplus, Alan K.

1992-01-01

The following is a laboratory experiment designed to further understanding of materials science. The experiment itself can be informative for persons of any age past elementary school, and even for some in elementary school. The preparation of the plastic samples is readily accomplished by persons with resonable dexterity in the cutting of paper designs. The completion of the statistical Design of Experiments, which uses Yates' Method, requires basic math (addition and subtraction). Interpretive work requires plotting of data and making observations. Knowledge of statistical methods would be helpful. The purpose of this experiment is to acquaint students with the seven classes of recyclable plastics, and provide hands-on learning about the response of these plastics to mechanical tensile loading.
Psychological and Physiological Mechanisms by Which Discovery and Didactic Methods Work.

ERIC Educational Resources Information Center

Keegan, Mark

1995-01-01

Describes physiological, affective, and cognitive mechanisms by which didactic and discovery methods appear to work, as revealed by research literature. The optimal instructional method depends on the instructional objective, the educator's skills, and the nature of the students. Suggests more use of discovery methods. (69 references) (Author/MKR)
QA-driven Guidelines Generation for Bacteriotherapy

PubMed Central

Pasche, Emilie; Teodoro, Douglas; Gobeill, Julien; Ruch, Patrick; Lovis, Christian

2009-01-01

PURPOSE We propose a question-answering (QA) driven generation approach for automatic acquisition of structured rules that can be used in a knowledge authoring tool for antibiotic prescription guidelines management. METHODS: The rule generation is seen as a question-answering problem, where the parameters of the questions are known items of the rule (e.g. an infectious disease, caused by a given bacterium) and answers (e.g. some antibiotics) are obtained by a question-answering engine. RESULTS: When looking for a drug given a pathogen and a disease, top-precision of 0.55 is obtained by the combination of the Boolean engine (PubMed) and the relevance-driven engine (easyIR), which means that for more than half of our evaluation benchmark at least one of the recommended antibiotics was automatically acquired by the rule generation method. CONCLUSION: These results suggest that such an automatic text mining approach could provide a useful tool for guidelines management, by improving knowledge update and discovery. PMID:20351908
Supervised extensions of chemography approaches: case studies of chemical liabilities assessment

PubMed Central

2014-01-01

Chemical liabilities, such as adverse effects and toxicity, play a significant role in modern drug discovery process. In silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Herein, we propose an approach combining several classification and chemography methods to be able to predict chemical liabilities and to interpret obtained results in the context of impact of structural changes of compounds on their pharmacological profile. To our knowledge for the first time, the supervised extension of Generative Topographic Mapping is proposed as an effective new chemography method. New approach for mapping new data using supervised Isomap without re-building models from the scratch has been proposed. Two approaches for estimation of model’s applicability domain are used in our study to our knowledge for the first time in chemoinformatics. The structural alerts responsible for the negative characteristics of pharmacological profile of chemical compounds has been found as a result of model interpretation. PMID:24868246
Cost-Benefit Analysis of Confidentiality Policies for Advanced Knowledge Management Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

May, D

Knowledge Discovery (KD) processes can create new information within a Knowledge Management (KM) system. In many domains, including government, this new information must be secured against unauthorized disclosure. Applying an appropriate confidentiality policy achieves this. However, it is not evident which confidentiality policy to apply, especially when the goals of sharing and disseminating knowledge have to be balanced with the requirements to secure knowledge. This work proposes to solve this problem by developing a cost-benefit analysis technique for examining the tradeoffs between securing and sharing discovered knowledge.
Computational approaches for drug discovery.

PubMed

Hung, Che-Lun; Chen, Chi-Chun

2014-09-01

Cellular proteins are the mediators of multiple organism functions being involved in physiological mechanisms and disease. By discovering lead compounds that affect the function of target proteins, the target diseases or physiological mechanisms can be modulated. Based on knowledge of the ligand-receptor interaction, the chemical structures of leads can be modified to improve efficacy, selectivity and reduce side effects. One rational drug design technology, which enables drug discovery based on knowledge of target structures, functional properties and mechanisms, is computer-aided drug design (CADD). The application of CADD can be cost-effective using experiments to compare predicted and actual drug activity, the results from which can used iteratively to improve compound properties. The two major CADD-based approaches are structure-based drug design, where protein structures are required, and ligand-based drug design, where ligand and ligand activities can be used to design compounds interacting with the protein structure. Approaches in structure-based drug design include docking, de novo design, fragment-based drug discovery and structure-based pharmacophore modeling. Approaches in ligand-based drug design include quantitative structure-affinity relationship and pharmacophore modeling based on ligand properties. Based on whether the structure of the receptor and its interaction with the ligand are known, different design strategies can be seed. After lead compounds are generated, the rule of five can be used to assess whether these have drug-like properties. Several quality validation methods, such as cost function analysis, Fisher's cross-validation analysis and goodness of hit test, can be used to estimate the metrics of different drug design strategies. To further improve CADD performance, multi-computers and graphics processing units may be applied to reduce costs. © 2014 Wiley Periodicals, Inc.

Discovery of the leinamycin family of natural products by mining actinobacterial genomes

PubMed Central

Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen

2017-01-01

Nature’s ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF–SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF–SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm-type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature’s rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature’s biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity. PMID:29229819
Discovery of the leinamycin family of natural products by mining actinobacterial genomes.

PubMed

Pan, Guohui; Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Yang, Dong; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen; Shen, Ben

2017-12-26

Nature's ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF-SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF-SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm -type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature's rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature's biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity.
40 CFR 22.19 - Prehearing information exchange; prehearing conference; other discovery.

Code of Federal Regulations, 2010 CFR

2010-07-01

... method of discovery sought, provide the proposed discovery instruments, and describe in detail the nature... finding that: (i) The information sought cannot reasonably be obtained by alternative methods of discovery... promptly supplement or correct the exchange when the party learns that the information exchanged or...
A Sequence-Independent Strategy for Detection and Cloning of Circular DNA Virus Genomes by Using Multiply Primed Rolling-Circle Amplification

PubMed Central

Rector, Annabel; Tachezy, Ruth; Van Ranst, Marc

2004-01-01

The discovery of novel viruses has often been accomplished by using hybridization-based methods that necessitate the availability of a previously characterized virus genome probe or knowledge of the viral nucleotide sequence to construct consensus or degenerate PCR primers. In their natural replication cycle, certain viruses employ a rolling-circle mechanism to propagate their circular genomes, and multiply primed rolling-circle amplification (RCA) with φ29 DNA polymerase has recently been applied in the amplification of circular plasmid vectors used in cloning. We employed an isothermal RCA protocol that uses random hexamer primers to amplify the complete genomes of papillomaviruses without the need for prior knowledge of their DNA sequences. We optimized this RCA technique with extracted human papillomavirus type 16 (HPV-16) DNA from W12 cells, using a real-time quantitative PCR assay to determine amplification efficiency, and obtained a 2.4 × 104-fold increase in HPV-16 DNA concentration. We were able to clone the complete HPV-16 genome from this multiply primed RCA product. The optimized protocol was subsequently applied to a bovine fibropapillomatous wart tissue sample. Whereas no papillomavirus DNA could be detected by restriction enzyme digestion of the original sample, multiply primed RCA enabled us to obtain a sufficient amount of papillomavirus DNA for restriction enzyme analysis, cloning, and subsequent sequencing of a novel variant of bovine papillomavirus type 1. The multiply primed RCA method allows the discovery of previously unknown papillomaviruses, and possibly also other circular DNA viruses, without a priori sequence information. PMID:15113879
Computational Evolutionary Methodology for Knowledge Discovery and Forecasting in Epidemiology and Medicine

NASA Astrophysics Data System (ADS)

Rao, Dhananjai M.; Chernyakhovsky, Alexander; Rao, Victoria

2008-05-01

Humanity is facing an increasing number of highly virulent and communicable diseases such as avian influenza. Researchers believe that avian influenza has potential to evolve into one of the deadliest pandemics. Combating these diseases requires in-depth knowledge of their epidemiology. An effective methodology for discovering epidemiological knowledge is to utilize a descriptive, evolutionary, ecological model and use bio-simulations to study and analyze it. These types of bio-simulations fall under the category of computational evolutionary methods because the individual entities participating in the simulation are permitted to evolve in a natural manner by reacting to changes in the simulated ecosystem. This work describes the application of the aforementioned methodology to discover epidemiological knowledge about avian influenza using a novel eco-modeling and bio-simulation environment called SEARUMS. The mathematical principles underlying SEARUMS, its design, and the procedure for using SEARUMS are discussed. The bio-simulations and multi-faceted case studies conducted using SEARUMS elucidate its ability to pinpoint timelines, epicenters, and socio-economic impacts of avian influenza. This knowledge is invaluable for proactive deployment of countermeasures in order to minimize negative socioeconomic impacts, combat the disease, and avert a pandemic.
Computational Evolutionary Methodology for Knowledge Discovery and Forecasting in Epidemiology and Medicine

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rao, Dhananjai M.; Chernyakhovsky, Alexander; Rao, Victoria

2008-05-08

Humanity is facing an increasing number of highly virulent and communicable diseases such as avian influenza. Researchers believe that avian influenza has potential to evolve into one of the deadliest pandemics. Combating these diseases requires in-depth knowledge of their epidemiology. An effective methodology for discovering epidemiological knowledge is to utilize a descriptive, evolutionary, ecological model and use bio-simulations to study and analyze it. These types of bio-simulations fall under the category of computational evolutionary methods because the individual entities participating in the simulation are permitted to evolve in a natural manner by reacting to changes in the simulated ecosystem. Thismore » work describes the application of the aforementioned methodology to discover epidemiological knowledge about avian influenza using a novel eco-modeling and bio-simulation environment called SEARUMS. The mathematical principles underlying SEARUMS, its design, and the procedure for using SEARUMS are discussed. The bio-simulations and multi-faceted case studies conducted using SEARUMS elucidate its ability to pinpoint timelines, epicenters, and socio-economic impacts of avian influenza. This knowledge is invaluable for proactive deployment of countermeasures in order to minimize negative socioeconomic impacts, combat the disease, and avert a pandemic.« less
Automated Knowledge Discovery From Simulators

NASA Technical Reports Server (NTRS)

Burl, Michael; DeCoste, Dennis; Mazzoni, Dominic; Scharenbroich, Lucas; Enke, Brian; Merline, William

2007-01-01

A computational method, SimLearn, has been devised to facilitate efficient knowledge discovery from simulators. Simulators are complex computer programs used in science and engineering to model diverse phenomena such as fluid flow, gravitational interactions, coupled mechanical systems, and nuclear, chemical, and biological processes. SimLearn uses active-learning techniques to efficiently address the "landscape characterization problem." In particular, SimLearn tries to determine which regions in "input space" lead to a given output from the simulator, where "input space" refers to an abstraction of all the variables going into the simulator, e.g., initial conditions, parameters, and interaction equations. Landscape characterization can be viewed as an attempt to invert the forward mapping of the simulator and recover the inputs that produce a particular output. Given that a single simulation run can take days or weeks to complete even on a large computing cluster, SimLearn attempts to reduce costs by reducing the number of simulations needed to effect discoveries. Unlike conventional data-mining methods that are applied to static predefined datasets, SimLearn involves an iterative process in which a most informative dataset is constructed dynamically by using the simulator as an oracle. On each iteration, the algorithm models the knowledge it has gained through previous simulation trials and then chooses which simulation trials to run next. Running these trials through the simulator produces new data in the form of input-output pairs. The overall process is embodied in an algorithm that combines support vector machines (SVMs) with active learning. SVMs use learning from examples (the examples are the input-output pairs generated by running the simulator) and a principle called maximum margin to derive predictors that generalize well to new inputs. In SimLearn, the SVM plays the role of modeling the knowledge that has been gained through previous simulation trials. Active learning is used to determine which new input points would be most informative if their output were known. The selected input points are run through the simulator to generate new information that can be used to refine the SVM. The process is then repeated. SimLearn carefully balances exploration (semi-randomly searching around the input space) versus exploitation (using the current state of knowledge to conduct a tightly focused search). During each iteration, SimLearn uses not one, but an ensemble of SVMs. Each SVM in the ensemble is characterized by different hyper-parameters that control various aspects of the learned predictor - for example, whether the predictor is constrained to be very smooth (nearby points in input space lead to similar output predictions) or whether the predictor is allowed to be "bumpy." The various SVMs will have different preferences about which input points they would like to run through the simulator next. SimLearn includes a formal mechanism for balancing the ensemble SVM preferences so that a single choice can be made for the next set of trials.
The Proximal Lilly Collection: Mapping, Exploring and Exploiting Feasible Chemical Space.

PubMed

Nicolaou, Christos A; Watson, Ian A; Hu, Hong; Wang, Jibo

2016-07-25

Venturing into the immensity of the small molecule universe to identify novel chemical structure is a much discussed objective of many methods proposed by the chemoinformatics community. To this end, numerous approaches using techniques from the fields of computational de novo design, virtual screening and reaction informatics, among others, have been proposed. Although in principle this objective is commendable, in practice there are several obstacles to useful exploitation of the chemical space. Prime among them are the sheer number of theoretically feasible compounds and the practical concern regarding the synthesizability of the chemical structures conceived using in silico methods. We present the Proximal Lilly Collection initiative implemented at Eli Lilly and Co. with the aims to (i) define the chemical space of small, drug-like compounds that could be synthesized using in-house resources and (ii) facilitate access to compounds in this large space for the purposes of ongoing drug discovery efforts. The implementation of PLC relies on coupling access to available synthetic knowledge and resources with chemo/reaction informatics techniques and tools developed for this purpose. We describe in detail the computational framework supporting this initiative and elaborate on the characteristics of the PLC virtual collection of compounds. As an example of the opportunities provided to drug discovery researchers by easy access to a large, realistically feasible virtual collection such as the PLC, we describe a recent application of the technology that led to the discovery of selective kinase inhibitors.
Ensuring the Quality of Outreach: The Critical Role of Evaluating Individual and Collective Initiatives and Performance

ERIC Educational Resources Information Center

Lynton, Ernest A.

2016-01-01

New knowledge is created in the course of the application of outreach. Each complex problem in the real world is likely to have unique aspects and thus it requires some modification of standard approaches. Hence, each engagement in outreach is likely to have an element of inquiry and discovery, leading to new knowledge. The flow of knowledge is in…
Concepts of formal concept analysis

NASA Astrophysics Data System (ADS)

Žáček, Martin; Homola, Dan; Miarka, Rostislav

2017-07-01

The aim of this article is apply of Formal Concept Analysis on concept of world. Formal concept analysis (FCA) as a methodology of data analysis, information management and knowledge representation has potential to be applied to a verity of linguistic problems. FCA is mathematical theory for concepts and concept hierarchies that reflects an understanding of concept. Formal concept analysis explicitly formalizes extension and intension of a concept, their mutual relationships. A distinguishing feature of FCA is an inherent integration of three components of conceptual processing of data and knowledge, namely, the discovery and reasoning with concepts in data, discovery and reasoning with dependencies in data, and visualization of data, concepts, and dependencies with folding/unfolding capabilities.
Ethnobotanical approaches of traditional medicine studies in Southwest China: A literature review.

PubMed

Liu, Bo; Guo, Zhi-Yong; Bussmann, Rainer; Li, Fei-Fei; Li, Jian-Qin; Hong, Li-Ya; Long, Chun-Lin

2016-06-20

The ethnopharmacology of Southwest China is extremely interesting because of the region's high level of cultural and medicinal plant diversity. Little work has been done to document the traditional medicinal practices in this area. This review aims to provide an overview of the current knowledge of how medicinal plants in this area are utilized, and conserved, in order to better understand the medicinal flora, identify research gaps, and suggest directions for further research. A literature review was conducted that included peer reviewed journals, website, books, theses and scientific reports from 1979 to 2014. The distribution and characteristics of medicinal plant knowledge in each province, methods applied in research, and the fluctuations of literature in 5 year intervals were analyzed. The distribution research on different plant groups including fungi, ferns, mosses, and vascular plants were also analyzed. A total of 436 publications from 1979 to 2014 were selected for analysis. References were classified into three stages: discovery stage, utilization stage and conservation stage. Detailed results about the focus of the references, the methods applied, the development and relationship among all folk medicine in Southwest China, Daodi ethnomedicinal resources, Pharmacological studies and Toxicology studies were discussed. While, compared to the rich medicinal flora, the complex demographics and cultural diversity, a large gap still exist to fully understand and document the medicinal flora. Based on the review results, most research efforts in Southwest China focused on the first step: discovery of traditional usage, geographical distribution, and taxonomy of medicinal species. Only a small percentage of traditional uses or treatments have been tested by modern ethnobotanical approaches. Further research needs to put more emphasis on identifying adulterations, evaluating of Daodi medicine, and elucidating effective compounds from traditional drugs, using molecular and phytochemical approaches. Knowledge on ethnic and cultural aspects of medicinal plant species, to develop effective conservation and sustainable use protocols is lacking. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Semi-automated knowledge discovery: identifying and profiling human trafficking

NASA Astrophysics Data System (ADS)

Poelmans, Jonas; Elzinga, Paul; Ignatov, Dmitry I.; Kuznetsov, Sergei O.

2012-11-01

We propose an iterative and human-centred knowledge discovery methodology based on formal concept analysis. The proposed approach recognizes the important role of the domain expert in mining real-world enterprise applications and makes use of specific domain knowledge, including human intelligence and domain-specific constraints. Our approach was empirically validated at the Amsterdam-Amstelland police to identify suspects and victims of human trafficking in 266,157 suspicious activity reports. Based on guidelines of the Attorney Generals of the Netherlands, we first defined multiple early warning indicators that were used to index the police reports. Using concept lattices, we revealed numerous unknown human trafficking and loverboy suspects. In-depth investigation by the police resulted in a confirmation of their involvement in illegal activities resulting in actual arrestments been made. Our human-centred approach was embedded into operational policing practice and is now successfully used on a daily basis to cope with the vastly growing amount of unstructured information.
Database systems for knowledge-based discovery.

PubMed

Jagarlapudi, Sarma A R P; Kishan, K V Radha

2009-01-01

Several database systems have been developed to provide valuable information from the bench chemist to biologist, medical practitioner to pharmaceutical scientist in a structured format. The advent of information technology and computational power enhanced the ability to access large volumes of data in the form of a database where one could do compilation, searching, archiving, analysis, and finally knowledge derivation. Although, data are of variable types the tools used for database creation, searching and retrieval are similar. GVK BIO has been developing databases from publicly available scientific literature in specific areas like medicinal chemistry, clinical research, and mechanism-based toxicity so that the structured databases containing vast data could be used in several areas of research. These databases were classified as reference centric or compound centric depending on the way the database systems were designed. Integration of these databases with knowledge derivation tools would enhance the value of these systems toward better drug design and discovery.
The role of indirect evidence and traditional ecological knowledge in the discovery and description of new ape and monkey species since 1980.

PubMed

Rossi, Lorenzo; Gippoliti, Spartaco; Angelici, Francesco Maria

2018-06-04

Although empirical data are necessary to describe new species, their discoveries can be guided from the survey of the so-called circumstantial evidence (that indirectly determines the existence or nonexistence of a fact). Yet this type of evidence, generally linked to traditional ecological knowledge (TEK), is often disputed by field biologists due to its uncertain nature and, on account of that, generally untapped by them. To verify this behavior and the utility of circumstantial evidence, we reviewed the existing literature about the species of apes and monkeys described or rediscovered since January 1, 1980 and submitted a poll to the authors. The results show that circumstantial evidence has proved to be useful in 40.5% of the examined cases and point to the possibility that its use could speed up the process at the heart of the discovery and description of new species, an essential step for conservation purposes.
Dewey: How to Make It Work for You

ERIC Educational Resources Information Center

Panzer, Michael

2013-01-01

As knowledge brokers, librarians are living in interesting times for themselves and libraries. It causes them to wonder sometimes if the traditional tools like the Dewey Decimal Classification (DDC) system can cope with the onslaught of information. The categories provided do not always seem adequate for the knowledge-discovery habits of…
78 FR 29071 - Assessment of Mediation and Arbitration Procedures

Federal Register 2010, 2011, 2012, 2013, 2014

2013-05-17

... proceeding. Program participants in the new arbitration program will have prior knowledge of the issues to be... final rules, all parties opting into the arbitration program will have full prior knowledge that these... including discovery, the submission of evidence, and the treatment of confidential information, and the...
Streamlining the Discovery, Evaluation, and Integration of Data, Models, and Decision Support Systems: a Big Picture View

EPA Science Inventory

21st century environmental problems are wicked and require holistic systems thinking and solutions that integrate social and economic knowledge with knowledge of the environment. Computer-based technologies are fundamental to our ability to research and understand the relevant sy...
Teaching Practice: A Perspective on Inter-Text and Prior Knowledge

ERIC Educational Resources Information Center

Costley, Kevin C.; West, Howard G.

2012-01-01

The use of teaching practices that involve intertextual relationship discovery in today's elementary classrooms is increasingly essential to the success of young learners of reading. Teachers must constantly strive to expand their perspective of how to incorporate the dialogue included in prior knowledge assessment. Teachers must also consider how…
Globalization of Knowledge Discovery and Information Retrieval in Teaching and Learning

ERIC Educational Resources Information Center

Zaidel, Mark; Guerrero, Osiris

2008-01-01

Developments in communication and information technologies in the last decade have had a significant impact on instructional and learning activities. For many students and educators, the Internet became the significant medium for sharing instruction, learning and communication. Access to knowledge beyond boundaries and cultures has an impact on…
An Evaluation of Text Mining Tools as Applied to Selected Scientific and Engineering Literature.

ERIC Educational Resources Information Center

Trybula, Walter J.; Wyllys, Ronald E.

2000-01-01

Addresses an approach to the discovery of scientific knowledge through an examination of data mining and text mining techniques. Presents the results of experiments that investigated knowledge acquisition from a selected set of technical documents by domain experts. (Contains 15 references.) (Author/LRW)

Vocational Education Institutions' Role in National Innovation

ERIC Educational Resources Information Center

Moodie, Gavin

2006-01-01

This article distinguishes research--the discovery of new knowledge--from innovation, which is understood to be the transformation of practice in a community or the incorporation of existing knowledge into economic activity. From a survey of roles served by vocational education institutions in a number of OECD countries the paper argues that…
Exploring relation types for literature-based discovery.

PubMed

Preiss, Judita; Stevenson, Mark; Gaizauskas, Robert

2015-09-01

Literature-based discovery (LBD) aims to identify "hidden knowledge" in the medical literature by: (1) analyzing documents to identify pairs of explicitly related concepts (terms), then (2) hypothesizing novel relations between pairs of unrelated concepts that are implicitly related via a shared concept to which both are explicitly related. Many LBD approaches use simple techniques to identify semantically weak relations between concepts, for example, document co-occurrence. These generate huge numbers of hypotheses, difficult for humans to assess. More complex techniques rely on linguistic analysis, for example, shallow parsing, to identify semantically stronger relations. Such approaches generate fewer hypotheses, but may miss hidden knowledge. The authors investigate this trade-off in detail, comparing techniques for identifying related concepts to discover which are most suitable for LBD. A generic LBD system that can utilize a range of relation types was developed. Experiments were carried out comparing a number of techniques for identifying relations. Two approaches were used for evaluation: replication of existing discoveries and the "time slicing" approach.(1) RESULTS: Previous LBD discoveries could be replicated using relations based either on document co-occurrence or linguistic analysis. Using relations based on linguistic analysis generated many fewer hypotheses, but a significantly greater proportion of them were candidates for hidden knowledge. The use of linguistic analysis-based relations improves accuracy of LBD without overly damaging coverage. LBD systems often generate huge numbers of hypotheses, which are infeasible to manually review. Improving their accuracy has the potential to make these systems significantly more usable. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification

DOE PAGES

Wang, Jing; Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; ...

2013-01-01

Background . The availability of large complex data sets generated by high throughput technologies has enabled the recent proliferation of disease biomarker studies. However, a recurring problem in deriving biological information from large data sets is how to best incorporate expert knowledge into the biomarker selection process. Objective . To develop a generalizable framework that can incorporate expert knowledge into data-driven processes in a semiautomated way while providing a metric for optimization in a biomarker selection scheme. Methods . The framework was implemented as a pipeline consisting of five components for the identification of signatures from integrated clustering (ISIC). Expertmore » knowledge was integrated into the biomarker identification process using the combination of two distinct approaches; a distance-based clustering approach and an expert knowledge-driven functional selection. Results . The utility of the developed framework ISIC was demonstrated on proteomics data from a study of chronic obstructive pulmonary disease (COPD). Biomarker candidates were identified in a mouse model using ISIC and validated in a study of a human cohort. Conclusions . Expert knowledge can be introduced into a biomarker discovery process in different ways to enhance the robustness of selected marker candidates. Developing strategies for extracting orthogonal and robust features from large data sets increases the chances of success in biomarker identification.« less
A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification

PubMed Central

Wang, Jing; Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Varnum, Susan M.; Brown, Joseph N.; Riensche, Roderick M.; Adkins, Joshua N.; Jacobs, Jon M.; Hoidal, John R.; Scholand, Mary Beth; Pounds, Joel G.; Blackburn, Michael R.; Rodland, Karin D.; McDermott, Jason E.

2013-01-01

Background. The availability of large complex data sets generated by high throughput technologies has enabled the recent proliferation of disease biomarker studies. However, a recurring problem in deriving biological information from large data sets is how to best incorporate expert knowledge into the biomarker selection process. Objective. To develop a generalizable framework that can incorporate expert knowledge into data-driven processes in a semiautomated way while providing a metric for optimization in a biomarker selection scheme. Methods. The framework was implemented as a pipeline consisting of five components for the identification of signatures from integrated clustering (ISIC). Expert knowledge was integrated into the biomarker identification process using the combination of two distinct approaches; a distance-based clustering approach and an expert knowledge-driven functional selection. Results. The utility of the developed framework ISIC was demonstrated on proteomics data from a study of chronic obstructive pulmonary disease (COPD). Biomarker candidates were identified in a mouse model using ISIC and validated in a study of a human cohort. Conclusions. Expert knowledge can be introduced into a biomarker discovery process in different ways to enhance the robustness of selected marker candidates. Developing strategies for extracting orthogonal and robust features from large data sets increases the chances of success in biomarker identification. PMID:24223463
A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Jing; Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.

2013-10-01

Background. The availability of large complex data sets generated by high throughput technologies has enabled the recent proliferation of disease biomarker studies. However, a recurring problem in deriving biological information from large data sets is how to best incorporate expert knowledge into the biomarker selection process. Objective. To develop a generalizable framework that can incorporate expert knowledge into data-driven processes in a semiautomated way while providing a metric for optimization in a biomarker selection scheme. Methods. The framework was implemented as a pipeline consisting of five components for the identification of signatures from integrated clustering (ISIC). Expert knowledge was integratedmore » into the biomarker identification process using the combination of two distinct approaches; a distance-based clustering approach and an expert knowledge-driven functional selection. Results. The utility of the developed framework ISIC was demonstrated on proteomics data from a study of chronic obstructive pulmonary disease (COPD). Biomarker candidates were identified in a mouse model using ISIC and validated in a study of a human cohort. Conclusions. Expert knowledge can be introduced into a biomarker discovery process in different ways to enhance the robustness of selected marker candidates. Developing strategies for extracting orthogonal and robust features from large data sets increases the chances of success in biomarker identification.« less
An Analysis Pipeline with Statistical and Visualization-Guided Knowledge Discovery for Michigan-Style Learning Classifier Systems

PubMed Central

Urbanowicz, Ryan J.; Granizo-Mackenzie, Ambrose; Moore, Jason H.

2014-01-01

Michigan-style learning classifier systems (M-LCSs) represent an adaptive and powerful class of evolutionary algorithms which distribute the learned solution over a sizable population of rules. However their application to complex real world data mining problems, such as genetic association studies, has been limited. Traditional knowledge discovery strategies for M-LCS rule populations involve sorting and manual rule inspection. While this approach may be sufficient for simpler problems, the confounding influence of noise and the need to discriminate between predictive and non-predictive attributes calls for additional strategies. Additionally, tests of significance must be adapted to M-LCS analyses in order to make them a viable option within fields that require such analyses to assess confidence. In this work we introduce an M-LCS analysis pipeline that combines uniquely applied visualizations with objective statistical evaluation for the identification of predictive attributes, and reliable rule generalizations in noisy single-step data mining problems. This work considers an alternative paradigm for knowledge discovery in M-LCSs, shifting the focus from individual rules to a global, population-wide perspective. We demonstrate the efficacy of this pipeline applied to the identification of epistasis (i.e., attribute interaction) and heterogeneity in noisy simulated genetic association data. PMID:25431544
DataHub: Knowledge-based data management for data discovery

NASA Astrophysics Data System (ADS)

Handley, Thomas H.; Li, Y. Philip

1993-08-01

Currently available database technology is largely designed for business data-processing applications, and seems inadequate for scientific applications. The research described in this paper, the DataHub, will address the issues associated with this shortfall in technology utilization and development. The DataHub development is addressing the key issues in scientific data management of scientific database models and resource sharing in a geographically distributed, multi-disciplinary, science research environment. Thus, the DataHub will be a server between the data suppliers and data consumers to facilitate data exchanges, to assist science data analysis, and to provide as systematic approach for science data management. More specifically, the DataHub's objectives are to provide support for (1) exploratory data analysis (i.e., data driven analysis); (2) data transformations; (3) data semantics capture and usage; analysis-related knowledge capture and usage; and (5) data discovery, ingestion, and extraction. Applying technologies that vary from deductive databases, semantic data models, data discovery, knowledge representation and inferencing, exploratory data analysis techniques and modern man-machine interfaces, DataHub will provide a prototype, integrated environement to support research scientists' needs in multiple disciplines (i.e. oceanography, geology, and atmospheric) while addressing the more general science data management issues. Additionally, the DataHub will provide data management services to exploratory data analysis applications such as LinkWinds and NCSA's XIMAGE.
A Fast Projection-Based Algorithm for Clustering Big Data.

PubMed

Wu, Yun; He, Zhiquan; Lin, Hao; Zheng, Yufei; Zhang, Jingfen; Xu, Dong

2018-06-07

With the fast development of various techniques, more and more data have been accumulated with the unique properties of large size (tall) and high dimension (wide). The era of big data is coming. How to understand and discover new knowledge from these data has attracted more and more scholars' attention and has become the most important task in data mining. As one of the most important techniques in data mining, clustering analysis, a kind of unsupervised learning, could group a set data into objectives(clusters) that are meaningful, useful, or both. Thus, the technique has played very important role in knowledge discovery in big data. However, when facing the large-sized and high-dimensional data, most of the current clustering methods exhibited poor computational efficiency and high requirement of computational source, which will prevent us from clarifying the intrinsic properties and discovering the new knowledge behind the data. Based on this consideration, we developed a powerful clustering method, called MUFOLD-CL. The principle of the method is to project the data points to the centroid, and then to measure the similarity between any two points by calculating their projections on the centroid. The proposed method could achieve linear time complexity with respect to the sample size. Comparison with K-Means method on very large data showed that our method could produce better accuracy and require less computational time, demonstrating that the MUFOLD-CL can serve as a valuable tool, at least may play a complementary role to other existing methods, for big data clustering. Further comparisons with state-of-the-art clustering methods on smaller datasets showed that our method was fastest and achieved comparable accuracy. For the convenience of most scholars, a free soft package was constructed.
Constructing a Graph Database for Semantic Literature-Based Discovery.

PubMed

Hristovski, Dimitar; Kastrin, Andrej; Dinevski, Dejan; Rindflesch, Thomas C

2015-01-01

Literature-based discovery (LBD) generates discoveries, or hypotheses, by combining what is already known in the literature. Potential discoveries have the form of relations between biomedical concepts; for example, a drug may be determined to treat a disease other than the one for which it was intended. LBD views the knowledge in a domain as a network; a set of concepts along with the relations between them. As a starting point, we used SemMedDB, a database of semantic relations between biomedical concepts extracted with SemRep from Medline. SemMedDB is distributed as a MySQL relational database, which has some problems when dealing with network data. We transformed and uploaded SemMedDB into the Neo4j graph database, and implemented the basic LBD discovery algorithms with the Cypher query language. We conclude that storing the data needed for semantic LBD is more natural in a graph database. Also, implementing LBD discovery algorithms is conceptually simpler with a graph query language when compared with standard SQL.
Computational tools for comparative phenomics; the role and promise of ontologies

PubMed Central

Gkoutos, Georgios V.; Schofield, Paul N.; Hoehndorf, Robert

2012-01-01

A major aim of the biological sciences is to gain an understanding of human physiology and disease. One important step towards such a goal is the discovery of the function of genes that will lead to better understanding of the physiology and pathophysiology of organisms ultimately providing better understanding, diagnosis, and therapy. Our increasing ability to phenotypically characterise genetic variants of model organisms coupled with systematic and hypothesis-driven mutagenesis is resulting in a wealth of information that could potentially provide insight to the functions of all genes in an organism. The challenge we are now facing is to develop computational methods that can integrate and analyse such data. The introduction of formal ontologies that make their semantics explicit and accessible to automated reasoning promises the tantalizing possibility of standardizing biomedical knowledge allowing for novel, powerful queries that bridge multiple domains, disciplines, species and levels of granularity. We review recent computational approaches that facilitate the integration of experimental data from model organisms with clinical observations in humans. These methods foster novel cross species analysis approaches, thereby enabling comparative phenomics and leading to the potential of translating basic discoveries from the model systems into diagnostic and therapeutic advances at the clinical level. PMID:22814867
Faults Discovery By Using Mined Data

NASA Technical Reports Server (NTRS)

Lee, Charles

2005-01-01

Fault discovery in the complex systems consist of model based reasoning, fault tree analysis, rule based inference methods, and other approaches. Model based reasoning builds models for the systems either by mathematic formulations or by experiment model. Fault Tree Analysis shows the possible causes of a system malfunction by enumerating the suspect components and their respective failure modes that may have induced the problem. The rule based inference build the model based on the expert knowledge. Those models and methods have one thing in common; they have presumed some prior-conditions. Complex systems often use fault trees to analyze the faults. Fault diagnosis, when error occurs, is performed by engineers and analysts performing extensive examination of all data gathered during the mission. International Space Station (ISS) control center operates on the data feedback from the system and decisions are made based on threshold values by using fault trees. Since those decision-making tasks are safety critical and must be done promptly, the engineers who manually analyze the data are facing time challenge. To automate this process, this paper present an approach that uses decision trees to discover fault from data in real-time and capture the contents of fault trees as the initial state of the trees.
Arthropods as a source of new RNA viruses.

PubMed

Bichaud, L; de Lamballerie, X; Alkan, C; Izri, A; Gould, E A; Charrel, R N

2014-12-01

The discovery and development of methods for isolation, characterisation and taxonomy of viruses represents an important milestone in the study, treatment and control of virus diseases during the 20th century. Indeed, by the late-1950s, it was becoming common belief that most human and veterinary pathogenic viruses had been discovered. However, at that time, knowledge of the impact of improved commercial transportation, urbanisation and deforestation, on disease emergence, was in its infancy. From the late 1960s onwards viruses, such as hepatitis virus (A, B and C) hantavirus, HIV, Marburg virus, Ebola virus and many others began to emerge and it became apparent that the world was changing, at least in terms of virus epidemiology, largely due to the influence of anthropological activities. Subsequently, with the improvement of molecular biotechnologies, for amplification of viral RNA, genome sequencing and proteomic analysis the arsenal of available tools for virus discovery and genetic characterization opened up new and exciting possibilities for virological discovery. Many recently identified but "unclassified" viruses are now being allocated to existing genera or families based on whole genome sequencing, bioinformatic and phylogenetic analysis. New species, genera and families are also being created following the guidelines of the International Committee for the Taxonomy of Viruses. Many of these newly discovered viruses are vectored by arthropods (arboviruses) and possess an RNA genome. This brief review will focus largely on the discovery of new arthropod-borne viruses. Copyright © 2014 Elsevier Ltd. All rights reserved.
Aflatoxin control--how a regulatory agency managed risk from an unavoidable natural toxicant in food and feed.

PubMed

Park, D L; Stoloff, L

1989-04-01

The control by the Food and Drug Administration (FDA) of aflatoxin, a relatively recently discovered, unavoidable natural contaminant produced by specific molds that invade a number of basic food and feedstuffs, provides an example of the varying forces that affect risk assessment and management by a regulatory Agency. This is the story of how the FDA responded to the initial discovery of a potential carcinogenic hazard to humans in a domestic commodity, to the developing information concerning the nature of the hazard, to the economic and political pressures that are created by the impact of natural forces on regulatory controls, and to the restraints of laws within which the Agency must work. This story covers four periods: the years of discovery and action decisions on the basis of meager knowledge and the fear of cancer; the years of tinkering on paper with the regulatory process, the years of digestion of the accumulating knowledge, and the application of that knowledge to actions forced by natural events; and an audit of the current status of knowledge about the hazard from aflatoxin, and proposals for regulatory control based on that knowledge.
Postgenomic strategies in antibacterial drug discovery.

PubMed

Brötz-Oesterhelt, Heike; Sass, Peter

2010-10-01

During the last decade the field of antibacterial drug discovery has changed in many aspects including bacterial organisms of primary interest, discovery strategies applied and pharmaceutical companies involved. Target-based high-throughput screening had been disappointingly unsuccessful for antibiotic research. Understanding of this lack of success has increased substantially and the lessons learned refer to characteristics of targets, screening libraries and screening strategies. The 'genomics' approach was replaced by a diverse array of discovery strategies, for example, searching for new natural product leads among previously abandoned compounds or new microbial sources, screening for synthetic inhibitors by targeted approaches including structure-based design and analyses of focused libraries and designing resistance-breaking properties into antibiotics of established classes. Furthermore, alternative treatment options are being pursued including anti-virulence strategies and immunotherapeutic approaches. This article summarizes the lessons learned from the genomics era and describes discovery strategies resulting from that knowledge.
Priority of discovery in the life sciences

PubMed Central

Vale, Ronald D; Hyman, Anthony A

2016-01-01

The job of a scientist is to make a discovery and then communicate this new knowledge to others. For a scientist to be successful, he or she needs to be able to claim credit or priority for discoveries throughout their career. However, despite being fundamental to the reward system of science, the principles for establishing the "priority of discovery" are rarely discussed. Here we break down priority into two steps: disclosure, in which the discovery is released to the world-wide community; and validation, in which other scientists assess the accuracy, quality and importance of the work. Currently, in biology, disclosure and an initial validation are combined in a journal publication. Here, we discuss the advantages of separating these steps into disclosure via a preprint, and validation via a combination of peer review at a journal and additional evaluation by the wider scientific community. PMID:27310529
The Application of the Open Pharmacological Concepts Triple Store (Open PHACTS) to Support Drug Discovery Research

PubMed Central

Ratnam, Joseline; Zdrazil, Barbara; Digles, Daniela; Cuadrado-Rodriguez, Emiliano; Neefs, Jean-Marc; Tipney, Hannah; Siebes, Ronald; Waagmeester, Andra; Bradley, Glyn; Chau, Chau Han; Richter, Lars; Brea, Jose; Evelo, Chris T.; Jacoby, Edgar; Senger, Stefan; Loza, Maria Isabel; Ecker, Gerhard F.; Chichester, Christine

2014-01-01

Integration of open access, curated, high-quality information from multiple disciplines in the Life and Biomedical Sciences provides a holistic understanding of the domain. Additionally, the effective linking of diverse data sources can unearth hidden relationships and guide potential research strategies. However, given the lack of consistency between descriptors and identifiers used in different resources and the absence of a simple mechanism to link them, gathering and combining relevant, comprehensive information from diverse databases remains a challenge. The Open Pharmacological Concepts Triple Store (Open PHACTS) is an Innovative Medicines Initiative project that uses semantic web technology approaches to enable scientists to easily access and process data from multiple sources to solve real-world drug discovery problems. The project draws together sources of publicly-available pharmacological, physicochemical and biomolecular data, represents it in a stable infrastructure and provides well-defined information exploration and retrieval methods. Here, we highlight the utility of this platform in conjunction with workflow tools to solve pharmacological research questions that require interoperability between target, compound, and pathway data. Use cases presented herein cover 1) the comprehensive identification of chemical matter for a dopamine receptor drug discovery program 2) the identification of compounds active against all targets in the Epidermal growth factor receptor (ErbB) signaling pathway that have a relevance to disease and 3) the evaluation of established targets in the Vitamin D metabolism pathway to aid novel Vitamin D analogue design. The example workflows presented illustrate how the Open PHACTS Discovery Platform can be used to exploit existing knowledge and generate new hypotheses in the process of drug discovery. PMID:25522365
Use of Computational Functional Genomics in Drug Discovery and Repurposing for Analgesic Indications.

PubMed

Lötsch, Jörn; Kringel, Dario

2018-06-01

The novel research area of functional genomics investigates biochemical, cellular, or physiological properties of gene products with the goal of understanding the relationship between the genome and the phenotype. These developments have made analgesic drug research a data-rich discipline mastered only by making use of parallel developments in computer science, including the establishment of knowledge bases, mining methods for big data, machine-learning, and artificial intelligence, (Table ) which will be exemplarily introduced in the following. © 2018 The Authors Clinical Pharmacology & Therapeutics published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.
Joint principal trend analysis for longitudinal high-dimensional data.

PubMed

Zhang, Yuping; Ouyang, Zhengqing

2018-06-01

We consider a research scenario motivated by integrating multiple sources of information for better knowledge discovery in diverse dynamic biological processes. Given two longitudinal high-dimensional datasets for a group of subjects, we want to extract shared latent trends and identify relevant features. To solve this problem, we present a new statistical method named as joint principal trend analysis (JPTA). We demonstrate the utility of JPTA through simulations and applications to gene expression data of the mammalian cell cycle and longitudinal transcriptional profiling data in response to influenza viral infections. © 2017, The International Biometric Society.
Worms--a "license to kill".

PubMed

Kaminsky, Ronald; Rufener, Lucien; Bouvier, Jacques; Lizundia, Regina; Schorderet Weber, Sandra; Sager, Heinz

2013-08-01

Worm infections can cause severe harm and death to both humans and numerous domestic and wild animals. Despite the fact that there are many beneficial worm species, veterinarians, physicians and parasitologists have multiple reasons to combat parasitic worms. The pros and cons of various approaches for the discovery of new control methods are discussed, including novel anthelmintics, vaccines and genetic approaches to identify novel drug and vaccine targets. Currently, the mainstay of worm control remains chemotherapy and prophylaxis. The importance of knowledgeable and wise use of the available anthelmintics is highlighted. Copyright © 2013 Elsevier B.V. All rights reserved.
Privacy Preserving Technique for Euclidean Distance Based Mining Algorithms Using a Wavelet Related Transform

NASA Astrophysics Data System (ADS)

Kadampur, Mohammad Ali; D. v. L. N., Somayajulu

Privacy preserving data mining is an art of knowledge discovery without revealing the sensitive data of the data set. In this paper a data transformation technique using wavelets is presented for privacy preserving data mining. Wavelets use well known energy compaction approach during data transformation and only the high energy coefficients are published to the public domain instead of the actual data proper. It is found that the transformed data preserves the Eucleadian distances and the method can be used in privacy preserving clustering. Wavelets offer the inherent improved time complexity.

Exploiting Recurring Structure in a Semantic Network

NASA Technical Reports Server (NTRS)

Wolfe, Shawn R.; Keller, Richard M.

2004-01-01

With the growing popularity of the Semantic Web, an increasing amount of information is becoming available in machine interpretable, semantically structured networks. Within these semantic networks are recurring structures that could be mined by existing or novel knowledge discovery methods. The mining of these semantic structures represents an interesting area that focuses on mining both for and from the Semantic Web, with surprising applicability to problems confronting the developers of Semantic Web applications. In this paper, we present representative examples of recurring structures and show how these structures could be used to increase the utility of a semantic repository deployed at NASA.
[The methods of Western medicine in on ancient medicine].

PubMed

Ban, Deokjin

2010-06-30

The treatise On Ancient Medicine attests that questions of method were being debated both in medicine and in philosophy and is important evidence of cross-discipline methodological controversy. The treatise On Ancient Medicine is the first attempt in the history of Greek thought to provide a detailed account of the development of a science from a starting point in observation and experience. The author of it criticizes philosophical physicians who attempt to systematized medicine by reducing it to the interaction of one or more of the opposites hot, cold, wet, and dry, factors. He regards the theory of his opponents as hypothesis(hypothesis). Medicine has long been in possession of both an archē and a hodos, a principle and a method, which have enabled it to make discoveries over a long period of time. As far as method is concerned, the traditional science of medicine attained the knowledge of the visible by starting from observation and experience, but it recommended the use of reasoning and analogies with familiar objects as a means of learning about the invisible. It also utilized inference from the visible to the visible(epilogismos) and inference from the visible to the invisible(analogismos). The use of analogy as a means of learning about the obscure was also part of the common heritage of early philosophy and medicine. But the author's use of the analogical method distinguishes it from Empedocles' well-known analogy comparisons of the eye to a lantern and the process of respiration to the operations of a clepsydra. According to the author, traditional science of medicine used functional analogy like wine example and cheese example to know the function of humors within the body and utilized structured analogy like a tube example and a cupping instrument example to acknowledge an organ or structure within the body. But the author didn't distinguish between the claim that medicine has a systematic method of making discoveries and very different claim that it has a systematic method of treatment. The reason for this is that he thought that discoveries are the end point of the method of investigation and the starting point of the procedures used in treatment.
Forecasting petroleum discoveries in sparsely drilled areas: Nigeria and the North Sea

DOE Office of Scientific and Technical Information (OSTI.GOV)

Attanasi, E.D.; Root, D.H.

1988-10-01

Decline function methods for projecting future discoveries generally capture the crowding effects of wildcat wells on the discovery rate. However, these methods do not accommodate easily situations where exploration areas and horizons are expanding. In this paper, a method is presented that uses a mapping algorithm for separating these often countervailing influences. The method is applied to Nigeria and the North Sea. For an amount of future drilling equivalent to past drilling (825 wildcat wells), future discoveries (in resources found) for Nigeria are expected to decline by 68% per well but still amount to 8.5 billion barrels of oil equivalentmore » (BOE). Similarly, for the total North Sea for an equivalent amount and mix among areas of past drilling (1322 wildcat wells), future discoveries are expected to amount to 17.9 billion BOE, whereas the average discovery rate per well is expected to decline by 71%.« less
Forecasting petroleum discoveries in sparsely drilled areas: Nigeria and the North Sea

USGS Publications Warehouse

Attanasi, E.D.; Root, D.H.

1988-01-01

Decline function methods for projecting future discoveries generally capture the crowding effects of wildcat wells on the discovery rate. However, these methods do not accommodate easily situations where exploration areas and horizons are expanding. In this paper, a method is presented that uses a mapping algorithm for separating these often countervailing influences. The method is applied to Nigeria and the North Sea. For an amount of future drilling equivalent to past drilling (825 wildcat wells), future discoveries (in resources found) for Nigeria are expected to decline by 68% per well but still amount to 8.5 billion barrels of oil equivalent (BOE). Similarly, for the total North Sea for an equivalent amount and mix among areas of past drilling (1322 wildcat wells), future discoveries are expected to amount to 17.9 billion BOE, whereas the average discovery rate per well is expected to decline by 71%. ?? 1988 International Association for Mathematical Geology.
Repositioning the substrate activity screening (SAS) approach as a fragment-based method for identification of weak binders.

PubMed

Gladysz, Rafaela; Cleenewerck, Matthias; Joossens, Jurgen; Lambeir, Anne-Marie; Augustyns, Koen; Van der Veken, Pieter

2014-10-13

Fragment-based drug discovery (FBDD) has evolved into an established approach for "hit" identification. Typically, most applications of FBDD depend on specialised cost- and time-intensive biophysical techniques. The substrate activity screening (SAS) approach has been proposed as a relatively cheap and straightforward alternative for identification of fragments for enzyme inhibitors. We have investigated SAS for the discovery of inhibitors of oncology target urokinase (uPA). Although our results support the key hypotheses of SAS, we also encountered a number of unreported limitations. In response, we propose an efficient modified methodology: "MSAS" (modified substrate activity screening). MSAS circumvents the limitations of SAS and broadens its scope by providing additional fragments and more coherent SAR data. As well as presenting and validating MSAS, this study expands existing SAR knowledge for the S1 pocket of uPA and reports new reversible and irreversible uPA inhibitor scaffolds. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Materials Informatics: The Materials ``Gene'' and Big Data

NASA Astrophysics Data System (ADS)

Rajan, Krishna

2015-07-01

Materials informatics provides the foundations for a new paradigm of materials discovery. It shifts our emphasis from one of solely searching among large volumes of data that may be generated by experiment or computation to one of targeted materials discovery via high-throughput identification of the key factors (i.e., “genes”) and via showing how these factors can be quantitatively integrated by statistical learning methods into design rules (i.e., “gene sequencing”) governing targeted materials functionality. However, a critical challenge in discovering these materials genes is the difficulty in unraveling the complexity of the data associated with numerous factors including noise, uncertainty, and the complex diversity of data that one needs to consider (i.e., Big Data). In this article, we explore one aspect of materials informatics, namely how one can efficiently explore for new knowledge in regimes of structure-property space, especially when no reasonable selection pathways based on theory or clear trends in observations exist among an almost infinite set of possibilities.
Was Muller's 1946 Nobel Prize research for radiation-induced gene mutations peer-reviewed?

PubMed

Calabrese, Edward J

2018-06-06

This historical analysis indicates that it is highly unlikely that the Nobel Prize winning research of Hermann J. Muller was peer-reviewed. The published paper of Muller lacked a research methods section, cited no references, and failed to acknowledge and discuss the work of Gager and Blakeslee (PNAS 13:75-79, 1927) that claimed to have induced gene mutation via ionizing radiation six months prior to Muller's non-data Science paper (Muller, Science 66(1699):84-87, 1927a). Despite being well acclimated into the scientific world of peer-review, Muller choose to avoid the peer-review process on his most significant publication. It appears that Muller's actions were strongly influenced by his desire to claim primacy for the discovery of gene mutation. The actions of Muller have important ethical lessons and implications today, when self-interest trumps one's obligations to society and the scientific culture that supports the quest for new knowledge and discovery.
Autonomy enables new science missions

NASA Astrophysics Data System (ADS)

Doyle, Richard J.; Gor, Victoria; Man, Guy K.; Stolorz, Paul E.; Chapman, Clark; Merline, William J.; Stern, Alan

1997-01-01

The challenge of space flight in NASA's future is to enable smaller, more frequent and intensive space exploration at much lower total cost without substantially decreasing mission reliability, capability, or the scientific return on investment. The most effective way to achieve this goal is to build intelligent capabilities into the spacecraft themselves. Our technological vision for meeting the challenge of returning quality science through limited communication bandwidth will actually put scientists in a more direct link with the spacecraft than they have enjoyed to date. Technologies such as pattern recognition and machine learning can place a part of the scientist's awareness onboard the spacecraft to prioritize downlink or to autonomously trigger time-critical follow-up observations-particularly important in flyby missions-without ground interaction. Onboard knowledge discovery methods can be used to include candidate discoveries in each downlink for scientists' scrutiny. Such capabilities will allow scientists to quickly reprioritize missions in a much more intimate and efficient manner than is possible today. Ultimately, new classes of exploration missions will be enabled.
Oomycete Interactions with Plants: Infection Strategies and Resistance Principles

PubMed Central

Doumane, Mehdi

2015-01-01

SUMMARY The Oomycota include many economically significant microbial pathogens of crop species. Understanding the mechanisms by which oomycetes infect plants and identifying methods to provide durable resistance are major research goals. Over the last few years, many elicitors that trigger plant immunity have been identified, as well as host genes that mediate susceptibility to oomycete pathogens. The mechanisms behind these processes have subsequently been investigated and many new discoveries made, marking a period of exciting research in the oomycete pathology field. This review provides an introduction to our current knowledge of the pathogenic mechanisms used by oomycetes, including elicitors and effectors, plus an overview of the major principles of host resistance: the established R gene hypothesis and the more recently defined susceptibility (S) gene model. Future directions for development of oomycete-resistant plants are discussed, along with ways that recent discoveries in the field of oomycete-plant interactions are generating novel means of studying how pathogen and symbiont colonizations overlap. PMID:26041933
Generation of a novel next-generation sequencing-based method for the isolation of new human papillomavirus types.

PubMed

Brancaccio, Rosario N; Robitaille, Alexis; Dutta, Sankhadeep; Cuenin, Cyrille; Santare, Daiga; Skenders, Girts; Leja, Marcis; Fischer, Nicole; Giuliano, Anna R; Rollison, Dana E; Grundhoff, Adam; Tommasino, Massimo; Gheit, Tarik

2018-05-07

With the advent of new molecular tools, the discovery of new papillomaviruses (PVs) has accelerated during the past decade, enabling the expansion of knowledge about the viral populations that inhabit the human body. Human PVs (HPVs) are etiologically linked to benign or malignant lesions of the skin and mucosa. The detection of HPV types can vary widely, depending mainly on the methodology and the quality of the biological sample. Next-generation sequencing is one of the most powerful tools, enabling the discovery of novel viruses in a wide range of biological material. Here, we report a novel protocol for the detection of known and unknown HPV types in human skin and oral gargle samples using improved PCR protocols combined with next-generation sequencing. We identified 105 putative new PV types in addition to 296 known types, thus providing important information about the viral distribution in the oral cavity and skin. Copyright © 2018. Published by Elsevier Inc.
Structure-based discovery of fiber-binding compounds that reduce the cytotoxicity of amyloid beta

DOE PAGES

Jiang, Lin; Liu, Cong; Leibly, David; ...

2013-07-16

Amyloid protein aggregates are associated with dozens of devastating diseases including Alzheimer’s, Parkinson’s, ALS, and diabetes type 2. While structure-based discovery of compounds has been effective in combating numerous infectious and metabolic diseases, ignorance of amyloid structure has hindered similar approaches to amyloid disease. Here we show that knowledge of the atomic structure of one of the adhesive, steric-zipper segments of the amyloid-beta (Aβ) protein of Alzheimer’s disease, when coupled with computational methods, identifies eight diverse but mainly flat compounds and three compound derivatives that reduce Aβ cytotoxicity against mammalian cells by up to 90%. Although these compounds bind tomore » Aβ fibers, they do not reduce fiber formation of Aβ. Structure-activity relationship studies of the fiber-binding compounds and their derivatives suggest that compound binding increases fiber stability and decreases fiber toxicity, perhaps by shifting the equilibrium of Aβ from oligomers to fibers.« less
The top quark (20 years after the discovery)

DOE PAGES

Boos, Eduard; Brandt, Oleg; Denisov, Dmitri; ...

2015-09-10

On the twentieth anniversary of the observation of the top quark, we trace our understanding of this heaviest of all known particles from the prediction of its existence, through the searches and discovery, to the current knowledge of its production mechanisms and properties. We also discuss the central role of the top quark in the Standard Model and the windows that it opens for seeking new physics beyond the Standard Model.
Facilitating knowledge discovery and visualization through mining contextual data from published studies: lessons from JournalMap

USDA-ARS?s Scientific Manuscript database

Valuable information on the location and context of ecological studies are locked up in publications in myriad formats that are not easily machine readable. This presents significant challenges to building geographic-based tools to search for and visualize sources of ecological knowledge. JournalMap...
77 FR 11345 - Harmonization of Compliance Obligations for Registered Investment Companies Required To Register...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-02-24

... his or her knowledge and belief, the information contained in the document is accurate and complete. The first item in the certification required by SEC Form N-CSR is: ``Based on my knowledge, this..., competitiveness and financial integrity of futures markets; (3) price discovery; (4) sound risk management...
Incremental Knowledge Discovery in Social Media

ERIC Educational Resources Information Center

Tang, Xuning

2013-01-01

In light of the prosperity of online social media, Web users are shifting from data consumers to data producers. To catch the pulse of this rapidly changing world, it is critical to transform online social media data to information and to knowledge. This dissertation centers on the issue of modeling the dynamics of user communities, trending…
Effects of Students' Prior Knowledge on Scientific Reasoning in Density.

ERIC Educational Resources Information Center

Yang, Il-Ho; Kwon, Yong-Ju; Kim, Young-Shin; Jang, Myoung-Duk; Jeong, Jin-Woo; Park, Kuk-Tae

2002-01-01

Investigates the effects of students' prior knowledge on the scientific reasoning processes of performing the task of controlling variables with computer simulation and identifies a number of problems that students encounter in scientific discovery. Involves (n=27) 5th grade students and (n=33) 7th grade students. Indicates that students' prior…
Strain Prioritization for Natural Product Discovery by a High-Throughput Real-Time PCR Method

PubMed Central

2015-01-01

Natural products offer unmatched chemical and structural diversity compared to other small-molecule libraries, but traditional natural product discovery programs are not sustainable, demanding too much time, effort, and resources. Here we report a strain prioritization method for natural product discovery. Central to the method is the application of real-time PCR, targeting genes characteristic to the biosynthetic machinery of natural products with distinct scaffolds in a high-throughput format. The practicality and effectiveness of the method were showcased by prioritizing 1911 actinomycete strains for diterpenoid discovery. A total of 488 potential diterpenoid producers were identified, among which six were confirmed as platensimycin and platencin dual producers and one as a viguiepinol and oxaloterpin producer. While the method as described is most appropriate to prioritize strains for discovering specific natural products, variations of this method should be applicable to the discovery of other classes of natural products. Applications of genome sequencing and genome mining to the high-priority strains could essentially eliminate the chance elements from traditional discovery programs and fundamentally change how natural products are discovered. PMID:25238028
Putting Priors in Mixture Density Mercer Kernels

NASA Technical Reports Server (NTRS)

Srivastava, Ashok N.; Schumann, Johann; Fischer, Bernd

2004-01-01

This paper presents a new methodology for automatic knowledge driven data mining based on the theory of Mercer Kernels, which are highly nonlinear symmetric positive definite mappings from the original image space to a very high, possibly infinite dimensional feature space. We describe a new method called Mixture Density Mercer Kernels to learn kernel function directly from data, rather than using predefined kernels. These data adaptive kernels can en- code prior knowledge in the kernel using a Bayesian formulation, thus allowing for physical information to be encoded in the model. We compare the results with existing algorithms on data from the Sloan Digital Sky Survey (SDSS). The code for these experiments has been generated with the AUTOBAYES tool, which automatically generates efficient and documented C/C++ code from abstract statistical model specifications. The core of the system is a schema library which contains template for learning and knowledge discovery algorithms like different versions of EM, or numeric optimization methods like conjugate gradient methods. The template instantiation is supported by symbolic- algebraic computations, which allows AUTOBAYES to find closed-form solutions and, where possible, to integrate them into the code. The results show that the Mixture Density Mercer-Kernel described here outperforms tree-based classification in distinguishing high-redshift galaxies from low- redshift galaxies by approximately 16% on test data, bagged trees by approximately 7%, and bagged trees built on a much larger sample of data by approximately 2%.
Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback

PubMed Central

Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

2016-01-01

Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery. PMID:27861505
Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback.

PubMed

Hu, Kai; Gui, Zhipeng; Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

2016-01-01

Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.

29 CFR 2700.56 - Discovery; general.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 29 Labor 9 2011-07-01 2011-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or more... upon property for inspecting, copying, photographing, and gathering information. (b) Scope of discovery...
29 CFR 2700.56 - Discovery; general.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 29 Labor 9 2012-07-01 2012-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or more... upon property for inspecting, copying, photographing, and gathering information. (b) Scope of discovery...
29 CFR 2700.56 - Discovery; general.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 29 Labor 9 2014-07-01 2014-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or more... upon property for inspecting, copying, photographing, and gathering information. (b) Scope of discovery...
29 CFR 2700.56 - Discovery; general.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 29 Labor 9 2010-07-01 2010-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or more... upon property for inspecting, copying, photographing, and gathering information. (b) Scope of discovery...
Feature selection for examining behavior by pathology laboratories.

PubMed

Hawkins, S; Williams, G; Baxter, R

2001-08-01

Australia has a universal health insurance scheme called Medicare, which is managed by Australia's Health Insurance Commission. Medicare payments for pathology services generate voluminous transaction data on patients, doctors and pathology laboratories. The Health Insurance Commission (HIC) currently uses predictive models to monitor compliance with regulatory requirements. The HIC commissioned a project to investigate the generation of new features from the data. Feature generation has not appeared as an important step in the knowledge discovery in databases (KDD) literature. New interesting features for use in predictive modeling are generated. These features were summarized, visualized and used as inputs for clustering and outlier detection methods. Data organization and data transformation methods are described for the efficient access and manipulation of these new features.
Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field.

PubMed

Wójcikowski, Maciej; Zielenkiewicz, Piotr; Siedlecki, Pawel

2015-01-01

There has been huge progress in the open cheminformatics field in both methods and software development. Unfortunately, there has been little effort to unite those methods and software into one package. We here describe the Open Drug Discovery Toolkit (ODDT), which aims to fulfill the need for comprehensive and open source drug discovery software. The Open Drug Discovery Toolkit was developed as a free and open source tool for both computer aided drug discovery (CADD) developers and researchers. ODDT reimplements many state-of-the-art methods, such as machine learning scoring functions (RF-Score and NNScore) and wraps other external software to ease the process of developing CADD pipelines. ODDT is an out-of-the-box solution designed to be easily customizable and extensible. Therefore, users are strongly encouraged to extend it and develop new methods. We here present three use cases for ODDT in common tasks in computer-aided drug discovery. Open Drug Discovery Toolkit is released on a permissive 3-clause BSD license for both academic and industrial use. ODDT's source code, additional examples and documentation are available on GitHub (https://github.com/oddt/oddt).
Data mining in pharma sector: benefits.

PubMed

Ranjan, Jayanthi

2009-01-01

The amount of data getting generated in any sector at present is enormous. The information flow in the pharma industry is huge. Pharma firms are progressing into increased technology-enabled products and services. Data mining, which is knowledge discovery from large sets of data, helps pharma firms to discover patterns in improving the quality of drug discovery and delivery methods. The paper aims to present how data mining is useful in the pharma industry, how its techniques can yield good results in pharma sector, and to show how data mining can really enhance in making decisions using pharmaceutical data. This conceptual paper is written based on secondary study, research and observations from magazines, reports and notes. The author has listed the types of patterns that can be discovered using data mining in pharma data. The paper shows how data mining is useful in the pharma industry and how its techniques can yield good results in pharma sector. Although much work can be produced for discovering knowledge in pharma data using data mining, the paper is limited to conceptualizing the ideas and view points at this stage; future work may include applying data mining techniques to pharma data based on primary research using the available, famous significant data mining tools. Research papers and conceptual papers related to data mining in Pharma industry are rare; this is the motivation for the paper.
An approach to the analysis of SDSS spectroscopic outliers based on self-organizing maps. Designing the outlier analysis software package for the next Gaia survey

NASA Astrophysics Data System (ADS)

Fustes, D.; Manteiga, M.; Dafonte, C.; Arcay, B.; Ulla, A.; Smith, K.; Borrachero, R.; Sordo, R.

2013-11-01

Aims: A new method applied to the segmentation and further analysis of the outliers resulting from the classification of astronomical objects in large databases is discussed. The method is being used in the framework of the Gaia satellite Data Processing and Analysis Consortium (DPAC) activities to prepare automated software tools that will be used to derive basic astrophysical information that is to be included in final Gaia archive. Methods: Our algorithm has been tested by means of simulated Gaia spectrophotometry, which is based on SDSS observations and theoretical spectral libraries covering a wide sample of astronomical objects. Self-organizing maps networks are used to organize the information in clusters of objects, as homogeneously as possible according to their spectral energy distributions, and to project them onto a 2D grid where the data structure can be visualized. Results: We demonstrate the usefulness of the method by analyzing the spectra that were rejected by the SDSS spectroscopic classification pipeline and thus classified as "UNKNOWN". First, our method can help distinguish between astrophysical objects and instrumental artifacts. Additionally, the application of our algorithm to SDSS objects of unknown nature has allowed us to identify classes of objects with similar astrophysical natures. In addition, the method allows for the potential discovery of hundreds of new objects, such as white dwarfs and quasars. Therefore, the proposed method is shown to be very promising for data exploration and knowledge discovery in very large astronomical databases, such as the archive from the upcoming Gaia mission.
Advances in the understanding and use of the genomic base of microbial secondary metabolite biosynthesis for the discovery of new natural products.

PubMed

McAlpine, James B

2009-03-27

Over the past decade major changes have occurred in the access to genome sequences that encode the enzymes responsible for the biosynthesis of secondary metabolites, knowledge of how those sequences translate into the final structure of the metabolite, and the ability to alter the sequence to obtain predicted products via both homologous and heterologous expression. Novel genera have been discovered leading to new chemotypes, but more surprisingly several instances have been uncovered where the apparently general rules of modular translation have not applied. Several new biosynthetic pathways have been unearthed, and our general knowledge grows rapidly. This review aims to highlight some of the more striking discoveries and advances of the decade.
Text mining patents for biomedical knowledge.

PubMed

Rodriguez-Esteban, Raul; Bundschus, Markus

2016-06-01

Biomedical text mining of scientific knowledge bases, such as Medline, has received much attention in recent years. Given that text mining is able to automatically extract biomedical facts that revolve around entities such as genes, proteins, and drugs, from unstructured text sources, it is seen as a major enabler to foster biomedical research and drug discovery. In contrast to the biomedical literature, research into the mining of biomedical patents has not reached the same level of maturity. Here, we review existing work and highlight the associated technical challenges that emerge from automatically extracting facts from patents. We conclude by outlining potential future directions in this domain that could help drive biomedical research and drug discovery. Copyright © 2016 Elsevier Ltd. All rights reserved.
Structure-Based Virtual Screening for Drug Discovery: Principles, Applications and Recent Advances

PubMed Central

Lionta, Evanthia; Spyrou, George; Vassilatis, Demetrios K.; Cournia, Zoe

2014-01-01

Structure-based drug discovery (SBDD) is becoming an essential tool in assisting fast and cost-efficient lead discovery and optimization. The application of rational, structure-based drug design is proven to be more efficient than the traditional way of drug discovery since it aims to understand the molecular basis of a disease and utilizes the knowledge of the three-dimensional structure of the biological target in the process. In this review, we focus on the principles and applications of Virtual Screening (VS) within the context of SBDD and examine different procedures ranging from the initial stages of the process that include receptor and library pre-processing, to docking, scoring and post-processing of topscoring hits. Recent improvements in structure-based virtual screening (SBVS) efficiency through ensemble docking, induced fit and consensus docking are also discussed. The review highlights advances in the field within the framework of several success studies that have led to nM inhibition directly from VS and provides recent trends in library design as well as discusses limitations of the method. Applications of SBVS in the design of substrates for engineered proteins that enable the discovery of new metabolic and signal transduction pathways and the design of inhibitors of multifunctional proteins are also reviewed. Finally, we contribute two promising VS protocols recently developed by us that aim to increase inhibitor selectivity. In the first protocol, we describe the discovery of micromolar inhibitors through SBVS designed to inhibit the mutant H1047R PI3Kα kinase. Second, we discuss a strategy for the identification of selective binders for the RXRα nuclear receptor. In this protocol, a set of target structures is constructed for ensemble docking based on binding site shape characterization and clustering, aiming to enhance the hit rate of selective inhibitors for the desired protein target through the SBVS process. PMID:25262799
19 CFR 207.109 - Discovery.

Code of Federal Regulations, 2014 CFR

2014-04-01

... 19 Customs Duties 3 2014-04-01 2014-04-01 false Discovery. 207.109 Section 207.109 Customs Duties... and Committee Proceedings § 207.109 Discovery. (a) Discovery methods. All parties may obtain discovery under such terms and limitations as the administrative law judge may order. Discovery may be by one or...
19 CFR 207.109 - Discovery.

Code of Federal Regulations, 2011 CFR

2011-04-01

... 19 Customs Duties 3 2011-04-01 2011-04-01 false Discovery. 207.109 Section 207.109 Customs Duties... and Committee Proceedings § 207.109 Discovery. (a) Discovery methods. All parties may obtain discovery under such terms and limitations as the administrative law judge may order. Discovery may be by one or...
19 CFR 207.109 - Discovery.

Code of Federal Regulations, 2013 CFR

2013-04-01

... 19 Customs Duties 3 2013-04-01 2013-04-01 false Discovery. 207.109 Section 207.109 Customs Duties... and Committee Proceedings § 207.109 Discovery. (a) Discovery methods. All parties may obtain discovery under such terms and limitations as the administrative law judge may order. Discovery may be by one or...
19 CFR 207.109 - Discovery.

Code of Federal Regulations, 2012 CFR

2012-04-01

... 19 Customs Duties 3 2012-04-01 2012-04-01 false Discovery. 207.109 Section 207.109 Customs Duties... and Committee Proceedings § 207.109 Discovery. (a) Discovery methods. All parties may obtain discovery under such terms and limitations as the administrative law judge may order. Discovery may be by one or...
19 CFR 207.109 - Discovery.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 19 Customs Duties 3 2010-04-01 2010-04-01 false Discovery. 207.109 Section 207.109 Customs Duties... and Committee Proceedings § 207.109 Discovery. (a) Discovery methods. All parties may obtain discovery under such terms and limitations as the administrative law judge may order. Discovery may be by one or...
Computational drug discovery

PubMed Central

Ou-Yang, Si-sheng; Lu, Jun-yan; Kong, Xiang-qian; Liang, Zhong-jie; Luo, Cheng; Jiang, Hualiang

2012-01-01

Computational drug discovery is an effective strategy for accelerating and economizing drug discovery and development process. Because of the dramatic increase in the availability of biological macromolecule and small molecule information, the applicability of computational drug discovery has been extended and broadly applied to nearly every stage in the drug discovery and development workflow, including target identification and validation, lead discovery and optimization and preclinical tests. Over the past decades, computational drug discovery methods such as molecular docking, pharmacophore modeling and mapping, de novo design, molecular similarity calculation and sequence-based virtual screening have been greatly improved. In this review, we present an overview of these important computational methods, platforms and successful applications in this field. PMID:22922346
Prior knowledge driven Granger causality analysis on gene regulatory network discovery

DOE PAGES

Yao, Shun; Yoo, Shinjae; Yu, Dantong

2015-08-28

Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, themore » propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. In our research, we noticed a “ 1+1>2” effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be related to the yeast’s responses to different levels of glucose. In conclusion, our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home.« less
A bilateral integrative health-care knowledge service mechanism based on 'MedGrid'.

PubMed

Liu, Chao; Jiang, Zuhua; Zhen, Lu; Su, Hai

2008-04-01

Current health-care organizations are encountering impression of paucity of medical knowledge. This paper classifies medical knowledge with new scopes. The discovery of health-care 'knowledge flow' initiates a bilateral integrative health-care knowledge service, and we make medical knowledge 'flow' around and gain comprehensive effectiveness through six operations (such as knowledge refreshing...). Seizing the active demand of Chinese health-care revolution, this paper presents 'MedGrid', which is a platform with medical ontology and knowledge contents service. Each level and detailed contents are described on MedGrid info-structure. Moreover, a new diagnosis and treatment mechanism are formed by technically connecting with electronic health-care records (EHRs).
Review of high-throughput techniques for detecting solid phase Transformation from material libraries produced by combinatorial methods

NASA Technical Reports Server (NTRS)

Lee, Jonathan A.

2005-01-01

High-throughput measurement techniques are reviewed for solid phase transformation from materials produced by combinatorial methods, which are highly efficient concepts to fabricate large variety of material libraries with different compositional gradients on a single wafer. Combinatorial methods hold high potential for reducing the time and costs associated with the development of new materials, as compared to time-consuming and labor-intensive conventional methods that test large batches of material, one- composition at a time. These high-throughput techniques can be automated to rapidly capture and analyze data, using the entire material library on a single wafer, thereby accelerating the pace of materials discovery and knowledge generation for solid phase transformations. The review covers experimental techniques that are applicable to inorganic materials such as shape memory alloys, graded materials, metal hydrides, ferric materials, semiconductors and industrial alloys.

Harnessing the potential of natural products in drug discovery from a cheminformatics vantage point.

PubMed

Rodrigues, Tiago

2017-11-15

Natural products (NPs) present a privileged source of inspiration for chemical probe and drug design. Despite the biological pre-validation of the underlying molecular architectures and their relevance in drug discovery, the poor accessibility to NPs, complexity of the synthetic routes and scarce knowledge of their macromolecular counterparts in phenotypic screens still hinder their broader exploration. Cheminformatics algorithms now provide a powerful means of circumventing the abovementioned challenges and unlocking the full potential of NPs in a drug discovery context. Herein, I discuss recent advances in the computer-assisted design of NP mimics and how artificial intelligence may accelerate future NP-inspired molecular medicine.
Integrated Bio-Entity Network: A System for Biological Knowledge Discovery

PubMed Central

Bell, Lindsey; Chowdhary, Rajesh; Liu, Jun S.; Niu, Xufeng; Zhang, Jinfeng

2011-01-01

A significant part of our biological knowledge is centered on relationships between biological entities (bio-entities) such as proteins, genes, small molecules, pathways, gene ontology (GO) terms and diseases. Accumulated at an increasing speed, the information on bio-entity relationships is archived in different forms at scattered places. Most of such information is buried in scientific literature as unstructured text. Organizing heterogeneous information in a structured form not only facilitates study of biological systems using integrative approaches, but also allows discovery of new knowledge in an automatic and systematic way. In this study, we performed a large scale integration of bio-entity relationship information from both databases containing manually annotated, structured information and automatic information extraction of unstructured text in scientific literature. The relationship information we integrated in this study includes protein–protein interactions, protein/gene regulations, protein–small molecule interactions, protein–GO relationships, protein–pathway relationships, and pathway–disease relationships. The relationship information is organized in a graph data structure, named integrated bio-entity network (IBN), where the vertices are the bio-entities and edges represent their relationships. Under this framework, graph theoretic algorithms can be designed to perform various knowledge discovery tasks. We designed breadth-first search with pruning (BFSP) and most probable path (MPP) algorithms to automatically generate hypotheses—the indirect relationships with high probabilities in the network. We show that IBN can be used to generate plausible hypotheses, which not only help to better understand the complex interactions in biological systems, but also provide guidance for experimental designs. PMID:21738677
32 CFR 34.2 - Definitions.

Code of Federal Regulations, 2011 CFR

2011-07-01

... increasing knowledge or understanding in science and engineering. Applied research is defined as efforts that attempt to determine and exploit the potential of scientific discoveries or improvements in technology...
76 FR 4452 - Privacy Act of 1974; Report of Modified or Altered System of Records

Federal Register 2010, 2011, 2012, 2013, 2014

2011-01-25

... Disease Control and Prevention (CDC) for more complete knowledge of the disease/condition in the following... the light of future discoveries and proven associations so that relevant data collected at the time of... professional staff at the Centers for Disease Control and Prevention (CDC) for more complete knowledge of the...
Trying to Teach Well: A Story of Small Discoveries

ERIC Educational Resources Information Center

Lewis, P. J.

2004-01-01

''Stories do not simply contain knowledge, they are themselves the knowledge'' (Jackson (In: K. Eagan, H. McEwan (Eds.), Narrative in Teaching, Learning and Research, Teacher College Press, New York, 1995, p. 5)). How can we teach well? Perhaps we can find answers through our stories from the classroom. It is through our stories that we make sense…
Knowledge Translation versus Knowledge Integration: A "Funder's" Perspective

ERIC Educational Resources Information Center

Kerner, Jon F.

2006-01-01

Each year, billions of US tax dollars are spent on basic discovery, intervention development, and efficacy research, while hundreds of billions of US tax dollars are also spent on health service delivery programs. However, little is spent on or known about how best to ensure that the lessons learned from science inform and improve the quality of…
The Assessment of Self-Directed Learning Readiness in Medical Education

ERIC Educational Resources Information Center

Monroe, Katherine Swint

2014-01-01

The rapid pace of scientific discovery has catalyzed the need for medical students to be able to find and assess new information. The knowledge required for physicians' skillful practice will change of the course of their careers, and, to keep up, they must be able to recognized their deficiencies, search for new knowledge, and critically evaluate…
EPA Web Taxonomy

EPA Pesticide Factsheets

EPA's Web Taxonomy is a faceted hierarchical vocabulary used to tag web pages with terms from a controlled vocabulary. Tagging enables search and discovery of EPA's Web based information assests. EPA's Web Taxonomy is being provided in Simple Knowledge Organization System (SKOS) format. SKOS is a standard for sharing and linking knowledge organization systems that promises to make Federal terminology resources more interoperable.
The Emergence of Organizing Structure in Conceptual Representation.

PubMed

Lake, Brenden M; Lawrence, Neil D; Tenenbaum, Joshua B

2018-06-01

Both scientists and children make important structural discoveries, yet their computational underpinnings are not well understood. Structure discovery has previously been formalized as probabilistic inference about the right structural form-where form could be a tree, ring, chain, grid, etc. (Kemp & Tenenbaum, 2008). Although this approach can learn intuitive organizations, including a tree for animals and a ring for the color circle, it assumes a strong inductive bias that considers only these particular forms, and each form is explicitly provided as initial knowledge. Here we introduce a new computational model of how organizing structure can be discovered, utilizing a broad hypothesis space with a preference for sparse connectivity. Given that the inductive bias is more general, the model's initial knowledge shows little qualitative resemblance to some of the discoveries it supports. As a consequence, the model can also learn complex structures for domains that lack intuitive description, as well as predict human property induction judgments without explicit structural forms. By allowing form to emerge from sparsity, our approach clarifies how both the richness and flexibility of human conceptual organization can coexist. Copyright © 2018 Cognitive Science Society, Inc.
OWL reasoning framework over big biological knowledge network.

PubMed

Chen, Huajun; Chen, Xi; Gu, Peiqin; Wu, Zhaohui; Yu, Tong

2014-01-01

Recently, huge amounts of data are generated in the domain of biology. Embedded with domain knowledge from different disciplines, the isolated biological resources are implicitly connected. Thus it has shaped a big network of versatile biological knowledge. Faced with such massive, disparate, and interlinked biological data, providing an efficient way to model, integrate, and analyze the big biological network becomes a challenge. In this paper, we present a general OWL (web ontology language) reasoning framework to study the implicit relationships among biological entities. A comprehensive biological ontology across traditional Chinese medicine (TCM) and western medicine (WM) is used to create a conceptual model for the biological network. Then corresponding biological data is integrated into a biological knowledge network as the data model. Based on the conceptual model and data model, a scalable OWL reasoning method is utilized to infer the potential associations between biological entities from the biological network. In our experiment, we focus on the association discovery between TCM and WM. The derived associations are quite useful for biologists to promote the development of novel drugs and TCM modernization. The experimental results show that the system achieves high efficiency, accuracy, scalability, and effectivity.
Information extraction and knowledge graph construction from geoscience literature

NASA Astrophysics Data System (ADS)

Wang, Chengbin; Ma, Xiaogang; Chen, Jianguo; Chen, Jingwen

2018-03-01

Geoscience literature published online is an important part of open data, and brings both challenges and opportunities for data analysis. Compared with studies of numerical geoscience data, there are limited works on information extraction and knowledge discovery from textual geoscience data. This paper presents a workflow and a few empirical case studies for that topic, with a focus on documents written in Chinese. First, we set up a hybrid corpus combining the generic and geology terms from geology dictionaries to train Chinese word segmentation rules of the Conditional Random Fields model. Second, we used the word segmentation rules to parse documents into individual words, and removed the stop-words from the segmentation results to get a corpus constituted of content-words. Third, we used a statistical method to analyze the semantic links between content-words, and we selected the chord and bigram graphs to visualize the content-words and their links as nodes and edges in a knowledge graph, respectively. The resulting graph presents a clear overview of key information in an unstructured document. This study proves the usefulness of the designed workflow, and shows the potential of leveraging natural language processing and knowledge graph technologies for geoscience.
OWL Reasoning Framework over Big Biological Knowledge Network

PubMed Central

Chen, Huajun; Chen, Xi; Gu, Peiqin; Wu, Zhaohui; Yu, Tong

2014-01-01

Recently, huge amounts of data are generated in the domain of biology. Embedded with domain knowledge from different disciplines, the isolated biological resources are implicitly connected. Thus it has shaped a big network of versatile biological knowledge. Faced with such massive, disparate, and interlinked biological data, providing an efficient way to model, integrate, and analyze the big biological network becomes a challenge. In this paper, we present a general OWL (web ontology language) reasoning framework to study the implicit relationships among biological entities. A comprehensive biological ontology across traditional Chinese medicine (TCM) and western medicine (WM) is used to create a conceptual model for the biological network. Then corresponding biological data is integrated into a biological knowledge network as the data model. Based on the conceptual model and data model, a scalable OWL reasoning method is utilized to infer the potential associations between biological entities from the biological network. In our experiment, we focus on the association discovery between TCM and WM. The derived associations are quite useful for biologists to promote the development of novel drugs and TCM modernization. The experimental results show that the system achieves high efficiency, accuracy, scalability, and effectivity. PMID:24877076
Combined use of computational chemistry and chemoinformatics methods for chemical discovery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sugimoto, Manabu, E-mail: sugimoto@kumamoto-u.ac.jp; Institute for Molecular Science, 38 Nishigo-Naka, Myodaiji, Okazaki 444-8585; CREST, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012

2015-12-31

Data analysis on numerical data by the computational chemistry calculations is carried out to obtain knowledge information of molecules. A molecular database is developed to systematically store chemical, electronic-structure, and knowledge-based information. The database is used to find molecules related to a keyword of “cancer”. Then the electronic-structure calculations are performed to quantitatively evaluate quantum chemical similarity of the molecules. Among the 377 compounds registered in the database, 24 molecules are found to be “cancer”-related. This set of molecules includes both carcinogens and anticancer drugs. The quantum chemical similarity analysis, which is carried out by using numerical results of themore » density-functional theory calculations, shows that, when some energy spectra are referred to, carcinogens are reasonably distinguished from the anticancer drugs. Therefore these spectral properties are considered of as important measures for classification.« less
The modern search for the Holy Grail: is neuroscience a solution?

PubMed Central

Naor, Navot; Ben-Ze'ev, Aaron; Okon-Singer, Hadas

2014-01-01

Neuroscience has become prevalent in recent years; nevertheless, its value in the examination of psychological and philosophical phenomena is still a matter of debate. The examples reviewed here suggest that neuroscientific tools can be significant in the investigation of such complex phenomena. In this article, we argue that it is important to study concepts that do not have a clear characterization and emphasize the role of neuroscience in this quest for knowledge. The data reviewed here suggest that neuroscience may (1) enrich our knowledge; (2) outline the nature of an explanation; and (3) lead to substantial empirical and theoretical discoveries. To that end, we review work on hedonia and eudaimonia in the fields of neuroscience, psychology, and philosophy. These studies demonstrate the importance of neuroscientific tools in the investigation of phenomena that are difficult to define using other methods. PMID:24926246
Biomedical Ontologies in Action: Role in Knowledge Management, Data Integration and Decision Support

PubMed Central

Bodenreider, O.

2008-01-01

Summary Objectives To provide typical examples of biomedical ontologies in action, emphasizing the role played by biomedical ontologies in knowledge management, data integration and decision support. Methods Biomedical ontologies selected for their practical impact are examined from a functional perspective. Examples of applications are taken from operational systems and the biomedical literature, with a bias towards recent journal articles. Results The ontologies under investigation in this survey include SNOMED CT, the Logical Observation Identifiers, Names, and Codes (LOINC), the Foundational Model of Anatomy, the Gene Ontology, RxNorm, the National Cancer Institute Thesaurus, the International Classification of Diseases, the Medical Subject Headings (MeSH) and the Unified Medical Language System (UMLS). The roles played by biomedical ontologies are classified into three major categories: knowledge management (indexing and retrieval of data and information, access to information, mapping among ontologies); data integration, exchange and semantic interoperability; and decision support and reasoning (data selection and aggregation, decision support, natural language processing applications, knowledge discovery). Conclusions Ontologies play an important role in biomedical research through a variety of applications. While ontologies are used primarily as a source of vocabulary for standardization and integration purposes, many applications also use them as a source of computable knowledge. Barriers to the use of ontologies in biomedical applications are discussed. PMID:18660879
A knowledge discovery object model API for Java

PubMed Central

Zuyderduyn, Scott D; Jones, Steven JM

2003-01-01

Background Biological data resources have become heterogeneous and derive from multiple sources. This introduces challenges in the management and utilization of this data in software development. Although efforts are underway to create a standard format for the transmission and storage of biological data, this objective has yet to be fully realized. Results This work describes an application programming interface (API) that provides a framework for developing an effective biological knowledge ontology for Java-based software projects. The API provides a robust framework for the data acquisition and management needs of an ontology implementation. In addition, the API contains classes to assist in creating GUIs to represent this data visually. Conclusions The Knowledge Discovery Object Model (KDOM) API is particularly useful for medium to large applications, or for a number of smaller software projects with common characteristics or objectives. KDOM can be coupled effectively with other biologically relevant APIs and classes. Source code, libraries, documentation and examples are available at . PMID:14583100
Integrative Convergence in Neuroscience: Trajectories, Problems, and the Need for a Progressive Neurobioethics

NASA Astrophysics Data System (ADS)

Giordano, J.

The advanced integrative scientific convergence (AISC) model represents a viable approach to neuroscience. Beyond simple multi-disciplinarity, the AISC model unifies constituent scientific and technological fields to foster innovation, invention and new ways of addressing seemingly intractable questions. In this way, AISC can yield novel methods and foster new trajectories of knowledge and discovery, and yield new epistemologies. As stand-alone disciplines, each and all of the constituent fields generate practical and ethical issues, and their convergence may establish a unique set of both potential benefits and problems. To effectively attend to these contingencies requires pragmatic assessment of the actual capabilities and limits of neurofocal AISC, and an openness to what new knowledge and scientific/technological achievements may be produced, and how such outcomes can affect humanity, the human condition, society and the global environment. It is proposed that a progressive neurobioethics may be needed to establish both a meta-ethical framework upon which to structure ethical decisions, and a system and method of ethics that is inclusive, convergent and innovative, and in thus aligned with and meaningful to use of an AISC model in neuroscience.
Graph theory enables drug repurposing--how a mathematical model can drive the discovery of hidden mechanisms of action.

PubMed

Gramatica, Ruggero; Di Matteo, T; Giorgetti, Stefano; Barbiani, Massimo; Bevec, Dorian; Aste, Tomaso

2014-01-01

We introduce a methodology to efficiently exploit natural-language expressed biomedical knowledge for repurposing existing drugs towards diseases for which they were not initially intended. Leveraging on developments in Computational Linguistics and Graph Theory, a methodology is defined to build a graph representation of knowledge, which is automatically analysed to discover hidden relations between any drug and any disease: these relations are specific paths among the biomedical entities of the graph, representing possible Modes of Action for any given pharmacological compound. We propose a measure for the likeliness of these paths based on a stochastic process on the graph. This measure depends on the abundance of indirect paths between a peptide and a disease, rather than solely on the strength of the shortest path connecting them. We provide real-world examples, showing how the method successfully retrieves known pathophysiological Mode of Action and finds new ones by meaningfully selecting and aggregating contributions from known bio-molecular interactions. Applications of this methodology are presented, and prove the efficacy of the method for selecting drugs as treatment options for rare diseases.
Data Science Priorities for a University Hospital-Based Institute of Infectious Diseases: A Viewpoint.

PubMed

Valleron, Alain-Jacques

2017-08-15

Automation of laboratory tests, bioinformatic analysis of biological sequences, and professional data management are used routinely in a modern university hospital-based infectious diseases institute. This dates back to at least the 1980s. However, the scientific methods of this 21st century are changing with the increased power and speed of computers, with the "big data" revolution having already happened in genomics and environment, and eventually arriving in medical informatics. The research will be increasingly "data driven," and the powerful machine learning methods whose efficiency is demonstrated in daily life will also revolutionize medical research. A university-based institute of infectious diseases must therefore not only gather excellent computer scientists and statisticians (as in the past, and as in any medical discipline), but also fully integrate the biologists and clinicians with these computer scientists, statisticians, and mathematical modelers having a broad culture in machine learning, knowledge representation, and knowledge discovery. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.
Emergence of Chinese drug discovery research: impact of hit and lead identification.

PubMed

Zhou, Caihong; Zhou, Yan; Wang, Jia; Zhu, Yue; Deng, Jiejie; Wang, Ming-Wei

2015-03-01

The identification of hits and the generation of viable leads is an early and yet crucial step in drug discovery. In the West, the main players of drug discovery are pharmaceutical and biotechnology companies, while in China, academic institutions remain central in the field of drug discovery. There has been a tremendous amount of investment from the public as well as private sectors to support infrastructure buildup and expertise consolidation relative to drug discovery and development in the past two decades. A large-scale compound library has been established in China, and a series of high-impact discoveries of lead compounds have been made by integrating information obtained from different technology-based strategies. Natural products are a major source in China's drug discovery efforts. Knowledge has been enhanced via disruptive breakthroughs such as the discovery of Boc5 as a nonpeptidic agonist of glucagon-like peptide 1 receptor (GLP-1R), one of the class B G protein-coupled receptors (GPCRs). Most of the original hit identification and lead generation were carried out by academic institutions, including universities and specialized research institutes. The Chinese pharmaceutical industry is gradually transforming itself from manufacturing low-end generics and active pharmaceutical ingredients to inventing new drugs. © 2014 Society for Laboratory Automation and Screening.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.