Knowledge Discovery from Databases: An Introductory Review.
ERIC Educational Resources Information Center
Vickery, Brian
1997-01-01
Introduces new procedures being used to extract knowledge from databases and discusses rationales for developing knowledge discovery methods. Methods are described for such techniques as classification, clustering, and the detection of deviations from pre-established norms. Examines potential uses of knowledge discovery in the information field.…
1994-09-30
relational versus object oriented DBMS, knowledge discovery, data models, rnetadata, data filtering, clustering techniques, and synthetic data. A secondary...The first was the investigation of Al/ES Lapplications (knowledge discovery, data mining, and clustering ). Here CAST collabo.rated with Dr. Fred Petry...knowledge discovery system based on clustering techniques; implemented an on-line data browser to the DBMS; completed preliminary efforts to apply object
Knowledge Discovery and Data Mining: An Overview
NASA Technical Reports Server (NTRS)
Fayyad, U.
1995-01-01
The process of knowledge discovery and data mining is the process of information extraction from very large databases. Its importance is described along with several techniques and considerations for selecting the most appropriate technique for extracting information from a particular data set.
Knowledge Discovery as an Aid to Organizational Creativity.
ERIC Educational Resources Information Center
Siau, Keng
2000-01-01
This article presents the concept of knowledge discovery, a process of searching for associations in large volumes of computer data, as an aid to creativity. It then discusses the various techniques in knowledge discovery. Mednick's associative theory of creative thought serves as the theoretical foundation for this research. (Contains…
Knowledge Discovery and Data Mining in Iran's Climatic Researches
NASA Astrophysics Data System (ADS)
Karimi, Mostafa
2013-04-01
Advances in measurement technology and data collection is the database gets larger. Large databases require powerful tools for analysis data. Iterative process of acquiring knowledge from information obtained from data processing is done in various forms in all scientific fields. However, when the data volume large, and many of the problems the Traditional methods cannot respond. in the recent years, use of databases in various scientific fields, especially atmospheric databases in climatology expanded. in addition, increases in the amount of data generated by the climate models is a challenge for analysis of it for extraction of hidden pattern and knowledge. The approach to this problem has been made in recent years uses the process of knowledge discovery and data mining techniques with the use of the concepts of machine learning, artificial intelligence and expert (professional) systems is overall performance. Data manning is analytically process for manning in massive volume data. The ultimate goal of data mining is access to information and finally knowledge. climatology is a part of science that uses variety and massive volume data. Goal of the climate data manning is Achieve to information from variety and massive atmospheric and non-atmospheric data. in fact, Knowledge Discovery performs these activities in a logical and predetermined and almost automatic process. The goal of this research is study of uses knowledge Discovery and data mining technique in Iranian climate research. For Achieve This goal, study content (descriptive) analysis and classify base method and issue. The result shown that in climatic research of Iran most clustering, k-means and wards applied and in terms of issues precipitation and atmospheric circulation patterns most introduced. Although several studies in geography and climate issues with statistical techniques such as clustering and pattern extraction is done, Due to the nature of statistics and data mining, but cannot say for internal climate studies in data mining and knowledge discovery techniques are used. However, it is necessary to use the KDD Approach and DM techniques in the climatic studies, specific interpreter of climate modeling result.
Anguera, A; Barreiro, J M; Lara, J A; Lizcano, D
2016-01-01
One of the major challenges in the medical domain today is how to exploit the huge amount of data that this field generates. To do this, approaches are required that are capable of discovering knowledge that is useful for decision making in the medical field. Time series are data types that are common in the medical domain and require specialized analysis techniques and tools, especially if the information of interest to specialists is concentrated within particular time series regions, known as events. This research followed the steps specified by the so-called knowledge discovery in databases (KDD) process to discover knowledge from medical time series derived from stabilometric (396 series) and electroencephalographic (200) patient electronic health records (EHR). The view offered in the paper is based on the experience gathered as part of the VIIP project. Knowledge discovery in medical time series has a number of difficulties and implications that are highlighted by illustrating the application of several techniques that cover the entire KDD process through two case studies. This paper illustrates the application of different knowledge discovery techniques for the purposes of classification within the above domains. The accuracy of this application for the two classes considered in each case is 99.86% and 98.11% for epilepsy diagnosis in the electroencephalography (EEG) domain and 99.4% and 99.1% for early-age sports talent classification in the stabilometry domain. The KDD techniques achieve better results than other traditional neural network-based classification techniques.
A Knowledge Discovery framework for Planetary Defense
NASA Astrophysics Data System (ADS)
Jiang, Y.; Yang, C. P.; Li, Y.; Yu, M.; Bambacus, M.; Seery, B.; Barbee, B.
2016-12-01
Planetary Defense, a project funded by NASA Goddard and the NSF, is a multi-faceted effort focused on the mitigation of Near Earth Object (NEO) threats to our planet. Currently, there exists a dispersion of information concerning NEO's amongst different organizations and scientists, leading to a lack of a coherent system of information to be used for efficient NEO mitigation. In this paper, a planetary defense knowledge discovery engine is proposed to better assist the development and integration of a NEO responding system. Specifically, we have implemented an organized information framework by two means: 1) the development of a semantic knowledge base, which provides a structure for relevant information. It has been developed by the implementation of web crawling and natural language processing techniques, which allows us to collect and store the most relevant structured information on a regular basis. 2) the development of a knowledge discovery engine, which allows for the efficient retrieval of information from our knowledge base. The knowledge discovery engine has been built on the top of Elasticsearch, an open source full-text search engine, as well as cutting-edge machine learning ranking and recommendation algorithms. This proposed framework is expected to advance the knowledge discovery and innovation in planetary science domain.
Big, Deep, and Smart Data in Scanning Probe Microscopy
Kalinin, Sergei V.; Strelcov, Evgheni; Belianinov, Alex; ...
2016-09-27
Scanning probe microscopy techniques open the door to nanoscience and nanotechnology by enabling imaging and manipulation of structure and functionality of matter on nanometer and atomic scales. We analyze the discovery process by SPM in terms of information flow from tip-surface junction to the knowledge adoption by scientific community. Furthermore, we discuss the challenges and opportunities offered by merging of SPM and advanced data mining, visual analytics, and knowledge discovery technologies.
From Information Center to Discovery System: Next Step for Libraries?
ERIC Educational Resources Information Center
Marcum, James W.
2001-01-01
Proposes a discovery system model to guide technology integration in academic libraries that fuses organizational learning, systems learning, and knowledge creation techniques with constructivist learning practices to suggest possible future directions for digital libraries. Topics include accessing visual and continuous media; information…
Exploring relation types for literature-based discovery.
Preiss, Judita; Stevenson, Mark; Gaizauskas, Robert
2015-09-01
Literature-based discovery (LBD) aims to identify "hidden knowledge" in the medical literature by: (1) analyzing documents to identify pairs of explicitly related concepts (terms), then (2) hypothesizing novel relations between pairs of unrelated concepts that are implicitly related via a shared concept to which both are explicitly related. Many LBD approaches use simple techniques to identify semantically weak relations between concepts, for example, document co-occurrence. These generate huge numbers of hypotheses, difficult for humans to assess. More complex techniques rely on linguistic analysis, for example, shallow parsing, to identify semantically stronger relations. Such approaches generate fewer hypotheses, but may miss hidden knowledge. The authors investigate this trade-off in detail, comparing techniques for identifying related concepts to discover which are most suitable for LBD. A generic LBD system that can utilize a range of relation types was developed. Experiments were carried out comparing a number of techniques for identifying relations. Two approaches were used for evaluation: replication of existing discoveries and the "time slicing" approach.(1) RESULTS: Previous LBD discoveries could be replicated using relations based either on document co-occurrence or linguistic analysis. Using relations based on linguistic analysis generated many fewer hypotheses, but a significantly greater proportion of them were candidates for hidden knowledge. The use of linguistic analysis-based relations improves accuracy of LBD without overly damaging coverage. LBD systems often generate huge numbers of hypotheses, which are infeasible to manually review. Improving their accuracy has the potential to make these systems significantly more usable. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Radioactive Dating: A Method for Geochronology.
ERIC Educational Resources Information Center
Rowe, M. W.
1985-01-01
Gives historical background on the discovery of natural radiation and discusses various techniques for using knowledge of radiochemistry in geochronological studies. Indicates that of these radioactive techniques, Potassium-40/Argon-40 dating is used most often. (JN)
Big, Deep, and Smart Data in Scanning Probe Microscopy.
Kalinin, Sergei V; Strelcov, Evgheni; Belianinov, Alex; Somnath, Suhas; Vasudevan, Rama K; Lingerfelt, Eric J; Archibald, Richard K; Chen, Chaomei; Proksch, Roger; Laanait, Nouamane; Jesse, Stephen
2016-09-27
Scanning probe microscopy (SPM) techniques have opened the door to nanoscience and nanotechnology by enabling imaging and manipulation of the structure and functionality of matter at nanometer and atomic scales. Here, we analyze the scientific discovery process in SPM by following the information flow from the tip-surface junction, to knowledge adoption by the wider scientific community. We further discuss the challenges and opportunities offered by merging SPM with advanced data mining, visual analytics, and knowledge discovery technologies.
NASA Astrophysics Data System (ADS)
Stranieri, Andrew; Yearwood, John; Pham, Binh
1999-07-01
The development of data warehouses for the storage and analysis of very large corpora of medical image data represents a significant trend in health care and research. Amongst other benefits, the trend toward warehousing enables the use of techniques for automatically discovering knowledge from large and distributed databases. In this paper, we present an application design for knowledge discovery from databases (KDD) techniques that enhance the performance of the problem solving strategy known as case- based reasoning (CBR) for the diagnosis of radiological images. The problem of diagnosing the abnormality of the cervical spine is used to illustrate the method. The design of a case-based medical image diagnostic support system has three essential characteristics. The first is a case representation that comprises textual descriptions of the image, visual features that are known to be useful for indexing images, and additional visual features to be discovered by data mining many existing images. The second characteristic of the approach presented here involves the development of a case base that comprises an optimal number and distribution of cases. The third characteristic involves the automatic discovery, using KDD techniques, of adaptation knowledge to enhance the performance of the case based reasoner. Together, the three characteristics of our approach can overcome real time efficiency obstacles that otherwise mitigate against the use of CBR to the domain of medical image analysis.
ERIC Educational Resources Information Center
Heffernan, Bernadette M.
1998-01-01
Describes work done to provide staff of the Sandy Point Discovery Center with methods for evaluating exhibits and interpretive programming. Quantitative and qualitative evaluation measures were designed to assess the program's objective of estuary education. Pretest-posttest questionnaires and interviews are used to measure subjects' knowledge and…
2013-01-01
Background Professionals in the biomedical domain are confronted with an increasing mass of data. Developing methods to assist professional end users in the field of Knowledge Discovery to identify, extract, visualize and understand useful information from these huge amounts of data is a huge challenge. However, there are so many diverse methods and methodologies available, that for biomedical researchers who are inexperienced in the use of even relatively popular knowledge discovery methods, it can be very difficult to select the most appropriate method for their particular research problem. Results A web application, called KNODWAT (KNOwledge Discovery With Advanced Techniques) has been developed, using Java on Spring framework 3.1. and following a user-centered approach. The software runs on Java 1.6 and above and requires a web server such as Apache Tomcat and a database server such as the MySQL Server. For frontend functionality and styling, Twitter Bootstrap was used as well as jQuery for interactive user interface operations. Conclusions The framework presented is user-centric, highly extensible and flexible. Since it enables methods for testing using existing data to assess suitability and performance, it is especially suitable for inexperienced biomedical researchers, new to the field of knowledge discovery and data mining. For testing purposes two algorithms, CART and C4.5 were implemented using the WEKA data mining framework. PMID:23763826
Holzinger, Andreas; Zupan, Mario
2013-06-13
Professionals in the biomedical domain are confronted with an increasing mass of data. Developing methods to assist professional end users in the field of Knowledge Discovery to identify, extract, visualize and understand useful information from these huge amounts of data is a huge challenge. However, there are so many diverse methods and methodologies available, that for biomedical researchers who are inexperienced in the use of even relatively popular knowledge discovery methods, it can be very difficult to select the most appropriate method for their particular research problem. A web application, called KNODWAT (KNOwledge Discovery With Advanced Techniques) has been developed, using Java on Spring framework 3.1. and following a user-centered approach. The software runs on Java 1.6 and above and requires a web server such as Apache Tomcat and a database server such as the MySQL Server. For frontend functionality and styling, Twitter Bootstrap was used as well as jQuery for interactive user interface operations. The framework presented is user-centric, highly extensible and flexible. Since it enables methods for testing using existing data to assess suitability and performance, it is especially suitable for inexperienced biomedical researchers, new to the field of knowledge discovery and data mining. For testing purposes two algorithms, CART and C4.5 were implemented using the WEKA data mining framework.
Knowledge discovery from structured mammography reports using inductive logic programming.
Burnside, Elizabeth S; Davis, Jesse; Costa, Victor Santos; Dutra, Inês de Castro; Kahn, Charles E; Fine, Jason; Page, David
2005-01-01
The development of large mammography databases provides an opportunity for knowledge discovery and data mining techniques to recognize patterns not previously appreciated. Using a database from a breast imaging practice containing patient risk factors, imaging findings, and biopsy results, we tested whether inductive logic programming (ILP) could discover interesting hypotheses that could subsequently be tested and validated. The ILP algorithm discovered two hypotheses from the data that were 1) judged as interesting by a subspecialty trained mammographer and 2) validated by analysis of the data itself.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pharhizgar, K.D.; Lunce, S.E.
1994-12-31
Development of knowledge-based technological acquisition techniques and customers` information profiles are known as assimilative integrated discovery systems (AIDS) in modern organizations. These systems have access through processing to both deep and broad domains of information in modern societies. Through these systems organizations and individuals can predict future trend probabilities and events concerning their customers. AIDSs are new techniques which produce new information which informants can use without the help of the knowledge sources because of the existence of highly sophisticated computerized networks. This paper has analyzed the danger and side effects of misuse of information through the illegal, unethical andmore » immoral access to the data-base in an integrated and assimilative information system as described above. Cognivistic mapping, pragmatistic informational design gathering, and holistic classifiable and distributive techniques are potentially abusive systems whose outputs can be easily misused by businesses when researching the firm`s customers.« less
Medical data mining: knowledge discovery in a clinical data warehouse.
Prather, J. C.; Lobach, D. F.; Goodwin, L. K.; Hales, J. W.; Hage, M. L.; Hammond, W. E.
1997-01-01
Clinical databases have accumulated large quantities of information about patients and their medical conditions. Relationships and patterns within this data could provide new medical knowledge. Unfortunately, few methodologies have been developed and applied to discover this hidden knowledge. In this study, the techniques of data mining (also known as Knowledge Discovery in Databases) were used to search for relationships in a large clinical database. Specifically, data accumulated on 3,902 obstetrical patients were evaluated for factors potentially contributing to preterm birth using exploratory factor analysis. Three factors were identified by the investigators for further exploration. This paper describes the processes involved in mining a clinical database including data warehousing, data query and cleaning, and data analysis. PMID:9357597
Temporal data mining for the quality assessment of hemodialysis services.
Bellazzi, Riccardo; Larizza, Cristiana; Magni, Paolo; Bellazzi, Roberto
2005-05-01
This paper describes the temporal data mining aspects of a research project that deals with the definition of methods and tools for the assessment of the clinical performance of hemodialysis (HD) services, on the basis of the time series automatically collected during hemodialysis sessions. Intelligent data analysis and temporal data mining techniques are applied to gain insight and to discover knowledge on the causes of unsatisfactory clinical results. In particular, two new methods for association rule discovery and temporal rule discovery are applied to the time series. Such methods exploit several pre-processing techniques, comprising data reduction, multi-scale filtering and temporal abstractions. We have analyzed the data of more than 5800 dialysis sessions coming from 43 different patients monitored for 19 months. The qualitative rules associating the outcome parameters and the measured variables were examined by the domain experts, which were able to distinguish between rules confirming available background knowledge and unexpected but plausible rules. The new methods proposed in the paper are suitable tools for knowledge discovery in clinical time series. Their use in the context of an auditing system for dialysis management helped clinicians to improve their understanding of the patients' behavior.
An Evaluation of Text Mining Tools as Applied to Selected Scientific and Engineering Literature.
ERIC Educational Resources Information Center
Trybula, Walter J.; Wyllys, Ronald E.
2000-01-01
Addresses an approach to the discovery of scientific knowledge through an examination of data mining and text mining techniques. Presents the results of experiments that investigated knowledge acquisition from a selected set of technical documents by domain experts. (Contains 15 references.) (Author/LRW)
Integrative Systems Biology for Data Driven Knowledge Discovery
Greene, Casey S.; Troyanskaya, Olga G.
2015-01-01
Integrative systems biology is an approach that brings together diverse high throughput experiments and databases to gain new insights into biological processes or systems at molecular through physiological levels. These approaches rely on diverse high-throughput experimental techniques that generate heterogeneous data by assaying varying aspects of complex biological processes. Computational approaches are necessary to provide an integrative view of these experimental results and enable data-driven knowledge discovery. Hypotheses generated from these approaches can direct definitive molecular experiments in a cost effective manner. Using integrative systems biology approaches, we can leverage existing biological knowledge and large-scale data to improve our understanding of yet unknown components of a system of interest and how its malfunction leads to disease. PMID:21044756
Cost-Benefit Analysis of Confidentiality Policies for Advanced Knowledge Management Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
May, D
Knowledge Discovery (KD) processes can create new information within a Knowledge Management (KM) system. In many domains, including government, this new information must be secured against unauthorized disclosure. Applying an appropriate confidentiality policy achieves this. However, it is not evident which confidentiality policy to apply, especially when the goals of sharing and disseminating knowledge have to be balanced with the requirements to secure knowledge. This work proposes to solve this problem by developing a cost-benefit analysis technique for examining the tradeoffs between securing and sharing discovered knowledge.
Payne, Philip R O; Kwok, Alan; Dhaval, Rakesh; Borlawsky, Tara B
2009-03-01
The conduct of large-scale translational studies presents significant challenges related to the storage, management and analysis of integrative data sets. Ideally, the application of methodologies such as conceptual knowledge discovery in databases (CKDD) provides a means for moving beyond intuitive hypothesis discovery and testing in such data sets, and towards the high-throughput generation and evaluation of knowledge-anchored relationships between complex bio-molecular and phenotypic variables. However, the induction of such high-throughput hypotheses is non-trivial, and requires correspondingly high-throughput validation methodologies. In this manuscript, we describe an evaluation of the efficacy of a natural language processing-based approach to validating such hypotheses. As part of this evaluation, we will examine a phenomenon that we have labeled as "Conceptual Dissonance" in which conceptual knowledge derived from two or more sources of comparable scope and granularity cannot be readily integrated or compared using conventional methods and automated tools.
ERIC Educational Resources Information Center
Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.
2000-01-01
These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)
Modalities, Relations, and Learning
NASA Astrophysics Data System (ADS)
Müller, Martin Eric
While the popularity of statistical, probabilistic and exhaustive machine learning techniques still increases, relational and logic approaches are still a niche market in research. While the former approaches focus on predictive accuracy, the latter ones prove to be indispensable in knowledge discovery.
Schulthess, Pascal; van Wijk, Rob C; Krekels, Elke H J; Yates, James W T; Spaink, Herman P; van der Graaf, Piet H
2018-04-25
To advance the systems approach in pharmacology, experimental models and computational methods need to be integrated from early drug discovery onward. Here, we propose outside-in model development, a model identification technique to understand and predict the dynamics of a system without requiring prior biological and/or pharmacological knowledge. The advanced data required could be obtained by whole vertebrate, high-throughput, low-resource dose-exposure-effect experimentation with the zebrafish larva. Combinations of these innovative techniques could improve early drug discovery. © 2018 The Authors CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals, Inc. on behalf of American Society for Clinical Pharmacology and Therapeutics.
Lötsch, Jörn; Lippmann, Catharina; Kringel, Dario; Ultsch, Alfred
2017-01-01
Genes causally involved in human insensitivity to pain provide a unique molecular source of studying the pathophysiology of pain and the development of novel analgesic drugs. The increasing availability of “big data” enables novel research approaches to chronic pain while also requiring novel techniques for data mining and knowledge discovery. We used machine learning to combine the knowledge about n = 20 genes causally involved in human hereditary insensitivity to pain with the knowledge about the functions of thousands of genes. An integrated computational analysis proposed that among the functions of this set of genes, the processes related to nervous system development and to ceramide and sphingosine signaling pathways are particularly important. This is in line with earlier suggestions to use these pathways as therapeutic target in pain. Following identification of the biological processes characterizing hereditary insensitivity to pain, the biological processes were used for a similarity analysis with the functions of n = 4,834 database-queried drugs. Using emergent self-organizing maps, a cluster of n = 22 drugs was identified sharing important functional features with hereditary insensitivity to pain. Several members of this cluster had been implicated in pain in preclinical experiments. Thus, the present concept of machine-learned knowledge discovery for pain research provides biologically plausible results and seems to be suitable for drug discovery by identifying a narrow choice of repurposing candidates, demonstrating that contemporary machine-learned methods offer innovative approaches to knowledge discovery from available evidence. PMID:28848388
NASA Astrophysics Data System (ADS)
Ganzert, Steven; Guttmann, Josef; Steinmann, Daniel; Kramer, Stefan
Lung protective ventilation strategies reduce the risk of ventilator associated lung injury. To develop such strategies, knowledge about mechanical properties of the mechanically ventilated human lung is essential. This study was designed to develop an equation discovery system to identify mathematical models of the respiratory system in time-series data obtained from mechanically ventilated patients. Two techniques were combined: (i) the usage of declarative bias to reduce search space complexity and inherently providing the processing of background knowledge. (ii) A newly developed heuristic for traversing the hypothesis space with a greedy, randomized strategy analogical to the GSAT algorithm. In 96.8% of all runs the applied equation discovery system was capable to detect the well-established equation of motion model of the respiratory system in the provided data. We see the potential of this semi-automatic approach to detect more complex mathematical descriptions of the respiratory system from respiratory data.
ERIC Educational Resources Information Center
Alfonseca, Enrique; Rodriguez, Pilar; Perez, Diana
2007-01-01
This work describes a framework that combines techniques from Adaptive Hypermedia and Natural Language processing in order to create, in a fully automated way, on-line information systems from linear texts in electronic format, such as textbooks. The process is divided into two steps: an "off-line" processing step, which analyses the source text,…
Healthcare applications of knowledge discovery in databases.
DeGruy, K B
2000-01-01
Many healthcare leaders find themselves overwhelmed with data, but lack the information they need to make informed decisions. Knowledge discovery in databases (KDD) can help organizations turn their data into information. KDD is the process of finding complex patterns and relationships in data. The tools and techniques of KDD have achieved impressive results in other industries, and healthcare needs to take advantage of advances in this exciting field. Recent advances in the KDD field have brought it from the realm of research institutions and large corporations to many smaller companies. Software and hardware advances enable small organizations to tap the power of KDD using desktop PCs. KDD has been used extensively for fraud detection and focused marketing. There is a wealth of data available within the healthcare industry that would benefit from the application of KDD tools and techniques. Providers and payers have a vast quantity of data (such as, charges and claims), but not effective way to analyze the data to accurately determine relationships and trends. Organizations that take advantage of KDD techniques will find that they offer valuable assistance in the quest to lower healthcare costs while improving healthcare quality.
Introducing the Big Knowledge to Use (BK2U) challenge.
Perl, Yehoshua; Geller, James; Halper, Michael; Ochs, Christopher; Zheng, Ling; Kapusnik-Uner, Joan
2017-01-01
The purpose of the Big Data to Knowledge initiative is to develop methods for discovering new knowledge from large amounts of data. However, if the resulting knowledge is so large that it resists comprehension, referred to here as Big Knowledge (BK), how can it be used properly and creatively? We call this secondary challenge, Big Knowledge to Use. Without a high-level mental representation of the kinds of knowledge in a BK knowledgebase, effective or innovative use of the knowledge may be limited. We describe summarization and visualization techniques that capture the big picture of a BK knowledgebase, possibly created from Big Data. In this research, we distinguish between assertion BK and rule-based BK (rule BK) and demonstrate the usefulness of summarization and visualization techniques of assertion BK for clinical phenotyping. As an example, we illustrate how a summary of many intracranial bleeding concepts can improve phenotyping, compared to the traditional approach. We also demonstrate the usefulness of summarization and visualization techniques of rule BK for drug-drug interaction discovery. © 2016 New York Academy of Sciences.
Introducing the Big Knowledge to Use (BK2U) challenge
Perl, Yehoshua; Geller, James; Halper, Michael; Ochs, Christopher; Zheng, Ling; Kapusnik-Uner, Joan
2016-01-01
The purpose of the Big Data to Knowledge (BD2K) initiative is to develop methods for discovering new knowledge from large amounts of data. However, if the resulting knowledge is so large that it resists comprehension, referred to here as Big Knowledge (BK), how can it be used properly and creatively? We call this secondary challenge, Big Knowledge to Use (BK2U). Without a high-level mental representation of the kinds of knowledge in a BK knowledgebase, effective or innovative use of the knowledge may be limited. We describe summarization and visualization techniques that capture the big picture of a BK knowledgebase, possibly created from Big Data. In this research, we distinguish between assertion BK and rule-based BK and demonstrate the usefulness of summarization and visualization techniques of assertion BK for clinical phenotyping. As an example, we illustrate how a summary of many intracranial bleeding concepts can improve phenotyping, compared to the traditional approach. We also demonstrate the usefulness of summarization and visualization techniques of rule-based BK for drug–drug interaction discovery. PMID:27750400
Li, Yan; Thomas, Manoj; Osei-Bryson, Kweku-Muata; Levy, Jason
2016-01-01
With the growing popularity of data analytics and data science in the field of environmental risk management, a formalized Knowledge Discovery via Data Analytics (KDDA) process that incorporates all applicable analytical techniques for a specific environmental risk management problem is essential. In this emerging field, there is limited research dealing with the use of decision support to elicit environmental risk management (ERM) objectives and identify analytical goals from ERM decision makers. In this paper, we address problem formulation in the ERM understanding phase of the KDDA process. We build a DM3 ontology to capture ERM objectives and to inference analytical goals and associated analytical techniques. A framework to assist decision making in the problem formulation process is developed. It is shown how the ontology-based knowledge system can provide structured guidance to retrieve relevant knowledge during problem formulation. The importance of not only operationalizing the KDDA approach in a real-world environment but also evaluating the effectiveness of the proposed procedure is emphasized. We demonstrate how ontology inferencing may be used to discover analytical goals and techniques by conceptualizing Hazardous Air Pollutants (HAPs) exposure shifts based on a multilevel analysis of the level of urbanization (and related economic activity) and the degree of Socio-Economic Deprivation (SED) at the local neighborhood level. The HAPs case highlights not only the role of complexity in problem formulation but also the need for integrating data from multiple sources and the importance of employing appropriate KDDA modeling techniques. Challenges and opportunities for KDDA are summarized with an emphasis on environmental risk management and HAPs. PMID:27983713
Li, Yan; Thomas, Manoj; Osei-Bryson, Kweku-Muata; Levy, Jason
2016-12-15
With the growing popularity of data analytics and data science in the field of environmental risk management, a formalized Knowledge Discovery via Data Analytics (KDDA) process that incorporates all applicable analytical techniques for a specific environmental risk management problem is essential. In this emerging field, there is limited research dealing with the use of decision support to elicit environmental risk management (ERM) objectives and identify analytical goals from ERM decision makers. In this paper, we address problem formulation in the ERM understanding phase of the KDDA process. We build a DM³ ontology to capture ERM objectives and to inference analytical goals and associated analytical techniques. A framework to assist decision making in the problem formulation process is developed. It is shown how the ontology-based knowledge system can provide structured guidance to retrieve relevant knowledge during problem formulation. The importance of not only operationalizing the KDDA approach in a real-world environment but also evaluating the effectiveness of the proposed procedure is emphasized. We demonstrate how ontology inferencing may be used to discover analytical goals and techniques by conceptualizing Hazardous Air Pollutants (HAPs) exposure shifts based on a multilevel analysis of the level of urbanization (and related economic activity) and the degree of Socio-Economic Deprivation (SED) at the local neighborhood level. The HAPs case highlights not only the role of complexity in problem formulation but also the need for integrating data from multiple sources and the importance of employing appropriate KDDA modeling techniques. Challenges and opportunities for KDDA are summarized with an emphasis on environmental risk management and HAPs.
How can knowledge discovery methods uncover spatio-temporal patterns in environmental data?
NASA Astrophysics Data System (ADS)
Wachowicz, Monica
2000-04-01
This paper proposes the integration of KDD, GVis and STDB as a long-term strategy, which will allow users to apply knowledge discovery methods for uncovering spatio-temporal patterns in environmental data. The main goal is to combine innovative techniques and associated tools for exploring very large environmental data sets in order to arrive at valid, novel, potentially useful, and ultimately understandable spatio-temporal patterns. The GeoInsight approach is described using the principles and key developments in the research domains of KDD, GVis, and STDB. The GeoInsight approach aims at the integration of these research domains in order to provide tools for performing information retrieval, exploration, analysis, and visualization. The result is a knowledge-based design, which involves visual thinking (perceptual-cognitive process) and automated information processing (computer-analytical process).
Virtual Observatories, Data Mining, and Astroinformatics
NASA Astrophysics Data System (ADS)
Borne, Kirk
The historical, current, and future trends in knowledge discovery from data in astronomy are presented here. The story begins with a brief history of data gathering and data organization. A description of the development ofnew information science technologies for astronomical discovery is then presented. Among these are e-Science and the virtual observatory, with its data discovery, access, display, and integration protocols; astroinformatics and data mining for exploratory data analysis, information extraction, and knowledge discovery from distributed data collections; new sky surveys' databases, including rich multivariate observational parameter sets for large numbers of objects; and the emerging discipline of data-oriented astronomical research, called astroinformatics. Astroinformatics is described as the fourth paradigm of astronomical research, following the three traditional research methodologies: observation, theory, and computation/modeling. Astroinformatics research areas include machine learning, data mining, visualization, statistics, semantic science, and scientific data management.Each of these areas is now an active research discipline, with significantscience-enabling applications in astronomy. Research challenges and sample research scenarios are presented in these areas, in addition to sample algorithms for data-oriented research. These information science technologies enable scientific knowledge discovery from the increasingly large and complex data collections in astronomy. The education and training of the modern astronomy student must consequently include skill development in these areas, whose practitioners have traditionally been limited to applied mathematicians, computer scientists, and statisticians. Modern astronomical researchers must cross these traditional discipline boundaries, thereby borrowing the best of breed methodologies from multiple disciplines. In the era of large sky surveys and numerous large telescopes, the potential for astronomical discovery is equally large, and so the data-oriented research methods, algorithms, and techniques that are presented here will enable the greatest discovery potential from the ever-growing data and information resources in astronomy.
García-Alonso, Carlos; Pérez-Naranjo, Leonor
2009-01-01
Introduction Knowledge management, based on information transfer between experts and analysts, is crucial for the validity and usability of data envelopment analysis (DEA). Aim To design and develop a methodology: i) to assess technical efficiency of small health areas (SHA) in an uncertainty environment, and ii) to transfer information between experts and operational models, in both directions, for improving expert’s knowledge. Method A procedure derived from knowledge discovery from data (KDD) is used to select, interpret and weigh DEA inputs and outputs. Based on KDD results, an expert-driven Monte-Carlo DEA model has been designed to assess the technical efficiency of SHA in Andalusia. Results In terms of probability, SHA 29 is the most efficient being, on the contrary, SHA 22 very inefficient. 73% of analysed SHA have a probability of being efficient (Pe) >0.9 and 18% <0.5. Conclusions Expert knowledge is necessary to design and validate any operational model. KDD techniques make the transfer of information from experts to any operational model easy and results obtained from the latter improve expert’s knowledge.
Artificial intelligence techniques for monitoring dangerous infections.
Lamma, Evelina; Mello, Paola; Nanetti, Anna; Riguzzi, Fabrizio; Storari, Sergio; Valastro, Gianfranco
2006-01-01
The monitoring and detection of nosocomial infections is a very important problem arising in hospitals. A hospital-acquired or nosocomial infection is a disease that develops after admission into the hospital and it is the consequence of a treatment, not necessarily a surgical one, performed by the medical staff. Nosocomial infections are dangerous because they are caused by bacteria which have dangerous (critical) resistance to antibiotics. This problem is very serious all over the world. In Italy, almost 5-8% of the patients admitted into hospitals develop this kind of infection. In order to reduce this figure, policies for controlling infections should be adopted by medical practitioners. In order to support them in this complex task, we have developed a system, called MERCURIO, capable of managing different aspects of the problem. The objectives of this system are the validation of microbiological data and the creation of a real time epidemiological information system. The system is useful for laboratory physicians, because it supports them in the execution of the microbiological analyses; for clinicians, because it supports them in the definition of the prophylaxis, of the most suitable antibi-otic therapy and in monitoring patients' infections; and for epidemiologists, because it allows them to identify outbreaks and to study infection dynamics. In order to achieve these objectives, we have adopted expert system and data mining techniques. We have also integrated a statistical module that monitors the diffusion of nosocomial infections over time in the hospital, and that strictly interacts with the knowledge based module. Data mining techniques have been used for improving the system knowledge base. The knowledge discovery process is not antithetic, but complementary to the one based on manual knowledge elicitation. In order to verify the reliability of the tasks performed by MERCURIO and the usefulness of the knowledge discovery approach, we performed a test based on a dataset of real infection events. In the validation task MERCURIO achieved an accuracy of 98.5%, a sensitivity of 98.5% and a specificity of 99%. In the therapy suggestion task, MERCURIO achieved very high accuracy and specificity as well. The executed test provided many insights to experts, too (we discovered some of their mistakes). The knowledge discovery approach was very effective in validating part of the MERCURIO knowledge base, and also in extending it with new validation rules, confirmed by interviewed microbiologists and specific to the hospital laboratory under consideration.
NATURAL PRODUCTS: A CONTINUING SOURCE OF NOVEL DRUG LEADS
Cragg, Gordon M.; Newman, David J.
2013-01-01
1. Background Nature has been a source of medicinal products for millennia, with many useful drugs developed from plant sources. Following discovery of the penicillins, drug discovery from microbial sources occurred and diving techniques in the 1970s opened the seas. Combinatorial chemistry (late 1980s), shifted the focus of drug discovery efforts from Nature to the laboratory bench. 2. Scope of Review This review traces natural products drug discovery, outlining important drugs from natural sources that revolutionized treatment of serious diseases. It is clear Nature will continue to be a major source of new structural leads, and effective drug development depends on multidisciplinary collaborations. 3. Major Conclusions The explosion of genetic information led not only to novel screens, but the genetic techniques permitted the implementation of combinatorial biosynthetic technology and genome mining. The knowledge gained has allowed unknown molecules to be identified. These novel bioactive structures can be optimized by using combinatorial chemistry generating new drug candidates for many diseases. 4 General Significance: The advent of genetic techniques that permitted the isolation / expression of biosynthetic cassettes from microbes may well be the new frontier for natural products lead discovery. It is now apparent that biodiversity may be much greater in those organisms. The numbers of potential species involved in the microbial world are many orders of magnitude greater than those of plants and multi-celled animals. Coupling these numbers to the number of currently unexpressed biosynthetic clusters now identified (>10 per species) the potential of microbial diversity remains essentially untapped. PMID:23428572
Computational methods in drug discovery
Leelananda, Sumudu P
2016-01-01
The process for drug discovery and development is challenging, time consuming and expensive. Computer-aided drug discovery (CADD) tools can act as a virtual shortcut, assisting in the expedition of this long process and potentially reducing the cost of research and development. Today CADD has become an effective and indispensable tool in therapeutic development. The human genome project has made available a substantial amount of sequence data that can be used in various drug discovery projects. Additionally, increasing knowledge of biological structures, as well as increasing computer power have made it possible to use computational methods effectively in various phases of the drug discovery and development pipeline. The importance of in silico tools is greater than ever before and has advanced pharmaceutical research. Here we present an overview of computational methods used in different facets of drug discovery and highlight some of the recent successes. In this review, both structure-based and ligand-based drug discovery methods are discussed. Advances in virtual high-throughput screening, protein structure prediction methods, protein–ligand docking, pharmacophore modeling and QSAR techniques are reviewed. PMID:28144341
Computational methods in drug discovery.
Leelananda, Sumudu P; Lindert, Steffen
2016-01-01
The process for drug discovery and development is challenging, time consuming and expensive. Computer-aided drug discovery (CADD) tools can act as a virtual shortcut, assisting in the expedition of this long process and potentially reducing the cost of research and development. Today CADD has become an effective and indispensable tool in therapeutic development. The human genome project has made available a substantial amount of sequence data that can be used in various drug discovery projects. Additionally, increasing knowledge of biological structures, as well as increasing computer power have made it possible to use computational methods effectively in various phases of the drug discovery and development pipeline. The importance of in silico tools is greater than ever before and has advanced pharmaceutical research. Here we present an overview of computational methods used in different facets of drug discovery and highlight some of the recent successes. In this review, both structure-based and ligand-based drug discovery methods are discussed. Advances in virtual high-throughput screening, protein structure prediction methods, protein-ligand docking, pharmacophore modeling and QSAR techniques are reviewed.
Introduction to fragment-based drug discovery.
Erlanson, Daniel A
2012-01-01
Fragment-based drug discovery (FBDD) has emerged in the past decade as a powerful tool for discovering drug leads. The approach first identifies starting points: very small molecules (fragments) that are about half the size of typical drugs. These fragments are then expanded or linked together to generate drug leads. Although the origins of the technique date back some 30 years, it was only in the mid-1990s that experimental techniques became sufficiently sensitive and rapid for the concept to be become practical. Since that time, the field has exploded: FBDD has played a role in discovery of at least 18 drugs that have entered the clinic, and practitioners of FBDD can be found throughout the world in both academia and industry. Literally dozens of reviews have been published on various aspects of FBDD or on the field as a whole, as have three books (Jahnke and Erlanson, Fragment-based approaches in drug discovery, 2006; Zartler and Shapiro, Fragment-based drug discovery: a practical approach, 2008; Kuo, Fragment based drug design: tools, practical approaches, and examples, 2011). However, this chapter will assume that the reader is approaching the field with little prior knowledge. It will introduce some of the key concepts, set the stage for the chapters to follow, and demonstrate how X-ray crystallography plays a central role in fragment identification and advancement.
Collaborative filtering to improve navigation of large radiology knowledge resources.
Kahn, Charles E
2005-06-01
Collaborative filtering is a knowledge-discovery technique that can help guide readers to items of potential interest based on the experience of prior users. This study sought to determine the impact of collaborative filtering on navigation of a large, Web-based radiology knowledge resource. Collaborative filtering was applied to a collection of 1,168 radiology hypertext documents available via the Internet. An item-based collaborative filtering algorithm identified each document's six most closely related documents based on 248,304 page views in an 18-day period. Documents were amended to include links to their related documents, and use was analyzed over the next 5 days. The mean number of documents viewed per visit increased from 1.57 to 1.74 (P < 0.0001). Collaborative filtering can increase a radiology information resource's utilization and can improve its usefulness and ease of navigation. The technique holds promise for improving navigation of large Internet-based radiology knowledge resources.
PKDE4J: Entity and relation extraction for public knowledge discovery.
Song, Min; Kim, Won Chul; Lee, Dahee; Heo, Go Eun; Kang, Keun Young
2015-10-01
Due to an enormous number of scientific publications that cannot be handled manually, there is a rising interest in text-mining techniques for automated information extraction, especially in the biomedical field. Such techniques provide effective means of information search, knowledge discovery, and hypothesis generation. Most previous studies have primarily focused on the design and performance improvement of either named entity recognition or relation extraction. In this paper, we present PKDE4J, a comprehensive text-mining system that integrates dictionary-based entity extraction and rule-based relation extraction in a highly flexible and extensible framework. Starting with the Stanford CoreNLP, we developed the system to cope with multiple types of entities and relations. The system also has fairly good performance in terms of accuracy as well as the ability to configure text-processing components. We demonstrate its competitive performance by evaluating it on many corpora and found that it surpasses existing systems with average F-measures of 85% for entity extraction and 81% for relation extraction. Copyright © 2015 Elsevier Inc. All rights reserved.
Recommendation Techniques for Drug-Target Interaction Prediction and Drug Repositioning.
Alaimo, Salvatore; Giugno, Rosalba; Pulvirenti, Alfredo
2016-01-01
The usage of computational methods in drug discovery is a common practice. More recently, by exploiting the wealth of biological knowledge bases, a novel approach called drug repositioning has raised. Several computational methods are available, and these try to make a high-level integration of all the knowledge in order to discover unknown mechanisms. In this chapter, we review drug-target interaction prediction methods based on a recommendation system. We also give some extensions which go beyond the bipartite network case.
NASA Astrophysics Data System (ADS)
McGibbney, L. J.; Jiang, Y.; Burgess, A. B.
2017-12-01
Big Earth observation data have been produced, archived and made available online, but discovering the right data in a manner that precisely and efficiently satisfies user needs presents a significant challenge to the Earth Science (ES) community. An emerging trend in information retrieval community is to utilize knowledge graphs to assist users in quickly finding desired information from across knowledge sources. This is particularly prevalent within the fields of social media and complex multimodal information processing to name but a few, however building a domain-specific knowledge graph is labour-intensive and hard to keep up-to-date. In this work, we update our progress on the Earth Science Knowledge Graph (ESKG) project; an ESIP-funded testbed project which provides an automatic approach to building a dynamic knowledge graph for ES to improve interdisciplinary data discovery by leveraging implicit, latent existing knowledge present within across several U.S Federal Agencies e.g. NASA, NOAA and USGS. ESKG strengthens ties between observations and user communities by: 1) developing a knowledge graph derived from various sources e.g. Web pages, Web Services, etc. via natural language processing and knowledge extraction techniques; 2) allowing users to traverse, explore, query, reason and navigate ES data via knowledge graph interaction. ESKG has the potential to revolutionize the way in which ES communities interact with ES data in the open world through the entity, spatial and temporal linkages and characteristics that make it up. This project enables the advancement of ESIP collaboration areas including both Discovery and Semantic Technologies by putting graph information right at our fingertips in an interactive, modern manner and reducing the efforts to constructing ontology. To demonstrate the ESKG concept, we will demonstrate use of our framework across NASA JPL's PO.DAAC, NOAA's Earth Observation Requirements Evaluation System (EORES) and various USGS systems.
The relation between prior knowledge and students' collaborative discovery learning processes
NASA Astrophysics Data System (ADS)
Gijlers, Hannie; de Jong, Ton
2005-03-01
In this study we investigate how prior knowledge influences knowledge development during collaborative discovery learning. Fifteen dyads of students (pre-university education, 15-16 years old) worked on a discovery learning task in the physics field of kinematics. The (face-to-face) communication between students was recorded and the interaction with the environment was logged. Based on students' individual judgments of the truth-value and testability of a series of domain-specific propositions, a detailed description of the knowledge configuration for each dyad was created before they entered the learning environment. Qualitative analyses of two dialogues illustrated that prior knowledge influences the discovery learning processes, and knowledge development in a pair of students. Assessments of student and dyad definitional (domain-specific) knowledge, generic (mathematical and graph) knowledge, and generic (discovery) skills were related to the students' dialogue in different discovery learning processes. Results show that a high level of definitional prior knowledge is positively related to the proportion of communication regarding the interpretation of results. Heterogeneity with respect to generic prior knowledge was positively related to the number of utterances made in the discovery process categories hypotheses generation and experimentation. Results of the qualitative analyses indicated that collaboration between extremely heterogeneous dyads is difficult when the high achiever is not willing to scaffold information and work in the low achiever's zone of proximal development.
DataHub: Knowledge-based data management for data discovery
NASA Astrophysics Data System (ADS)
Handley, Thomas H.; Li, Y. Philip
1993-08-01
Currently available database technology is largely designed for business data-processing applications, and seems inadequate for scientific applications. The research described in this paper, the DataHub, will address the issues associated with this shortfall in technology utilization and development. The DataHub development is addressing the key issues in scientific data management of scientific database models and resource sharing in a geographically distributed, multi-disciplinary, science research environment. Thus, the DataHub will be a server between the data suppliers and data consumers to facilitate data exchanges, to assist science data analysis, and to provide as systematic approach for science data management. More specifically, the DataHub's objectives are to provide support for (1) exploratory data analysis (i.e., data driven analysis); (2) data transformations; (3) data semantics capture and usage; analysis-related knowledge capture and usage; and (5) data discovery, ingestion, and extraction. Applying technologies that vary from deductive databases, semantic data models, data discovery, knowledge representation and inferencing, exploratory data analysis techniques and modern man-machine interfaces, DataHub will provide a prototype, integrated environement to support research scientists' needs in multiple disciplines (i.e. oceanography, geology, and atmospheric) while addressing the more general science data management issues. Additionally, the DataHub will provide data management services to exploratory data analysis applications such as LinkWinds and NCSA's XIMAGE.
ERIC Educational Resources Information Center
Mohamed, Fahim; Abdeslam, Jakimi; Lahcen, El Bermi
2017-01-01
Virtual Environments for Training (VET) are useful tools for visualization, discovery as well as for training. VETs are based on virtual reality technique to put learners in training situations that emulate genuine situations. VETs have proven to be advantageous in putting learners into varied training situations to acquire knowledge and…
Analytics for Cyber Network Defense
DOE Office of Scientific and Technical Information (OSTI.GOV)
Plantenga, Todd.; Kolda, Tamara Gibson
2011-06-01
This report provides a brief survey of analytics tools considered relevant to cyber network defense (CND). Ideas and tools come from elds such as statistics, data mining, and knowledge discovery. Some analytics are considered standard mathematical or statistical techniques, while others re ect current research directions. In all cases the report attempts to explain the relevance to CND with brief examples.
Information Fusion - Methods and Aggregation Operators
NASA Astrophysics Data System (ADS)
Torra, Vicenç
Information fusion techniques are commonly applied in Data Mining and Knowledge Discovery. In this chapter, we will give an overview of such applications considering their three main uses. This is, we consider fusion methods for data preprocessing, model building and information extraction. Some aggregation operators (i.e. particular fusion methods) and their properties are briefly described as well.
ERIC Educational Resources Information Center
National Academies Press, 2013
2013-01-01
Spurred on by new discoveries and rapid technological advances, the capacity for life science research is expanding across the globe-and with it comes concerns about the unintended impacts of research on the physical and biological environment, human well-being, or the deliberate misuse of knowledge, tools, and techniques to cause harm. This…
Recent advances in inkjet dispensing technologies: applications in drug discovery.
Zhu, Xiangcheng; Zheng, Qiang; Yang, Hu; Cai, Jin; Huang, Lei; Duan, Yanwen; Xu, Zhinan; Cen, Peilin
2012-09-01
Inkjet dispensing technology is a promising fabrication methodology widely applied in drug discovery. The automated programmable characteristics and high-throughput efficiency makes this approach potentially very useful in miniaturizing the design patterns for assays and drug screening. Various custom-made inkjet dispensing systems as well as specialized bio-ink and substrates have been developed and applied to fulfill the increasing demands of basic drug discovery studies. The incorporation of other modern technologies has further exploited the potential of inkjet dispensing technology in drug discovery and development. This paper reviews and discusses the recent developments and practical applications of inkjet dispensing technology in several areas of drug discovery and development including fundamental assays of cells and proteins, microarrays, biosensors, tissue engineering, basic biological and pharmaceutical studies. Progression in a number of areas of research including biomaterials, inkjet mechanical systems and modern analytical techniques as well as the exploration and accumulation of profound biological knowledge has enabled different inkjet dispensing technologies to be developed and adapted for high-throughput pattern fabrication and miniaturization. This in turn presents a great opportunity to propel inkjet dispensing technology into drug discovery.
Information Fusion for Natural and Man-Made Disasters
2007-01-31
comprehensively large, and metaphysically accurate model of situations, through which specific tasks such as situation assessment, knowledge discovery , or the...significance” is always context specific. Event discovery is a very important element of the HLF process, which can lead to knowledge discovery about...expected, given the current state of knowledge . Examples of such behavior may include discovery of a new aggregate or situation, a specific pattern of
A New System To Support Knowledge Discovery: Telemakus.
ERIC Educational Resources Information Center
Revere, Debra; Fuller, Sherrilynne S.; Bugni, Paul F.; Martin, George M.
2003-01-01
The Telemakus System builds on the areas of concept representation, schema theory, and information visualization to enhance knowledge discovery from scientific literature. This article describes the underlying theories and an overview of a working implementation designed to enhance the knowledge discovery process through retrieval, visual and…
First centenary of Röntgen's discovery of X-rays
NASA Astrophysics Data System (ADS)
Valkovic, V.
1996-04-01
Usually it takes a decade or even several decades, from a discovery to its practical applications. This was not the case with X-rays; they were widely applied in medical and industrial radiography within a year of their discovery in 1895 by W.C. Röntgen. Today, X-ray analysis covers a wide range of techniques and fields of applications: from deduction of atomic arrangements by observation of diffraction phenomena to measurements of trace element concentration levels, distributions and maps by measuring fluorescence, X-ray attenuation or scattering. Although the contribution of analytical applications of X-rays to the present knowledge is difficult to surpass, modern application cover a wide range of activities from three-dimensional microfabrication using synchroton radiation to collecting information from the deep space by X-ray astronomy.
2017-06-27
From - To) 05-27-2017 Final 17-03-2017 - 15-03-2018 4. TITLE AND SUBTITLE Sa. CONTRACT NUMBER FA2386-17-1-0102 Advances in Knowledge Discovery and...Springer; Switzerland. 14. ABSTRACT The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference...in the areas of knowledge discovery and data mining (KDD). We had three keynote speeches, delivered by Sang Cha from Seoul National University
Visualising nursing data using correspondence analysis.
Kokol, Peter; Blažun Vošner, Helena; Železnik, Danica
2016-09-01
Digitally stored, large healthcare datasets enable nurses to use 'big data' techniques and tools in nursing research. Big data is complex and multi-dimensional, so visualisation may be a preferable approach to analyse and understand it. To demonstrate the use of visualisation of big data in a technique called correspondence analysis. In the authors' study, relations among data in a nursing dataset were shown visually in graphs using correspondence analysis. The case presented demonstrates that correspondence analysis is easy to use, shows relations between data visually in a form that is simple to interpret, and can reveal hidden associations between data. Correspondence analysis supports the discovery of new knowledge. Implications for practice Knowledge obtained using correspondence analysis can be transferred immediately into practice or used to foster further research.
The Relation between Prior Knowledge and Students' Collaborative Discovery Learning Processes
ERIC Educational Resources Information Center
Gijlers, Hannie; de Jong, Ton
2005-01-01
In this study we investigate how prior knowledge influences knowledge development during collaborative discovery learning. Fifteen dyads of students (pre-university education, 15-16 years old) worked on a discovery learning task in the physics field of kinematics. The (face-to-face) communication between students was recorded and the interaction…
Data mining in pharma sector: benefits.
Ranjan, Jayanthi
2009-01-01
The amount of data getting generated in any sector at present is enormous. The information flow in the pharma industry is huge. Pharma firms are progressing into increased technology-enabled products and services. Data mining, which is knowledge discovery from large sets of data, helps pharma firms to discover patterns in improving the quality of drug discovery and delivery methods. The paper aims to present how data mining is useful in the pharma industry, how its techniques can yield good results in pharma sector, and to show how data mining can really enhance in making decisions using pharmaceutical data. This conceptual paper is written based on secondary study, research and observations from magazines, reports and notes. The author has listed the types of patterns that can be discovered using data mining in pharma data. The paper shows how data mining is useful in the pharma industry and how its techniques can yield good results in pharma sector. Although much work can be produced for discovering knowledge in pharma data using data mining, the paper is limited to conceptualizing the ideas and view points at this stage; future work may include applying data mining techniques to pharma data based on primary research using the available, famous significant data mining tools. Research papers and conceptual papers related to data mining in Pharma industry are rare; this is the motivation for the paper.
Knowledge Discovery from Vibration Measurements
Li, Jian; Wang, Daoyao
2014-01-01
The framework as well as the particular algorithms of pattern recognition process is widely adopted in structural health monitoring (SHM). However, as a part of the overall process of knowledge discovery from data bases (KDD), the results of pattern recognition are only changes and patterns of changes of data features. In this paper, based on the similarity between KDD and SHM and considering the particularity of SHM problems, a four-step framework of SHM is proposed which extends the final goal of SHM from detecting damages to extracting knowledge to facilitate decision making. The purposes and proper methods of each step of this framework are discussed. To demonstrate the proposed SHM framework, a specific SHM method which is composed by the second order structural parameter identification, statistical control chart analysis, and system reliability analysis is then presented. To examine the performance of this SHM method, real sensor data measured from a lab size steel bridge model structure are used. The developed four-step framework of SHM has the potential to clarify the process of SHM to facilitate the further development of SHM techniques. PMID:24574933
Realising the knowledge spiral in healthcare: the role of data mining and knowledge management.
Wickramasinghe, Nilmini; Bali, Rajeev K; Gibbons, M Chris; Schaffer, Jonathan
2008-01-01
Knowledge Management (KM) is an emerging business approach aimed at solving current problems such as competitiveness and the need to innovate which are faced by businesses today. The premise for the need for KM is based on a paradigm shift in the business environment where knowledge is central to organizational performance . Organizations trying to embrace KM have many tools, techniques and strategies at their disposal. A vital technique in KM is data mining which enables critical knowledge to be gained from the analysis of large amounts of data and information. The healthcare industry is a very information rich industry. The collecting of data and information permeate most, if not all areas of this industry; however, the healthcare industry has yet to fully embrace KM, let alone the new evolving techniques of data mining. In this paper, we demonstrate the ubiquitous benefits of data mining and KM to healthcare by highlighting their potential to enable and facilitate superior clinical practice and administrative management to ensue. Specifically, we show how data mining can realize the knowledge spiral by effecting the four key transformations identified by Nonaka of turning: (1) existing explicit knowledge to new explicit knowledge, (2) existing explicit knowledge to new tacit knowledge, (3) existing tacit knowledge to new explicit knowledge and (4) existing tacit knowledge to new tacit knowledge. This is done through the establishment of theoretical models that respectively identify the function of the knowledge spiral and the powers of data mining, both exploratory and predictive, in the knowledge discovery process. Our models are then applied to a healthcare data set to demonstrate the potential of this approach as well as the implications of such an approach to the clinical and administrative aspects of healthcare. Further, we demonstrate how these techniques can facilitate hospitals to address the six healthcare quality dimensions identified by the Committee for Quality Healthcare.
Markatos, Konstantinos; Androutsos, Georgios; Karamanou, Marianna; Tzagkarakis, Georgios; Kaseta, Maria; Mavrogenis, Andreas
2018-05-11
The purpose of this review is to summarize the life and work of Jean-Louis Petit, his inventions, his discoveries, and his impact on the evolution of surgery of his era. A thorough search of the literature was undertaken in PubMed and Google Scholar as well as in physical books in libraries to summarize current and classic literature on Petit. Jean-Louis Petit (1674-1750) was an eminent anatomist and surgeon of his era with an invaluable contribution to clinical knowledge, surgical technique, and instrumentation as well as innovative therapeutic modalities and basic scientific discoveries. Jean-Louis Petit was an innovative anatomist and surgeon as well as an excellent clinician of his era. He revolutionized the surgical technique of his era with a significant contribution to what would later become orthopaedic surgery.
SemaTyP: a knowledge graph based literature mining method for drug discovery.
Sang, Shengtian; Yang, Zhihao; Wang, Lei; Liu, Xiaoxia; Lin, Hongfei; Wang, Jian
2018-05-30
Drug discovery is the process through which potential new medicines are identified. High-throughput screening and computer-aided drug discovery/design are the two main drug discovery methods for now, which have successfully discovered a series of drugs. However, development of new drugs is still an extremely time-consuming and expensive process. Biomedical literature contains important clues for the identification of potential treatments. It could support experts in biomedicine on their way towards new discoveries. Here, we propose a biomedical knowledge graph-based drug discovery method called SemaTyP, which discovers candidate drugs for diseases by mining published biomedical literature. We first construct a biomedical knowledge graph with the relations extracted from biomedical abstracts, then a logistic regression model is trained by learning the semantic types of paths of known drug therapies' existing in the biomedical knowledge graph, finally the learned model is used to discover drug therapies for new diseases. The experimental results show that our method could not only effectively discover new drug therapies for new diseases, but also could provide the potential mechanism of action of the candidate drugs. In this paper we propose a novel knowledge graph based literature mining method for drug discovery. It could be a supplementary method for current drug discovery methods.
A renaissance of neural networks in drug discovery.
Baskin, Igor I; Winkler, David; Tetko, Igor V
2016-08-01
Neural networks are becoming a very popular method for solving machine learning and artificial intelligence problems. The variety of neural network types and their application to drug discovery requires expert knowledge to choose the most appropriate approach. In this review, the authors discuss traditional and newly emerging neural network approaches to drug discovery. Their focus is on backpropagation neural networks and their variants, self-organizing maps and associated methods, and a relatively new technique, deep learning. The most important technical issues are discussed including overfitting and its prevention through regularization, ensemble and multitask modeling, model interpretation, and estimation of applicability domain. Different aspects of using neural networks in drug discovery are considered: building structure-activity models with respect to various targets; predicting drug selectivity, toxicity profiles, ADMET and physicochemical properties; characteristics of drug-delivery systems and virtual screening. Neural networks continue to grow in importance for drug discovery. Recent developments in deep learning suggests further improvements may be gained in the analysis of large chemical data sets. It's anticipated that neural networks will be more widely used in drug discovery in the future, and applied in non-traditional areas such as drug delivery systems, biologically compatible materials, and regenerative medicine.
Bioinformatics in protein kinases regulatory network and drug discovery.
Chen, Qingfeng; Luo, Haiqiong; Zhang, Chengqi; Chen, Yi-Ping Phoebe
2015-04-01
Protein kinases have been implicated in a number of diseases, where kinases participate many aspects that control cell growth, movement and death. The deregulated kinase activities and the knowledge of these disorders are of great clinical interest of drug discovery. The most critical issue is the development of safe and efficient disease diagnosis and treatment for less cost and in less time. It is critical to develop innovative approaches that aim at the root cause of a disease, not just its symptoms. Bioinformatics including genetic, genomic, mathematics and computational technologies, has become the most promising option for effective drug discovery, and has showed its potential in early stage of drug-target identification and target validation. It is essential that these aspects are understood and integrated into new methods used in drug discovery for diseases arisen from deregulated kinase activity. This article reviews bioinformatics techniques for protein kinase data management and analysis, kinase pathways and drug targets and describes their potential application in pharma ceutical industry. Copyright © 2015 Elsevier Inc. All rights reserved.
A New Student Performance Analysing System Using Knowledge Discovery in Higher Educational Databases
ERIC Educational Resources Information Center
Guruler, Huseyin; Istanbullu, Ayhan; Karahasan, Mehmet
2010-01-01
Knowledge discovery is a wide ranged process including data mining, which is used to find out meaningful and useful patterns in large amounts of data. In order to explore the factors having impact on the success of university students, knowledge discovery software, called MUSKUP, has been developed and tested on student data. In this system a…
Building Knowledge Graphs for NASA's Earth Science Enterprise
NASA Astrophysics Data System (ADS)
Zhang, J.; Lee, T. J.; Ramachandran, R.; Shi, R.; Bao, Q.; Gatlin, P. N.; Weigel, A. M.; Maskey, M.; Miller, J. J.
2016-12-01
Inspired by Google Knowledge Graph, we have been building a prototype Knowledge Graph for Earth scientists, connecting information and data in NASA's Earth science enterprise. Our primary goal is to advance the state-of-the-art NASA knowledge extraction capability by going beyond traditional catalog search and linking different distributed information (such as data, publications, services, tools and people). This will enable a more efficient pathway to knowledge discovery. While Google Knowledge Graph provides impressive semantic-search and aggregation capabilities, it is limited to search topics for general public. We use the similar knowledge graph approach to semantically link information gathered from a wide variety of sources within the NASA Earth Science enterprise. Our prototype serves as a proof of concept on the viability of building an operational "knowledge base" system for NASA Earth science. Information is pulled from structured sources (such as NASA CMR catalog, GCMD, and Climate and Forecast Conventions) and unstructured sources (such as research papers). Leveraging modern techniques of machine learning, information retrieval, and deep learning, we provide an integrated data mining and information discovery environment to help Earth scientists to use the best data, tools, methodologies, and models available to answer a hypothesis. Our knowledge graph would be able to answer questions like: Which articles discuss topics investigating similar hypotheses? How have these methods been tested for accuracy? Which approaches have been highly cited within the scientific community? What variables were used for this method and what datasets were used to represent them? What processing was necessary to use this data? These questions then lead researchers and citizen scientists to investigate the sources where data can be found, available user guides, information on how the data was acquired, and available tools and models to use with this data. As a proof of concept, we focus on a well-defined domain - Hurricane Science linking research articles and their findings, data, people and tools/services. Modern information retrieval, natural language processing machine learning and deep learning techniques are applied to build the knowledge network.
Knowledge Discovery in Databases.
ERIC Educational Resources Information Center
Norton, M. Jay
1999-01-01
Knowledge discovery in databases (KDD) revolves around the investigation and creation of knowledge, processes, algorithms, and mechanisms for retrieving knowledge from data collections. The article is an introductory overview of KDD. The rationale and environment of its development and applications are discussed. Issues related to database design…
The role of collaborative ontology development in the knowledge negotiation process
NASA Astrophysics Data System (ADS)
Rivera, Norma
Interdisciplinary research (IDR) collaboration can be defined as the process of integrating experts' knowledge, perspectives, and resources to advance scientific discovery. The flourishing of more complex research problems, together with the growth of scientific and technical knowledge has resulted in the need for researchers from diverse fields to provide different expertise and points of view to tackle these problems. These collaborations, however, introduce a new set of "culture" barriers as participating experts are trained to communicate in discipline-specific languages, theories, and research practices. We propose that building a common knowledge base for research using ontology development techniques can provide a starting point for interdisciplinary knowledge exchange, negotiation, and integration. The goal of this work is to extend ontology development techniques to support the knowledge negotiation process in IDR groups. Towards this goal, this work presents a methodology that extends previous work in collaborative ontology development and integrates learning strategies and tools to enhance interdisciplinary research practices. We evaluate the effectiveness of applying such methodology in three different scenarios that cover educational and research settings. The results of this evaluation confirm that integrating learning strategies can, in fact, be advantageous to overall collaborative practices in IDR groups.
Searching for human oncoviruses: Histories, challenges, and opportunities.
Cao, Jian; Li, Dawei
2018-06-01
Oncoviruses contribute significantly to cancer burden. A century of tumor virological studies have led to the discovery of seven well-accepted human oncoviruses, cumulatively responsible for approximately 15% of human cancer cases. Virus-caused cancers are largely preventable through vaccination. Identifying additional oncoviruses and virus-caused tumors will advance cancer prevention and precision medicine, benefiting affected individuals, and society as a whole. The historic success of finding human oncoviruses has provided a unique lesson for directing new research efforts in the post-sequencing era. Combing the experiences from these pioneer studies with emerging high-throughput techniques will certainly accelerate new discovery and advance our knowledge of the remaining human oncoviruses. © 2018 Wiley Periodicals, Inc.
Translational Research 2.0: a framework for accelerating collaborative discovery.
Asakiewicz, Chris
2014-05-01
The world wide web has revolutionized the conduct of global, cross-disciplinary research. In the life sciences, interdisciplinary approaches to problem solving and collaboration are becoming increasingly important in facilitating knowledge discovery and integration. Web 2.0 technologies promise to have a profound impact - enabling reproducibility, aiding in discovery, and accelerating and transforming medical and healthcare research across the healthcare ecosystem. However, knowledge integration and discovery require a consistent foundation upon which to operate. A foundation should be capable of addressing some of the critical issues associated with how research is conducted within the ecosystem today and how it should be conducted for the future. This article will discuss a framework for enhancing collaborative knowledge discovery across the medical and healthcare research ecosystem. A framework that could serve as a foundation upon which ecosystem stakeholders can enhance the way data, information and knowledge is created, shared and used to accelerate the translation of knowledge from one area of the ecosystem to another.
NASA Astrophysics Data System (ADS)
Berres, A.; Karthik, R.; Nugent, P.; Sorokine, A.; Myers, A.; Pang, H.
2017-12-01
Building an integrated data infrastructure that can meet the needs of a sustainable energy-water resource management requires a robust data management and geovisual analytics platform, capable of cross-domain scientific discovery and knowledge generation. Such a platform can facilitate the investigation of diverse complex research and policy questions for emerging priorities in Energy-Water Nexus (EWN) science areas. Using advanced data analytics, machine learning techniques, multi-dimensional statistical tools, and interactive geovisualization components, such a multi-layered federated platform is being developed, the Energy-Water Nexus Knowledge Discovery Framework (EWN-KDF). This platform utilizes several enterprise-grade software design concepts and standards such as extensible service-oriented architecture, open standard protocols, event-driven programming model, enterprise service bus, and adaptive user interfaces to provide a strategic value to the integrative computational and data infrastructure. EWN-KDF is built on the Compute and Data Environment for Science (CADES) environment in Oak Ridge National Laboratory (ORNL).
Knowledge discovery for pancreatic cancer using inductive logic programming.
Qiu, Yushan; Shimada, Kazuaki; Hiraoka, Nobuyoshi; Maeshiro, Kensei; Ching, Wai-Ki; Aoki-Kinoshita, Kiyoko F; Furuta, Koh
2014-08-01
Pancreatic cancer is a devastating disease and predicting the status of the patients becomes an important and urgent issue. The authors explore the applicability of inductive logic programming (ILP) method in the disease and show that the accumulated clinical laboratory data can be used to predict disease characteristics, and this will contribute to the selection of therapeutic modalities of pancreatic cancer. The availability of a large amount of clinical laboratory data provides clues to aid in the knowledge discovery of diseases. In predicting the differentiation of tumour and the status of lymph node metastasis in pancreatic cancer, using the ILP model, three rules are developed that are consistent with descriptions in the literature. The rules that are identified are useful to detect the differentiation of tumour and the status of lymph node metastasis in pancreatic cancer and therefore contributed significantly to the decision of therapeutic strategies. In addition, the proposed method is compared with the other typical classification techniques and the results further confirm the superiority and merit of the proposed method.
The Analysis of Image Segmentation Hierarchies with a Graph-based Knowledge Discovery System
NASA Technical Reports Server (NTRS)
Tilton, James C.; Cooke, diane J.; Ketkar, Nikhil; Aksoy, Selim
2008-01-01
Currently available pixel-based analysis techniques do not effectively extract the information content from the increasingly available high spatial resolution remotely sensed imagery data. A general consensus is that object-based image analysis (OBIA) is required to effectively analyze this type of data. OBIA is usually a two-stage process; image segmentation followed by an analysis of the segmented objects. We are exploring an approach to OBIA in which hierarchical image segmentations provided by the Recursive Hierarchical Segmentation (RHSEG) software developed at NASA GSFC are analyzed by the Subdue graph-based knowledge discovery system developed by a team at Washington State University. In this paper we discuss out initial approach to representing the RHSEG-produced hierarchical image segmentations in a graphical form understandable by Subdue, and provide results on real and simulated data. We also discuss planned improvements designed to more effectively and completely convey the hierarchical segmentation information to Subdue and to improve processing efficiency.
NASA Astrophysics Data System (ADS)
Liao, Chun-Chih; Xiao, Furen; Wong, Jau-Min; Chiang, I.-Jen
Computed tomography (CT) of the brain is preferred study on neurological emergencies. Physicians use CT to diagnose various types of intracranial hematomas, including epidural, subdural and intracerebral hematomas according to their locations and shapes. We propose a novel method that can automatically diagnose intracranial hematomas by combining machine vision and knowledge discovery techniques. The skull on the CT slice is located and the depth of each intracranial pixel is labeled. After normalization of the pixel intensities by their depth, the hyperdense area of intracranial hematoma is segmented with multi-resolution thresholding and region-growing. We then apply C4.5 algorithm to construct a decision tree using the features of the segmented hematoma and the diagnoses made by physicians. The algorithm was evaluated on 48 pathological images treated in a single institute. The two discovered rules closely resemble those used by human experts, and are able to make correct diagnoses in all cases.
Knowledge Discovery from Biomedical Ontologies in Cross Domains.
Shen, Feichen; Lee, Yugyung
2016-01-01
In recent years, there is an increasing demand for sharing and integration of medical data in biomedical research. In order to improve a health care system, it is required to support the integration of data by facilitating semantic interoperability systems and practices. Semantic interoperability is difficult to achieve in these systems as the conceptual models underlying datasets are not fully exploited. In this paper, we propose a semantic framework, called Medical Knowledge Discovery and Data Mining (MedKDD), that aims to build a topic hierarchy and serve the semantic interoperability between different ontologies. For the purpose, we fully focus on the discovery of semantic patterns about the association of relations in the heterogeneous information network representing different types of objects and relationships in multiple biological ontologies and the creation of a topic hierarchy through the analysis of the discovered patterns. These patterns are used to cluster heterogeneous information networks into a set of smaller topic graphs in a hierarchical manner and then to conduct cross domain knowledge discovery from the multiple biological ontologies. Thus, patterns made a greater contribution in the knowledge discovery across multiple ontologies. We have demonstrated the cross domain knowledge discovery in the MedKDD framework using a case study with 9 primary biological ontologies from Bio2RDF and compared it with the cross domain query processing approach, namely SLAP. We have confirmed the effectiveness of the MedKDD framework in knowledge discovery from multiple medical ontologies.
Knowledge Discovery from Biomedical Ontologies in Cross Domains
Shen, Feichen; Lee, Yugyung
2016-01-01
In recent years, there is an increasing demand for sharing and integration of medical data in biomedical research. In order to improve a health care system, it is required to support the integration of data by facilitating semantic interoperability systems and practices. Semantic interoperability is difficult to achieve in these systems as the conceptual models underlying datasets are not fully exploited. In this paper, we propose a semantic framework, called Medical Knowledge Discovery and Data Mining (MedKDD), that aims to build a topic hierarchy and serve the semantic interoperability between different ontologies. For the purpose, we fully focus on the discovery of semantic patterns about the association of relations in the heterogeneous information network representing different types of objects and relationships in multiple biological ontologies and the creation of a topic hierarchy through the analysis of the discovered patterns. These patterns are used to cluster heterogeneous information networks into a set of smaller topic graphs in a hierarchical manner and then to conduct cross domain knowledge discovery from the multiple biological ontologies. Thus, patterns made a greater contribution in the knowledge discovery across multiple ontologies. We have demonstrated the cross domain knowledge discovery in the MedKDD framework using a case study with 9 primary biological ontologies from Bio2RDF and compared it with the cross domain query processing approach, namely SLAP. We have confirmed the effectiveness of the MedKDD framework in knowledge discovery from multiple medical ontologies. PMID:27548262
Knowledge discovery with classification rules in a cardiovascular dataset.
Podgorelec, Vili; Kokol, Peter; Stiglic, Milojka Molan; Hericko, Marjan; Rozman, Ivan
2005-12-01
In this paper we study an evolutionary machine learning approach to data mining and knowledge discovery based on the induction of classification rules. A method for automatic rules induction called AREX using evolutionary induction of decision trees and automatic programming is introduced. The proposed algorithm is applied to a cardiovascular dataset consisting of different groups of attributes which should possibly reveal the presence of some specific cardiovascular problems in young patients. A case study is presented that shows the use of AREX for the classification of patients and for discovering possible new medical knowledge from the dataset. The defined knowledge discovery loop comprises a medical expert's assessment of induced rules to drive the evolution of rule sets towards more appropriate solutions. The final result is the discovery of a possible new medical knowledge in the field of pediatric cardiology.
Progress in Biomedical Knowledge Discovery: A 25-year Retrospective
Sacchi, L.
2016-01-01
Summary Objectives We sought to explore, via a systematic review of the literature, the state of the art of knowledge discovery in biomedical databases as it existed in 1992, and then now, 25 years later, mainly focused on supervised learning. Methods We performed a rigorous systematic search of PubMed and latent Dirichlet allocation to identify themes in the literature and trends in the science of knowledge discovery in and between time periods and compare these trends. We restricted the result set using a bracket of five years previous, such that the 1992 result set was restricted to articles published between 1987 and 1992, and the 2015 set between 2011 and 2015. This was to reflect the current literature available at the time to researchers and others at the target dates of 1992 and 2015. The search term was framed as: Knowledge Discovery OR Data Mining OR Pattern Discovery OR Pattern Recognition, Automated. Results A total 538 and 18,172 documents were retrieved for 1992 and 2015, respectively. The number and type of data sources increased dramatically over the observation period, primarily due to the advent of electronic clinical systems. The period 1992-2015 saw the emergence of new areas of research in knowledge discovery, and the refinement and application of machine learning approaches that were nascent or unknown in 1992. Conclusions Over the 25 years of the observation period, we identified numerous developments that impacted the science of knowledge discovery, including the availability of new forms of data, new machine learning algorithms, and new application domains. Through a bibliometric analysis we examine the striking changes in the availability of highly heterogeneous data resources, the evolution of new algorithmic approaches to knowledge discovery, and we consider from legal, social, and political perspectives possible explanations of the growth of the field. Finally, we reflect on the achievements of the past 25 years to consider what the next 25 years will bring with regard to the availability of even more complex data and to the methods that could be, and are being now developed for the discovery of new knowledge in biomedical data. PMID:27488403
Progress in Biomedical Knowledge Discovery: A 25-year Retrospective.
Sacchi, L; Holmes, J H
2016-08-02
We sought to explore, via a systematic review of the literature, the state of the art of knowledge discovery in biomedical databases as it existed in 1992, and then now, 25 years later, mainly focused on supervised learning. We performed a rigorous systematic search of PubMed and latent Dirichlet allocation to identify themes in the literature and trends in the science of knowledge discovery in and between time periods and compare these trends. We restricted the result set using a bracket of five years previous, such that the 1992 result set was restricted to articles published between 1987 and 1992, and the 2015 set between 2011 and 2015. This was to reflect the current literature available at the time to researchers and others at the target dates of 1992 and 2015. The search term was framed as: Knowledge Discovery OR Data Mining OR Pattern Discovery OR Pattern Recognition, Automated. A total 538 and 18,172 documents were retrieved for 1992 and 2015, respectively. The number and type of data sources increased dramatically over the observation period, primarily due to the advent of electronic clinical systems. The period 1992- 2015 saw the emergence of new areas of research in knowledge discovery, and the refinement and application of machine learning approaches that were nascent or unknown in 1992. Over the 25 years of the observation period, we identified numerous developments that impacted the science of knowledge discovery, including the availability of new forms of data, new machine learning algorithms, and new application domains. Through a bibliometric analysis we examine the striking changes in the availability of highly heterogeneous data resources, the evolution of new algorithmic approaches to knowledge discovery, and we consider from legal, social, and political perspectives possible explanations of the growth of the field. Finally, we reflect on the achievements of the past 25 years to consider what the next 25 years will bring with regard to the availability of even more complex data and to the methods that could be, and are being now developed for the discovery of new knowledge in biomedical data.
Berler, Alexander; Pavlopoulos, Sotiris; Koutsouris, Dimitris
2005-06-01
The advantages of the introduction of information and communication technologies in the complex health-care sector are already well-known and well-stated in the past. It is, nevertheless, paradoxical that although the medical community has embraced with satisfaction most of the technological discoveries allowing the improvement in patient care, this has not happened when talking about health-care informatics. Taking the above issue of concern, our work proposes an information model for knowledge management (KM) based upon the use of key performance indicators (KPIs) in health-care systems. Based upon the use of the balanced scorecard (BSC) framework (Kaplan/Norton) and quality assurance techniques in health care (Donabedian), this paper is proposing a patient journey centered approach that drives information flow at all levels of the day-to-day process of delivering effective and managed care, toward information assessment and knowledge discovery. In order to persuade health-care decision-makers to assess the added value of KM tools, those should be used to propose new performance measurement and performance management techniques at all levels of a health-care system. The proposed KPIs are forming a complete set of metrics that enable the performance management of a regional health-care system. In addition, the performance framework established is technically applied by the use of state-of-the-art KM tools such as data warehouses and business intelligence information systems. In that sense, the proposed infrastructure is, technologically speaking, an important KM tool that enables knowledge sharing amongst various health-care stakeholders and between different health-care groups. The use of BSC is an enabling framework toward a KM strategy in health care.
Communication in Collaborative Discovery Learning
ERIC Educational Resources Information Center
Saab, Nadira; van Joolingen, Wouter R.; van Hout-Wolters, Bernadette H. A. M.
2005-01-01
Background: Constructivist approaches to learning focus on learning environments in which students have the opportunity to construct knowledge themselves, and negotiate this knowledge with others. "Discovery learning" and "collaborative learning" are examples of learning contexts that cater for knowledge construction processes. We introduce a…
Practice-Based Knowledge Discovery for Comparative Effectiveness Research: An Organizing Framework
Lucero, Robert J.; Bakken, Suzanne
2014-01-01
Electronic health information systems can increase the ability of health-care organizations to investigate the effects of clinical interventions. The authors present an organizing framework that integrates outcomes and informatics research paradigms to guide knowledge discovery in electronic clinical databases. They illustrate its application using the example of hospital acquired pressure ulcers (HAPU). The Knowledge Discovery through Informatics for Comparative Effectiveness Research (KDI-CER) framework was conceived as a heuristic to conceptualize study designs and address potential methodological limitations imposed by using a single research perspective. Advances in informatics research can play a complementary role in advancing the field of outcomes research including CER. The KDI-CER framework can be used to facilitate knowledge discovery from routinely collected electronic clinical data. PMID:25278645
NASA Astrophysics Data System (ADS)
Kadampur, Mohammad Ali; D. v. L. N., Somayajulu
Privacy preserving data mining is an art of knowledge discovery without revealing the sensitive data of the data set. In this paper a data transformation technique using wavelets is presented for privacy preserving data mining. Wavelets use well known energy compaction approach during data transformation and only the high energy coefficients are published to the public domain instead of the actual data proper. It is found that the transformed data preserves the Eucleadian distances and the method can be used in privacy preserving clustering. Wavelets offer the inherent improved time complexity.
Design of Automatic Extraction Algorithm of Knowledge Points for MOOCs
Chen, Haijian; Han, Dongmei; Zhao, Lina
2015-01-01
In recent years, Massive Open Online Courses (MOOCs) are very popular among college students and have a powerful impact on academic institutions. In the MOOCs environment, knowledge discovery and knowledge sharing are very important, which currently are often achieved by ontology techniques. In building ontology, automatic extraction technology is crucial. Because the general methods of text mining algorithm do not have obvious effect on online course, we designed automatic extracting course knowledge points (AECKP) algorithm for online course. It includes document classification, Chinese word segmentation, and POS tagging for each document. Vector Space Model (VSM) is used to calculate similarity and design the weight to optimize the TF-IDF algorithm output values, and the higher scores will be selected as knowledge points. Course documents of “C programming language” are selected for the experiment in this study. The results show that the proposed approach can achieve satisfactory accuracy rate and recall rate. PMID:26448738
Novel drug discovery for Chagas disease.
Moraes, Carolina B; Franco, Caio H
2016-01-01
Chagas disease is a chronic infection associated with long-term morbidity. Increased funding and advocacy for drug discovery for neglected diseases have prompted the introduction of several important technological advances, and Chagas disease is among the neglected conditions that has mostly benefited from technological developments. A number of screening campaigns, and the development of new and improved in vitro and in vivo assays, has led to advances in the field of drug discovery. This review highlights the major advances in Chagas disease drug screening, and how these are being used not only to discover novel chemical entities and drug candidates, but also increase our knowledge about the disease and the parasite. Different methodologies used for compound screening and prioritization are discussed, as well as novel techniques for the investigation of these targets. The molecular mechanism of action is also discussed. Technological advances have been executed with scientific rigour for the development of new in vitro cell-based assays and in vivo animal models, to bring about novel and better drugs for Chagas disease, as well as to increase our understanding of what are the necessary properties for a compound to be successful in the clinic. The gained knowledge, combined with new exciting approaches toward target deconvolution, will help identifying new targets for Chagas disease chemotherapy in the future.
Girardi, Dominic; Küng, Josef; Kleiser, Raimund; Sonnberger, Michael; Csillag, Doris; Trenkler, Johannes; Holzinger, Andreas
2016-09-01
Established process models for knowledge discovery find the domain-expert in a customer-like and supervising role. In the field of biomedical research, it is necessary to move the domain-experts into the center of this process with far-reaching consequences for both their research output and the process itself. In this paper, we revise the established process models for knowledge discovery and propose a new process model for domain-expert-driven interactive knowledge discovery. Furthermore, we present a research infrastructure which is adapted to this new process model and demonstrate how the domain-expert can be deeply integrated even into the highly complex data-mining process and data-exploration tasks. We evaluated this approach in the medical domain for the case of cerebral aneurysms research.
Living With Earthquakes in the Pacific Northwest: A Survivor's Guide, 2nd edition
NASA Astrophysics Data System (ADS)
Hutton, Kate
In 1995, Robert S.Yeats found himself teaching a core curriculum class at Oregon State University for undergraduate nonscience majors, linking recent discoveries on the earthquake hazard in the Pacific Northwest to societal response to those hazards. The notes for that course evolved into the first edition of this book, published in 1998. In 2001, he published a similar book, Living With Earthquakes in California: A Survivors Guide (Oregon State University Press).Recent earthquakes, such as the 2001 Nisqually Mw6.8, discoveries, and new techniques in paleoseismology plus changes in public policy decisions, quickly outdated the first Pacific Northwest edition. This is especially true with the Cascadia Subduction Zone and crustal faults, where our knowledge expands with every scientific meeting.
Knowledge Discovery in Medical Mining by using Genetic Algorithms and Artificial Neural Networks
NASA Astrophysics Data System (ADS)
Srivathsa, P. K.
2011-12-01
Medical Data mining could be thought of as the search for relationships and patterns within the medical data, which facilitates the acquisition of useful knowledge for effective medical diagnosis. Consequently, the predictability of disease will become more effective and the early detection of disease certainly facilitates an increased exposure to required patient care with focused treatment, economic feasibility and improved cure rates. So, the present investigation is carried on medical data(PIMA) using DM and GA based Neural Network technique and the results predict that the methodology is not only reliable but also helps in furthering the scope of the subject.
Knowledge Discovery in Textual Documentation: Qualitative and Quantitative Analyses.
ERIC Educational Resources Information Center
Loh, Stanley; De Oliveira, Jose Palazzo M.; Gastal, Fabio Leite
2001-01-01
Presents an application of knowledge discovery in texts (KDT) concerning medical records of a psychiatric hospital. The approach helps physicians to extract knowledge about patients and diseases that may be used for epidemiological studies, for training professionals, and to support physicians to diagnose and evaluate diseases. (Author/AEF)
NASA Astrophysics Data System (ADS)
Dabiru, L.; O'Hara, C. G.; Shaw, D.; Katragadda, S.; Anderson, D.; Kim, S.; Shrestha, B.; Aanstoos, J.; Frisbie, T.; Policelli, F.; Keblawi, N.
2006-12-01
The Research Project Knowledge Base (RPKB) is currently being designed and will be implemented in a manner that is fully compatible and interoperable with enterprise architecture tools developed to support NASA's Applied Sciences Program. Through user needs assessment, collaboration with Stennis Space Center, Goddard Space Flight Center, and NASA's DEVELOP Staff personnel insight to information needs for the RPKB were gathered from across NASA scientific communities of practice. To enable efficient, consistent, standard, structured, and managed data entry and research results compilation a prototype RPKB has been designed and fully integrated with the existing NASA Earth Science Systems Components database. The RPKB will compile research project and keyword information of relevance to the six major science focus areas, 12 national applications, and the Global Change Master Directory (GCMD). The RPKB will include information about projects awarded from NASA research solicitations, project investigator information, research publications, NASA data products employed, and model or decision support tools used or developed as well as new data product information. The RPKB will be developed in a multi-tier architecture that will include a SQL Server relational database backend, middleware, and front end client interfaces for data entry. The purpose of this project is to intelligently harvest the results of research sponsored by the NASA Applied Sciences Program and related research program results. We present various approaches for a wide spectrum of knowledge discovery of research results, publications, projects, etc. from the NASA Systems Components database and global information systems and show how this is implemented in SQL Server database. The application of knowledge discovery is useful for intelligent query answering and multiple-layered database construction. Using advanced EA tools such as the Earth Science Architecture Tool (ESAT), RPKB will enable NASA and partner agencies to efficiently identify the significant results for new experiment directions and principle investigators to formulate experiment directions for new proposals.
A collaborative filtering-based approach to biomedical knowledge discovery.
Lever, Jake; Gakkhar, Sitanshu; Gottlieb, Michael; Rashnavadi, Tahereh; Lin, Santina; Siu, Celia; Smith, Maia; Jones, Martin R; Krzywinski, Martin; Jones, Steven J M; Wren, Jonathan
2018-02-15
The increase in publication rates makes it challenging for an individual researcher to stay abreast of all relevant research in order to find novel research hypotheses. Literature-based discovery methods make use of knowledge graphs built using text mining and can infer future associations between biomedical concepts that will likely occur in new publications. These predictions are a valuable resource for researchers to explore a research topic. Current methods for prediction are based on the local structure of the knowledge graph. A method that uses global knowledge from across the knowledge graph needs to be developed in order to make knowledge discovery a frequently used tool by researchers. We propose an approach based on the singular value decomposition (SVD) that is able to combine data from across the knowledge graph through a reduced representation. Using cooccurrence data extracted from published literature, we show that SVD performs better than the leading methods for scoring discoveries. We also show the diminishing predictive power of knowledge discovery as we compare our predictions with real associations that appear further into the future. Finally, we examine the strengths and weaknesses of the SVD approach against another well-performing system using several predicted associations. All code and results files for this analysis can be accessed at https://github.com/jakelever/knowledgediscovery. sjones@bcgsc.ca. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Ethnobotany and Medicinal Plant Biotechnology: From Tradition to Modern Aspects of Drug Development.
Kayser, Oliver
2018-05-24
Secondary natural products from plants are important drug leads for the development of new drug candidates for rational clinical therapy and exhibit a variety of biological activities in experimental pharmacology and serve as structural template in medicinal chemistry. The exploration of plants and discovery of natural compounds based on ethnopharmacology in combination with high sophisticated analytics is still today an important drug discovery to characterize and validate potential leads. Due to structural complexity, low abundance in biological material, and high costs in chemical synthesis, alternative ways in production like plant cell cultures, heterologous biosynthesis, and synthetic biotechnology are applied. The basis for any biotechnological process is deep knowledge in genetic regulation of pathways and protein expression with regard to todays "omics" technologies. The high number genetic techniques allowed the implementation of combinatorial biosynthesis and wide genome sequencing. Consequently, genetics allowed functional expression of biosynthetic cascades from plants and to reconstitute low-performing pathways in more productive heterologous microorganisms. Thus, de novo biosynthesis in heterologous hosts requires fundamental understanding of pathway reconstruction and multitude of genes in a foreign organism. Here, actual concepts and strategies are discussed for pathway reconstruction and genome sequencing techniques cloning tools to bridge the gap between ethnopharmaceutical drug discovery to industrial biotechnology. Georg Thieme Verlag KG Stuttgart · New York.
Tulabandhula, Theja; Rudin, Cynthia
2014-06-01
Our goal is to design a prediction and decision system for real-time use during a professional car race. In designing a knowledge discovery process for racing, we faced several challenges that were overcome only when domain knowledge of racing was carefully infused within statistical modeling techniques. In this article, we describe how we leveraged expert knowledge of the domain to produce a real-time decision system for tire changes within a race. Our forecasts have the potential to impact how racing teams can optimize strategy by making tire-change decisions to benefit their rank position. Our work significantly expands previous research on sports analytics, as it is the only work on analytical methods for within-race prediction and decision making for professional car racing.
Building Faculty Capacity through the Learning Sciences
ERIC Educational Resources Information Center
Moy, Elizabeth; O'Sullivan, Gerard; Terlecki, Melissa; Jernstedt, Christian
2014-01-01
Discoveries in the learning sciences (especially in neuroscience) have yielded a rich and growing body of knowledge about how students learn, yet this knowledge is only half of the story. The other half is "know how," i.e. the application of this knowledge. For faculty members, that means applying the discoveries of the learning sciences…
Enamel paint techniques in archaeology and their identification using XRF and micro-XRF
NASA Astrophysics Data System (ADS)
Hložek, M.; Trojek, T.; Komoróczy, B.; Prokeš, R.
2017-08-01
This investigation focuses in detail on the analysis of discoveries in South Moravia - important sites from the Roman period in Pasohlávky and Mušov. Using X-ray fluorescence analysis and micro-analysis we help identify the techniques of enamel paint and give a thorough chemical analysis in details which would not be possible to determine by means of macroscopic examination. We thus address the influence of elemental composition on the final colour of the enamel paint and describe the less known technique of combining enamel with millefiori. The material analyses of the metal artefacts decorated with enamel paint significantly contribute to our knowledge of the technology being used during the Roman period.
[Icterus of the newborn caused by indirect bilirubin--recent progress].
Hervei, Sarolta
2004-06-13
Recently a big shift has taken place in the judgment and treatment of jaundice in newborn, caused by increased unconjugated bilirubin level. New techniques evolved for assessing the prognosis of developing jaundice. An important major discovery is the antioxidant effect of bilirubin. We have a broader range of knowledge concerning the mechanism of bilirubin toxicity and for judging the chance of developing kernicterus. The prevention techniques do not stop at prohibiting anti-D immunisation but go on to preventing hydrops foetalis, the life-threatening form of haemolytic disease. There are data about the complications of phototherapy and EPO treatment for prolonged anaemia.
Knowledge discovery in traditional Chinese medicine: state of the art and perspectives.
Feng, Yi; Wu, Zhaohui; Zhou, Xuezhong; Zhou, Zhongmei; Fan, Weiyu
2006-11-01
As a complementary medical system to Western medicine, traditional Chinese medicine (TCM) provides a unique theoretical and practical approach to the treatment of diseases over thousands of years. Confronted with the increasing popularity of TCM and the huge volume of TCM data, historically accumulated and recently obtained, there is an urgent need to explore these resources effectively by the techniques of knowledge discovery in database (KDD). This paper aims at providing an overview of recent KDD studies in TCM field. A literature search was conducted in both English and Chinese publications, and major studies of knowledge discovery in TCM (KDTCM) reported in these materials were identified. Based on an introduction to the state of the art of TCM data resources, a review of four subfields of KDTCM research was presented, including KDD for the research of Chinese medical formula, KDD for the research of Chinese herbal medicine, KDD for TCM syndrome research, and KDD for TCM clinical diagnosis. Furthermore, the current state and main problems in each subfield were summarized based on a discussion of existing studies, and future directions for each subfield were also proposed accordingly. A series of KDD methods are used in existing KDTCM researches, ranging from conventional frequent itemset mining to state of the art latent structure model. Considerable interesting discoveries are obtained by these methods, such as novel TCM paired drugs discovered by frequent itemset analysis, functional community of related genes discovered under syndrome perspective by text mining, the high proportion of toxic plants in the botanical family Ranunculaceae disclosed by statistical analysis, the association between M-cholinoceptor blocking drug and Solanaceae revealed by association rule mining, etc. It is particularly inspiring to see some studies connecting TCM with biomedicine, which provide a novel top-down view for functional genomics research. However, further developments of KDD methods are still expected to better adapt to the features of TCM. Existing studies demonstrate that KDTCM is effective in obtaining medical discoveries. However, much more work needs to be done in order to discover real diamonds from TCM domain. The usage and development of KDTCM in the future will substantially contribute to the TCM community, as well as modern life science.
The center for causal discovery of biomedical knowledge from big data
Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard
2015-01-01
The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. PMID:26138794
Computational discovery of picomolar Q(o) site inhibitors of cytochrome bc1 complex.
Hao, Ge-Fei; Wang, Fu; Li, Hui; Zhu, Xiao-Lei; Yang, Wen-Chao; Huang, Li-Shar; Wu, Jia-Wei; Berry, Edward A; Yang, Guang-Fu
2012-07-11
A critical challenge to the fragment-based drug discovery (FBDD) is its low-throughput nature due to the necessity of biophysical method-based fragment screening. Herein, a method of pharmacophore-linked fragment virtual screening (PFVS) was successfully developed. Its application yielded the first picomolar-range Q(o) site inhibitors of the cytochrome bc(1) complex, an important membrane protein for drug and fungicide discovery. Compared with the original hit compound 4 (K(i) = 881.80 nM, porcine bc(1)), the most potent compound 4f displayed 20 507-fold improved binding affinity (K(i) = 43.00 pM). Compound 4f was proved to be a noncompetitive inhibitor with respect to the substrate cytochrome c, but a competitive inhibitor with respect to the substrate ubiquinol. Additionally, we determined the crystal structure of compound 4e (K(i) = 83.00 pM) bound to the chicken bc(1) at 2.70 Å resolution, providing a molecular basis for understanding its ultrapotency. To our knowledge, this study is the first application of the FBDD method in the discovery of picomolar inhibitors of a membrane protein. This work demonstrates that the novel PFVS approach is a high-throughput drug discovery method, independent of biophysical screening techniques.
The Effect of Rules and Discovery in the Retention and Retrieval of Braille Inkprint Letter Pairs.
ERIC Educational Resources Information Center
Nagengast, Daniel L.; And Others
The effects of rule knowledge were investigated using Braille inkprint pairs. Both recognition and recall were studied in three groups of subjects: rule knowledge, rule discovery, and no rule. Two hypotheses were tested: (1) that the group exposed to the rule would score better than would a discovery group and a control group; and (2) that all…
Knowledge-Based Topic Model for Unsupervised Object Discovery and Localization.
Niu, Zhenxing; Hua, Gang; Wang, Le; Gao, Xinbo
Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.
NASA Astrophysics Data System (ADS)
Narock, T.; Arko, R. A.; Carbotte, S. M.; Chandler, C. L.; Cheatham, M.; Finin, T.; Hitzler, P.; Krisnadhi, A.; Raymond, L. M.; Shepherd, A.; Wiebe, P. H.
2014-12-01
A wide spectrum of maturing methods and tools, collectively characterized as the Semantic Web, is helping to vastly improve the dissemination of scientific research. Creating semantic integration requires input from both domain and cyberinfrastructure scientists. OceanLink, an NSF EarthCube Building Block, is demonstrating semantic technologies through the integration of geoscience data repositories, library holdings, conference abstracts, and funded research awards. Meeting project objectives involves applying semantic technologies to support data representation, discovery, sharing and integration. Our semantic cyberinfrastructure components include ontology design patterns, Linked Data collections, semantic provenance, and associated services to enhance data and knowledge discovery, interoperation, and integration. We discuss how these components are integrated, the continued automated and semi-automated creation of semantic metadata, and techniques we have developed to integrate ontologies, link resources, and preserve provenance and attribution.
Concept Formation in Scientific Knowledge Discovery from a Constructivist View
NASA Astrophysics Data System (ADS)
Peng, Wei; Gero, John S.
The central goal of scientific knowledge discovery is to learn cause-effect relationships among natural phenomena presented as variables and the consequences their interactions. Scientific knowledge is normally expressed as scientific taxonomies and qualitative and quantitative laws [1]. This type of knowledge represents intrinsic regularities of the observed phenomena that can be used to explain and predict behaviors of the phenomena. It is a generalization that is abstracted and externalized from a set of contexts and applicable to a broader scope. Scientific knowledge is a type of third-person knowledge, i.e., knowledge that independent of a specific enquirer. Artificial intelligence approaches, particularly data mining algorithms that are used to identify meaningful patterns from large data sets, are approaches that aim to facilitate the knowledge discovery process [2]. A broad spectrum of algorithms has been developed in addressing classification, associative learning, and clustering problems. However, their linkages to people who use them have not been adequately explored. Issues in relation to supporting the interpretation of the patterns, the application of prior knowledge to the data mining process and addressing user interactions remain challenges for building knowledge discovery tools [3]. As a consequence, scientists rely on their experience to formulate problems, evaluate hypotheses, reason about untraceable factors and derive new problems. This type of knowledge which they have developed during their career is called “first-person” knowledge. The formation of scientific knowledge (third-person knowledge) is highly influenced by the enquirer’s first-person knowledge construct, which is a result of his or her interactions with the environment. There have been attempts to craft automatic knowledge discovery tools but these systems are limited in their capabilities to handle the dynamics of personal experience. There are now trends in developing approaches to assist scientists applying their expertise to model formation, simulation, and prediction in various domains [4], [5]. On the other hand, first-person knowledge becomes third-person theory only if it proves general by evidence and is acknowledged by a scientific community. Researchers start to focus on building interactive cooperation platforms [1] to accommodate different views into the knowledge discovery process. There are some fundamental questions in relation to scientific knowledge development. What aremajor components for knowledge construction and how do people construct their knowledge? How is this personal construct assimilated and accommodated into a scientific paradigm? How can one design a computational system to facilitate these processes? This chapter does not attempt to answer all these questions but serves as a basis to foster thinking along this line. A brief literature review about how people develop their knowledge is carried out through a constructivist view. A hydrological modeling scenario is presented to elucidate the approach.
Concept Formation in Scientific Knowledge Discovery from a Constructivist View
NASA Astrophysics Data System (ADS)
Peng, Wei; Gero, John S.
The central goal of scientific knowledge discovery is to learn cause-effect relationships among natural phenomena presented as variables and the consequences their interactions. Scientific knowledge is normally expressed as scientific taxonomies and qualitative and quantitative laws [1]. This type of knowledge represents intrinsic regularities of the observed phenomena that can be used to explain and predict behaviors of the phenomena. It is a generalization that is abstracted and externalized from a set of contexts and applicable to a broader scope. Scientific knowledge is a type of third-person knowledge, i.e., knowledge that independent of a specific enquirer. Artificial intelligence approaches, particularly data mining algorithms that are used to identify meaningful patterns from large data sets, are approaches that aim to facilitate the knowledge discovery process [2]. A broad spectrum of algorithms has been developed in addressing classification, associative learning, and clustering problems. However, their linkages to people who use them have not been adequately explored. Issues in relation to supporting the interpretation of the patterns, the application of prior knowledge to the data mining process and addressing user interactions remain challenges for building knowledge discovery tools [3]. As a consequence, scientists rely on their experience to formulate problems, evaluate hypotheses, reason about untraceable factors and derive new problems. This type of knowledge which they have developed during their career is called "first-person" knowledge. The formation of scientific knowledge (third-person knowledge) is highly influenced by the enquirer's first-person knowledge construct, which is a result of his or her interactions with the environment. There have been attempts to craft automatic knowledge discovery tools but these systems are limited in their capabilities to handle the dynamics of personal experience. There are now trends in developing approaches to assist scientists applying their expertise to model formation, simulation, and prediction in various domains [4], [5]. On the other hand, first-person knowledge becomes third-person theory only if it proves general by evidence and is acknowledged by a scientific community. Researchers start to focus on building interactive cooperation platforms [1] to accommodate different views into the knowledge discovery process. There are some fundamental questions in relation to scientific knowledge development. What aremajor components for knowledge construction and how do people construct their knowledge? How is this personal construct assimilated and accommodated into a scientific paradigm? How can one design a computational system to facilitate these processes? This chapter does not attempt to answer all these questions but serves as a basis to foster thinking along this line. A brief literature review about how people develop their knowledge is carried out through a constructivist view. A hydrological modeling scenario is presented to elucidate the approach.
2004-11-01
affords exciting opportunities in target detection. The input signal may be a sum of sine waves, it could be an auditory signal, or possibly a visual...rendering of a scene. Since image processing is an area in which the original data are stationary in some sense ( auditory signals suffer from...11 Example 1 of SR - Identification of a Subliminal Signal below a Threshold .......................... 13 Example 2 of SR
Evaluation of Malware Target Recognition Deployed in a Cloud-Based Fileserver Environment
2012-03-01
many of these detection techniques could be evaded with simple obfuscation. Kolter and Maloof extend Schultz’s research in [KM04] and [KM06]. Their...69 [KM04] Jeremy Z. Kolter and Marcus A. Maloof. Learning to detect malicious executables in the wild. In Proceedings of the tenth ACM SIGKDD...international conference on Knowledge discovery and data mining, KDD ’04, pages 470–478, New York, NY, USA, 2004. ACM. [KM06] J.Z. Kolter and M.A. Maloof
Large scale analysis of the mutational landscape in HT-SELEX improves aptamer discovery
Hoinka, Jan; Berezhnoy, Alexey; Dao, Phuong; Sauna, Zuben E.; Gilboa, Eli; Przytycka, Teresa M.
2015-01-01
High-Throughput (HT) SELEX combines SELEX (Systematic Evolution of Ligands by EXponential Enrichment), a method for aptamer discovery, with massively parallel sequencing technologies. This emerging technology provides data for a global analysis of the selection process and for simultaneous discovery of a large number of candidates but currently lacks dedicated computational approaches for their analysis. To close this gap, we developed novel in-silico methods to analyze HT-SELEX data and utilized them to study the emergence of polymerase errors during HT-SELEX. Rather than considering these errors as a nuisance, we demonstrated their utility for guiding aptamer discovery. Our approach builds on two main advancements in aptamer analysis: AptaMut—a novel technique allowing for the identification of polymerase errors conferring an improved binding affinity relative to the ‘parent’ sequence and AptaCluster—an aptamer clustering algorithm which is to our best knowledge, the only currently available tool capable of efficiently clustering entire aptamer pools. We applied these methods to an HT-SELEX experiment developing aptamers against Interleukin 10 receptor alpha chain (IL-10RA) and experimentally confirmed our predictions thus validating our computational methods. PMID:25870409
12 CFR 263.53 - Discovery depositions.
Code of Federal Regulations, 2014 CFR
2014-01-01
... 12 Banks and Banking 4 2014-01-01 2014-01-01 false Discovery depositions. 263.53 Section 263.53... Discovery depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts...
12 CFR 263.53 - Discovery depositions.
Code of Federal Regulations, 2012 CFR
2012-01-01
... 12 Banks and Banking 4 2012-01-01 2012-01-01 false Discovery depositions. 263.53 Section 263.53... Discovery depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts...
An integrative model for in-silico clinical-genomics discovery science.
Lussier, Yves A; Sarkar, Indra Nell; Cantor, Michael
2002-01-01
Human Genome discovery research has set the pace for Post-Genomic Discovery Research. While post-genomic fields focused at the molecular level are intensively pursued, little effort is being deployed in the later stages of molecular medicine discovery research, such as clinical-genomics. The objective of this study is to demonstrate the relevance and significance of integrating mainstream clinical informatics decision support systems to current bioinformatics genomic discovery science. This paper is a feasibility study of an original model enabling novel "in-silico" clinical-genomic discovery science and that demonstrates its feasibility. This model is designed to mediate queries among clinical and genomic knowledge bases with relevant bioinformatic analytic tools (e.g. gene clustering). Briefly, trait-disease-gene relationships were successfully illustrated using QMR, OMIM, SNOMED-RT, GeneCluster and TreeView. The analyses were visualized as two-dimensional dendrograms of clinical observations clustered around genes. To our knowledge, this is the first study using knowledge bases of clinical decision support systems for genomic discovery. Although this study is a proof of principle, it provides a framework for the development of clinical decision-support-system driven, high-throughput clinical-genomic technologies which could potentially unveil significant high-level functions of genes.
Semantic Data Integration and Knowledge Management to Represent Biological Network Associations.
Losko, Sascha; Heumann, Klaus
2017-01-01
The vast quantities of information generated by academic and industrial research groups are reflected in a rapidly growing body of scientific literature and exponentially expanding resources of formalized data, including experimental data, originating from a multitude of "-omics" platforms, phenotype information, and clinical data. For bioinformatics, the challenge remains to structure this information so that scientists can identify relevant information, to integrate this information as specific "knowledge bases," and to formalize this knowledge across multiple scientific domains to facilitate hypothesis generation and validation. Here we report on progress made in building a generic knowledge management environment capable of representing and mining both explicit and implicit knowledge and, thus, generating new knowledge. Risk management in drug discovery and clinical research is used as a typical example to illustrate this approach. In this chapter we introduce techniques and concepts (such as ontologies, semantic objects, typed relationships, contexts, graphs, and information layers) that are used to represent complex biomedical networks. The BioXM™ Knowledge Management Environment is used as an example to demonstrate how a domain such as oncology is represented and how this representation is utilized for research.
Imaging and examination strategies of normal male and female sex development and anatomy.
Wünsch, Lutz; Schober, Justine M
2007-09-01
Over recent years a variety of new details on the developmental biology of sexual differentiation has been discovered. Moreover, important advances have been made in imaging and examination strategies for urogenital organs, and these have added new knowledge to our understanding of the 'normal' anatomy of the sexes. Both aspects contribute to the comprehension of phenotypic sex development, but they are not commonly presented in the same context. This will be attempted in this chapter, which aims to link discoveries in developmental biology to anatomical details shown by modern examination techniques. A review of the literature concerning the link between sexual development and imaging of urogenital organs was performed. Genes, proteins and pathways related to sexual differentiation were related to some organotypic features revealed by clinical examination techniques. Early 'organotypic' patterns can be identified in prostatic, urethral and genital development and followed into postnatal life. New imaging and endoscopy techniques allow for detailed descriptive anatomical studies, hopefully resulting in a broader understanding of sex development and a better genotype-phenotype correlation in defined disorders. Clinical description relying on imaging techniques should be related to knowledge of the genetic and endocrine factors influencing sex development in a specific and stepwise manner.
Self organising hypothesis networks: a new approach for representing and structuring SAR knowledge
2014-01-01
Background Combining different sources of knowledge to build improved structure activity relationship models is not easy owing to the variety of knowledge formats and the absence of a common framework to interoperate between learning techniques. Most of the current approaches address this problem by using consensus models that operate at the prediction level. We explore the possibility to directly combine these sources at the knowledge level, with the aim to harvest potentially increased synergy at an earlier stage. Our goal is to design a general methodology to facilitate knowledge discovery and produce accurate and interpretable models. Results To combine models at the knowledge level, we propose to decouple the learning phase from the knowledge application phase using a pivot representation (lingua franca) based on the concept of hypothesis. A hypothesis is a simple and interpretable knowledge unit. Regardless of its origin, knowledge is broken down into a collection of hypotheses. These hypotheses are subsequently organised into hierarchical network. This unification permits to combine different sources of knowledge into a common formalised framework. The approach allows us to create a synergistic system between different forms of knowledge and new algorithms can be applied to leverage this unified model. This first article focuses on the general principle of the Self Organising Hypothesis Network (SOHN) approach in the context of binary classification problems along with an illustrative application to the prediction of mutagenicity. Conclusion It is possible to represent knowledge in the unified form of a hypothesis network allowing interpretable predictions with performances comparable to mainstream machine learning techniques. This new approach offers the potential to combine knowledge from different sources into a common framework in which high level reasoning and meta-learning can be applied; these latter perspectives will be explored in future work. PMID:24959206
ERIC Educational Resources Information Center
Benoit, Gerald
2002-01-01
Discusses data mining (DM) and knowledge discovery in databases (KDD), taking the view that KDD is the larger view of the entire process, with DM emphasizing the cleaning, warehousing, mining, and visualization of knowledge discovery in databases. Highlights include algorithms; users; the Internet; text mining; and information extraction.…
Gibert, Karina; García-Rudolph, Alejandro; Curcoll, Lluïsa; Soler, Dolors; Pla, Laura; Tormos, José María
2009-01-01
In this paper, an integral Knowledge Discovery Methodology, named Clustering based on rules by States, which incorporates artificial intelligence (AI) and statistical methods as well as interpretation-oriented tools, is used for extracting knowledge patterns about the evolution over time of the Quality of Life (QoL) of patients with Spinal Cord Injury. The methodology incorporates the interaction with experts as a crucial element with the clustering methodology to guarantee usefulness of the results. Four typical patterns are discovered by taking into account prior expert knowledge. Several hypotheses are elaborated about the reasons for psychological distress or decreases in QoL of patients over time. The knowledge discovery from data (KDD) approach turns out, once again, to be a suitable formal framework for handling multidimensional complexity of the health domains.
Text mining for traditional Chinese medical knowledge discovery: a survey.
Zhou, Xuezhong; Peng, Yonghong; Liu, Baoyan
2010-08-01
Extracting meaningful information and knowledge from free text is the subject of considerable research interest in the machine learning and data mining fields. Text data mining (or text mining) has become one of the most active research sub-fields in data mining. Significant developments in the area of biomedical text mining during the past years have demonstrated its great promise for supporting scientists in developing novel hypotheses and new knowledge from the biomedical literature. Traditional Chinese medicine (TCM) provides a distinct methodology with which to view human life. It is one of the most complete and distinguished traditional medicines with a history of several thousand years of studying and practicing the diagnosis and treatment of human disease. It has been shown that the TCM knowledge obtained from clinical practice has become a significant complementary source of information for modern biomedical sciences. TCM literature obtained from the historical period and from modern clinical studies has recently been transformed into digital data in the form of relational databases or text documents, which provide an effective platform for information sharing and retrieval. This motivates and facilitates research and development into knowledge discovery approaches and to modernize TCM. In order to contribute to this still growing field, this paper presents (1) a comparative introduction to TCM and modern biomedicine, (2) a survey of the related information sources of TCM, (3) a review and discussion of the state of the art and the development of text mining techniques with applications to TCM, (4) a discussion of the research issues around TCM text mining and its future directions. Copyright 2010 Elsevier Inc. All rights reserved.
An intelligent content discovery technique for health portal content management.
De Silva, Daswin; Burstein, Frada
2014-04-23
Continuous content management of health information portals is a feature vital for its sustainability and widespread acceptance. Knowledge and experience of a domain expert is essential for content management in the health domain. The rate of generation of online health resources is exponential and thereby manual examination for relevance to a specific topic and audience is a formidable challenge for domain experts. Intelligent content discovery for effective content management is a less researched topic. An existing expert-endorsed content repository can provide the necessary leverage to automatically identify relevant resources and evaluate qualitative metrics. This paper reports on the design research towards an intelligent technique for automated content discovery and ranking for health information portals. The proposed technique aims to improve efficiency of the current mostly manual process of portal content management by utilising an existing expert-endorsed content repository as a supporting base and a benchmark to evaluate the suitability of new content A model for content management was established based on a field study of potential users. The proposed technique is integral to this content management model and executes in several phases (ie, query construction, content search, text analytics and fuzzy multi-criteria ranking). The construction of multi-dimensional search queries with input from Wordnet, the use of multi-word and single-word terms as representative semantics for text analytics and the use of fuzzy multi-criteria ranking for subjective evaluation of quality metrics are original contributions reported in this paper. The feasibility of the proposed technique was examined with experiments conducted on an actual health information portal, the BCKOnline portal. Both intermediary and final results generated by the technique are presented in the paper and these help to establish benefits of the technique and its contribution towards effective content management. The prevalence of large numbers of online health resources is a key obstacle for domain experts involved in content management of health information portals and websites. The proposed technique has proven successful at search and identification of resources and the measurement of their relevance. It can be used to support the domain expert in content management and thereby ensure the health portal is up-to-date and current.
Knowledge discovery from data as a framework to decision support in medical domains
Gibert, Karina
2009-01-01
Introduction Knowledge discovery from data (KDD) is a multidisciplinary discipline which appeared in 1996 for “non trivial identifying of valid, novel, potentially useful, ultimately understandable patterns in data”. Pre-treatment of data and post-processing is as important as the data exploitation (Data Mining) itself. Different analysis techniques can be properly combined to produce explicit knowledge from data. Methods Hybrid KDD methodologies combining Artificial Intelligence with Statistics and visualization have been used to identify patterns in complex medical phenomena: experts provide prior knowledge (pK); it biases the search of distinguishable groups of homogeneous objects; support-interpretation tools (CPG) assisted experts in conceptualization and labelling of discovered patterns, consistently with pK. Results Patterns of dependency in mental disabilities supported decision-making on legislation of the Spanish Dependency Law in Catalonia. Relationships between type of neurorehabilitation treatment and patterns of response for brain damage are assessed. Patterns of the perceived QOL along time are used in spinal cord lesion to improve social inclusion. Conclusion Reality is more and more complex and classical data analyses are not powerful enough to model it. New methodologies are required including multidisciplinarity and stressing on production of understandable models. Interaction with the experts is critical to generate meaningful results which can really support decision-making, particularly convenient transferring the pK to the system, as well as interpreting results in close interaction with experts. KDD is a valuable paradigm, particularly when facing very complex domains, not well understood yet, like many medical phenomena.
When fragments link: a bibliometric perspective on the development of fragment-based drug discovery.
Romasanta, Angelo K S; van der Sijde, Peter; Hellsten, Iina; Hubbard, Roderick E; Keseru, Gyorgy M; van Muijlwijk-Koezen, Jacqueline; de Esch, Iwan J P
2018-05-05
Fragment-based drug discovery (FBDD) is a highly interdisciplinary field, rich in ideas integrated from pharmaceutical sciences, chemistry, biology, and physics, among others. To enrich our understanding of the development of the field, we used bibliometric techniques to analyze 3642 publications in FBDD, complementing accounts by key practitioners. Mapping its core papers, we found the transfer of knowledge from academia to industry. Co-authorship analysis showed that university-industry collaboration has grown over time. Moreover, we show how ideas from other scientific disciplines have been integrated into the FBDD paradigm. Keyword analysis showed that the field is organized into four interconnected practices: library design, fragment screening, computational methods, and optimization. This study highlights the importance of interactions among various individuals and institutions from diverse disciplines in newly emerging scientific fields. Copyright © 2018. Published by Elsevier Ltd.
Resource Discovery within the Networked "Hybrid" Library.
ERIC Educational Resources Information Center
Leigh, Sally-Anne
This paper focuses on the development, adoption, and integration of resource discovery, knowledge management, and/or knowledge sharing interfaces such as interactive portals, and the use of the library's World Wide Web presence to increase the availability and usability of information services. The introduction addresses changes in library…
A biological compression model and its applications.
Cao, Minh Duc; Dix, Trevor I; Allison, Lloyd
2011-01-01
A biological compression model, expert model, is presented which is superior to existing compression algorithms in both compression performance and speed. The model is able to compress whole eukaryotic genomes. Most importantly, the model provides a framework for knowledge discovery from biological data. It can be used for repeat element discovery, sequence alignment and phylogenetic analysis. We demonstrate that the model can handle statistically biased sequences and distantly related sequences where conventional knowledge discovery tools often fail.
NASA Technical Reports Server (NTRS)
Lee, Jonathan A.
2005-01-01
High-throughput measurement techniques are reviewed for solid phase transformation from materials produced by combinatorial methods, which are highly efficient concepts to fabricate large variety of material libraries with different compositional gradients on a single wafer. Combinatorial methods hold high potential for reducing the time and costs associated with the development of new materials, as compared to time-consuming and labor-intensive conventional methods that test large batches of material, one- composition at a time. These high-throughput techniques can be automated to rapidly capture and analyze data, using the entire material library on a single wafer, thereby accelerating the pace of materials discovery and knowledge generation for solid phase transformations. The review covers experimental techniques that are applicable to inorganic materials such as shape memory alloys, graded materials, metal hydrides, ferric materials, semiconductors and industrial alloys.
Robust and Accurate Anomaly Detection in ECG Artifacts Using Time Series Motif Discovery
Sivaraks, Haemwaan
2015-01-01
Electrocardiogram (ECG) anomaly detection is an important technique for detecting dissimilar heartbeats which helps identify abnormal ECGs before the diagnosis process. Currently available ECG anomaly detection methods, ranging from academic research to commercial ECG machines, still suffer from a high false alarm rate because these methods are not able to differentiate ECG artifacts from real ECG signal, especially, in ECG artifacts that are similar to ECG signals in terms of shape and/or frequency. The problem leads to high vigilance for physicians and misinterpretation risk for nonspecialists. Therefore, this work proposes a novel anomaly detection technique that is highly robust and accurate in the presence of ECG artifacts which can effectively reduce the false alarm rate. Expert knowledge from cardiologists and motif discovery technique is utilized in our design. In addition, every step of the algorithm conforms to the interpretation of cardiologists. Our method can be utilized to both single-lead ECGs and multilead ECGs. Our experiment results on real ECG datasets are interpreted and evaluated by cardiologists. Our proposed algorithm can mostly achieve 100% of accuracy on detection (AoD), sensitivity, specificity, and positive predictive value with 0% false alarm rate. The results demonstrate that our proposed method is highly accurate and robust to artifacts, compared with competitive anomaly detection methods. PMID:25688284
2010-01-01
Background The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. Results In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. Conclusion High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data. PMID:20122245
Seok, Junhee; Kaushal, Amit; Davis, Ronald W; Xiao, Wenzhong
2010-01-18
The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data.
Form-Focused Discovery Activities in English Classes
ERIC Educational Resources Information Center
Ogeyik, Muhlise Cosgun
2011-01-01
Form-focused discovery activities allow language learners to grasp various aspects of a target language by contributing implicit knowledge by using discovered explicit knowledge. Moreover, such activities can assist learners to perceive and discover the features of their language input. In foreign language teaching environments, they can be used…
The confluence of ancient wisdom and future technology in our profession
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miller, D.P.
1997-10-01
The theme of this year`s Annual Meeting is ``Ancient Wisdom-Future Technology.`` The panel assembled for this session has been asked to think metaphorically about the theme and how it relates to their profession of human factors and ergonomics. Originally conceived as a debate centering around the older technologies and research techniques versus the newer ways of finding answers, it was soon realized that there was no dichotomy, but more of a synergy between the old and the new. If human factors is truly a philosophy of design rather than simply a body of knowledge, then one would expect consistency inmore » approach regardless of field of application or new discoveries of human performance. Just as when two or more rivers combine to become a force mightier than the simple summation, the synergistic power of established techniques or knowledge and recent innovation is available to everyone in the profession. The invited panelists represent diverse perspectives in human factors and ergonomics, and this made for a stimulating discussion.« less
75 FR 66766 - NIAID Blue Ribbon Panel Meeting on Adjuvant Discovery and Development
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-29
..., identifies gaps in knowledge and capabilities, and defines NIAID's goals for the continued discovery... DEPARTMENT OF HEALTH AND HUMAN SERVICES NIAID Blue Ribbon Panel Meeting on Adjuvant Discovery and... agenda for the discovery, development and clinical evaluation of adjuvants for use with preventive...
12 CFR 263.53 - Discovery depositions.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 12 Banks and Banking 3 2011-01-01 2011-01-01 false Discovery depositions. 263.53 Section 263.53... depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts material to the...
12 CFR 19.170 - Discovery depositions.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 1 2010-01-01 2010-01-01 false Discovery depositions. 19.170 Section 19.170... PROCEDURE Discovery Depositions and Subpoenas § 19.170 Discovery depositions. (a) General rule. In any... deposition of an expert, or of a person, including another party, who has direct knowledge of matters that...
12 CFR 19.170 - Discovery depositions.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 12 Banks and Banking 1 2011-01-01 2011-01-01 false Discovery depositions. 19.170 Section 19.170... PROCEDURE Discovery Depositions and Subpoenas § 19.170 Discovery depositions. (a) General rule. In any... deposition of an expert, or of a person, including another party, who has direct knowledge of matters that...
12 CFR 263.53 - Discovery depositions.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 3 2010-01-01 2010-01-01 false Discovery depositions. 263.53 Section 263.53... depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts material to the...
BioGraph: unsupervised biomedical knowledge discovery via automated hypothesis generation
2011-01-01
We present BioGraph, a data integration and data mining platform for the exploration and discovery of biomedical information. The platform offers prioritizations of putative disease genes, supported by functional hypotheses. We show that BioGraph can retrospectively confirm recently discovered disease genes and identify potential susceptibility genes, outperforming existing technologies, without requiring prior domain knowledge. Additionally, BioGraph allows for generic biomedical applications beyond gene discovery. BioGraph is accessible at http://www.biograph.be. PMID:21696594
Bate, A; Lindquist, M; Edwards, I R
2008-04-01
After market launch, new information on adverse effects of medicinal products is almost exclusively first highlighted by spontaneous reporting. As data sets of spontaneous reports have become larger, and computational capability has increased, quantitative methods have been increasingly applied to such data sets. The screening of such data sets is an application of knowledge discovery in databases (KDD). Effective KDD is an iterative and interactive process made up of the following steps: developing an understanding of an application domain, creating a target data set, data cleaning and pre-processing, data reduction and projection, choosing the data mining task, choosing the data mining algorithm, data mining, interpretation of results and consolidating and using acquired knowledge. The process of KDD as it applies to the analysis of spontaneous reports can be exemplified by its routine use on the 3.5 million suspected adverse drug reaction (ADR) reports in the WHO ADR database. Examples of new adverse effects first highlighted by the KDD process on WHO data include topiramate glaucoma, infliximab vasculitis and the association of selective serotonin reuptake inhibitors (SSRIs) and neonatal convulsions. The KDD process has already improved our ability to highlight previously unsuspected ADRs for clinical review in spontaneous reporting, and we anticipate that such techniques will be increasingly used in the successful screening of other healthcare data sets such as patient records in the future.
Medical knowledge discovery and management.
Prior, Fred
2009-05-01
Although the volume of medical information is growing rapidly, the ability to rapidly convert this data into "actionable insights" and new medical knowledge is lagging far behind. The first step in the knowledge discovery process is data management and integration, which logically can be accomplished through the application of data warehouse technologies. A key insight that arises from efforts in biosurveillance and the global scope of military medicine is that information must be integrated over both time (longitudinal health records) and space (spatial localization of health-related events). Once data are compiled and integrated it is essential to encode the semantics and relationships among data elements through the use of ontologies and semantic web technologies to convert data into knowledge. Medical images form a special class of health-related information. Traditionally knowledge has been extracted from images by human observation and encoded via controlled terminologies. This approach is rapidly being replaced by quantitative analyses that more reliably support knowledge extraction. The goals of knowledge discovery are the improvement of both the timeliness and accuracy of medical decision making and the identification of new procedures and therapies.
Newton, Mandi S; Scott-Findlay, Shannon
2007-01-01
Background In the past 15 years, knowledge translation in healthcare has emerged as a multifaceted and complex agenda. Theoretical and polemical discussions, the development of a science to study and measure the effects of translating research evidence into healthcare, and the role of key stakeholders including academe, healthcare decision-makers, the public, and government funding bodies have brought scholarly, organizational, social, and political dimensions to the agenda. Objective This paper discusses the current knowledge translation agenda in Canadian healthcare and how elements in this agenda shape the discovery and translation of health knowledge. Discussion The current knowledge translation agenda in Canadian healthcare involves the influence of values, priorities, and people; stakes which greatly shape the discovery of research knowledge and how it is or is not instituted in healthcare delivery. As this agenda continues to take shape and direction, ensuring that it is accountable for its influences is essential and should be at the forefront of concern to the Canadian public and healthcare community. This transparency will allow for scrutiny, debate, and improvements in health knowledge discovery and health services delivery. PMID:17916256
Concept of operations for knowledge discovery from Big Data across enterprise data warehouses
NASA Astrophysics Data System (ADS)
Sukumar, Sreenivas R.; Olama, Mohammed M.; McNair, Allen W.; Nutaro, James J.
2013-05-01
The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Options that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.
NASA Astrophysics Data System (ADS)
Huang, Yin; Chen, Jianhua; Xiong, Shaojun
2009-07-01
Mobile-Learning (M-learning) makes many learners get the advantages of both traditional learning and E-learning. Currently, Web-based Mobile-Learning Systems have created many new ways and defined new relationships between educators and learners. Association rule mining is one of the most important fields in data mining and knowledge discovery in databases. Rules explosion is a serious problem which causes great concerns, as conventional mining algorithms often produce too many rules for decision makers to digest. Since Web-based Mobile-Learning System collects vast amounts of student profile data, data mining and knowledge discovery techniques can be applied to find interesting relationships between attributes of learners, assessments, the solution strategies adopted by learners and so on. Therefore ,this paper focus on a new data-mining algorithm, combined with the advantages of genetic algorithm and simulated annealing algorithm , called ARGSA(Association rules based on an improved Genetic Simulated Annealing Algorithm), to mine the association rules. This paper first takes advantage of the Parallel Genetic Algorithm and Simulated Algorithm designed specifically for discovering association rules. Moreover, the analysis and experiment are also made to show the proposed method is superior to the Apriori algorithm in this Mobile-Learning system.
The center for causal discovery of biomedical knowledge from big data.
Cooper, Gregory F; Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard
2015-11-01
The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.All rights reserved. For Permissions, please email: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
McGovern, Mary Francis
Non-formal environmental education provides students the opportunity to learn in ways that would not be possible in a traditional classroom setting. Outdoor learning allows students to make connections to their environment and helps to foster an appreciation for nature. This type of education can be interdisciplinary---students not only develop skills in science, but also in mathematics, social studies, technology, and critical thinking. This case study focuses on a non-formal marine education program, the South Carolina Department of Natural Resources' (SCDNR) Discovery vessel based program. The Discovery curriculum was evaluated to determine impact on student knowledge about and attitude toward the estuary. Students from two South Carolina coastal counties who attended the boat program during fall 2014 were asked to complete a brief survey before, immediately after, and two weeks following the program. The results of this study indicate that both student knowledge about and attitude significantly improved after completion of the Discovery vessel based program. Knowledge and attitude scores demonstrated a positive correlation.
Manda, Prashanti; McCarthy, Fiona; Bridges, Susan M
2013-10-01
The Gene Ontology (GO), a set of three sub-ontologies, is one of the most popular bio-ontologies used for describing gene product characteristics. GO annotation data containing terms from multiple sub-ontologies and at different levels in the ontologies is an important source of implicit relationships between terms from the three sub-ontologies. Data mining techniques such as association rule mining that are tailored to mine from multiple ontologies at multiple levels of abstraction are required for effective knowledge discovery from GO annotation data. We present a data mining approach, Multi-ontology data mining at All Levels (MOAL) that uses the structure and relationships of the GO to mine multi-ontology multi-level association rules. We introduce two interestingness measures: Multi-ontology Support (MOSupport) and Multi-ontology Confidence (MOConfidence) customized to evaluate multi-ontology multi-level association rules. We also describe a variety of post-processing strategies for pruning uninteresting rules. We use publicly available GO annotation data to demonstrate our methods with respect to two applications (1) the discovery of co-annotation suggestions and (2) the discovery of new cross-ontology relationships. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Stone, S.; Parker, M. S.; Howe, B.; Lazowska, E.
2015-12-01
Rapid advances in technology are transforming nearly every field from "data-poor" to "data-rich." The ability to extract knowledge from this abundance of data is the cornerstone of 21st century discovery. At the University of Washington eScience Institute, our mission is to engage researchers across disciplines in developing and applying advanced computational methods and tools to real world problems in data-intensive discovery. Our research team consists of individuals with diverse backgrounds in domain sciences such as astronomy, oceanography and geology, with complementary expertise in advanced statistical and computational techniques such as data management, visualization, and machine learning. Two key elements are necessary to foster careers in data science: individuals with cross-disciplinary training in both method and domain sciences, and career paths emphasizing alternative metrics for advancement. We see persistent and deep-rooted challenges for the career paths of people whose skills, activities and work patterns don't fit neatly into the traditional roles and success metrics of academia. To address these challenges the eScience Institute has developed training programs and established new career opportunities for data-intensive research in academia. Our graduate students and post-docs have mentors in both a methodology and an application field. They also participate in coursework and tutorials to advance technical skill and foster community. Professional Data Scientist positions were created to support research independence while encouraging the development and adoption of domain-specific tools and techniques. The eScience Institute also supports the appointment of faculty who are innovators in developing and applying data science methodologies to advance their field of discovery. Our ultimate goal is to create a supportive environment for data science in academia and to establish global recognition for data-intensive discovery across all fields.
Search techniques for near-earth asteroids
NASA Technical Reports Server (NTRS)
Helin, E. F.; Dunbar, R. S.
1990-01-01
Knowledge of the near-earth asteroids (Apollo, Amor, and Aten groups) has increased enormously over the last 10 to 15 years. This has been due in large part to the success of programs that have systematically searched for these objects. These programs have been motivated by the apparent relationships of the near-earth asteroids to terrestrial impact cratering, meteorites, and comets, and their relative accessibility for asteroid missions. Discovery of new near-earth asteroids is fundamental to all other studies, from theoretical modeling of their populations to the determination of their physical characteristics by various remote-sensing techniques. The methods that have been used to find these objects are reviewed, and ways in which the search for near-earth asteroids can be expanded are discussed.
Defining Malaysian Knowledge Society: Results from the Delphi Technique
NASA Astrophysics Data System (ADS)
Hamid, Norsiah Abdul; Zaman, Halimah Badioze
This paper outlines the findings of research where the central idea is to define the term Knowledge Society (KS) in Malaysian context. The research focuses on three important dimensions, namely knowledge, ICT and human capital. This study adopts a modified Delphi technique to seek the important dimensions that can contribute to the development of Malaysian's KS. The Delphi technique involved ten experts in a five-round iterative and controlled feedback procedure to obtain consensus on the important dimensions and to verify the proposed definition of KS. The finding shows that all three dimensions proposed initially scored high and moderate consensus. Round One (R1) proposed an initial definition of KS and required comments and inputs from the panel. These inputs were then used to develop items for a R2 questionnaire. In R2, 56 out of 73 items scored high consensus and in R3, 63 out of 90 items scored high. R4 was conducted to re-rate the new items, in which 8 out of 17 items scored high. Other items scored moderate consensus and no item scored low or no consensus in all rounds. The final round (R5) was employed to verify the final definition of KS. Findings and discovery of this study are significant to the definition of KS and the development of a framework in the Malaysian context.
Knowledge discovery in cardiology: A systematic literature review.
Kadi, I; Idri, A; Fernandez-Aleman, J L
2017-01-01
Data mining (DM) provides the methodology and technology needed to transform huge amounts of data into useful information for decision making. It is a powerful process employed to extract knowledge and discover new patterns embedded in large data sets. Data mining has been increasingly used in medicine, particularly in cardiology. In fact, DM applications can greatly benefit all those involved in cardiology, such as patients, cardiologists and nurses. The purpose of this paper is to review papers concerning the application of DM techniques in cardiology so as to summarize and analyze evidence regarding: (1) the DM techniques most frequently used in cardiology; (2) the performance of DM models in cardiology; (3) comparisons of the performance of different DM models in cardiology. We performed a systematic literature review of empirical studies on the application of DM techniques in cardiology published in the period between 1 January 2000 and 31 December 2015. A total of 149 articles published between 2000 and 2015 were selected, studied and analyzed according to the following criteria: DM techniques and performance of the approaches developed. The results obtained showed that a significant number of the studies selected used classification and prediction techniques when developing DM models. Neural networks, decision trees and support vector machines were identified as being the techniques most frequently employed when developing DM models in cardiology. Moreover, neural networks and support vector machines achieved the highest accuracy rates and were proved to be more efficient than other techniques. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Providing Effective Professional Development for Teachers through the Lunar Workshops for Educators
NASA Astrophysics Data System (ADS)
Canipe, Marti; Buxner, Sanlyn; Jones, Andrea; Hsu, Brooke; Shaner, Andy; Bleacher, Lora
2014-11-01
In order to integrate current scientific discoveries in the classroom, K-12 teachers benefit from professional development and support. The Lunar Workshops for Educators is a series of weeklong workshops for grade 6-9 science teachers focused on lunar science and exploration, sponsored by the Lunar Reconnaissance Orbiter (LRO) and conducted by the LRO Education and Public Outreach (E/PO) Team. The Lunar Workshops for Educators, have provided this professional development for teachers for the last five years. Program evaluation includes pre- and post- content tests and surveys related to classroom practice, daily surveys, and follow-up surveys conducted during the academic year following the summer workshops to assess how the knowledge and skills learned at the workshop are being used in the classroom. The evaluation of the workshop shows that the participants increased their overall knowledge of lunar science and exploration. Additionally, they gained knowledge about student misconceptions related to the Moon and ways to address those misconceptions. The workshops impacted the ways teachers taught about the Moon by providing them with resources to teach about the Moon and increased confidence in teaching about these topics. Participants reported ways that the workshop impacted their teaching practices beyond teaching about the Moon, encouraging them to include more inquiry and other teaching techniques demonstrated in the workshops in their science classes. Overall, the program evaluation has shown the Lunar Workshops for Educators are effective at increasing teachers’ knowledge about the Moon and use of inquiry-based teaching into their classrooms. Additionally, the program supports participant teachers in integrating current scientific discoveries into their classrooms.
Fattore, Matteo; Arrigo, Patrizio
2005-01-01
The possibility to study an organism in terms of system theory has been proposed in the past, but only the advancement of molecular biology techniques allow us to investigate the dynamical properties of a biological system in a more quantitative and rational way than before . These new techniques can gave only the basic level view of an organisms functionality. The comprehension of its dynamical behaviour depends on the possibility to perform a multiple level analysis. Functional genomics has stimulated the interest in the investigation the dynamical behaviour of an organism as a whole. These activities are commonly known as System Biology, and its interests ranges from molecules to organs. One of the more promising applications is the 'disease modeling'. The use of experimental models is a common procedure in pharmacological and clinical researches; today this approach is supported by 'in silico' predictive methods. This investigation can be improved by a combination of experimental and computational tools. The Machine Learning (ML) tools are able to process different heterogeneous data sources, taking into account this peculiarity, they could be fruitfully applied to support a multilevel data processing (molecular, cellular and morphological) that is the prerequisite for the formal model design; these techniques can allow us to extract the knowledge for mathematical model development. The aim of our work is the development and implementation of a system that combines ML and dynamical models simulations. The program is addressed to the virtual analysis of the pathways involved in neurodegenerative diseases. These pathologies are multifactorial diseases and the relevance of the different factors has not yet been well elucidated. This is a very complex task; in order to test the integrative approach our program has been limited to the analysis of the effects of a specific protein, the Cyclin dependent kinase 5 (CDK5) which relies on the induction of neuronal apoptosis. The system has a modular structure centred on a textual knowledge discovery approach. The text mining is the only way to enhance the capability to extract ,from multiple data sources, the information required for the dynamical simulator. The user may access the publically available modules through the following site: http://biocomp.ge.ismac.cnr.it.
Advances in studying phasic dopamine signaling in brain reward mechanisms
Wickham, Robert J.; Solecki, Wojciech; Rathbun, Liza R.; Neugebauer, Nichole M.; Wightman, R. Mark; Addy, Nii A.
2013-01-01
The last sixty years of research have provided extraordinary advances of our knowledge of the reward system. Since its initial discovery as a neurotransmitter by Carlsson and colleagues (Carlsson et al., 1957), dopamine (DA) has emerged as an important mediator of reward processing. As a result, a number of electrochemical techniques have been developed to directly measure DA levels in the brain using various preparations. Many of these techniques and preparations differ in the types of questions that they can address. Together, these techniques have begun to elucidate the complex roles of tonic and phasic DA signaling in reward processing and in addiction. In this review, we will first provide a guide for the most commonly used electrochemical methods for DA detection and describe their utility in furthering our knowledge about DA's role in reward and addiction. Second, we will review the value of common in vitro and in vivo preparations and describe their ability to address different types of questions. Last, we will review recent data that has provided new insight of the mechanisms of in vivo phasic DA signaling and its role in reward processing and reward-mediated behavior. PMID:23747914
A bioinformatics knowledge discovery in text application for grid computing
Castellano, Marcello; Mastronardi, Giuseppe; Bellotti, Roberto; Tarricone, Gianfranco
2009-01-01
Background A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. Methods The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. Results A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. Conclusion In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities. PMID:19534749
A bioinformatics knowledge discovery in text application for grid computing.
Castellano, Marcello; Mastronardi, Giuseppe; Bellotti, Roberto; Tarricone, Gianfranco
2009-06-16
A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities.
ERIC Educational Resources Information Center
Tsantis, Linda; Castellani, John
2001-01-01
This article explores how knowledge-discovery applications can empower educators with the information they need to provide anticipatory guidance for teaching and learning, forecast school and district needs, and find critical markers for making the best program decisions for children and youth with disabilities. Data mining for schools is…
ERIC Educational Resources Information Center
Molina, Otilia Alejandro; Ratté, Sylvie
2017-01-01
This research introduces a method to construct a unified representation of teachers and students perspectives based on the actionable knowledge discovery (AKD) and delivery framework. The representation is constructed using two models: one obtained from student evaluations and the other obtained from teachers' reflections about their teaching…
ERIC Educational Resources Information Center
Taft, Laritza M.
2010-01-01
In its report "To Err is Human", The Institute of Medicine recommended the implementation of internal and external voluntary and mandatory automatic reporting systems to increase detection of adverse events. Knowledge Discovery in Databases (KDD) allows the detection of patterns and trends that would be hidden or less detectable if analyzed by…
Knowledge Discovery Process: Case Study of RNAV Adherence of Radar Track Data
NASA Technical Reports Server (NTRS)
Matthews, Bryan
2018-01-01
This talk is an introduction to the knowledge discovery process, beginning with: identifying the problem, choosing data sources, matching the appropriate machine learning tools, and reviewing the results. The overview will be given in the context of an ongoing study that is assessing RNAV adherence of commercial aircraft in the national airspace.
A Virtual Bioinformatics Knowledge Environment for Early Cancer Detection
NASA Technical Reports Server (NTRS)
Crichton, Daniel; Srivastava, Sudhir; Johnsey, Donald
2003-01-01
Discovery of disease biomarkers for cancer is a leading focus of early detection. The National Cancer Institute created a network of collaborating institutions focused on the discovery and validation of cancer biomarkers called the Early Detection Research Network (EDRN). Informatics plays a key role in enabling a virtual knowledge environment that provides scientists real time access to distributed data sets located at research institutions across the nation. The distributed and heterogeneous nature of the collaboration makes data sharing across institutions very difficult. EDRN has developed a comprehensive informatics effort focused on developing a national infrastructure enabling seamless access, sharing and discovery of science data resources across all EDRN sites. This paper will discuss the EDRN knowledge system architecture, its objectives and its accomplishments.
ERIC Educational Resources Information Center
Harmon, Glynn
2013-01-01
The term discovery applies herein to the successful outcome of inquiry in which a significant personal, professional or scholarly breakthrough or insight occurs, and which is individually or socially acknowledged as a key contribution to knowledge. Since discoveries culminate at fixed points in time, discoveries can serve as an outcome metric for…
Jiang, Guoqian; Wang, Chen; Zhu, Qian; Chute, Christopher G
2013-01-01
Knowledge-driven text mining is becoming an important research area for identifying pharmacogenomics target genes. However, few of such studies have been focused on the pharmacogenomics targets of adverse drug events (ADEs). The objective of the present study is to build a framework of knowledge integration and discovery that aims to support pharmacogenomics target predication of ADEs. We integrate a semantically annotated literature corpus Semantic MEDLINE with a semantically coded ADE knowledgebase known as ADEpedia using a semantic web based framework. We developed a knowledge discovery approach combining a network analysis of a protein-protein interaction (PPI) network and a gene functional classification approach. We performed a case study of drug-induced long QT syndrome for demonstrating the usefulness of the framework in predicting potential pharmacogenomics targets of ADEs.
The Proximal Lilly Collection: Mapping, Exploring and Exploiting Feasible Chemical Space.
Nicolaou, Christos A; Watson, Ian A; Hu, Hong; Wang, Jibo
2016-07-25
Venturing into the immensity of the small molecule universe to identify novel chemical structure is a much discussed objective of many methods proposed by the chemoinformatics community. To this end, numerous approaches using techniques from the fields of computational de novo design, virtual screening and reaction informatics, among others, have been proposed. Although in principle this objective is commendable, in practice there are several obstacles to useful exploitation of the chemical space. Prime among them are the sheer number of theoretically feasible compounds and the practical concern regarding the synthesizability of the chemical structures conceived using in silico methods. We present the Proximal Lilly Collection initiative implemented at Eli Lilly and Co. with the aims to (i) define the chemical space of small, drug-like compounds that could be synthesized using in-house resources and (ii) facilitate access to compounds in this large space for the purposes of ongoing drug discovery efforts. The implementation of PLC relies on coupling access to available synthetic knowledge and resources with chemo/reaction informatics techniques and tools developed for this purpose. We describe in detail the computational framework supporting this initiative and elaborate on the characteristics of the PLC virtual collection of compounds. As an example of the opportunities provided to drug discovery researchers by easy access to a large, realistically feasible virtual collection such as the PLC, we describe a recent application of the technology that led to the discovery of selective kinase inhibitors.
'Big Data' Collaboration: Exploring, Recording and Sharing Enterprise Knowledge
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sukumar, Sreenivas R; Ferrell, Regina Kay
2013-01-01
As data sources and data size proliferate, knowledge discovery from "Big Data" is starting to pose several challenges. In this paper, we address a specific challenge in the practice of enterprise knowledge management while extracting actionable nuggets from diverse data sources of seemingly-related information. In particular, we address the challenge of archiving knowledge gained through collaboration, dissemination and visualization as part of the data analysis, inference and decision-making lifecycle. We motivate the implementation of an enterprise data-discovery and knowledge recorder tool, called SEEKER based on real world case-study. We demonstrate SEEKER capturing schema and data-element relationships, tracking the data elementsmore » of value based on the queries and the analytical artifacts that are being created by analysts as they use the data. We show how the tool serves as digital record of institutional domain knowledge and a documentation for the evolution of data elements, queries and schemas over time. As a knowledge management service, a tool like SEEKER saves enterprise resources and time by avoiding analytic silos, expediting the process of multi-source data integration and intelligently documenting discoveries from fellow analysts.« less
Analysis student self efficacy in terms of using Discovery Learning model with SAVI approach
NASA Astrophysics Data System (ADS)
Sahara, Rifki; Mardiyana, S., Dewi Retno Sari
2017-12-01
Often students are unable to prove their academic achievement optimally according to their abilities. One reason is that they often feel unsure that they are capable of completing the tasks assigned to them. For students, such beliefs are necessary. The term belief has called self efficacy. Self efficacy is not something that has brought about by birth or something with permanent quality of an individual, but is the result of cognitive processes, the meaning one's self efficacy will be stimulated through learning activities. Self efficacy has developed and enhanced by a learning model that can stimulate students to foster confidence in their capabilities. One of them is by using Discovery Learning model with SAVI approach. Discovery Learning model with SAVI approach is one of learning models that involves the active participation of students in exploring and discovering their own knowledge and using it in problem solving by utilizing all the sensory devices they have. This naturalistic qualitative research aims to analyze student self efficacy in terms of use the Discovery Learning model with SAVI approach. The subjects of this study are 30 students focused on eight students who have high, medium, and low self efficacy obtained through purposive sampling technique. The data analysis of this research used three stages, that were reducing, displaying, and getting conclusion of the data. Based on the results of data analysis, it was concluded that the self efficacy appeared dominantly on the learning by using Discovery Learning model with SAVI approach is magnitude dimension.
X-ray crystallography over the past decade for novel drug discovery - where are we heading next?
Zheng, Heping; Handing, Katarzyna B; Zimmerman, Matthew D; Shabalin, Ivan G; Almo, Steven C; Minor, Wladek
2015-01-01
Macromolecular X-ray crystallography has been the primary methodology for determining the three-dimensional structures of proteins, nucleic acids and viruses. Structural information has paved the way for structure-guided drug discovery and laid the foundations for structural bioinformatics. However, X-ray crystallography still has a few fundamental limitations, some of which may be overcome and complemented using emerging methods and technologies in other areas of structural biology. This review describes how structural knowledge gained from X-ray crystallography has been used to advance other biophysical methods for structure determination (and vice versa). This article also covers current practices for integrating data generated by other biochemical and biophysical methods with those obtained from X-ray crystallography. Finally, the authors articulate their vision about how a combination of structural and biochemical/biophysical methods may improve our understanding of biological processes and interactions. X-ray crystallography has been, and will continue to serve as, the central source of experimental structural biology data used in the discovery of new drugs. However, other structural biology techniques are useful not only to overcome the major limitation of X-ray crystallography, but also to provide complementary structural data that is useful in drug discovery. The use of recent advancements in biochemical, spectroscopy and bioinformatics methods may revolutionize drug discovery, albeit only when these data are combined and analyzed with effective data management systems. Accurate and complete data management is crucial for developing experimental procedures that are robust and reproducible.
Bouwknecht, J Adriaan
2015-04-15
The review describes a personal journey through 25 years of animal research with a focus on the contribution of rodent models for anxiety and depression to the development of new medicines in a drug discovery environment. Several classic acute models for mood disorders are briefly described as well as chronic stress and disease-induction models. The paper highlights a variety of factors that influence the quality and consistency of behavioral data in a laboratory setting. The importance of meta-analysis techniques for study validation (tolerance interval) and assay sensitivity (Monte Carlo modeling) are demonstrated by examples that use historic data. It is essential for successful discovery of new potential drugs to maintain a high level of control in animal research and to bridge knowledge across in silico modeling, and in vitro and in vivo assays. Today, drug discovery is a highly dynamic environment in search of new types of treatments and new animal models which should be guided by enhanced two-way translation between bench and bed. Although productivity has been disappointing in the search of new and better medicines in psychiatry over the past decades, there has been and will always be an important role for in vivo models in-between preclinical discovery and clinical development. The right balance between good science and proper judgment versus a decent level of innovation, assay development and two-way translation will open the doors to a very bright future. Copyright © 2014 Elsevier B.V. All rights reserved.
An Intelligent Content Discovery Technique for Health Portal Content Management
2014-01-01
Background Continuous content management of health information portals is a feature vital for its sustainability and widespread acceptance. Knowledge and experience of a domain expert is essential for content management in the health domain. The rate of generation of online health resources is exponential and thereby manual examination for relevance to a specific topic and audience is a formidable challenge for domain experts. Intelligent content discovery for effective content management is a less researched topic. An existing expert-endorsed content repository can provide the necessary leverage to automatically identify relevant resources and evaluate qualitative metrics. Objective This paper reports on the design research towards an intelligent technique for automated content discovery and ranking for health information portals. The proposed technique aims to improve efficiency of the current mostly manual process of portal content management by utilising an existing expert-endorsed content repository as a supporting base and a benchmark to evaluate the suitability of new content Methods A model for content management was established based on a field study of potential users. The proposed technique is integral to this content management model and executes in several phases (ie, query construction, content search, text analytics and fuzzy multi-criteria ranking). The construction of multi-dimensional search queries with input from Wordnet, the use of multi-word and single-word terms as representative semantics for text analytics and the use of fuzzy multi-criteria ranking for subjective evaluation of quality metrics are original contributions reported in this paper. Results The feasibility of the proposed technique was examined with experiments conducted on an actual health information portal, the BCKOnline portal. Both intermediary and final results generated by the technique are presented in the paper and these help to establish benefits of the technique and its contribution towards effective content management. Conclusions The prevalence of large numbers of online health resources is a key obstacle for domain experts involved in content management of health information portals and websites. The proposed technique has proven successful at search and identification of resources and the measurement of their relevance. It can be used to support the domain expert in content management and thereby ensure the health portal is up-to-date and current. PMID:25654440
NASA Technical Reports Server (NTRS)
Tilton, James C.; Cook, Diane J.
2008-01-01
Under a project recently selected for funding by NASA's Science Mission Directorate under the Applied Information Systems Research (AISR) program, Tilton and Cook will design and implement the integration of the Subdue graph based knowledge discovery system, developed at the University of Texas Arlington and Washington State University, with image segmentation hierarchies produced by the RHSEG software, developed at NASA GSFC, and perform pilot demonstration studies of data analysis, mining and knowledge discovery on NASA data. Subdue represents a method for discovering substructures in structural databases. Subdue is devised for general-purpose automated discovery, concept learning, and hierarchical clustering, with or without domain knowledge. Subdue was developed by Cook and her colleague, Lawrence B. Holder. For Subdue to be effective in finding patterns in imagery data, the data must be abstracted up from the pixel domain. An appropriate abstraction of imagery data is a segmentation hierarchy: a set of several segmentations of the same image at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. The RHSEG program, a recursive approximation to a Hierarchical Segmentation approach (HSEG), can produce segmentation hierarchies quickly and effectively for a wide variety of images. RHSEG and HSEG were developed at NASA GSFC by Tilton. In this presentation we provide background on the RHSEG and Subdue technologies and present a preliminary analysis on how RHSEG and Subdue may be combined to enhance image data analysis, mining and knowledge discovery.
Predicting future discoveries from current scientific literature.
Petrič, Ingrid; Cestnik, Bojan
2014-01-01
Knowledge discovery in biomedicine is a time-consuming process starting from the basic research, through preclinical testing, towards possible clinical applications. Crossing of conceptual boundaries is often needed for groundbreaking biomedical research that generates highly inventive discoveries. We demonstrate the ability of a creative literature mining method to advance valuable new discoveries based on rare ideas from existing literature. When emerging ideas from scientific literature are put together as fragments of knowledge in a systematic way, they may lead to original, sometimes surprising, research findings. If enough scientific evidence is already published for the association of such findings, they can be considered as scientific hypotheses. In this chapter, we describe a method for the computer-aided generation of such hypotheses based on the existing scientific literature. Our literature-based discovery of NF-kappaB with its possible connections to autism was recently approved by scientific community, which confirms the ability of our literature mining methodology to accelerate future discoveries based on rare ideas from existing literature.
Bioenergy Knowledge Discovery Framework Fact Sheet
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
The Bioenergy Knowledge Discovery Framework (KDF) supports the development of a sustainable bioenergy industry by providing access to a variety of data sets, publications, and collaboration and mapping tools that support bioenergy research, analysis, and decision making. In the KDF, users can search for information, contribute data, and use the tools and map interface to synthesize, analyze, and visualize information in a spatially integrated manner.
Teachers' Journal Club: Bridging between the Dynamics of Biological Discoveries and Biology Teachers
ERIC Educational Resources Information Center
Brill, Gilat; Falk, Hedda; Yarden, Anat
2003-01-01
Since biology is one of the most dynamic research fields within the natural sciences, the gap between the accumulated knowledge in biology and the knowledge that is taught in schools, increases rapidly with time. Our long-term objective is to develop means to bridge between the dynamics of biological discoveries and the biology teachers and…
Security Services Discovery by ATM Endsystems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sholander, Peter; Tarman, Thomas
This contribution proposes strawman techniques for Security Service Discovery by ATM endsystems in ATM networks. Candidate techniques include ILMI extensions, ANS extensions and new ATM anycast addresses. Another option is a new protocol based on an IETF service discovery protocol, such as Service Location Protocol (SLP). Finally, this contribution provides strawman requirements for Security-Based Routing in ATM networks.
Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji
2016-01-01
Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. © The Author(s) 2016. Published by Oxford University Press.
Chen, Yi-An; Tripathi, Lokesh P.; Mizuguchi, Kenji
2016-01-01
Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org PMID:26989145
DOE Office of Scientific and Technical Information (OSTI.GOV)
McDermott, Jason E.; Wang, Jing; Mitchell, Hugh D.
2013-01-01
The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities both for purely statistical and expert knowledge-based approaches and would benefit from improved integration of the two. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges thatmore » have been encountered. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to biomarker discovery and characterization are key to future success in the biomarker field. We will describe our recommendations of possible approaches to this problem including metrics for the evaluation of biomarkers.« less
Computational functional genomics-based approaches in analgesic drug discovery and repurposing.
Lippmann, Catharina; Kringel, Dario; Ultsch, Alfred; Lötsch, Jörn
2018-06-01
Persistent pain is a major healthcare problem affecting a fifth of adults worldwide with still limited treatment options. The search for new analgesics increasingly includes the novel research area of functional genomics, which combines data derived from various processes related to DNA sequence, gene expression or protein function and uses advanced methods of data mining and knowledge discovery with the goal of understanding the relationship between the genome and the phenotype. Its use in drug discovery and repurposing for analgesic indications has so far been performed using knowledge discovery in gene function and drug target-related databases; next-generation sequencing; and functional proteomics-based approaches. Here, we discuss recent efforts in functional genomics-based approaches to analgesic drug discovery and repurposing and highlight the potential of computational functional genomics in this field including a demonstration of the workflow using a novel R library 'dbtORA'.
Ahmed, Wamiq M; Lenz, Dominik; Liu, Jia; Paul Robinson, J; Ghafoor, Arif
2008-03-01
High-throughput biological imaging uses automated imaging devices to collect a large number of microscopic images for analysis of biological systems and validation of scientific hypotheses. Efficient manipulation of these datasets for knowledge discovery requires high-performance computational resources, efficient storage, and automated tools for extracting and sharing such knowledge among different research sites. Newly emerging grid technologies provide powerful means for exploiting the full potential of these imaging techniques. Efficient utilization of grid resources requires the development of knowledge-based tools and services that combine domain knowledge with analysis algorithms. In this paper, we first investigate how grid infrastructure can facilitate high-throughput biological imaging research, and present an architecture for providing knowledge-based grid services for this field. We identify two levels of knowledge-based services. The first level provides tools for extracting spatiotemporal knowledge from image sets and the second level provides high-level knowledge management and reasoning services. We then present cellular imaging markup language, an extensible markup language-based language for modeling of biological images and representation of spatiotemporal knowledge. This scheme can be used for spatiotemporal event composition, matching, and automated knowledge extraction and representation for large biological imaging datasets. We demonstrate the expressive power of this formalism by means of different examples and extensive experimental results.
The Genetic Basis of Mendelian Phenotypes: Discoveries, Challenges, and Opportunities
Chong, Jessica X.; Buckingham, Kati J.; Jhangiani, Shalini N.; Boehm, Corinne; Sobreira, Nara; Smith, Joshua D.; Harrell, Tanya M.; McMillin, Margaret J.; Wiszniewski, Wojciech; Gambin, Tomasz; Coban Akdemir, Zeynep H.; Doheny, Kimberly; Scott, Alan F.; Avramopoulos, Dimitri; Chakravarti, Aravinda; Hoover-Fong, Julie; Mathews, Debra; Witmer, P. Dane; Ling, Hua; Hetrick, Kurt; Watkins, Lee; Patterson, Karynne E.; Reinier, Frederic; Blue, Elizabeth; Muzny, Donna; Kircher, Martin; Bilguvar, Kaya; López-Giráldez, Francesc; Sutton, V. Reid; Tabor, Holly K.; Leal, Suzanne M.; Gunel, Murat; Mane, Shrikant; Gibbs, Richard A.; Boerwinkle, Eric; Hamosh, Ada; Shendure, Jay; Lupski, James R.; Lifton, Richard P.; Valle, David; Nickerson, Deborah A.; Bamshad, Michael J.
2015-01-01
Discovering the genetic basis of a Mendelian phenotype establishes a causal link between genotype and phenotype, making possible carrier and population screening and direct diagnosis. Such discoveries also contribute to our knowledge of gene function, gene regulation, development, and biological mechanisms that can be used for developing new therapeutics. As of February 2015, 2,937 genes underlying 4,163 Mendelian phenotypes have been discovered, but the genes underlying ∼50% (i.e., 3,152) of all known Mendelian phenotypes are still unknown, and many more Mendelian conditions have yet to be recognized. This is a formidable gap in biomedical knowledge. Accordingly, in December 2011, the NIH established the Centers for Mendelian Genomics (CMGs) to provide the collaborative framework and infrastructure necessary for undertaking large-scale whole-exome sequencing and discovery of the genetic variants responsible for Mendelian phenotypes. In partnership with 529 investigators from 261 institutions in 36 countries, the CMGs assessed 18,863 samples from 8,838 families representing 579 known and 470 novel Mendelian phenotypes as of January 2015. This collaborative effort has identified 956 genes, including 375 not previously associated with human health, that underlie a Mendelian phenotype. These results provide insight into study design and analytical strategies, identify novel mechanisms of disease, and reveal the extensive clinical variability of Mendelian phenotypes. Discovering the gene underlying every Mendelian phenotype will require tackling challenges such as worldwide ascertainment and phenotypic characterization of families affected by Mendelian conditions, improvement in sequencing and analytical techniques, and pervasive sharing of phenotypic and genomic data among researchers, clinicians, and families. PMID:26166479
Williams, Kevin; Bilsland, Elizabeth; Sparkes, Andrew; Aubrey, Wayne; Young, Michael; Soldatova, Larisa N; De Grave, Kurt; Ramon, Jan; de Clare, Michaela; Sirawaraporn, Worachart; Oliver, Stephen G; King, Ross D
2015-03-06
There is an urgent need to make drug discovery cheaper and faster. This will enable the development of treatments for diseases currently neglected for economic reasons, such as tropical and orphan diseases, and generally increase the supply of new drugs. Here, we report the Robot Scientist 'Eve' designed to make drug discovery more economical. A Robot Scientist is a laboratory automation system that uses artificial intelligence (AI) techniques to discover scientific knowledge through cycles of experimentation. Eve integrates and automates library-screening, hit-confirmation, and lead generation through cycles of quantitative structure activity relationship learning and testing. Using econometric modelling we demonstrate that the use of AI to select compounds economically outperforms standard drug screening. For further efficiency Eve uses a standardized form of assay to compute Boolean functions of compound properties. These assays can be quickly and cheaply engineered using synthetic biology, enabling more targets to be assayed for a given budget. Eve has repositioned several drugs against specific targets in parasites that cause tropical diseases. One validated discovery is that the anti-cancer compound TNP-470 is a potent inhibitor of dihydrofolate reductase from the malaria-causing parasite Plasmodium vivax.
Williams, Kevin; Bilsland, Elizabeth; Sparkes, Andrew; Aubrey, Wayne; Young, Michael; Soldatova, Larisa N.; De Grave, Kurt; Ramon, Jan; de Clare, Michaela; Sirawaraporn, Worachart; Oliver, Stephen G.; King, Ross D.
2015-01-01
There is an urgent need to make drug discovery cheaper and faster. This will enable the development of treatments for diseases currently neglected for economic reasons, such as tropical and orphan diseases, and generally increase the supply of new drugs. Here, we report the Robot Scientist ‘Eve’ designed to make drug discovery more economical. A Robot Scientist is a laboratory automation system that uses artificial intelligence (AI) techniques to discover scientific knowledge through cycles of experimentation. Eve integrates and automates library-screening, hit-confirmation, and lead generation through cycles of quantitative structure activity relationship learning and testing. Using econometric modelling we demonstrate that the use of AI to select compounds economically outperforms standard drug screening. For further efficiency Eve uses a standardized form of assay to compute Boolean functions of compound properties. These assays can be quickly and cheaply engineered using synthetic biology, enabling more targets to be assayed for a given budget. Eve has repositioned several drugs against specific targets in parasites that cause tropical diseases. One validated discovery is that the anti-cancer compound TNP-470 is a potent inhibitor of dihydrofolate reductase from the malaria-causing parasite Plasmodium vivax. PMID:25652463
Current Developments in Machine Learning Techniques in Biological Data Mining.
Dumancas, Gerard G; Adrianto, Indra; Bello, Ghalib; Dozmorov, Mikhail
2017-01-01
This supplement is intended to focus on the use of machine learning techniques to generate meaningful information on biological data. This supplement under Bioinformatics and Biology Insights aims to provide scientists and researchers working in this rapid and evolving field with online, open-access articles authored by leading international experts in this field. Advances in the field of biology have generated massive opportunities to allow the implementation of modern computational and statistical techniques. Machine learning methods in particular, a subfield of computer science, have evolved as an indispensable tool applied to a wide spectrum of bioinformatics applications. Thus, it is broadly used to investigate the underlying mechanisms leading to a specific disease, as well as the biomarker discovery process. With a growth in this specific area of science comes the need to access up-to-date, high-quality scholarly articles that will leverage the knowledge of scientists and researchers in the various applications of machine learning techniques in mining biological data.
Knowledge Discovery from Posts in Online Health Communities Using Unified Medical Language System.
Chen, Donghua; Zhang, Runtong; Liu, Kecheng; Hou, Lei
2018-06-19
Patient-reported posts in Online Health Communities (OHCs) contain various valuable information that can help establish knowledge-based online support for online patients. However, utilizing these reports to improve online patient services in the absence of appropriate medical and healthcare expert knowledge is difficult. Thus, we propose a comprehensive knowledge discovery method that is based on the Unified Medical Language System for the analysis of narrative posts in OHCs. First, we propose a domain-knowledge support framework for OHCs to provide a basis for post analysis. Second, we develop a Knowledge-Involved Topic Modeling (KI-TM) method to extract and expand explicit knowledge within the text. We propose four metrics, namely, explicit knowledge rate, latent knowledge rate, knowledge correlation rate, and perplexity, for the evaluation of the KI-TM method. Our experimental results indicate that our proposed method outperforms existing methods in terms of providing knowledge support. Our method enhances knowledge support for online patients and can help develop intelligent OHCs in the future.
Distributed Noise Generation for Density Estimation Based Clustering without Trusted Third Party
NASA Astrophysics Data System (ADS)
Su, Chunhua; Bao, Feng; Zhou, Jianying; Takagi, Tsuyoshi; Sakurai, Kouichi
The rapid growth of the Internet provides people with tremendous opportunities for data collection, knowledge discovery and cooperative computation. However, it also brings the problem of sensitive information leakage. Both individuals and enterprises may suffer from the massive data collection and the information retrieval by distrusted parties. In this paper, we propose a privacy-preserving protocol for the distributed kernel density estimation-based clustering. Our scheme applies random data perturbation (RDP) technique and the verifiable secret sharing to solve the security problem of distributed kernel density estimation in [4] which assumed a mediate party to help in the computation.
Learning in the context of distribution drift
2017-05-09
published in the leading data mining journal, Data Mining and Knowledge Discovery (Webb et. al., 2016)1. We have shown that the previous qualitative...learner Low-bias learner Aggregated classifier Figure 7: Architecture for learning fr m streaming data in th co text of variable or unknown...Learning limited dependence Bayesian classifiers, in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD
A Bioinformatic Approach to Inter Functional Interactions within Protein Sequences
2009-02-23
AFOSR/AOARD Reference Number: USAFAOGA07: FA4869-07-1-4050 AFOSR/AOARD Program Manager : Hiroshi Motoda, Ph.D. Period of...Conference on Knowledge Discovery and Data Mining.) In a separate study we have applied our approaches to the problem of whole genome alignment. We have...SIGKDD Conference on Knowledge Discovery and Data Mining Attached. Interactions: Please list: (a) Participation/presentations at meetings
Xiang, Yang; Lu, Kewei; James, Stephen L.; Borlawsky, Tara B.; Huang, Kun; Payne, Philip R.O.
2011-01-01
The Unified Medical Language System (UMLS) is the largest thesaurus in the biomedical informatics domain. Previous works have shown that knowledge constructs comprised of transitively-associated UMLS concepts are effective for discovering potentially novel biomedical hypotheses. However, the extremely large size of the UMLS becomes a major challenge for these applications. To address this problem, we designed a k-neighborhood Decentralization Labeling Scheme (kDLS) for the UMLS, and the corresponding method to effectively evaluate the kDLS indexing results. kDLS provides a comprehensive solution for indexing the UMLS for very efficient large scale knowledge discovery. We demonstrated that it is highly effective to use kDLS paths to prioritize disease-gene relations across the whole genome, with extremely high fold-enrichment values. To our knowledge, this is the first indexing scheme capable of supporting efficient large scale knowledge discovery on the UMLS as a whole. Our expectation is that kDLS will become a vital engine for retrieving information and generating hypotheses from the UMLS for future medical informatics applications. PMID:22154838
Xiang, Yang; Lu, Kewei; James, Stephen L; Borlawsky, Tara B; Huang, Kun; Payne, Philip R O
2012-04-01
The Unified Medical Language System (UMLS) is the largest thesaurus in the biomedical informatics domain. Previous works have shown that knowledge constructs comprised of transitively-associated UMLS concepts are effective for discovering potentially novel biomedical hypotheses. However, the extremely large size of the UMLS becomes a major challenge for these applications. To address this problem, we designed a k-neighborhood Decentralization Labeling Scheme (kDLS) for the UMLS, and the corresponding method to effectively evaluate the kDLS indexing results. kDLS provides a comprehensive solution for indexing the UMLS for very efficient large scale knowledge discovery. We demonstrated that it is highly effective to use kDLS paths to prioritize disease-gene relations across the whole genome, with extremely high fold-enrichment values. To our knowledge, this is the first indexing scheme capable of supporting efficient large scale knowledge discovery on the UMLS as a whole. Our expectation is that kDLS will become a vital engine for retrieving information and generating hypotheses from the UMLS for future medical informatics applications. Copyright © 2011 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Olama, Mohammed M; Nutaro, James J; Sukumar, Sreenivas R
2013-01-01
The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Optionsmore » that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.« less
User needs analysis and usability assessment of DataMed - a biomedical data discovery index.
Dixit, Ram; Rogith, Deevakar; Narayana, Vidya; Salimi, Mandana; Gururaj, Anupama; Ohno-Machado, Lucila; Xu, Hua; Johnson, Todd R
2017-11-30
To present user needs and usability evaluations of DataMed, a Data Discovery Index (DDI) that allows searching for biomedical data from multiple sources. We conducted 2 phases of user studies. Phase 1 was a user needs analysis conducted before the development of DataMed, consisting of interviews with researchers. Phase 2 involved iterative usability evaluations of DataMed prototypes. We analyzed data qualitatively to document researchers' information and user interface needs. Biomedical researchers' information needs in data discovery are complex, multidimensional, and shaped by their context, domain knowledge, and technical experience. User needs analyses validate the need for a DDI, while usability evaluations of DataMed show that even though aggregating metadata into a common search engine and applying traditional information retrieval tools are promising first steps, there remain challenges for DataMed due to incomplete metadata and the complexity of data discovery. Biomedical data poses distinct problems for search when compared to websites or publications. Making data available is not enough to facilitate biomedical data discovery: new retrieval techniques and user interfaces are necessary for dataset exploration. Consistent, complete, and high-quality metadata are vital to enable this process. While available data and researchers' information needs are complex and heterogeneous, a successful DDI must meet those needs and fit into the processes of biomedical researchers. Research directions include formalizing researchers' information needs, standardizing overviews of data to facilitate relevance judgments, implementing user interfaces for concept-based searching, and developing evaluation methods for open-ended discovery systems such as DDIs. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Knowledge Retrieval Solutions.
ERIC Educational Resources Information Center
Khan, Kamran
1998-01-01
Excalibur RetrievalWare offers true knowledge retrieval solutions. Its fundamental technologies, Adaptive Pattern Recognition Processing and Semantic Networks, have capabilities for knowledge discovery and knowledge management of full-text, structured and visual information. The software delivers a combination of accuracy, extensibility,…
Knowledge extraction from evolving spiking neural networks with rank order population coding.
Soltic, Snjezana; Kasabov, Nikola
2010-12-01
This paper demonstrates how knowledge can be extracted from evolving spiking neural networks with rank order population coding. Knowledge discovery is a very important feature of intelligent systems. Yet, a disproportionally small amount of research is centered on the issue of knowledge extraction from spiking neural networks which are considered to be the third generation of artificial neural networks. The lack of knowledge representation compatibility is becoming a major detriment to end users of these networks. We show that a high-level knowledge can be obtained from evolving spiking neural networks. More specifically, we propose a method for fuzzy rule extraction from an evolving spiking network with rank order population coding. The proposed method was used for knowledge discovery on two benchmark taste recognition problems where the knowledge learnt by an evolving spiking neural network was extracted in the form of zero-order Takagi-Sugeno fuzzy IF-THEN rules.
Flood AI: An Intelligent Systems for Discovery and Communication of Disaster Knowledge
NASA Astrophysics Data System (ADS)
Demir, I.; Sermet, M. Y.
2017-12-01
Communities are not immune from extreme events or natural disasters that can lead to large-scale consequences for the nation and public. Improving resilience to better prepare, plan, recover, and adapt to disasters is critical to reduce the impacts of extreme events. The National Research Council (NRC) report discusses the topic of how to increase resilience to extreme events through a vision of resilient nation in the year 2030. The report highlights the importance of data, information, gaps and knowledge challenges that needs to be addressed, and suggests every individual to access the risk and vulnerability information to make their communities more resilient. This project presents an intelligent system, Flood AI, for flooding to improve societal preparedness by providing a knowledge engine using voice recognition, artificial intelligence, and natural language processing based on a generalized ontology for disasters with a primary focus on flooding. The knowledge engine utilizes the flood ontology and concepts to connect user input to relevant knowledge discovery channels on flooding by developing a data acquisition and processing framework utilizing environmental observations, forecast models, and knowledge bases. Communication channels of the framework includes web-based systems, agent-based chat bots, smartphone applications, automated web workflows, and smart home devices, opening the knowledge discovery for flooding to many unique use cases.
Duncan, Dean F; Kum, Hye-Chung; Weigensberg, Elizabeth Caplick; Flair, Kimberly A; Stewart, C Joy
2008-11-01
Proper management and implementation of an effective child welfare agency requires the constant use of information about the experiences and outcomes of children involved in the system, emphasizing the need for comprehensive, timely, and accurate data. In the past 20 years, there have been many advances in technology that can maximize the potential of administrative data to promote better evaluation and management in the field of child welfare. Specifically, this article discusses the use of knowledge discovery and data mining (KDD), which makes it possible to create longitudinal data files from administrative data sources, extract valuable knowledge, and make the information available via a user-friendly public Web site. This article demonstrates a successful project in North Carolina where knowledge discovery and data mining technology was used to develop a comprehensive set of child welfare outcomes available through a public Web site to facilitate information sharing of child welfare data to improve policy and practice.
NSF's Perspective on Space Weather Research for Building Forecasting Capabilities
NASA Astrophysics Data System (ADS)
Bisi, M. M.; Pulkkinen, A. A.; Bisi, M. M.; Pulkkinen, A. A.; Webb, D. F.; Oughton, E. J.; Azeem, S. I.
2017-12-01
Space weather research at the National Science Foundation (NSF) is focused on scientific discovery and on deepening knowledge of the Sun-Geospace system. The process of maturation of knowledge base is a requirement for the development of improved space weather forecast models and for the accurate assessment of potential mitigation strategies. Progress in space weather forecasting requires advancing in-depth understanding of the underlying physical processes, developing better instrumentation and measurement techniques, and capturing the advancements in understanding in large-scale physics based models that span the entire chain of events from the Sun to the Earth. This presentation will provide an overview of current and planned programs pertaining to space weather research at NSF and discuss the recommendations of the Geospace Section portfolio review panel within the context of space weather forecasting capabilities.
Developing integrated crop knowledge networks to advance candidate gene discovery.
Hassani-Pak, Keywan; Castellote, Martin; Esch, Maria; Hindle, Matthew; Lysenko, Artem; Taubert, Jan; Rawlings, Christopher
2016-12-01
The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.
Hit discovery and hit-to-lead approaches.
Keseru, György M; Makara, Gergely M
2006-08-01
Hit discovery technologies range from traditional high-throughput screening to affinity selection of large libraries, fragment-based techniques and computer-aided de novo design, many of which have been extensively reviewed. Development of quality leads using hit confirmation and hit-to-lead approaches present their own challenges, depending on the hit discovery method used to identify the initial hits. In this paper, we summarize common industry practices adopted to tackle hit-to-lead challenges and review how the advantages and drawbacks of different hit discovery techniques could affect the various issues hit-to-lead groups face.
Cancer drug discovery: recent innovative approaches to tumor modeling.
Lovitt, Carrie J; Shelper, Todd B; Avery, Vicky M
2016-09-01
Cell culture models have been at the heart of anti-cancer drug discovery programs for over half a century. Advancements in cell culture techniques have seen the rapid evolution of more complex in vitro cell culture models investigated for use in drug discovery. Three-dimensional (3D) cell culture research has become a strong focal point, as this technique permits the recapitulation of the tumor microenvironment. Biologically relevant 3D cellular models have demonstrated significant promise in advancing cancer drug discovery, and will continue to play an increasing role in the future. In this review, recent advances in 3D cell culture techniques and their application in tumor modeling and anti-cancer drug discovery programs are discussed. The topics include selection of cancer cells, 3D cell culture assays (associated endpoint measurements and analysis), 3D microfluidic systems and 3D bio-printing. Although advanced cancer cell culture models and techniques are becoming commonplace in many research groups, the use of these approaches has yet to be fully embraced in anti-cancer drug applications. Furthermore, limitations associated with analyzing information-rich biological data remain unaddressed.
X-ray crystallography over the past decade for novel drug discovery – where are we heading next?
Zheng, Heping; Handing, Katarzyna B; Zimmerman, Matthew D; Shabalin, Ivan G; Almo, Steven C; Minor, Wladek
2015-01-01
Introduction Macromolecular X-ray crystallography has been the primary methodology for determining the three-dimensional structures of proteins, nucleic acids and viruses. Structural information has paved the way for structure-guided drug discovery and laid the foundations for structural bioinformatics. However, X-ray crystallography still has a few fundamental limitations, some of which may be overcome and complemented using emerging methods and technologies in other areas of structural biology. Areas covered This review describes how structural knowledge gained from X-ray crystallography has been used to advance other biophysical methods for structure determination (and vice versa). This article also covers current practices for integrating data generated by other biochemical and biophysical methods with those obtained from X-ray crystallography. Finally, the authors articulate their vision about how a combination of structural and biochemical/biophysical methods may improve our understanding of biological processes and interactions. Expert opinion X-ray crystallography has been, and will continue to serve as, the central source of experimental structural biology data used in the discovery of new drugs. However, other structural biology techniques are useful not only to overcome the major limitation of X-ray crystallography, but also to provide complementary structural data that is useful in drug discovery. The use of recent advancements in biochemical, spectroscopy and bioinformatics methods may revolutionize drug discovery, albeit only when these data are combined and analyzed with effective data management systems. Accurate and complete data management is crucial for developing experimental procedures that are robust and reproducible. PMID:26177814
The application of absolute quantitative (1)H NMR spectroscopy in drug discovery and development.
Singh, Suruchi; Roy, Raja
2016-07-01
The identification of a drug candidate and its structural determination is the most important step in the process of the drug discovery and for this, nuclear magnetic resonance (NMR) is one of the most selective analytical techniques. The present review illustrates the various perspectives of absolute quantitative (1)H NMR spectroscopy in drug discovery and development. It deals with the fundamentals of quantitative NMR (qNMR), the physiochemical properties affecting qNMR, and the latest referencing techniques used for quantification. The precise application of qNMR during various stages of drug discovery and development, namely natural product research, drug quantitation in dosage forms, drug metabolism studies, impurity profiling and solubility measurements is elaborated. To achieve this, the authors explore the literature of NMR in drug discovery and development between 1963 and 2015. It also takes into account several other reviews on the subject. qNMR experiments are used for drug discovery and development processes as it is a non-destructive, versatile and robust technique with high intra and interpersonal variability. However, there are several limitations also. qNMR of complex biological samples is incorporated with peak overlap and a low limit of quantification and this can be overcome by using hyphenated chromatographic techniques in addition to NMR.
On the Growth of Scientific Knowledge: Yeast Biology as a Case Study
He, Xionglei; Zhang, Jianzhi
2009-01-01
The tempo and mode of human knowledge expansion is an enduring yet poorly understood topic. Through a temporal network analysis of three decades of discoveries of protein interactions and genetic interactions in baker's yeast, we show that the growth of scientific knowledge is exponential over time and that important subjects tend to be studied earlier. However, expansions of different domains of knowledge are highly heterogeneous and episodic such that the temporal turnover of knowledge hubs is much greater than expected by chance. Familiar subjects are preferentially studied over new subjects, leading to a reduced pace of innovation. While research is increasingly done in teams, the number of discoveries per researcher is greater in smaller teams. These findings reveal collective human behaviors in scientific research and help design better strategies in future knowledge exploration. PMID:19300476
On the growth of scientific knowledge: yeast biology as a case study.
He, Xionglei; Zhang, Jianzhi
2009-03-01
The tempo and mode of human knowledge expansion is an enduring yet poorly understood topic. Through a temporal network analysis of three decades of discoveries of protein interactions and genetic interactions in baker's yeast, we show that the growth of scientific knowledge is exponential over time and that important subjects tend to be studied earlier. However, expansions of different domains of knowledge are highly heterogeneous and episodic such that the temporal turnover of knowledge hubs is much greater than expected by chance. Familiar subjects are preferentially studied over new subjects, leading to a reduced pace of innovation. While research is increasingly done in teams, the number of discoveries per researcher is greater in smaller teams. These findings reveal collective human behaviors in scientific research and help design better strategies in future knowledge exploration.
Gladysz, Rafaela; Cleenewerck, Matthias; Joossens, Jurgen; Lambeir, Anne-Marie; Augustyns, Koen; Van der Veken, Pieter
2014-10-13
Fragment-based drug discovery (FBDD) has evolved into an established approach for "hit" identification. Typically, most applications of FBDD depend on specialised cost- and time-intensive biophysical techniques. The substrate activity screening (SAS) approach has been proposed as a relatively cheap and straightforward alternative for identification of fragments for enzyme inhibitors. We have investigated SAS for the discovery of inhibitors of oncology target urokinase (uPA). Although our results support the key hypotheses of SAS, we also encountered a number of unreported limitations. In response, we propose an efficient modified methodology: "MSAS" (modified substrate activity screening). MSAS circumvents the limitations of SAS and broadens its scope by providing additional fragments and more coherent SAR data. As well as presenting and validating MSAS, this study expands existing SAR knowledge for the S1 pocket of uPA and reports new reversible and irreversible uPA inhibitor scaffolds. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Gafurov, O.; Gafurov, D.; Syryamkin, V.
2018-05-01
The paper analyses a field of computer science formed at the intersection of such areas of natural science as artificial intelligence, mathematical statistics, and database theory, which is referred to as "Data Mining" (discovery of knowledge in data). The theory of neural networks is applied along with classical methods of mathematical analysis and numerical simulation. The paper describes the technique protected by the patent of the Russian Federation for the invention “A Method for Determining Location of Production Wells during the Development of Hydrocarbon Fields” [1–3] and implemented using the geoinformation system NeuroInformGeo. There are no analogues in domestic and international practice. The paper gives an example of comparing the forecast of the oil reservoir quality made by the geophysicist interpreter using standard methods and the forecast of the oil reservoir quality made using this technology. The technical result achieved shows the increase of efficiency, effectiveness, and ecological compatibility of development of mineral deposits and discovery of a new oil deposit.
Applying Knowledge Discovery in Databases in Public Health Data Set: Challenges and Concerns
Volrathongchia, Kanittha
2003-01-01
In attempting to apply Knowledge Discovery in Databases (KDD) to generate a predictive model from a health care dataset that is currently available to the public, the first step is to pre-process the data to overcome the challenges of missing data, redundant observations, and records containing inaccurate data. This study will demonstrate how to use simple pre-processing methods to improve the quality of input data. PMID:14728545
Exploiting Early Intent Recognition for Competitive Advantage
2009-01-01
basketball [Bhan- dari et al., 1997; Jug et al., 2003], and Robocup soccer sim- ulations [Riley and Veloso, 2000; 2002; Kuhlmann et al., 2006] and non...actions (e.g. before, after, around). Jug et al. [2003] used a similar framework for offline basketball game analysis. More recently, Hess et al...and K. Ramanujam. Advanced Scout: Data mining and knowledge discovery in NBA data. Data Mining and Knowledge Discovery, 1(1):121–125, 1997. [Chang
ERIC Educational Resources Information Center
Fyfe, Emily R.; DeCaro, Marci S.; Rittle-Johnson, Bethany
2013-01-01
An emerging consensus suggests that guided discovery, which combines discovery and instruction, is a more effective educational approach than either one in isolation. The goal of this study was to examine two specific forms of guided discovery, testing whether conceptual instruction should precede or follow exploratory problem solving. In both…
ERIC Educational Resources Information Center
Liu, Chen-Chung; Don, Ping-Hsing; Chung, Chen-Wei; Lin, Shao-Jun; Chen, Gwo-Dong; Liu, Baw-Jhiune
2010-01-01
While Web discovery is usually undertaken as a solitary activity, Web co-discovery may transform Web learning activities from the isolated individual search process into interactive and collaborative knowledge exploration. Recent studies have proposed Web co-search environments on a single computer, supported by multiple one-to-one technologies.…
Knowledge Management in Higher Education: A Knowledge Repository Approach
ERIC Educational Resources Information Center
Wedman, John; Wang, Feng-Kwei
2005-01-01
One might expect higher education, where the discovery and dissemination of new and useful knowledge is vital, to be among the first to implement knowledge management practices. Surprisingly, higher education has been slow to implement knowledge management practices (Townley, 2003). This article describes an ongoing research and development effort…
Active Storage with Analytics Capabilities and I/O Runtime System for Petascale Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choudhary, Alok
Computational scientists must understand results from experimental, observational and computational simulation generated data to gain insights and perform knowledge discovery. As systems approach the petascale range, problems that were unimaginable a few years ago are within reach. With the increasing volume and complexity of data produced by ultra-scale simulations and high-throughput experiments, understanding the science is largely hampered by the lack of comprehensive I/O, storage, acceleration of data manipulation, analysis, and mining tools. Scientists require techniques, tools and infrastructure to facilitate better understanding of their data, in particular the ability to effectively perform complex data analysis, statistical analysis and knowledgemore » discovery. The goal of this work is to enable more effective analysis of scientific datasets through the integration of enhancements in the I/O stack, from active storage support at the file system layer to MPI-IO and high-level I/O library layers. We propose to provide software components to accelerate data analytics, mining, I/O, and knowledge discovery for large-scale scientific applications, thereby increasing productivity of both scientists and the systems. Our approaches include 1) design the interfaces in high-level I/O libraries, such as parallel netCDF, for applications to activate data mining operations at the lower I/O layers; 2) Enhance MPI-IO runtime systems to incorporate the functionality developed as a part of the runtime system design; 3) Develop parallel data mining programs as part of runtime library for server-side file system in PVFS file system; and 4) Prototype an active storage cluster, which will utilize multicore CPUs, GPUs, and FPGAs to carry out the data mining workload.« less
Crowdsourcing Knowledge Discovery and Innovations in Medicine
2014-01-01
Clinicians face difficult treatment decisions in contexts that are not well addressed by available evidence as formulated based on research. The digitization of medicine provides an opportunity for clinicians to collaborate with researchers and data scientists on solutions to previously ambiguous and seemingly insolvable questions. But these groups tend to work in isolated environments, and do not communicate or interact effectively. Clinicians are typically buried in the weeds and exigencies of daily practice such that they do not recognize or act on ways to improve knowledge discovery. Researchers may not be able to identify the gaps in clinical knowledge. For data scientists, the main challenge is discerning what is relevant in a domain that is both unfamiliar and complex. Each type of domain expert can contribute skills unavailable to the other groups. “Health hackathons” and “data marathons”, in which diverse participants work together, can leverage the current ready availability of digital data to discover new knowledge. Utilizing the complementary skills and expertise of these talented, but functionally divided groups, innovations are formulated at the systems level. As a result, the knowledge discovery process is simultaneously democratized and improved, real problems are solved, cross-disciplinary collaboration is supported, and innovations are enabled. PMID:25239002
Crowdsourcing knowledge discovery and innovations in medicine.
Celi, Leo Anthony; Ippolito, Andrea; Montgomery, Robert A; Moses, Christopher; Stone, David J
2014-09-19
Clinicians face difficult treatment decisions in contexts that are not well addressed by available evidence as formulated based on research. The digitization of medicine provides an opportunity for clinicians to collaborate with researchers and data scientists on solutions to previously ambiguous and seemingly insolvable questions. But these groups tend to work in isolated environments, and do not communicate or interact effectively. Clinicians are typically buried in the weeds and exigencies of daily practice such that they do not recognize or act on ways to improve knowledge discovery. Researchers may not be able to identify the gaps in clinical knowledge. For data scientists, the main challenge is discerning what is relevant in a domain that is both unfamiliar and complex. Each type of domain expert can contribute skills unavailable to the other groups. "Health hackathons" and "data marathons", in which diverse participants work together, can leverage the current ready availability of digital data to discover new knowledge. Utilizing the complementary skills and expertise of these talented, but functionally divided groups, innovations are formulated at the systems level. As a result, the knowledge discovery process is simultaneously democratized and improved, real problems are solved, cross-disciplinary collaboration is supported, and innovations are enabled.
Empirical study using network of semantically related associations in bridging the knowledge gap.
Abedi, Vida; Yeasin, Mohammed; Zand, Ramin
2014-11-27
The data overload has created a new set of challenges in finding meaningful and relevant information with minimal cognitive effort. However designing robust and scalable knowledge discovery systems remains a challenge. Recent innovations in the (biological) literature mining tools have opened new avenues to understand the confluence of various diseases, genes, risk factors as well as biological processes in bridging the gaps between the massive amounts of scientific data and harvesting useful knowledge. In this paper, we highlight some of the findings using a text analytics tool, called ARIANA--Adaptive Robust and Integrative Analysis for finding Novel Associations. Empirical study using ARIANA reveals knowledge discovery instances that illustrate the efficacy of such tool. For example, ARIANA can capture the connection between the drug hexamethonium and pulmonary inflammation and fibrosis that caused the tragic death of a healthy volunteer in a 2001 John Hopkins asthma study, even though the abstract of the study was not part of the semantic model. An integrated system, such as ARIANA, could assist the human expert in exploratory literature search by bringing forward hidden associations, promoting data reuse and knowledge discovery as well as stimulating interdisciplinary projects by connecting information across the disciplines.
NASA Astrophysics Data System (ADS)
Bettencourt, Luis; Kaiser, David
2004-03-01
Based on an a historically documented example of scientific discovery - Feynman diagrams as the main calculational tool of theoretical high energy Physics - we map the time evolution of the social network of early adopters through in the US, UK, Japan and the USSR. The spread of the technique for total number of users in each region is then modelled in terms of epidemic models, highlighting parallel and divergent aspects of this analogy. We also show that transient social arrangements develop as the idea is introduced and learned, which later disappear as the technique becomes common knowledge. Such early transient is characterized by abnormally low connectivity distribution powers and by high clustering. This interesting early non-equilibrium stage of network evolution is captured by a new dynamical model for network evolution, which coincides in its long time limit with familiar preferential aggregation dynamics.
2017-04-01
ADVANCED VISUALIZATION AND INTERACTIVE DISPLAY RAPID INNOVATION AND DISCOVERY EVALUATION RESEARCH (VISRIDER) PROGRAM TASK 6: POINT CLOUD...To) OCT 2013 – SEP 2014 4. TITLE AND SUBTITLE ADVANCED VISUALIZATION AND INTERACTIVE DISPLAY RAPID INNOVATION AND DISCOVERY EVALUATION RESEARCH...various point cloud visualization techniques for viewing large scale LiDAR datasets. Evaluate their potential use for thick client desktop platforms
Text-based discovery in biomedicine: the architecture of the DAD-system.
Weeber, M; Klein, H; Aronson, A R; Mork, J G; de Jong-van den Berg, L T; Vos, R
2000-01-01
Current scientific research takes place in highly specialized contexts with poor communication between disciplines as a likely consequence. Knowledge from one discipline may be useful for the other without researchers knowing it. As scientific publications are a condensation of this knowledge, literature-based discovery tools may help the individual scientist to explore new useful domains. We report on the development of the DAD-system, a concept-based Natural Language Processing system for PubMed citations that provides the biomedical researcher such a tool. We describe the general architecture and illustrate its operation by a simulation of a well-known text-based discovery: The favorable effects of fish oil on patients suffering from Raynaud's disease [1].
Which are the greatest recent discoveries and the greatest future challenges in nutrition?
Katan, M B; Boekschoten, M V; Connor, W E; Mensink, R P; Seidell, J; Vessby, B; Willett, W
2009-01-01
Nutrition science aims to create new knowledge, but scientists rarely sit back to reflect on what nutrition research has achieved in recent decades. We report the outcome of a 1-day symposium at which the audience was asked to vote on the greatest discoveries in nutrition since 1976 and on the greatest challenges for the coming 30 years. Most of the 128 participants were Dutch scientists working in nutrition or related biomedical and public health fields. Candidate discoveries and challenges were nominated by five invited speakers and by members of the audience. Ballot forms were then prepared on which participants selected one discovery and one challenge. A total of 15 discoveries and 14 challenges were nominated. The audience elected Folic acid prevents birth defects as the greatest discovery in nutrition science since 1976. Controlling obesity and insulin resistance through activity and diet was elected as the greatest challenge for the coming 30 years. This selection was probably biased by the interests and knowledge of the speakers and the audience. For the present review, we therefore added 12 discoveries from the period 1976 to 2006 that we judged worthy of consideration, but that had not been nominated at the meeting. The meeting did not represent an objective selection process, but it did demonstrate that the past 30 years have yielded major new discoveries in nutrition and health.
Translating three states of knowledge--discovery, invention, and innovation
2010-01-01
Background Knowledge Translation (KT) has historically focused on the proper use of knowledge in healthcare delivery. A knowledge base has been created through empirical research and resides in scholarly literature. Some knowledge is amenable to direct application by stakeholders who are engaged during or after the research process, as shown by the Knowledge to Action (KTA) model. Other knowledge requires multiple transformations before achieving utility for end users. For example, conceptual knowledge generated through science or engineering may become embodied as a technology-based invention through development methods. The invention may then be integrated within an innovative device or service through production methods. To what extent is KT relevant to these transformations? How might the KTA model accommodate these additional development and production activities while preserving the KT concepts? Discussion Stakeholders adopt and use knowledge that has perceived utility, such as a solution to a problem. Achieving a technology-based solution involves three methods that generate knowledge in three states, analogous to the three classic states of matter. Research activity generates discoveries that are intangible and highly malleable like a gas; development activity transforms discoveries into inventions that are moderately tangible yet still malleable like a liquid; and production activity transforms inventions into innovations that are tangible and immutable like a solid. The paper demonstrates how the KTA model can accommodate all three types of activity and address all three states of knowledge. Linking the three activities in one model also illustrates the importance of engaging the relevant stakeholders prior to initiating any knowledge-related activities. Summary Science and engineering focused on technology-based devices or services change the state of knowledge through three successive activities. Achieving knowledge implementation requires methods that accommodate these three activities and knowledge states. Accomplishing beneficial societal impacts from technology-based knowledge involves the successful progression through all three activities, and the effective communication of each successive knowledge state to the relevant stakeholders. The KTA model appears suitable for structuring and linking these processes. PMID:20205873
NIU, Kiyoshi
2008-01-01
This is a historical review of the discovery of naked charm particles and lifetime differences among charm species. These discoveries in the field of cosmic-ray physics were made by the innovation of nuclear emulsion techniques in Japan. A pair of naked charm particles was discovered in 1971 in a cosmic-ray interaction, three years prior to the discovery of the hidden charm particle, J/Ψ, in western countries. Lifetime differences between charged and neutral charm particles were pointed out in 1975, which were later re-confirmed by the collaborative Experiment E531 at Fermilab. Japanese physicists led by K.Niu made essential contributions to it with improved emulsion techniques, complemented by electronic detectors. This review also discusses the discovery of artificially produced naked charm particles by us in an accelerator experiment at Fermilab in 1975 and of multiple-pair productions of charm particles in a single interaction in 1987 by the collaborative Experiment WA75 at CERN. PMID:18941283
Open-source tools for data mining.
Zupan, Blaz; Demsar, Janez
2008-03-01
With a growing volume of biomedical databases and repositories, the need to develop a set of tools to address their analysis and support knowledge discovery is becoming acute. The data mining community has developed a substantial set of techniques for computational treatment of these data. In this article, we discuss the evolution of open-source toolboxes that data mining researchers and enthusiasts have developed over the span of a few decades and review several currently available open-source data mining suites. The approaches we review are diverse in data mining methods and user interfaces and also demonstrate that the field and its tools are ready to be fully exploited in biomedical research.
Instrumentation for the detection and characterization of exoplanets.
Pepe, Francesco; Ehrenreich, David; Meyer, Michael R
2014-09-18
In no other field of astrophysics has the impact of new instrumentation been as substantial as in the domain of exoplanets. Before 1995 our knowledge of exoplanets was mainly based on philosophical and theoretical considerations. The years that followed have been marked, instead, by surprising discoveries made possible by high-precision instruments. Over the past decade, the availability of new techniques has moved the focus of research from the detection to the characterization of exoplanets. Next-generation facilities will produce even more complementary data that will lead to a comprehensive view of exoplanet characteristics and, by comparison with theoretical models, to a better understanding of planet formation.
Distributed data mining on grids: services, tools, and applications.
Cannataro, Mario; Congiusta, Antonio; Pugliese, Andrea; Talia, Domenico; Trunfio, Paolo
2004-12-01
Data mining algorithms are widely used today for the analysis of large corporate and scientific datasets stored in databases and data archives. Industry, science, and commerce fields often need to analyze very large datasets maintained over geographically distributed sites by using the computational power of distributed and parallel systems. The grid can play a significant role in providing an effective computational support for distributed knowledge discovery applications. For the development of data mining applications on grids we designed a system called Knowledge Grid. This paper describes the Knowledge Grid framework and presents the toolset provided by the Knowledge Grid for implementing distributed knowledge discovery. The paper discusses how to design and implement data mining applications by using the Knowledge Grid tools starting from searching grid resources, composing software and data components, and executing the resulting data mining process on a grid. Some performance results are also discussed.
Knowledge Discovery/A Collaborative Approach, an Innovative Solution
NASA Technical Reports Server (NTRS)
Fitts, Mary A.
2009-01-01
Collaboration between Medical Informatics and Healthcare Systems (MIHCS) at NASA/Johnson Space Center (JSC) and the Texas Medical Center (TMC) Library was established to investigate technologies for facilitating knowledge discovery across multiple life sciences research disciplines in multiple repositories. After reviewing 14 potential Enterprise Search System (ESS) solutions, Collexis was determined to best meet the expressed needs. A three month pilot evaluation of Collexis produced positive reports from multiple scientists across 12 research disciplines. The joint venture and a pilot-phased approach achieved the desired results without the high cost of purchasing software, hardware or additional resources to conduct the task. Medical research is highly compartmentalized by discipline, e.g. cardiology, immunology, neurology. The medical research community at large, as well as at JSC, recognizes the need for cross-referencing relevant information to generate best evidence. Cross-discipline collaboration at JSC is specifically required to close knowledge gaps affecting space exploration. To facilitate knowledge discovery across these communities, MIHCS combined expertise with the TMC library and found Collexis to best fit the needs of our researchers including:
Knowledge Discovery in Spectral Data by Means of Complex Networks
Zanin, Massimiliano; Papo, David; Solís, José Luis González; Espinosa, Juan Carlos Martínez; Frausto-Reyes, Claudio; Anda, Pascual Palomares; Sevilla-Escoboza, Ricardo; Boccaletti, Stefano; Menasalvas, Ernestina; Sousa, Pedro
2013-01-01
In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, due to the lack of a natural network representation of spectral data. Here we define a technique for reconstructing networks from spectral data sets, where nodes represent spectral bins, and pairs of them are connected when their intensities follow a pattern associated with a disease. The structural analysis of the resulting network can then be used to feed standard data-mining algorithms, for instance for the classification of new (unlabeled) subjects. Furthermore, we show how the structure of the network is resilient to the presence of external additive noise, and how it can be used to extract relevant knowledge about the development of the disease. PMID:24957895
Knowledge discovery in spectral data by means of complex networks.
Zanin, Massimiliano; Papo, David; Solís, José Luis González; Espinosa, Juan Carlos Martínez; Frausto-Reyes, Claudio; Anda, Pascual Palomares; Sevilla-Escoboza, Ricardo; Jaimes-Reategui, Rider; Boccaletti, Stefano; Menasalvas, Ernestina; Sousa, Pedro
2013-03-11
In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, due to the lack of a natural network representation of spectral data. Here we define a technique for reconstructing networks from spectral data sets, where nodes represent spectral bins, and pairs of them are connected when their intensities follow a pattern associated with a disease. The structural analysis of the resulting network can then be used to feed standard data-mining algorithms, for instance for the classification of new (unlabeled) subjects. Furthermore, we show how the structure of the network is resilient to the presence of external additive noise, and how it can be used to extract relevant knowledge about the development of the disease.
Jóźwik, Jagoda; Kałużna-Czaplińska, Joanna
2016-01-01
Currently, analysis of various human body fluids is one of the most essential and promising approaches to enable the discovery of biomarkers or pathophysiological mechanisms for disorders and diseases. Analysis of these fluids is challenging due to their complex composition and unique characteristics. Development of new analytical methods in this field has made it possible to analyze body fluids with higher selectivity, sensitivity, and precision. The composition and concentration of analytes in body fluids are most often determined by chromatography-based techniques. There is no doubt that proper use of knowledge that comes from a better understanding of the role of body fluids requires the cooperation of scientists of diverse specializations, including analytical chemists, biologists, and physicians. This article summarizes current knowledge about the application of different chromatographic methods in analyses of a wide range of compounds in human body fluids in order to diagnose certain diseases and disorders.
The Adam and Eve Robot Scientists for the Automated Discovery of Scientific Knowledge
NASA Astrophysics Data System (ADS)
King, Ross
A Robot Scientist is a physically implemented robotic system that applies techniques from artificial intelligence to execute cycles of automated scientific experimentation. A Robot Scientist can automatically execute cycles of hypothesis formation, selection of efficient experiments to discriminate between hypotheses, execution of experiments using laboratory automation equipment, and analysis of results. The motivation for developing Robot Scientists is to better understand science, and to make scientific research more efficient. The Robot Scientist `Adam' was the first machine to autonomously discover scientific knowledge: both form and experimentally confirm novel hypotheses. Adam worked in the domain of yeast functional genomics. The Robot Scientist `Eve' was originally developed to automate early-stage drug development, with specific application to neglected tropical disease such as malaria, African sleeping sickness, etc. We are now adapting Eve to work with on cancer. We are also teaching Eve to autonomously extract information from the scientific literature.
PubMedMiner: Mining and Visualizing MeSH-based Associations in PubMed.
Zhang, Yucan; Sarkar, Indra Neil; Chen, Elizabeth S
2014-01-01
The exponential growth of biomedical literature provides the opportunity to develop approaches for facilitating the identification of possible relationships between biomedical concepts. Indexing by Medical Subject Headings (MeSH) represent high-quality summaries of much of this literature that can be used to support hypothesis generation and knowledge discovery tasks using techniques such as association rule mining. Based on a survey of literature mining tools, a tool implemented using Ruby and R - PubMedMiner - was developed in this study for mining and visualizing MeSH-based associations for a set of MEDLINE articles. To demonstrate PubMedMiner's functionality, a case study was conducted that focused on identifying and comparing comorbidities for asthma in children and adults. Relative to the tools surveyed, the initial results suggest that PubMedMiner provides complementary functionality for summarizing and comparing topics as well as identifying potentially new knowledge.
Computational knowledge integration in biopharmaceutical research.
Ficenec, David; Osborne, Mark; Pradines, Joel; Richards, Dan; Felciano, Ramon; Cho, Raymond J; Chen, Richard O; Liefeld, Ted; Owen, James; Ruttenberg, Alan; Reich, Christian; Horvath, Joseph; Clark, Tim
2003-09-01
An initiative to increase biopharmaceutical research productivity by capturing, sharing and computationally integrating proprietary scientific discoveries with public knowledge is described. This initiative involves both organisational process change and multiple interoperating software systems. The software components rely on mutually supporting integration techniques. These include a richly structured ontology, statistical analysis of experimental data against stored conclusions, natural language processing of public literature, secure document repositories with lightweight metadata, web services integration, enterprise web portals and relational databases. This approach has already begun to increase scientific productivity in our enterprise by creating an organisational memory (OM) of internal research findings, accessible on the web. Through bringing together these components it has also been possible to construct a very large and expanding repository of biological pathway information linked to this repository of findings which is extremely useful in analysis of DNA microarray data. This repository, in turn, enables our research paradigm to be shifted towards more comprehensive systems-based understandings of drug action.
A brief history of vaccines: smallpox to the present.
Hsu, Jennifer L
2013-01-01
Modern vaccine history began in the late 18th century with the discovery of smallpox immunization by Edward Jenner. This pivotal step led to substantial progress in prevention of infectious diseases with inactivated vaccines for multiple infectious diseases, including typhoid, plague and cholera. Each advance produced significant decreases in infection-associated morbidity and mortality, thus shaping our modem cultures. As knowledge of microbiology and immunology grew through the 20th century, techniques were developed for cell culture of viruses. This allowed for rapid advances in prevention of polio, varicella, influenza and others. Finally, recent research has led to development of alternative vaccine strategies through use of vectored antigens, pathogen subunits (purified proteins or polysaccharides) or genetically engineered antigens. As the science of vaccinology continues to rapidly evolve, knowledge of the past creates added emphasis on the importance of developing safe and effective strategies for infectious disease prevention in the 21st century.
Building Scalable Knowledge Graphs for Earth Science
NASA Technical Reports Server (NTRS)
Ramachandran, Rahul; Maskey, Manil; Gatlin, Patrick; Zhang, Jia; Duan, Xiaoyi; Miller, J. J.; Bugbee, Kaylin; Christopher, Sundar; Freitag, Brian
2017-01-01
Knowledge Graphs link key entities in a specific domain with other entities via relationships. From these relationships, researchers can query knowledge graphs for probabilistic recommendations to infer new knowledge. Scientific papers are an untapped resource which knowledge graphs could leverage to accelerate research discovery. Goal: Develop an end-to-end (semi) automated methodology for constructing Knowledge Graphs for Earth Science.
Genetic discoveries and nursing implications for complex disease prevention and management.
Frazier, Lorraine; Meininger, Janet; Halsey Lea, Dale; Boerwinkle, Eric
2004-01-01
The purpose of this article is to examine the management of patients with complex diseases, in light of recent genetic discoveries, and to explore how these genetic discoveries will impact nursing practice and nursing research. The nursing science processes discussed are not comprehensive of all nursing practice but, instead, are concentrated in areas where genetics will have the greatest influence. Advances in genetic science will revolutionize our approach to patients and to health care in the prevention, diagnosis, and treatment of disease, raising many issues for nursing research and practice. As the scope of genetics expands to encompass multifactorial disease processes, a continuing reexamination of the knowledge base is required for nursing practice, with incorporation of genetic knowledge into the repertoire of every nurse, and with advanced knowledge for nurses who select specialty roles in the genetics area. This article explores the impact of this revolution on nursing science and practice as well as the opportunities for nursing science and practice to participate fully in this revolution. Because of the high proportion of the population at risk for complex diseases and because nurses are occupied every day in the prevention, assessment, treatment, and therapeutic intervention of patients with such diseases in practice and research, there is great opportunity for nurses to improve health care through the application (nursing practice) and discovery (nursing research) of genetic knowledge.
Discovery informatics in biological and biomedical sciences: research challenges and opportunities.
Honavar, Vasant
2015-01-01
New discoveries in biological, biomedical and health sciences are increasingly being driven by our ability to acquire, share, integrate and analyze, and construct and simulate predictive models of biological systems. While much attention has focused on automating routine aspects of management and analysis of "big data", realizing the full potential of "big data" to accelerate discovery calls for automating many other aspects of the scientific process that have so far largely resisted automation: identifying gaps in the current state of knowledge; generating and prioritizing questions; designing studies; designing, prioritizing, planning, and executing experiments; interpreting results; forming hypotheses; drawing conclusions; replicating studies; validating claims; documenting studies; communicating results; reviewing results; and integrating results into the larger body of knowledge in a discipline. Against this background, the PSB workshop on Discovery Informatics in Biological and Biomedical Sciences explores the opportunities and challenges of automating discovery or assisting humans in discovery through advances (i) Understanding, formalization, and information processing accounts of, the entire scientific process; (ii) Design, development, and evaluation of the computational artifacts (representations, processes) that embody such understanding; and (iii) Application of the resulting artifacts and systems to advance science (by augmenting individual or collective human efforts, or by fully automating science).
Schmalhofer, F J; Tschaitschian, B
1998-11-01
In this paper, we perform a cognitive analysis of knowledge discovery processes. As a result of this analysis, the construction-integration theory is proposed as a general framework for developing cooperative knowledge evolution systems. We thus suggest that for the acquisition of new domain knowledge in medicine, one should first construct pluralistic views on a given topic which may contain inconsistencies as well as redundancies. Only thereafter does this knowledge become consolidated into a situation-specific circumscription and the early inconsistencies become eliminated. As a proof for the viability of such knowledge acquisition processes in medicine, we present the IDEAS system, which can be used for the intelligent documentation of adverse events in clinical studies. This system provides a better documentation of the side-effects of medical drugs. Thereby, knowledge evolution occurs by achieving consistent explanations in increasingly larger contexts (i.e., more cases and more pharmaceutical substrates). Finally, it is shown how prototypes, model-based approaches and cooperative knowledge evolution systems can be distinguished as different classes of knowledge-based systems.
Pituitary Medicine From Discovery to Patient-Focused Outcomes
2016-01-01
Context: This perspective traces a pipeline of discovery in pituitary medicine over the past 75 years. Objective: To place in context past advances and predict future changes in understanding pituitary pathophysiology and clinical care. Design: Author's perspective on reports of pituitary advances in the published literature. Setting: Clinical and translational Endocrinology. Outcomes: Discovery of the hypothalamic-pituitary axis and mechanisms for pituitary control, have culminated in exquisite understanding of anterior pituitary cell function and dysfunction. Challenges facing the discipline include fundamental understanding of pituitary adenoma pathogenesis leading to more effective treatments of inexorably growing and debilitating hormone secreting pituitary tumors as well as medical management of non-secreting pituitary adenomas. Newly emerging pituitary syndromes include those associated with immune-targeted cancer therapies and head trauma. Conclusions: Novel diagnostic techniques including imaging genomic, proteomic, and biochemical analyses will yield further knowledge to enable diagnosis of heretofore cryptic syndromes, as well as sub classifications of pituitary syndromes for personalized treatment approaches. Cost effective personalized approaches to precision therapy must demonstrate value, and will be empowered by multidisciplinary approaches to integrating complex subcellular information to identify therapeutic targets for enabling maximal outcomes. These goals will be challenging to attain given the rarity of pituitary disorders and the difficulty in conducting appropriately powered prospective trials. PMID:26908107
Building Better Decision-Support by Using Knowledge Discovery.
ERIC Educational Resources Information Center
Jurisica, Igor
2000-01-01
Discusses knowledge-based decision-support systems that use artificial intelligence approaches. Addresses the issue of how to create an effective case-based reasoning system for complex and evolving domains, focusing on automated methods for system optimization and domain knowledge evolution that can supplement knowledge acquired from domain…
McDermott, Jason E.; Wang, Jing; Mitchell, Hugh; Webb-Robertson, Bobbie-Jo; Hafen, Ryan; Ramey, John; Rodland, Karin D.
2012-01-01
Introduction The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful molecular signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities for more sophisticated approaches to integrating purely statistical and expert knowledge-based approaches. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges that have been encountered in deriving valid and useful signatures of disease. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to identify predictive signatures of disease are key to future success in the biomarker field. We will describe our recommendations for possible approaches to this problem including metrics for the evaluation of biomarkers. PMID:23335946
Mining the Quantified Self: Personal Knowledge Discovery as a Challenge for Data Science.
Fawcett, Tom
2015-12-01
The last several years have seen an explosion of interest in wearable computing, personal tracking devices, and the so-called quantified self (QS) movement. Quantified self involves ordinary people recording and analyzing numerous aspects of their lives to understand and improve themselves. This is now a mainstream phenomenon, attracting a great deal of attention, participation, and funding. As more people are attracted to the movement, companies are offering various new platforms (hardware and software) that allow ever more aspects of daily life to be tracked. Nearly every aspect of the QS ecosystem is advancing rapidly, except for analytic capabilities, which remain surprisingly primitive. With increasing numbers of qualified self participants collecting ever greater amounts and types of data, many people literally have more data than they know what to do with. This article reviews the opportunities and challenges posed by the QS movement. Data science provides well-tested techniques for knowledge discovery. But making these useful for the QS domain poses unique challenges that derive from the characteristics of the data collected as well as the specific types of actionable insights that people want from the data. Using a small sample of QS time series data containing information about personal health we provide a formulation of the QS problem that connects data to the decisions of interest to the user.
Rivo, Eduardo; de la Fuente, Javier; Rivo, Ángel; García-Fontán, Eva; Cañizares, Miguel-Ángel; Gil, Pedro
2012-01-01
The aim of this study was to assess the applicability of knowledge discovery in database methodology, based upon data mining techniques, to the investigation of lung cancer surgery. According to CRISP 1.0 methodology, a data mining (DM) project was developed on a data warehouse containing records for 501 patients operated on for lung cancer with curative intention. The modelling technique was logistic regression. The finally selected model presented the following values: sensitivity 9.68%, specificity 100%, global precision 94.02%, positive predictive value 100% and negative predictive value 93.98% for a cut-off point set at 0.5. A receiver operating characteristic (ROC) curve was constructed. The area under the curve (CI 95%) was 0.817 (0.740- 0.893) (p < 0.05). Statistical association with perioperative mortality was found for the following variables [odds ratio (CI 95%)]: age over 70 [2.3822 (1.0338-5.4891)], heart disease [2.4875 (1.0089-6.1334)], peripheral arterial disease [5.7705 (1.9296-17.2570)], pneumonectomy [3.6199 (1.4939-8.7715)] and length of surgery (min) [1.0067 (1.0008-1.0126)]. The CRISP-DM process model is very suitable for lung cancer surgery analysis, improving decision making as well as knowledge and quality management.
Solution NMR Spectroscopy in Target-Based Drug Discovery.
Li, Yan; Kang, Congbao
2017-08-23
Solution NMR spectroscopy is a powerful tool to study protein structures and dynamics under physiological conditions. This technique is particularly useful in target-based drug discovery projects as it provides protein-ligand binding information in solution. Accumulated studies have shown that NMR will play more and more important roles in multiple steps of the drug discovery process. In a fragment-based drug discovery process, ligand-observed and protein-observed NMR spectroscopy can be applied to screen fragments with low binding affinities. The screened fragments can be further optimized into drug-like molecules. In combination with other biophysical techniques, NMR will guide structure-based drug discovery. In this review, we describe the possible roles of NMR spectroscopy in drug discovery. We also illustrate the challenges encountered in the drug discovery process. We include several examples demonstrating the roles of NMR in target-based drug discoveries such as hit identification, ranking ligand binding affinities, and mapping the ligand binding site. We also speculate the possible roles of NMR in target engagement based on recent processes in in-cell NMR spectroscopy.
Fundamentals of microfluidic cell culture in controlled microenvironments†
Young, Edmond W. K.; Beebe, David J.
2010-01-01
Microfluidics has the potential to revolutionize the way we approach cell biology research. The dimensions of microfluidic channels are well suited to the physical scale of biological cells, and the many advantages of microfluidics make it an attractive platform for new techniques in biology. One of the key benefits of microfluidics for basic biology is the ability to control parameters of the cell microenvironment at relevant length and time scales. Considerable progress has been made in the design and use of novel microfluidic devices for culturing cells and for subsequent treatment and analysis. With the recent pace of scientific discovery, it is becoming increasingly important to evaluate existing tools and techniques, and to synthesize fundamental concepts that would further improve the efficiency of biological research at the microscale. This tutorial review integrates fundamental principles from cell biology and local microenvironments with cell culture techniques and concepts in microfluidics. Culturing cells in microscale environments requires knowledge of multiple disciplines including physics, biochemistry, and engineering. We discuss basic concepts related to the physical and biochemical microenvironments of the cell, physicochemical properties of that microenvironment, cell culture techniques, and practical knowledge of microfluidic device design and operation. We also discuss the most recent advances in microfluidic cell culture and their implications on the future of the field. The goal is to guide new and interested researchers to the important areas and challenges facing the scientific community as we strive toward full integration of microfluidics with biology. PMID:20179823
Narumi, Ryohei; Tomonaga, Takeshi
2016-01-01
Mass spectrometry-based phosphoproteomics is an indispensible technique used in the discovery and quantification of phosphorylation events on proteins in biological samples. The application of this technique to tissue samples is especially useful for the discovery of biomarkers as well as biological studies. We herein describe the application of a large-scale phosphoproteome analysis and SRM/MRM-based quantitation to develop a strategy for the systematic discovery and validation of biomarkers using tissue samples.
To ontologise or not to ontologise: An information model for a geospatial knowledge infrastructure
NASA Astrophysics Data System (ADS)
Stock, Kristin; Stojanovic, Tim; Reitsma, Femke; Ou, Yang; Bishr, Mohamed; Ortmann, Jens; Robertson, Anne
2012-08-01
A geospatial knowledge infrastructure consists of a set of interoperable components, including software, information, hardware, procedures and standards, that work together to support advanced discovery and creation of geoscientific resources, including publications, data sets and web services. The focus of the work presented is the development of such an infrastructure for resource discovery. Advanced resource discovery is intended to support scientists in finding resources that meet their needs, and focuses on representing the semantic details of the scientific resources, including the detailed aspects of the science that led to the resource being created. This paper describes an information model for a geospatial knowledge infrastructure that uses ontologies to represent these semantic details, including knowledge about domain concepts, the scientific elements of the resource (analysis methods, theories and scientific processes) and web services. This semantic information can be used to enable more intelligent search over scientific resources, and to support new ways to infer and visualise scientific knowledge. The work describes the requirements for semantic support of a knowledge infrastructure, and analyses the different options for information storage based on the twin goals of semantic richness and syntactic interoperability to allow communication between different infrastructures. Such interoperability is achieved by the use of open standards, and the architecture of the knowledge infrastructure adopts such standards, particularly from the geospatial community. The paper then describes an information model that uses a range of different types of ontologies, explaining those ontologies and their content. The information model was successfully implemented in a working geospatial knowledge infrastructure, but the evaluation identified some issues in creating the ontologies.
Hassani-Pak, Keywan; Rawlings, Christopher
2017-06-13
Genetics and "omics" studies designed to uncover genotype to phenotype relationships often identify large numbers of potential candidate genes, among which the causal genes are hidden. Scientists generally lack the time and technical expertise to review all relevant information available from the literature, from key model species and from a potentially wide range of related biological databases in a variety of data formats with variable quality and coverage. Computational tools are needed for the integration and evaluation of heterogeneous information in order to prioritise candidate genes and components of interaction networks that, if perturbed through potential interventions, have a positive impact on the biological outcome in the whole organism without producing negative side effects. Here we review several bioinformatics tools and databases that play an important role in biological knowledge discovery and candidate gene prioritization. We conclude with several key challenges that need to be addressed in order to facilitate biological knowledge discovery in the future.
Reasoning and Knowledge Acquisition Framework for 5G Network Analytics
2017-01-01
Autonomic self-management is a key challenge for next-generation networks. This paper proposes an automated analysis framework to infer knowledge in 5G networks with the aim to understand the network status and to predict potential situations that might disrupt the network operability. The framework is based on the Endsley situational awareness model, and integrates automated capabilities for metrics discovery, pattern recognition, prediction techniques and rule-based reasoning to infer anomalous situations in the current operational context. Those situations should then be mitigated, either proactive or reactively, by a more complex decision-making process. The framework is driven by a use case methodology, where the network administrator is able to customize the knowledge inference rules and operational parameters. The proposal has also been instantiated to prove its adaptability to a real use case. To this end, a reference network traffic dataset was used to identify suspicious patterns and to predict the behavior of the monitored data volume. The preliminary results suggest a good level of accuracy on the inference of anomalous traffic volumes based on a simple configuration. PMID:29065473
NASA Astrophysics Data System (ADS)
Huang, Zan; Chen, Hsinchun; Yip, Alan; Ng, Gavin; Guo, Fei; Chen, Zhi-Kai; Roco, Mihail C.
2003-08-01
Nanoscale science and engineering (NSE) and related areas have seen rapid growth in recent years. The speed and scope of development in the field have made it essential for researchers to be informed on the progress across different laboratories, companies, industries and countries. In this project, we experimented with several analysis and visualization techniques on NSE-related United States patent documents to support various knowledge tasks. This paper presents results on the basic analysis of nanotechnology patents between 1976 and 2002, content map analysis and citation network analysis. The data have been obtained on individual countries, institutions and technology fields. The top 10 countries with the largest number of nanotechnology patents are the United States, Japan, France, the United Kingdom, Taiwan, Korea, the Netherlands, Switzerland, Italy and Australia. The fastest growth in the last 5 years has been in chemical and pharmaceutical fields, followed by semiconductor devices. The results demonstrate potential of information-based discovery and visualization technologies to capture knowledge regarding nanotechnology performance, transfer of knowledge and trends of development through analyzing the patent documents.
Reasoning and Knowledge Acquisition Framework for 5G Network Analytics.
Sotelo Monge, Marco Antonio; Maestre Vidal, Jorge; García Villalba, Luis Javier
2017-10-21
Autonomic self-management is a key challenge for next-generation networks. This paper proposes an automated analysis framework to infer knowledge in 5G networks with the aim to understand the network status and to predict potential situations that might disrupt the network operability. The framework is based on the Endsley situational awareness model, and integrates automated capabilities for metrics discovery, pattern recognition, prediction techniques and rule-based reasoning to infer anomalous situations in the current operational context. Those situations should then be mitigated, either proactive or reactively, by a more complex decision-making process. The framework is driven by a use case methodology, where the network administrator is able to customize the knowledge inference rules and operational parameters. The proposal has also been instantiated to prove its adaptability to a real use case. To this end, a reference network traffic dataset was used to identify suspicious patterns and to predict the behavior of the monitored data volume. The preliminary results suggest a good level of accuracy on the inference of anomalous traffic volumes based on a simple configuration.
NASA Astrophysics Data System (ADS)
Izquierdo, Joaquín; Montalvo, Idel; Campbell, Enrique; Pérez-García, Rafael
2016-08-01
Selecting the most appropriate heuristic for solving a specific problem is not easy, for many reasons. This article focuses on one of these reasons: traditionally, the solution search process has operated in a given manner regardless of the specific problem being solved, and the process has been the same regardless of the size, complexity and domain of the problem. To cope with this situation, search processes should mould the search into areas of the search space that are meaningful for the problem. This article builds on previous work in the development of a multi-agent paradigm using techniques derived from knowledge discovery (data-mining techniques) on databases of so-far visited solutions. The aim is to improve the search mechanisms, increase computational efficiency and use rules to enrich the formulation of optimization problems, while reducing the search space and catering to realistic problems.
Exploring patterns of epigenetic information with data mining techniques.
Aguiar-Pulido, Vanessa; Seoane, José A; Gestal, Marcos; Dorado, Julián
2013-01-01
Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of epigenetic data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of data for which data mining is essential. Part of these data may contain patterns of epigenetic information which are mitotically and/or meiotically heritable determining gene expression and cellular differentiation, as well as cellular fate. Epigenetic lesions and genetic mutations are acquired by individuals during their life and accumulate with ageing. Both defects, either together or individually, can result in losing control over cell growth and, thus, causing cancer development. Data mining techniques could be then used to extract the previous patterns. This work reviews some of the most important applications of data mining to epigenetics.
ERIC Educational Resources Information Center
Weeber, Marc; Klein, Henny; de Jong-van den Berg, Lolkje T. W.; Vos, Rein
2001-01-01
Proposes a two-step model of discovery in which new scientific hypotheses can be generated and subsequently tested. Applying advanced natural language processing techniques to find biomedical concepts in text, the model is implemented in a versatile interactive discovery support tool. This tool is used to successfully simulate Don R. Swanson's…
Cross-organism learning method to discover new gene functionalities.
Domeniconi, Giacomo; Masseroli, Marco; Moro, Gianluca; Pinoli, Pietro
2016-04-01
Knowledge of gene and protein functions is paramount for the understanding of physiological and pathological biological processes, as well as in the development of new drugs and therapies. Analyses for biomedical knowledge discovery greatly benefit from the availability of gene and protein functional feature descriptions expressed through controlled terminologies and ontologies, i.e., of gene and protein biomedical controlled annotations. In the last years, several databases of such annotations have become available; yet, these valuable annotations are incomplete, include errors and only some of them represent highly reliable human curated information. Computational techniques able to reliably predict new gene or protein annotations with an associated likelihood value are thus paramount. Here, we propose a novel cross-organisms learning approach to reliably predict new functionalities for the genes of an organism based on the known controlled annotations of the genes of another, evolutionarily related and better studied, organism. We leverage a new representation of the annotation discovery problem and a random perturbation of the available controlled annotations to allow the application of supervised algorithms to predict with good accuracy unknown gene annotations. Taking advantage of the numerous gene annotations available for a well-studied organism, our cross-organisms learning method creates and trains better prediction models, which can then be applied to predict new gene annotations of a target organism. We tested and compared our method with the equivalent single organism approach on different gene annotation datasets of five evolutionarily related organisms (Homo sapiens, Mus musculus, Bos taurus, Gallus gallus and Dictyostelium discoideum). Results show both the usefulness of the perturbation method of available annotations for better prediction model training and a great improvement of the cross-organism models with respect to the single-organism ones, without influence of the evolutionary distance between the considered organisms. The generated ranked lists of reliably predicted annotations, which describe novel gene functionalities and have an associated likelihood value, are very valuable both to complement available annotations, for better coverage in biomedical knowledge discovery analyses, and to quicken the annotation curation process, by focusing it on the prioritized novel annotations predicted. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Hypergraph Based Feature Selection Technique for Medical Diagnosis.
Somu, Nivethitha; Raman, M R Gauthama; Kirthivasan, Kannan; Sriram, V S Shankar
2016-11-01
The impact of internet and information systems across various domains have resulted in substantial generation of multidimensional datasets. The use of data mining and knowledge discovery techniques to extract the original information contained in the multidimensional datasets play a significant role in the exploitation of complete benefit provided by them. The presence of large number of features in the high dimensional datasets incurs high computational cost in terms of computing power and time. Hence, feature selection technique has been commonly used to build robust machine learning models to select a subset of relevant features which projects the maximal information content of the original dataset. In this paper, a novel Rough Set based K - Helly feature selection technique (RSKHT) which hybridize Rough Set Theory (RST) and K - Helly property of hypergraph representation had been designed to identify the optimal feature subset or reduct for medical diagnostic applications. Experiments carried out using the medical datasets from the UCI repository proves the dominance of the RSKHT over other feature selection techniques with respect to the reduct size, classification accuracy and time complexity. The performance of the RSKHT had been validated using WEKA tool, which shows that RSKHT had been computationally attractive and flexible over massive datasets.
Great Originals of Modern Physics
ERIC Educational Resources Information Center
Decker, Fred W.
1972-01-01
European travel can provide an intimate view of the implements and locales of great discoveries in physics for the knowledgeable traveler. The four museums at Cambridge, London, Remscheid-Lennep, and Munich display a full range of discovery apparatus in modern physics as outlined here. (Author/TS)
ERIC Educational Resources Information Center
MacKenzie, Marion
1983-01-01
Scientific research leading to the discovery of female plants of the red alga Palmaria plamata (dulse) is described. This discovery has not only advanced knowledge of marine organisms and taxonomic relationships but also has practical implications. The complete life cycle of this organism is included. (JN)
43 CFR 4.1132 - Scope of discovery.
Code of Federal Regulations, 2014 CFR
2014-10-01
..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
43 CFR 4.1132 - Scope of discovery.
Code of Federal Regulations, 2012 CFR
2012-10-01
..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
43 CFR 4.1132 - Scope of discovery.
Code of Federal Regulations, 2013 CFR
2013-10-01
..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
43 CFR 4.1132 - Scope of discovery.
Code of Federal Regulations, 2011 CFR
2011-10-01
..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
Scientific Knowledge Discovery in Complex Semantic Networks of Geophysical Systems
NASA Astrophysics Data System (ADS)
Fox, P.
2012-04-01
The vast majority of explorations of the Earth's systems are limited in their ability to effectively explore the most important (often most difficult) problems because they are forced to interconnect at the data-element, or syntactic, level rather than at a higher scientific, or semantic, level. Recent successes in the application of complex network theory and algorithms to climate data, raise expectations that more general graph-based approaches offer the opportunity for new discoveries. In the past ~ 5 years in the natural sciences there has substantial progress in providing both specialists and non-specialists the ability to describe in machine readable form, geophysical quantities and relations among them in meaningful and natural ways, effectively breaking the prior syntax barrier. The corresponding open-world semantics and reasoning provide higher-level interconnections. That is, semantics provided around the data structures, using semantically-equipped tools, and semantically aware interfaces between science application components allowing for discovery at the knowledge level. More recently, formal semantic approaches to continuous and aggregate physical processes are beginning to show promise and are soon likely to be ready to apply to geoscientific systems. To illustrate these opportunities, this presentation presents two application examples featuring domain vocabulary (ontology) and property relations (named and typed edges in the graphs). First, a climate knowledge discovery pilot encoding and exploration of CMIP5 catalog information with the eventual goal to encode and explore CMIP5 data. Second, a multi-stakeholder knowledge network for integrated assessments in marine ecosystems, where the data is highly inter-disciplinary.
Beginning to manage drug discovery and development knowledge.
Sumner-Smith, M
2001-05-01
Knowledge management approaches and technologies are beginning to be implemented by the pharmaceutical industry in support of new drug discovery and development processes aimed at greater efficiencies and effectiveness. This trend coincides with moves to reduce paper, coordinate larger teams with more diverse skills that are distributed around the globe, and to comply with regulatory requirements for electronic submissions and the associated maintenance of electronic records. Concurrently, the available technologies have implemented web-based architectures with a greater range of collaborative tools and personalization through portal approaches. However, successful application of knowledge management methods depends on effective cultural change management, as well as proper architectural design to match the organizational and work processes within a company.
Evidence-based medicine: is it a bridge too far?
Fernandez, Ana; Sturmberg, Joachim; Lukersmith, Sue; Madden, Rosamond; Torkfar, Ghazal; Colagiuri, Ruth; Salvador-Carulla, Luis
2015-11-06
This paper aims to describe the contextual factors that gave rise to evidence-based medicine (EBM), as well as its controversies and limitations in the current health context. Our analysis utilizes two frameworks: (1) a complex adaptive view of health that sees both health and healthcare as non-linear phenomena emerging from their different components; and (2) the unified approach to the philosophy of science that provides a new background for understanding the differences between the phases of discovery, corroboration, and implementation in science. The need for standardization, the development of clinical epidemiology, concerns about the economic sustainability of health systems and increasing numbers of clinical trials, together with the increase in the computer's ability to handle large amounts of data, have paved the way for the development of the EBM movement. It was quickly adopted on the basis of authoritative knowledge rather than evidence of its own capacity to improve the efficiency and equity of health systems. The main problem with the EBM approach is the restricted and simplistic approach to scientific knowledge, which prioritizes internal validity as the major quality of the studies to be included in clinical guidelines. As a corollary, the preferred method for generating evidence is the explanatory randomized controlled trial. This method can be useful in the phase of discovery but is inadequate in the field of implementation, which needs to incorporate additional information including expert knowledge, patients' values and the context. EBM needs to move forward and perceive health and healthcare as a complex interaction, i.e. an interconnected, non-linear phenomenon that may be better analysed using a variety of complexity science techniques.
Current progress in Structure-Based Rational Drug Design marks a new mindset in drug discovery
Lounnas, Valère; Ritschel, Tina; Kelder, Jan; McGuire, Ross; Bywater, Robert P.; Foloppe, Nicolas
2013-01-01
The past decade has witnessed a paradigm shift in preclinical drug discovery with structure-based drug design (SBDD) making a comeback while high-throughput screening (HTS) methods have continued to generate disappointing results. There is a deficit of information between identified hits and the many criteria that must be fulfilled in parallel to convert them into preclinical candidates that have a real chance to become a drug. This gap can be bridged by investigating the interactions between the ligands and their receptors. Accurate calculations of the free energy of binding are still elusive; however progresses were made with respect to how one may deal with the versatile role of water. A corpus of knowledge combining X-ray structures, bioinformatics and molecular modeling techniques now allows drug designers to routinely produce receptor homology models of increasing quality. These models serve as a basis to establish and validate efficient rationales used to tailor and/or screen virtual libraries with enhanced chances of obtaining hits. Many case reports of successful SBDD show how synergy can be gained from the combined use of several techniques. The role of SBDD with respect to two different classes of widely investigated pharmaceutical targets: (a) protein kinases (PK) and (b) G-protein coupled receptors (GPCR) is discussed. Throughout these examples prototypical situations covering the current possibilities and limitations of SBDD are presented. PMID:24688704
DrugQuest - a text mining workflow for drug association discovery.
Papanikolaou, Nikolas; Pavlopoulos, Georgios A; Theodosiou, Theodosios; Vizirianakis, Ioannis S; Iliopoulos, Ioannis
2016-06-06
Text mining and data integration methods are gaining ground in the field of health sciences due to the exponential growth of bio-medical literature and information stored in biological databases. While such methods mostly try to extract bioentity associations from PubMed, very few of them are dedicated in mining other types of repositories such as chemical databases. Herein, we apply a text mining approach on the DrugBank database in order to explore drug associations based on the DrugBank "Description", "Indication", "Pharmacodynamics" and "Mechanism of Action" text fields. We apply Name Entity Recognition (NER) techniques on these fields to identify chemicals, proteins, genes, pathways, diseases, and we utilize the TextQuest algorithm to find additional biologically significant words. Using a plethora of similarity and partitional clustering techniques, we group the DrugBank records based on their common terms and investigate possible scenarios why these records are clustered together. Different views such as clustered chemicals based on their textual information, tag clouds consisting of Significant Terms along with the terms that were used for clustering are delivered to the user through a user-friendly web interface. DrugQuest is a text mining tool for knowledge discovery: it is designed to cluster DrugBank records based on text attributes in order to find new associations between drugs. The service is freely available at http://bioinformatics.med.uoc.gr/drugquest .
NASA Astrophysics Data System (ADS)
Costanzi, Stefano; Tikhonova, Irina G.; Harden, T. Kendall; Jacobson, Kenneth A.
2009-11-01
Accurate in silico models for the quantitative prediction of the activity of G protein-coupled receptor (GPCR) ligands would greatly facilitate the process of drug discovery and development. Several methodologies have been developed based on the properties of the ligands, the direct study of the receptor-ligand interactions, or a combination of both approaches. Ligand-based three-dimensional quantitative structure-activity relationships (3D-QSAR) techniques, not requiring knowledge of the receptor structure, have been historically the first to be applied to the prediction of the activity of GPCR ligands. They are generally endowed with robustness and good ranking ability; however they are highly dependent on training sets. Structure-based techniques generally do not provide the level of accuracy necessary to yield meaningful rankings when applied to GPCR homology models. However, they are essentially independent from training sets and have a sufficient level of accuracy to allow an effective discrimination between binders and nonbinders, thus qualifying as viable lead discovery tools. The combination of ligand and structure-based methodologies in the form of receptor-based 3D-QSAR and ligand and structure-based consensus models results in robust and accurate quantitative predictions. The contribution of the structure-based component to these combined approaches is expected to become more substantial and effective in the future, as more sophisticated scoring functions are developed and more detailed structural information on GPCRs is gathered.
Foreword to "The Secret of Childhood."
ERIC Educational Resources Information Center
Stephenson, Margaret E.
2000-01-01
Discusses the basic discoveries of Montessori's Casa dei Bambini. Considers principles of Montessori's organizing theory: the absorbent mind, the unfolding nature of life, the spiritual embryo, self-construction, acquisition of culture, creativity of life, repetition of exercise, freedom within limits, children's discovery of knowledge, the secret…
NASA Astrophysics Data System (ADS)
Harwit, Martin
1984-04-01
In the remarkable opening section of this book, a well-known Cornell astronomer gives precise thumbnail histories of the 43 basic cosmic discoveries - stars, planets, novae, pulsars, comets, gamma-ray bursts, and the like - that form the core of our knowledge of the universe. Many of them, he points out, were made accidentally and outside the mainstream of astronomical research and funding. This observation leads him to speculate on how many more major phenomena there might be and how they might be most effectively sought out in afield now dominated by large instruments and complex investigative modes and observational conditions. The book also examines discovery in terms of its political, financial, and sociological context - the role of new technologies and of industry and the military in revealing new knowledge; and methods of funding, of peer review, and of allotting time on our largest telescopes. It concludes with specific recommendations for organizing astronomy in ways that will best lead to the discovery of the many - at least sixty - phenomena that Harwit estimates are still waiting to be found.
Improve Data Mining and Knowledge Discovery Through the Use of MatLab
NASA Technical Reports Server (NTRS)
Shaykhian, Gholam Ali; Martin, Dawn (Elliott); Beil, Robert
2011-01-01
Data mining is widely used to mine business, engineering, and scientific data. Data mining uses pattern based queries, searches, or other analyses of one or more electronic databases/datasets in order to discover or locate a predictive pattern or anomaly indicative of system failure, criminal or terrorist activity, etc. There are various algorithms, techniques and methods used to mine data; including neural networks, genetic algorithms, decision trees, nearest neighbor method, rule induction association analysis, slice and dice, segmentation, and clustering. These algorithms, techniques and methods used to detect patterns in a dataset, have been used in the development of numerous open source and commercially available products and technology for data mining. Data mining is best realized when latent information in a large quantity of data stored is discovered. No one technique solves all data mining problems; challenges are to select algorithms or methods appropriate to strengthen data/text mining and trending within given datasets. In recent years, throughout industry, academia and government agencies, thousands of data systems have been designed and tailored to serve specific engineering and business needs. Many of these systems use databases with relational algebra and structured query language to categorize and retrieve data. In these systems, data analyses are limited and require prior explicit knowledge of metadata and database relations; lacking exploratory data mining and discoveries of latent information. This presentation introduces MatLab(R) (MATrix LABoratory), an engineering and scientific data analyses tool to perform data mining. MatLab was originally intended to perform purely numerical calculations (a glorified calculator). Now, in addition to having hundreds of mathematical functions, it is a programming language with hundreds built in standard functions and numerous available toolboxes. MatLab's ease of data processing, visualization and its enormous availability of built in functionalities and toolboxes make it suitable to perform numerical computations and simulations as well as a data mining tool. Engineers and scientists can take advantage of the readily available functions/toolboxes to gain wider insight in their perspective data mining experiments.
Improve Data Mining and Knowledge Discovery through the use of MatLab
NASA Technical Reports Server (NTRS)
Shaykahian, Gholan Ali; Martin, Dawn Elliott; Beil, Robert
2011-01-01
Data mining is widely used to mine business, engineering, and scientific data. Data mining uses pattern based queries, searches, or other analyses of one or more electronic databases/datasets in order to discover or locate a predictive pattern or anomaly indicative of system failure, criminal or terrorist activity, etc. There are various algorithms, techniques and methods used to mine data; including neural networks, genetic algorithms, decision trees, nearest neighbor method, rule induction association analysis, slice and dice, segmentation, and clustering. These algorithms, techniques and methods used to detect patterns in a dataset, have been used in the development of numerous open source and commercially available products and technology for data mining. Data mining is best realized when latent information in a large quantity of data stored is discovered. No one technique solves all data mining problems; challenges are to select algorithms or methods appropriate to strengthen data/text mining and trending within given datasets. In recent years, throughout industry, academia and government agencies, thousands of data systems have been designed and tailored to serve specific engineering and business needs. Many of these systems use databases with relational algebra and structured query language to categorize and retrieve data. In these systems, data analyses are limited and require prior explicit knowledge of metadata and database relations; lacking exploratory data mining and discoveries of latent information. This presentation introduces MatLab(TradeMark)(MATrix LABoratory), an engineering and scientific data analyses tool to perform data mining. MatLab was originally intended to perform purely numerical calculations (a glorified calculator). Now, in addition to having hundreds of mathematical functions, it is a programming language with hundreds built in standard functions and numerous available toolboxes. MatLab's ease of data processing, visualization and its enormous availability of built in functionalities and toolboxes make it suitable to perform numerical computations and simulations as well as a data mining tool. Engineers and scientists can take advantage of the readily available functions/toolboxes to gain wider insight in their perspective data mining experiments.
The discovery of medicines for rare diseases
Swinney, David C; Xia, Shuangluo
2015-01-01
There is a pressing need for new medicines (new molecular entities; NMEs) for rare diseases as few of the 6800 rare diseases (according to the NIH) have approved treatments. Drug discovery strategies for the 102 orphan NMEs approved by the US FDA between 1999 and 2012 were analyzed to learn from past success: 46 NMEs were first in class; 51 were followers; and five were imaging agents. First-in-class medicines were discovered with phenotypic assays (15), target-based approaches (12) and biologic strategies (18). Identification of genetic causes in areas with more basic and translational research such as cancer and in-born errors in metabolism contributed to success regardless of discovery strategy. In conclusion, greater knowledge increases the chance of success and empirical solutions can be effective when knowledge is incomplete. PMID:25068983
The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery
2014-01-01
The Semanticscience Integrated Ontology (SIO) is an ontology to facilitate biomedical knowledge discovery. SIO features a simple upper level comprised of essential types and relations for the rich description of arbitrary (real, hypothesized, virtual, fictional) objects, processes and their attributes. SIO specifies simple design patterns to describe and associate qualities, capabilities, functions, quantities, and informational entities including textual, geometrical, and mathematical entities, and provides specific extensions in the domains of chemistry, biology, biochemistry, and bioinformatics. SIO provides an ontological foundation for the Bio2RDF linked data for the life sciences project and is used for semantic integration and discovery for SADI-based semantic web services. SIO is freely available to all users under a creative commons by attribution license. See website for further information: http://sio.semanticscience.org. PMID:24602174
The effects of the structure characteristics on Magnetic Barkhausen noise in commercial steels
NASA Astrophysics Data System (ADS)
Deng, Yu; Li, Zhe; Chen, Juan; Qi, Xin
2018-04-01
This study has been done by separately measuring Magnetic Barkhausen noise (MBN) under different structure characteristics, namely the carbon content, hardness, roughness, and elastic modulus in commercial steels. The result of the experiments shows a strong dependence of MBN parameters (peak height, Root mean square (RMS), and average value) on structure characteristics. These effects, according to this study, can be explained by two kinds of source mechanisms of the MBN, domain wall nucleation and wall propagation. The discovery obtained in this paper can provide basic knowledge to understand the existing surface condition problem of Magnetic Barkhausen noise as a non-destructive evaluation technique and bring MBN into wider application.
Astronomy education and the Astrophysics Source Code Library
NASA Astrophysics Data System (ADS)
Allen, Alice; Nemiroff, Robert J.
2016-01-01
The Astrophysics Source Code Library (ASCL) is an online registry of source codes used in refereed astrophysics research. It currently lists nearly 1,200 codes and covers all aspects of computational astrophysics. How can this resource be of use to educators and to the graduate students they mentor? The ASCL serves as a discovery tool for codes that can be used for one's own research. Graduate students can also investigate existing codes to see how common astronomical problems are approached numerically in practice, and use these codes as benchmarks for their own solutions to these problems. Further, they can deepen their knowledge of software practices and techniques through examination of others' codes.
Mass spectrometry analysis of terpene lactones in Ginkgo biloba.
Ding, Shujing; Dudley, Ed; Song, Qingbao; Plummer, Sue; Tang, Jiandong; Newton, Russell P; Brenton, A Gareth
2008-01-01
Terpene lactones are a family of compounds with unique chemical structures, first recognised in an extract of Ginkgo biloba. The discovery of terpene lactone derivatives has recently been reported in more and more plant extracts and even food products. In this study, mass spectrometric characteristics of the standard terpene lactones in Ginkgo biloba were comprehensively studied using both an ion trap and a quadrupole time-of-flight (QTOF) mass spectrometer. The mass spectral fragmentation data from both techniques was compared to obtain the mass spectrometric fragmentation pathways of the terpene lactones with high confidence. The data obtained will facilitate the analysis and identification of terpene lactones in future plant research via the fragmentation knowledge reported here.
Mass spectrometry-driven drug discovery for development of herbal medicine.
Zhang, Aihua; Sun, Hui; Wang, Xijun
2018-05-01
Herbal medicine (HM) has made a major contribution to the drug discovery process with regard to identifying products compounds. Currently, more attention has been focused on drug discovery from natural compounds of HM. Despite the rapid advancement of modern analytical techniques, drug discovery is still a difficult and lengthy process. Fortunately, mass spectrometry (MS) can provide us with useful structural information for drug discovery, has been recognized as a sensitive, rapid, and high-throughput technology for advancing drug discovery from HM in the post-genomic era. It is essential to develop an efficient, high-quality, high-throughput screening method integrated with an MS platform for early screening of candidate drug molecules from natural products. We have developed a new chinmedomics strategy reliant on MS that is capable of capturing the candidate molecules, facilitating their identification of novel chemical structures in the early phase; chinmedomics-guided natural product discovery based on MS may provide an effective tool that addresses challenges in early screening of effective constituents of herbs against disease. This critical review covers the use of MS with related techniques and methodologies for natural product discovery, biomarker identification, and determination of mechanisms of action. It also highlights high-throughput chinmedomics screening methods suitable for lead compound discovery illustrated by recent successes. © 2016 Wiley Periodicals, Inc.
A Metadata based Knowledge Discovery Methodology for Seeding Translational Research.
Kothari, Cartik R; Payne, Philip R O
2015-01-01
In this paper, we present a semantic, metadata based knowledge discovery methodology for identifying teams of researchers from diverse backgrounds who can collaborate on interdisciplinary research projects: projects in areas that have been identified as high-impact areas at The Ohio State University. This methodology involves the semantic annotation of keywords and the postulation of semantic metrics to improve the efficiency of the path exploration algorithm as well as to rank the results. Results indicate that our methodology can discover groups of experts from diverse areas who can collaborate on translational research projects.
McCorquodale, Donald; Pucillo, Evan M; Johnson, Nicholas E
2016-01-01
Charcot–Marie–Tooth (CMT) disease is the most common inherited neuropathy and one of the most common inherited diseases in humans. The diagnosis of CMT is traditionally made by the neurologic specialist, yet the optimal management of CMT patients includes genetic counselors, physical and occupational therapists, physiatrists, orthotists, mental health providers, and community resources. Rapidly developing genetic discoveries and novel gene discovery techniques continue to add a growing number of genetic subtypes of CMT. The first large clinical natural history and therapeutic trials have added to our knowledge of each CMT subtype and revealed how CMT impacts patient quality of life. In this review, we discuss several important trends in CMT research factors that will require a collaborative multidisciplinary approach. These include the development of large multicenter patient registries, standardized clinical instruments to assess disease progression and disability, and increasing recognition and use of patient-reported outcome measures. These developments will continue to guide strategies in long-term multidisciplinary efforts to maintain quality of life and preserve functionality in CMT patients. PMID:26855581
Discovering Knowledge from AIS Database for Application in VTS
NASA Astrophysics Data System (ADS)
Tsou, Ming-Cheng
The widespread use of the Automatic Identification System (AIS) has had a significant impact on maritime technology. AIS enables the Vessel Traffic Service (VTS) not only to offer commonly known functions such as identification, tracking and monitoring of vessels, but also to provide rich real-time information that is useful for marine traffic investigation, statistical analysis and theoretical research. However, due to the rapid accumulation of AIS observation data, the VTS platform is often unable quickly and effectively to absorb and analyze it. Traditional observation and analysis methods are becoming less suitable for the modern AIS generation of VTS. In view of this, we applied the same data mining technique used for business intelligence discovery (in Customer Relation Management (CRM) business marketing) to the analysis of AIS observation data. This recasts the marine traffic problem as a business-marketing problem and integrates technologies such as Geographic Information Systems (GIS), database management systems, data warehousing and data mining to facilitate the discovery of hidden and valuable information in a huge amount of observation data. Consequently, this provides the marine traffic managers with a useful strategic planning resource.
PINT, A Modern Software Package for Pulsar Timing
NASA Astrophysics Data System (ADS)
Luo, Jing; Ransom, Scott M.; Demorest, Paul; Ray, Paul S.; Stovall, Kevin; Jenet, Fredrick; Ellis, Justin; van Haasteren, Rutger; Bachetti, Matteo; NANOGrav PINT developer team
2018-01-01
Pulsar timing, first developed decades ago, has provided an extremely wide range of knowledge about our universe. It has been responsible for many important discoveries, such as the discovery of the first exoplanet and the orbital period decay of double neutron star systems. Currently pulsar timing is the leading technique for detecting low frequency (about 10^-9 Hertz) gravitational waves (GW) using an array of pulsars as the detectors. To achieve this goal, high precision pulsar timing data, at about nanoseconds level, is required. Most high precision pulsar timing data are analyzed using the widely adopted software TEMPO/TEMPO2. But for a robust and believable GW detection, it is important to have independent software that can cross-check the result. In this poster we present the new generation pulsar timing software PINT. This package will provide a robust system to cross check high-precision timing results, completely independent of TEMPO and TEMPO2. In addition, PINT is designed to be a package that is easy to extend and modify, through use of flexible code architecture and a modern programming language, Python, with modern technology and libraries.
ERIC Educational Resources Information Center
Kraft, Donald H., Ed.
The 2000 ASIS (American Society for Information Science) conference explored knowledge innovation. The tracks in the conference program included knowledge discovery, capture, and creation; classification and representation; information retrieval; knowledge dissemination; and social, behavioral, ethical, and legal aspects. This proceedings is…
Evaluating the Science of Discovery in Complex Health Systems
ERIC Educational Resources Information Center
Norman, Cameron D.; Best, Allan; Mortimer, Sharon; Huerta, Timothy; Buchan, Alison
2011-01-01
Complex health problems such as chronic disease or pandemics require knowledge that transcends disciplinary boundaries to generate solutions. Such transdisciplinary discovery requires researchers to work and collaborate across boundaries, combining elements of basic and applied science. At the same time, calls for more interdisciplinary health…
29 CFR 18.14 - Scope of discovery.
Code of Federal Regulations, 2014 CFR
2014-07-01
... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
49 CFR 386.38 - Scope of discovery.
Code of Federal Regulations, 2011 CFR
2011-10-01
... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
49 CFR 386.38 - Scope of discovery.
Code of Federal Regulations, 2012 CFR
2012-10-01
... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
29 CFR 18.14 - Scope of discovery.
Code of Federal Regulations, 2012 CFR
2012-07-01
... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
49 CFR 386.38 - Scope of discovery.
Code of Federal Regulations, 2013 CFR
2013-10-01
... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
29 CFR 18.14 - Scope of discovery.
Code of Federal Regulations, 2011 CFR
2011-07-01
... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
29 CFR 18.14 - Scope of discovery.
Code of Federal Regulations, 2013 CFR
2013-07-01
... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
49 CFR 386.38 - Scope of discovery.
Code of Federal Regulations, 2014 CFR
2014-10-01
... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
Trends in Modern Drug Discovery.
Eder, Jörg; Herrling, Paul L
2016-01-01
Drugs discovered by the pharmaceutical industry over the past 100 years have dramatically changed the practice of medicine and impacted on many aspects of our culture. For many years, drug discovery was a target- and mechanism-agnostic approach that was based on ethnobotanical knowledge often fueled by serendipity. With the advent of modern molecular biology methods and based on knowledge of the human genome, drug discovery has now largely changed into a hypothesis-driven target-based approach, a development which was paralleled by significant environmental changes in the pharmaceutical industry. Laboratories became increasingly computerized and automated, and geographically dispersed research sites are now more and more clustered into large centers to capture technological and biological synergies. Today, academia, the regulatory agencies, and the pharmaceutical industry all contribute to drug discovery, and, in order to translate the basic science into new medical treatments for unmet medical needs, pharmaceutical companies have to have a critical mass of excellent scientists working in many therapeutic fields, disciplines, and technologies. The imperative for the pharmaceutical industry to discover breakthrough medicines is matched by the increasing numbers of first-in-class drugs approved in recent years and reflects the impact of modern drug discovery approaches, technologies, and genomics.
A Fast Projection-Based Algorithm for Clustering Big Data.
Wu, Yun; He, Zhiquan; Lin, Hao; Zheng, Yufei; Zhang, Jingfen; Xu, Dong
2018-06-07
With the fast development of various techniques, more and more data have been accumulated with the unique properties of large size (tall) and high dimension (wide). The era of big data is coming. How to understand and discover new knowledge from these data has attracted more and more scholars' attention and has become the most important task in data mining. As one of the most important techniques in data mining, clustering analysis, a kind of unsupervised learning, could group a set data into objectives(clusters) that are meaningful, useful, or both. Thus, the technique has played very important role in knowledge discovery in big data. However, when facing the large-sized and high-dimensional data, most of the current clustering methods exhibited poor computational efficiency and high requirement of computational source, which will prevent us from clarifying the intrinsic properties and discovering the new knowledge behind the data. Based on this consideration, we developed a powerful clustering method, called MUFOLD-CL. The principle of the method is to project the data points to the centroid, and then to measure the similarity between any two points by calculating their projections on the centroid. The proposed method could achieve linear time complexity with respect to the sample size. Comparison with K-Means method on very large data showed that our method could produce better accuracy and require less computational time, demonstrating that the MUFOLD-CL can serve as a valuable tool, at least may play a complementary role to other existing methods, for big data clustering. Further comparisons with state-of-the-art clustering methods on smaller datasets showed that our method was fastest and achieved comparable accuracy. For the convenience of most scholars, a free soft package was constructed.
Advances in fragment-based drug discovery platforms.
Orita, Masaya; Warizaya, Masaichi; Amano, Yasushi; Ohno, Kazuki; Niimi, Tatsuya
2009-11-01
Fragment-based drug discovery (FBDD) has been established as a powerful alternative and complement to traditional high-throughput screening techniques for identifying drug leads. At present, this technique is widely used among academic groups as well as small biotech and large pharmaceutical companies. In recent years, > 10 new compounds developed with FBDD have entered clinical development, and more and more attention in the drug discovery field is being focused on this technique. Under the FBDD approach, a fragment library of relatively small compounds (molecular mass = 100 - 300 Da) is screened by various methods and the identified fragment hits which normally weakly bind to the target are used as starting points to generate more potent drug leads. Because FBDD is still a relatively new drug discovery technology, further developments and optimizations in screening platforms and fragment exploitation can be expected. This review summarizes recent advances in FBDD platforms and discusses the factors important for the successful application of this technique. Under the FBDD approach, both identifying the starting fragment hit to be developed and generating the drug lead from that starting fragment hit are important. Integration of various techniques, such as computational technology, X-ray crystallography, NMR, surface plasmon resonance, isothermal titration calorimetry, mass spectrometry and high-concentration screening, must be applied in a situation-appropriate manner.
Reuniting Virtue and Knowledge
ERIC Educational Resources Information Center
Culham, Tom
2015-01-01
Einstein held that intuition is more important than rational inquiry as a source of discovery. Further, he explicitly and implicitly linked the heart, the sacred, devotion and intuitive knowledge. The raison d'être of universities is the advance of knowledge; however, they have primarily focused on developing student's skills in working with…
Discovery of novel bacterial toxins by genomics and computational biology.
Doxey, Andrew C; Mansfield, Michael J; Montecucco, Cesare
2018-06-01
Hundreds and hundreds of bacterial protein toxins are presently known. Traditionally, toxin identification begins with pathological studies of bacterial infectious disease. Following identification and cultivation of a bacterial pathogen, the protein toxin is purified from the culture medium and its pathogenic activity is studied using the methods of biochemistry and structural biology, cell biology, tissue and organ biology, and appropriate animal models, supplemented by bioimaging techniques. The ongoing and explosive development of high-throughput DNA sequencing and bioinformatic approaches have set in motion a revolution in many fields of biology, including microbiology. One consequence is that genes encoding novel bacterial toxins can be identified by bioinformatic and computational methods based on previous knowledge accumulated from studies of the biology and pathology of thousands of known bacterial protein toxins. Starting from the paradigmatic cases of diphtheria toxin, tetanus and botulinum neurotoxins, this review discusses traditional experimental approaches as well as bioinformatics and genomics-driven approaches that facilitate the discovery of novel bacterial toxins. We discuss recent work on the identification of novel botulinum-like toxins from genera such as Weissella, Chryseobacterium, and Enteroccocus, and the implications of these computationally identified toxins in the field. Finally, we discuss the promise of metagenomics in the discovery of novel toxins and their ecological niches, and present data suggesting the existence of uncharacterized, botulinum-like toxin genes in insect gut metagenomes. Copyright © 2018. Published by Elsevier Ltd.
Gozalbes, Rafael; Carbajo, Rodrigo J; Pineda-Lucena, Antonio
2010-01-01
In the last decade, fragment-based drug discovery (FBDD) has evolved from a novel approach in the search of new hits to a valuable alternative to the high-throughput screening (HTS) campaigns of many pharmaceutical companies. The increasing relevance of FBDD in the drug discovery universe has been concomitant with an implementation of the biophysical techniques used for the detection of weak inhibitors, e.g. NMR, X-ray crystallography or surface plasmon resonance (SPR). At the same time, computational approaches have also been progressively incorporated into the FBDD process and nowadays several computational tools are available. These stretch from the filtering of huge chemical databases in order to build fragment-focused libraries comprising compounds with adequate physicochemical properties, to more evolved models based on different in silico methods such as docking, pharmacophore modelling, QSAR and virtual screening. In this paper we will review the parallel evolution and complementarities of biophysical techniques and computational methods, providing some representative examples of drug discovery success stories by using FBDD.
NASA Astrophysics Data System (ADS)
Leach, Martin O.
2004-02-01
The award of the Nobel Prize in Physiology or Medicine recognizes discoveries concerning the use of magnetic resonance to visualize different structures. The Assembly's decision to recognize the discoveries underpinning efficient spatial mapping of biological properties reflects the singular importance of imaging to the medical application of this technique. Without this, abnormalities in morphology cannot be recognized. Equally, the wealth of physiological information that can be obtained by manipulation of the magnetic resonance signal is of little value unless localized to identified organs, pathology or areas of tissue. Based on these early discoveries, a wide range of imaging and measurement techniques, together with enabling instrumentation, have been developed over the last 30 years. Commercial equipment became available in the early 1980s, and some 60 million MRI examinations are now performed each year. The power of the technique, and the range of applications, continues to develop rapidly. The full text of this editorial is given in the PDF file below.
Exploring the Role of Receptor Flexibility in Structure-Based Drug Discovery
Feixas, Ferran; Lindert, Steffen; Sinko, William; McCammon, J. Andrew
2015-01-01
The proper understanding of biomolecular recognition mechanisms that take place in a drug target is of paramount importance to improve the efficiency of drug discovery and development. The intrinsic dynamic character of proteins has a strong influence on biomolecular recognition mechanisms and models such as conformational selection have been widely used to account for this dynamic association process. However, conformational changes occurring in the receptor prior and upon association with other molecules are diverse and not obvious to predict when only a few structures of the receptor are available. In view of the prominent role of protein flexibility in ligand binding and its implications for drug discovery, it is of great interest to identify receptor conformations that play a major role in biomolecular recognition before starting rational drug design efforts. In this review, we discuss a number of recent advances in computer-aided drug discovery techniques that have been proposed to incorporate receptor flexibility into structure-based drug design. The allowance for receptor flexibility provided by computational techniques such as molecular dynamics simulations or enhanced sampling techniques helps to improve the accuracy of methods used to estimate binding affinities and, thus, such methods can contribute to the discovery of novel drug leads. PMID:24332165
Automated Knowledge Discovery from Simulators
NASA Technical Reports Server (NTRS)
Burl, Michael C.; DeCoste, D.; Enke, B. L.; Mazzoni, D.; Merline, W. J.; Scharenbroich, L.
2006-01-01
In this paper, we explore one aspect of knowledge discovery from simulators, the landscape characterization problem, where the aim is to identify regions in the input/ parameter/model space that lead to a particular output behavior. Large-scale numerical simulators are in widespread use by scientists and engineers across a range of government agencies, academia, and industry; in many cases, simulators provide the only means to examine processes that are infeasible or impossible to study otherwise. However, the cost of simulation studies can be quite high, both in terms of the time and computational resources required to conduct the trials and the manpower needed to sift through the resulting output. Thus, there is strong motivation to develop automated methods that enable more efficient knowledge extraction.
Collected Notes on the Workshop for Pattern Discovery in Large Databases
NASA Technical Reports Server (NTRS)
Buntine, Wray (Editor); Delalto, Martha (Editor)
1991-01-01
These collected notes are a record of material presented at the Workshop. The core data analysis is addressed that have traditionally required statistical or pattern recognition techniques. Some of the core tasks include classification, discrimination, clustering, supervised and unsupervised learning, discovery and diagnosis, i.e., general pattern discovery.
18 CFR 385.402 - Scope of discovery (Rule 402).
Code of Federal Regulations, 2010 CFR
2010-04-01
... 18 Conservation of Power and Water Resources 1 2010-04-01 2010-04-01 false Scope of discovery (Rule 402). 385.402 Section 385.402 Conservation of Power and Water Resources FEDERAL ENERGY REGULATORY... persons having any knowledge of any discoverable matter. It is not ground for objection that the...
Doors to Discovery[TM]. What Works Clearinghouse Intervention Report
ERIC Educational Resources Information Center
What Works Clearinghouse, 2013
2013-01-01
"Doors to Discovery"]TM] is a preschool literacy curriculum that uses eight thematic units of activities to help children build fundamental early literacy skills in oral language, phonological awareness, concepts of print, alphabet knowledge, writing, and comprehension. The eight thematic units cover topics such as nature, friendship,…
78 FR 12933 - Proceedings Before the Commodity Futures Trading Commission
Federal Register 2010, 2011, 2012, 2013, 2014
2013-02-26
... proceedings. These new amendments also provide that Judgment Officers may conduct sua sponte discovery in... discovery; (4) sound risk management practices; and (5) other public interest considerations. The amendments... representative capacity, it was done with full power and authority to do so; (C) To the best of his knowledge...
76 FR 64803 - Rules of Adjudication and Enforcement
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-19
...) is also amended to clarify the limits on discovery when the Commission orders the ALJ to consider the... that the complainant identify, to the best of its knowledge, the ``like or directly competitive... the taking of discovery by the parties shall be at the discretion of the presiding ALJ. The ITCTLA...
78 FR 63253 - Davidson Kempner Capital Management LLC; Notice of Application
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-23
... employees of the Adviser other than the Contributor have any knowledge of the Contribution prior to its discovery by the Adviser on November 2, 2011. The Contribution was discovered by the Adviser's compliance... names of employees. After discovery of the Contribution, the Adviser and Contributor obtained the...
[Bertha Röntgen or the transparency of the hand].
Picard, J D
1996-01-01
It is to Wilhelm Conrad Röntgen, the first elected "radiologist" of our Academy and the first Nobel Prize winner in physics, that we owe the transparency of the hand. We celebrate today the centenary of the great scientific discovery which was to revolutionize the diagnosis, and thereby the treatment, of a large number of illnesses the discovery of X-rays. It would be unjust not to link the name of this scientist with that of his wife, Bertha, who, ignorant of the dangers of all "novel medical inventions" volunteered her own hand for his research experiments: the hand which was to bring to the world tangible proof of this remarkable discovery. To a lesser degree, but nonetheless essential, we acknowledge, albeit not in exhaustive detail, all the progress made by the work of pioneers using this new investigative technique. So let us now return to the hand:--a body part which it was easy to immobilize, remembering that in those days a single radiographic exposure took up to an hour to obtain,--we will consider the immortalised hand of Bertha Röntgen,---to whom this address is dedicated,---and its radiographic exposures which allow us to appreciate the advances and to perceive the limitations of this technique. They also enable us better to envisage future investigative approaches whereby a deeper knowledge of the human body may be acquired. We note that compared with the histopathological sciences, imaging is not specific. Numerous microscopic structures, in particular neurological and vascular ones, are still insufficiently well visualised and the transmission pathways between the hand and the central nervous system deserve better characterisation. Current, research programmes are attempting to overcome these limitation of modern imaging. All the experience gained in studying the transparency of the hand, as we have discussed, is applicable to every part of the human anatomy. To credit: Röntgen's discovery with all its originality, we could say that the hand was to radiology what the brain was to CT and MRI scanning: an exceptional victory is rendering the human body transparent.
Chavez-Alvarez, Rocio; Chavoya, Arturo; Mendez-Vazquez, Andres
2014-01-01
DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques –an unsupervised artificial neural network called a Self-Organizing Map (SOM)–which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms. PMID:24699245
Network-based approaches to climate knowledge discovery
NASA Astrophysics Data System (ADS)
Budich, Reinhard; Nyberg, Per; Weigel, Tobias
2011-11-01
Climate Knowledge Discovery Workshop; Hamburg, Germany, 30 March to 1 April 2011 Do complex networks combined with semantic Web technologies offer the next generation of solutions in climate science? To address this question, a first Climate Knowledge Discovery (CKD) Workshop, hosted by the German Climate Computing Center (Deutsches Klimarechenzentrum (DKRZ)), brought together climate and computer scientists from major American and European laboratories, data centers, and universities, as well as representatives from industry, the broader academic community, and the semantic Web communities. The participants, representing six countries, were concerned with large-scale Earth system modeling and computational data analysis. The motivation for the meeting was the growing problem that climate scientists generate data faster than it can be interpreted and the need to prepare for further exponential data increases. Current analysis approaches are focused primarily on traditional methods, which are best suited for large-scale phenomena and coarse-resolution data sets. The workshop focused on the open discussion of ideas and technologies to provide the next generation of solutions to cope with the increasing data volumes in climate science.
Mott, Meghan; Koroshetz, Walter
2015-07-01
The mission of the National Institute of Neurological Disorders and Stroke (NINDS) is to seek fundamental knowledge about the brain and nervous system and to use that knowledge to reduce the burden of neurological disease. NINDS supports early- and late-stage therapy development funding programs to accelerate preclinical discovery and the development of new therapeutic interventions for neurological disorders. The NINDS Office of Translational Research facilitates and funds the movement of discoveries from the laboratory to patients. Its grantees include academics, often with partnerships with the private sector, as well as small businesses, which, by Congressional mandate, receive > 3% of the NINDS budget for small business innovation research. This article provides an overview of NINDS-funded therapy development programs offered by the NINDS Office of Translational Research.
Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases
Frijters, Raoul; van Vugt, Marianne; Smeets, Ruben; van Schaik, René; de Vlieg, Jacob; Alkema, Wynand
2010-01-01
The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs. PMID:20885778
Full, Robert J; Dudley, Robert; Koehl, M A R; Libby, Thomas; Schwab, Cheryl
2015-11-01
Experiencing the thrill of an original scientific discovery can be transformative to students unsure about becoming a scientist, yet few courses offer authentic research experiences. Increasingly, cutting-edge discoveries require an interdisciplinary approach not offered in current departmental-based courses. Here, we describe a one-semester, learning laboratory course on organismal biomechanics offered at our large research university that enables interdisciplinary teams of students from biology and engineering to grow intellectually, collaborate effectively, and make original discoveries. To attain this goal, we avoid traditional "cookbook" laboratories by training 20 students to use a dozen research stations. Teams of five students rotate to a new station each week where a professor, graduate student, and/or team member assists in the use of equipment, guides students through stages of critical thinking, encourages interdisciplinary collaboration, and moves them toward authentic discovery. Weekly discussion sections that involve the entire class offer exchange of discipline-specific knowledge, advice on experimental design, methods of collecting and analyzing data, a statistics primer, and best practices for writing and presenting scientific papers. The building of skills in concert with weekly guided inquiry facilitates original discovery via a final research project that can be presented at a national meeting or published in a scientific journal. © The Author 2015. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Asymmetric threat data mining and knowledge discovery
NASA Astrophysics Data System (ADS)
Gilmore, John F.; Pagels, Michael A.; Palk, Justin
2001-03-01
Asymmetric threats differ from the conventional force-on- force military encounters that the Defense Department has historically been trained to engage. Terrorism by its nature is now an operational activity that is neither easily detected or countered as its very existence depends on small covert attacks exploiting the element of surprise. But terrorism does have defined forms, motivations, tactics and organizational structure. Exploiting a terrorism taxonomy provides the opportunity to discover and assess knowledge of terrorist operations. This paper describes the Asymmetric Threat Terrorist Assessment, Countering, and Knowledge (ATTACK) system. ATTACK has been developed to (a) data mine open source intelligence (OSINT) information from web-based newspaper sources, video news web casts, and actual terrorist web sites, (b) evaluate this information against a terrorism taxonomy, (c) exploit country/region specific social, economic, political, and religious knowledge, and (d) discover and predict potential terrorist activities and association links. Details of the asymmetric threat structure and the ATTACK system architecture are presented with results of an actual terrorist data mining and knowledge discovery test case shown.
Hoehndorf, Robert; Dumontier, Michel; Oellrich, Anika; Rebholz-Schuhmann, Dietrich; Schofield, Paul N; Gkoutos, Georgios V
2011-01-01
Researchers design ontologies as a means to accurately annotate and integrate experimental data across heterogeneous and disparate data- and knowledge bases. Formal ontologies make the semantics of terms and relations explicit such that automated reasoning can be used to verify the consistency of knowledge. However, many biomedical ontologies do not sufficiently formalize the semantics of their relations and are therefore limited with respect to automated reasoning for large scale data integration and knowledge discovery. We describe a method to improve automated reasoning over biomedical ontologies and identify several thousand contradictory class definitions. Our approach aligns terms in biomedical ontologies with foundational classes in a top-level ontology and formalizes composite relations as class expressions. We describe the semi-automated repair of contradictions and demonstrate expressive queries over interoperable ontologies. Our work forms an important cornerstone for data integration, automatic inference and knowledge discovery based on formal representations of knowledge. Our results and analysis software are available at http://bioonto.de/pmwiki.php/Main/ReasonableOntologies.
Intelligent Systems: Terrestrial Observation and Prediction Using Remote Sensing Data
NASA Technical Reports Server (NTRS)
Coughlan, Joseph C.
2005-01-01
NASA has made science and technology investments to better utilize its large space-borne remote sensing data holdings of the Earth. With the launch of Terra, NASA created a data-rich environment where the challenge is to fully utilize the data collected from EOS however, despite unprecedented amounts of observed data, there is a need for increasing the frequency, resolution, and diversity of observations. Current terrestrial models that use remote sensing data were constructed in a relatively data and compute limited era and do not take full advantage of on-line learning methods and assimilation techniques that can exploit these data. NASA has invested in visualization, data mining and knowledge discovery methods which have facilitated data exploitation, but these methods are insufficient for improving Earth science models that have extensive background knowledge nor do these methods refine understanding of complex processes. Investing in interdisciplinary teams that include computational scientists can lead to new models and systems for online operation and analysis of data that can autonomously improve in prediction skill over time.
Pant, Bijaya
2014-01-01
Approximately 80% of the world inhabitants depend on the medicinal plants in the form of traditional formulations for their primary health care system well as in the treatment of a number of diseases since the ancient time. Many commercially used drugs have come from the information of indigenous knowledge of plants and their folk uses. Linking of the indigenous knowledge of medicinal plants to modern research activities provides a new reliable approach, for the discovery of novel drugs much more effectively than with random collection. Increase in population and increasing demand of plant products along with illegal trade are causing depletion of medicinal plants and many are threatened in natural habitat. Plant tissue culture technique has proved potential alternative for the production of desirable bioactive components from plants, to produce the enough amounts of plant material that is needed and for the conservation of threatened species. Different plant tissue culture systems have been extensively studied to improve and enhance the production of plant chemicals in various medicinal plants.
NASA Astrophysics Data System (ADS)
Cook, R.; Michener, W.; Vieglais, D.; Budden, A.; Koskela, R.
2012-04-01
Addressing grand environmental science challenges requires unprecedented access to easily understood data that cross the breadth of temporal, spatial, and thematic scales. Tools are needed to plan management of the data, discover the relevant data, integrate heterogeneous and diverse data, and convert the data to information and knowledge. Addressing these challenges requires new approaches for the full data life cycle of managing, preserving, sharing, and analyzing data. DataONE (Observation Network for Earth) represents a virtual organization that enables new science and knowledge creation through preservation and access to data about life on Earth and the environment that sustains it. The DataONE approach is to improve data collection and management techniques; facilitate easy, secure, and persistent storage of data; continue to increase access to data and tools that improve data interoperability; disseminate integrated and user-friendly tools for data discovery and novel analyses; work with researchers to build intuitive data exploration and visualization tools; and support communities of practice via education, outreach, and stakeholder engagement.
A review of recent developments in the speciation and location of arsenic and selenium in rice grain
Carey, Anne-Marie; Lombi, Enzo; Donner, Erica; de Jonge, Martin D.; Punshon, Tracy; Jackson, Brian P.; Guerinot, Mary Lou; Price, Adam H.; Meharg, Andrew A.
2014-01-01
Rice is a staple food yet is a significant dietary source of inorganic arsenic, a class 1, nonthreshold carcinogen. Establishing the location and speciation of arsenic within the edible rice grain is essential for understanding the risk and for developing effective strategies to reduce grain arsenic concentrations. Conversely, selenium is an essential micronutrient and up to 1 billion people worldwide are selenium-deficient. Several studies have suggested that selenium supplementation can reduce the risk of some cancers, generating substantial interest in biofortifying rice. Knowledge of selenium location and speciation is important, because the anti-cancer effects of selenium depend on its speciation. Germanic acid is an arsenite/silicic acid analogue, and location of germanium may help elucidate the mechanisms of arsenite transport into grain. This review summarises recent discoveries in the location and speciation of arsenic, germanium, and selenium in rice grain using state-of-the-art mass spectrometry and synchrotron techniques, and illustrates both the importance of high-sensitivity and high-resolution techniques and the advantages of combining techniques in an integrated quantitative and spatial approach. PMID:22159463
Oak Ridge Graph Analytics for Medical Innovation (ORiGAMI)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roberts, Larry W.; Lee, Sangkeun
2016-01-01
In this era of data-driven decisions and discovery where Big Data is producing Bigger Data, data scientists at the Oak Ridge National Laboratory are leveraging unique leadership infrastructure (e.g., Urika XA and Urika GD appliances) to develop scalable algorithms for semantic, logical and statistical reasoning with Big Data (i.e., data stored in databases as well as unstructured data in documents). ORiGAMI is a next-generation knowledge-discovery framework that is: (a) knowledge nurturing (i.e., evolves seamlessly with newer knowledge and data), (b) smart and curious (i.e. using information-foraging and reasoning algorithms to digest content) and (c) synergistic (i.e., interfaces computers with whatmore » they do best to help subject-matter-experts do their best. ORiGAMI has been demonstrated using the National Library of Medicine's SEMANTIC MEDLINE (archive of medical knowledge since 1994).« less
18 CFR 385.403 - Methods of discovery; general provisions (Rule 403).
Code of Federal Regulations, 2010 CFR
2010-04-01
... 18 Conservation of Power and Water Resources 1 2010-04-01 2010-04-01 false Methods of discovery; general provisions (Rule 403). 385.403 Section 385.403 Conservation of Power and Water Resources FEDERAL... the response is true and accurate to the best of that person's knowledge, information, and belief...
The Prehistory of Discovery: Precursors of Representational Change in Solving Gear System Problems.
ERIC Educational Resources Information Center
Dixon, James A.; Bangert, Ashley S.
2002-01-01
This study investigated whether the process of representational change undergoes developmental change or different processes occupy different niches in the course of knowledge acquisition. Subjects--college, third-, and sixth-grade students--solved gear system problems over two sessions. Findings indicated that for all grades, discovery of the…
40 CFR 300.300 - Phase I-Discovery or notification.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 40 Protection of Environment 27 2010-07-01 2010-07-01 false Phase I-Discovery or notification. 300.300 Section 300.300 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) SUPERFUND... person in charge of a vessel or a facility shall, as soon as he or she has knowledge of any discharge...
Serendipity: Accidental Discoveries in Science
NASA Astrophysics Data System (ADS)
Roberts, Royston M.
1989-06-01
Many of the things discovered by accident are important in our everyday lives: Teflon, Velcro, nylon, x-rays, penicillin, safety glass, sugar substitutes, and polyethylene and other plastics. And we owe a debt to accident for some of our deepest scientific knowledge, including Newton's theory of gravitation, the Big Bang theory of Creation, and the discovery of DNA. Even the Rosetta Stone, the Dead Sea Scrolls, and the ruins of Pompeii came to light through chance. This book tells the fascinating stories of these and other discoveries and reveals how the inquisitive human mind turns accident into discovery. Written for the layman, yet scientifically accurate, this illuminating collection of anecdotes portrays invention and discovery as quintessentially human acts, due in part to curiosity, perserverance, and luck.
Closed-Loop Multitarget Optimization for Discovery of New Emulsion Polymerization Recipes
2015-01-01
Self-optimization of chemical reactions enables faster optimization of reaction conditions or discovery of molecules with required target properties. The technology of self-optimization has been expanded to discovery of new process recipes for manufacture of complex functional products. A new machine-learning algorithm, specifically designed for multiobjective target optimization with an explicit aim to minimize the number of “expensive” experiments, guides the discovery process. This “black-box” approach assumes no a priori knowledge of chemical system and hence particularly suited to rapid development of processes to manufacture specialist low-volume, high-value products. The approach was demonstrated in discovery of process recipes for a semibatch emulsion copolymerization, targeting a specific particle size and full conversion. PMID:26435638
Carpenter, Kristy A; Huang, Xudong
2018-06-07
Virtual Screening (VS) has emerged as an important tool in the drug development process, as it conducts efficient in silico searches over millions of compounds, ultimately increasing yields of potential drug leads. As a subset of Artificial Intelligence (AI), Machine Learning (ML) is a powerful way of conducting VS for drug leads. ML for VS generally involves assembling a filtered training set of compounds, comprised of known actives and inactives. After training the model, it is validated and, if sufficiently accurate, used on previously unseen databases to screen for novel compounds with desired drug target binding activity. The study aims to review ML-based methods used for VS and applications to Alzheimer's disease (AD) drug discovery. To update the current knowledge on ML for VS, we review thorough backgrounds, explanations, and VS applications of the following ML techniques: Naïve Bayes (NB), k-Nearest Neighbors (kNN), Support Vector Machines (SVM), Random Forests (RF), and Artificial Neural Networks (ANN). All techniques have found success in VS, but the future of VS is likely to lean more heavily toward the use of neural networks - and more specifically, Convolutional Neural Networks (CNN), which are a subset of ANN that utilize convolution. We additionally conceptualize a work flow for conducting ML-based VS for potential therapeutics of for AD, a complex neurodegenerative disease with no known cure and prevention. This both serves as an example of how to apply the concepts introduced earlier in the review and as a potential workflow for future implementation. Different ML techniques are powerful tools for VS, and they have advantages and disadvantages albeit. ML-based VS can be applied to AD drug development. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Salvador-Carulla, L; Lukersmith, S; Sullivan, W
2017-04-01
Guideline methods to develop recommendations dedicate most effort around organising discovery and corroboration knowledge following the evidence-based medicine (EBM) framework. Guidelines typically use a single dimension of information, and generally discard contextual evidence and formal expert knowledge and consumer's experiences in the process. In recognition of the limitations of guidelines in complex cases, complex interventions and systems research, there has been significant effort to develop new tools, guides, resources and structures to use alongside EBM methods of guideline development. In addition to these advances, a new framework based on the philosophy of science is required. Guidelines should be defined as implementation decision support tools for improving the decision-making process in real-world practice and not only as a procedure to optimise the knowledge base of scientific discovery and corroboration. A shift from the model of the EBM pyramid of corroboration of evidence to the use of broader multi-domain perspective graphically depicted as 'Greek temple' could be considered. This model takes into account the different stages of scientific knowledge (discovery, corroboration and implementation), the sources of knowledge relevant to guideline development (experimental, observational, contextual, expert-based and experiential); their underlying inference mechanisms (deduction, induction, abduction, means-end inferences) and a more precise definition of evidence and related terms. The applicability of this broader approach is presented for the development of the Canadian Consensus Guidelines for the Primary Care of People with Developmental Disabilities.
NASA Astrophysics Data System (ADS)
This document is dedicated first to the criteria used to select a candidate asteroid. It contains the known characteristics of this asteroid as well as the assumptions made about it. It ends with a preliminary study of other possible more favorable candidates which might be found in the near future. Special attention is paid to the possible existence of Earth-Sun Trojan asteroids. Second, there is a description of the current state of our limited knowledge about the asteroids, and of the instruments and techniques being used to improve this knowledge. The contribution to asteroid research which can be expected from the new instruments already in space or due to be launched in this decade is then discussed. The last part of this document gives a description of different ways of improving our knowledge about the asteroids, both quantitatively and qualitatively. A proposal requiring reasonable financing and manpower to improve asteroid research is presented. It is believed that the implementation of such a program would have a dramatic effect on asteroid research. For example, a significant increase in both the rate of discovery of asteroids and their corresponding orbital parameters would be obtained. This program could be fully operational 3 years after its implementation.
NASA Astrophysics Data System (ADS)
Boulicaut, Jean-Francois; Jeudy, Baptiste
Knowledge Discovery in Databases (KDD) is a complex interactive process. The promising theoretical framework of inductive databases considers this is essentially a querying process. It is enabled by a query language which can deal either with raw data or patterns which hold in the data. Mining patterns turns to be the so-called inductive query evaluation process for which constraint-based Data Mining techniques have to be designed. An inductive query specifies declaratively the desired constraints and algorithms are used to compute the patterns satisfying the constraints in the data. We survey important results of this active research domain. This chapter emphasizes a real breakthrough for hard problems concerning local pattern mining under various constraints and it points out the current directions of research as well.
A survey of automated methods for sensemaking support
NASA Astrophysics Data System (ADS)
Llinas, James
2014-05-01
Complex, dynamic problems in general present a challenge for the design of analysis support systems and tools largely because there is limited reliable a priori procedural knowledge descriptive of the dynamic processes in the environment. Problem domains that are non-cooperative or adversarial impute added difficulties involving suboptimal observational data and/or data containing the effects of deception or covertness. The fundamental nature of analysis in these environments is based on composite approaches involving mining or foraging over the evidence, discovery and learning processes, and the synthesis of fragmented hypotheses; together, these can be labeled as sensemaking procedures. This paper reviews and analyzes the features, benefits, and limitations of a variety of automated techniques that offer possible support to sensemaking processes in these problem domains.
Antisense oligonucleotide technologies in drug discovery.
Aboul-Fadl, Tarek
2006-09-01
The principle of antisense oligonucleotide (AS-OD) technologies is based on the specific inhibition of unwanted gene expression by blocking mRNA activity. It has long appeared to be an ideal strategy to leverage new genomic knowledge for drug discovery and development. In recent years, AS-OD technologies have been widely used as potent and promising tools for this purpose. There is a rapid increase in the number of antisense molecules progressing in clinical trials. AS-OD technologies provide a simple and efficient approach for drug discovery and development and are expected to become a reality in the near future. This editorial describes the established and emerging AS-OD technologies in drug discovery.
100 years of elementary particles [Beam Line, vol. 27, issue 1, Spring 1997
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pais, Abraham; Weinberg, Steven; Quigg, Chris
1997-04-01
This issue of Beam Line commemorates the 100th anniversary of the April 30, 1897 report of the discovery of the electron by J.J. Thomson and the ensuing discovery of other subatomic particles. In the first three articles, theorists Abraham Pais, Steven Weinberg, and Chris Quigg provide their perspectives on the discoveries of elementary particles as well as the implications and future directions resulting from these discoveries. In the following three articles, Michael Riordan, Wolfgang Panofsky, and Virginia Trimble apply our knowledge about elementary particles to high-energy research, electronics technology, and understanding the origin and evolution of our Universe.
100 years of Elementary Particles [Beam Line, vol. 27, issue 1, Spring 1997
DOE R&D Accomplishments Database
Pais, Abraham; Weinberg, Steven; Quigg, Chris; Riordan, Michael; Panofsky, Wolfgang K. H.; Trimble, Virginia
1997-04-01
This issue of Beam Line commemorates the 100th anniversary of the April 30, 1897 report of the discovery of the electron by J.J. Thomson and the ensuing discovery of other subatomic particles. In the first three articles, theorists Abraham Pais, Steven Weinberg, and Chris Quigg provide their perspectives on the discoveries of elementary particles as well as the implications and future directions resulting from these discoveries. In the following three articles, Michael Riordan, Wolfgang Panofsky, and Virginia Trimble apply our knowledge about elementary particles to high-energy research, electronics technology, and understanding the origin and evolution of our Universe.
Sports Stars: Analyzing the Performance of Astronomers at Visualization-based Discovery
NASA Astrophysics Data System (ADS)
Fluke, C. J.; Parrington, L.; Hegarty, S.; MacMahon, C.; Morgan, S.; Hassan, A. H.; Kilborn, V. A.
2017-05-01
In this data-rich era of astronomy, there is a growing reliance on automated techniques to discover new knowledge. The role of the astronomer may change from being a discoverer to being a confirmer. But what do astronomers actually look at when they distinguish between “sources” and “noise?” What are the differences between novice and expert astronomers when it comes to visual-based discovery? Can we identify elite talent or coach astronomers to maximize their potential for discovery? By looking to the field of sports performance analysis, we consider an established, domain-wide approach, where the expertise of the viewer (i.e., a member of the coaching team) plays a crucial role in identifying and determining the subtle features of gameplay that provide a winning advantage. As an initial case study, we investigate whether the SportsCode performance analysis software can be used to understand and document how an experienced Hi astronomer makes discoveries in spectral data cubes. We find that the process of timeline-based coding can be applied to spectral cube data by mapping spectral channels to frames within a movie. SportsCode provides a range of easy to use methods for annotation, including feature-based codes and labels, text annotations associated with codes, and image-based drawing. The outputs, including instance movies that are uniquely associated with coded events, provide the basis for a training program or team-based analysis that could be used in unison with discipline specific analysis software. In this coordinated approach to visualization and analysis, SportsCode can act as a visual notebook, recording the insight and decisions in partnership with established analysis methods. Alternatively, in situ annotation and coding of features would be a valuable addition to existing and future visualization and analysis packages.
A knowledgebase system to enhance scientific discovery: Telemakus
Fuller, Sherrilynne S; Revere, Debra; Bugni, Paul F; Martin, George M
2004-01-01
Background With the rapid expansion of scientific research, the ability to effectively find or integrate new domain knowledge in the sciences is proving increasingly difficult. Efforts to improve and speed up scientific discovery are being explored on a number of fronts. However, much of this work is based on traditional search and retrieval approaches and the bibliographic citation presentation format remains unchanged. Methods Case study. Results The Telemakus KnowledgeBase System provides flexible new tools for creating knowledgebases to facilitate retrieval and review of scientific research reports. In formalizing the representation of the research methods and results of scientific reports, Telemakus offers a potential strategy to enhance the scientific discovery process. While other research has demonstrated that aggregating and analyzing research findings across domains augments knowledge discovery, the Telemakus system is unique in combining document surrogates with interactive concept maps of linked relationships across groups of research reports. Conclusion Based on how scientists conduct research and read the literature, the Telemakus KnowledgeBase System brings together three innovations in analyzing, displaying and summarizing research reports across a domain: (1) research report schema, a document surrogate of extracted research methods and findings presented in a consistent and structured schema format which mimics the research process itself and provides a high-level surrogate to facilitate searching and rapid review of retrieved documents; (2) research findings, used to index the documents, allowing searchers to request, for example, research studies which have studied the relationship between neoplasms and vitamin E; and (3) visual exploration interface of linked relationships for interactive querying of research findings across the knowledgebase and graphical displays of what is known as well as, through gaps in the map, what is yet to be tested. The rationale and system architecture are described and plans for the future are discussed. PMID:15507158
Potential for pharmacological manipulation of human embryonic stem cells
Atkinson, Stuart P; Lako, Majlinda; Armstrong, Lyle
2013-01-01
The therapeutic potential of human embryonic stem cells (hESCs) and induced pluripotent stem cells (iPSCs) is vast, allowing disease modelling, drug discovery and testing and perhaps most importantly regenerative therapies. However, problems abound; techniques for cultivating self-renewing hESCs tend to give a heterogeneous population of self-renewing and partially differentiated cells and general include animal-derived products that can be cost-prohibitive for large-scale production, and effective lineage-specific differentiation protocols also still remain relatively undefined and are inefficient at producing large amounts of cells for therapeutic use. Furthermore, the mechanisms and signalling pathways that mediate pluripotency and differentiation are still to be fully appreciated. However, over the recent years, the development/discovery of a range of effective small molecule inhibitors/activators has had a huge impact in hESC biology. Large-scale screening techniques, coupled with greater knowledge of the pathways involved, have generated pharmacological agents that can boost hESC pluripotency/self-renewal and survival and has greatly increased the efficiency of various differentiation protocols, while also aiding the delineation of several important signalling pathways. Within this review, we hope to describe the current uses of small molecule inhibitors/activators in hESC biology and their potential uses in the future. LINKED ARTICLES This article is part of a themed section on Regenerative Medicine and Pharmacology: A Look to the Future. To view the other articles in this section visit http://dx.doi.org/10.1111/bph.2013.169.issue-2 PMID:22515554
Knowledge Discovery from Climate Data using Graph-Based Methods
NASA Astrophysics Data System (ADS)
Steinhaeuser, K.
2012-04-01
Climate and Earth sciences have recently experienced a rapid transformation from a historically data-poor to a data-rich environment, thus bringing them into the realm of the Fourth Paradigm of scientific discovery - a term coined by the late Jim Gray (Hey et al. 2009), the other three being theory, experimentation and computer simulation. In particular, climate-related observations from remote sensors on satellites and weather radars, in situ sensors and sensor networks, as well as outputs of climate or Earth system models from large-scale simulations, provide terabytes of spatio-temporal data. These massive and information-rich datasets offer a significant opportunity for advancing climate science and our understanding of the global climate system, yet current analysis techniques are not able to fully realize their potential benefits. We describe a class of computational approaches, specifically from the data mining and machine learning domains, which may be novel to the climate science domain and can assist in the analysis process. Computer scientists have developed spatial and spatio-temporal analysis techniques for a number of years now, and many of them may be applicable and/or adaptable to problems in climate science. We describe a large-scale, NSF-funded project aimed at addressing climate science question using computational analysis methods; team members include computer scientists, statisticians, and climate scientists from various backgrounds. One of the major thrusts is in the development of graph-based methods, and several illustrative examples of recent work in this area will be presented.
Recent trends in spin-resolved photoelectron spectroscopy
NASA Astrophysics Data System (ADS)
Okuda, Taichi
2017-12-01
Since the discovery of the Rashba effect on crystal surfaces and also the discovery of topological insulators, spin- and angle-resolved photoelectron spectroscopy (SARPES) has become more and more important, as the technique can measure directly the electronic band structure of materials with spin resolution. In the same way that the discovery of high-Tc superconductors promoted the development of high-resolution angle-resolved photoelectron spectroscopy, the discovery of this new class of materials has stimulated the development of new SARPES apparatus with new functions and higher resolution, such as spin vector analysis, ten times higher energy and angular resolution than conventional SARPES, multichannel spin detection, and so on. In addition, the utilization of vacuum ultra violet lasers also opens a pathway to the realization of novel SARPES measurements. In this review, such recent trends in SARPES techniques and measurements will be overviewed.
An Expert System toward Buiding An Earth Science Knowledge Graph
NASA Astrophysics Data System (ADS)
Zhang, J.; Duan, X.; Ramachandran, R.; Lee, T. J.; Bao, Q.; Gatlin, P. N.; Maskey, M.
2017-12-01
In this ongoing work, we aim to build foundations of Cognitive Computing for Earth Science research. The goal of our project is to develop an end-to-end automated methodology for incrementally constructing Knowledge Graphs for Earth Science (KG4ES). These knowledge graphs can then serve as the foundational components for building cognitive systems in Earth science, enabling researchers to uncover new patterns and hypotheses that are virtually impossible to identify today. In addition, this research focuses on developing mining algorithms needed to exploit these constructed knowledge graphs. As such, these graphs will free knowledge from publications that are generated in a very linear, deterministic manner, and structure knowledge in a way that users can both interact and connect with relevant pieces of information. Our major contributions are two-fold. First, we have developed an end-to-end methodology for constructing Knowledge Graphs for Earth Science (KG4ES) using existing corpus of journal papers and reports. One of the key challenges in any machine learning, especially deep learning applications, is the need for robust and large training datasets. We have developed techniques capable of automatically retraining models and incrementally building and updating KG4ES, based on ever evolving training data. We also adopt the evaluation instrument based on common research methodologies used in Earth science research, especially in Atmospheric Science. Second, we have developed an algorithm to infer new knowledge that can exploit the constructed KG4ES. In more detail, we have developed a network prediction algorithm aiming to explore and predict possible new connections in the KG4ES and aid in new knowledge discovery.
Is there a best strategy for drug discovery?--SMR Meeting. 13 March 2003, London, UK.
Lunec, Anna
2003-05-01
This gathering of members from academia and industry allowed the sharing of ideas and techniques or the acceleration of drug discovery, and it was clear that there is a need for a more streamlined approach to discovery and development. Clearly, new technologies will aid in the discovery process, but the abilities of the human brain to analyze and interpret data should not be overlooked, as many discoveries have been made by chance or as the result of a hunch, and it would be a shame if the advent of artificial intelligence quashed that inquisitive aspect of drug discovery.
A network model of knowledge accumulation through diffusion and upgrade
NASA Astrophysics Data System (ADS)
Zhuang, Enyu; Chen, Guanrong; Feng, Gang
2011-07-01
In this paper, we introduce a model to describe knowledge accumulation through knowledge diffusion and knowledge upgrade in a multi-agent network. Here, knowledge diffusion refers to the distribution of existing knowledge in the network, while knowledge upgrade means the discovery of new knowledge. It is found that the population of the network and the number of each agent’s neighbors affect the speed of knowledge accumulation. Four different policies for updating the neighboring agents are thus proposed, and their influence on the speed of knowledge accumulation and the topology evolution of the network are also studied.
Interfaith Education: An Islamic Perspective
ERIC Educational Resources Information Center
Pallavicini, Yahya Sergio Yahe
2016-01-01
According to a teaching of the Prophet Muhammad, "the quest for knowledge is the duty of each Muslim, male or female", where knowledge is meant as the discovery of the real value of things and of oneself in relationship with the world in which God has placed us. This universal dimension of knowledge is in fact a wealth of wisdom of the…
Comparative study on drug safety surveillance between medical students of Malaysia and Nigeria.
Abubakar, Abdullahi Rabiu; Ismail, Salwani; Rahman, Nor Iza A; Haque, Mainul
2015-01-01
Internationally, there is a remarkable achievement in the areas of drug discovery, drug design, and clinical trials. New and efficient drug formulation techniques are widely available which have led to success in treatment of several diseases. Despite these achievements, large number of patients continue to experience adverse drug reactions (ADRs), and majority of them are yet to be on record. The purpose of this survey is to compare knowledge, attitude, and practice with respect to ADRs and pharmacovigilance (PV) between medical students of Malaysia and Nigeria and to determine if there is a relationship between their knowledge and practice. A cross-sectional, questionnaire-based survey involving year IV and year V medical students of the Department of Medicine, Universiti Sultan Zainal Abidin and Bayero University Kano was carried out. The questionnaire which comprised 25 questions on knowledge, attitude, and practice was adopted, modified, validated, and administered to them. The response was analyzed using SPSS version 20. The response rate from each country was 74%. There was a statistically significant difference in mean knowledge and practice score on ADRs and PV between medical students of Malaysia and Nigeria, both at P<0.000. No significance difference in attitude was observed at P=0.389. Also, a statistically significant relationship was recorded between their knowledge and practice (r=0.229, P=0.001), although the relationship was weak. Nigerian medical students have better knowledge and practice than those of Malaysia, although they need improvement. Imparting knowledge of ADRs and PV among medical students will upgrade their practice and enhance health care delivery services in the future.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yue, Peng; Gong, Jianya; Di, Liping
Abstract A geospatial catalogue service provides a network-based meta-information repository and interface for advertising and discovering shared geospatial data and services. Descriptive information (i.e., metadata) for geospatial data and services is structured and organized in catalogue services. The approaches currently available for searching and using that information are often inadequate. Semantic Web technologies show promise for better discovery methods by exploiting the underlying semantics. Such development needs special attention from the Cyberinfrastructure perspective, so that the traditional focus on discovery of and access to geospatial data can be expanded to support the increased demand for processing of geospatial information andmore » discovery of knowledge. Semantic descriptions for geospatial data, services, and geoprocessing service chains are structured, organized, and registered through extending elements in the ebXML Registry Information Model (ebRIM) of a geospatial catalogue service, which follows the interface specifications of the Open Geospatial Consortium (OGC) Catalogue Services for the Web (CSW). The process models for geoprocessing service chains, as a type of geospatial knowledge, are captured, registered, and discoverable. Semantics-enhanced discovery for geospatial data, services/service chains, and process models is described. Semantic search middleware that can support virtual data product materialization is developed for the geospatial catalogue service. The creation of such a semantics-enhanced geospatial catalogue service is important in meeting the demands for geospatial information discovery and analysis in Cyberinfrastructure.« less
Text Mining in Organizational Research
Kobayashi, Vladimer B.; Berkers, Hannah A.; Kismihók, Gábor; Den Hartog, Deanne N.
2017-01-01
Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies. PMID:29881248
Text Mining in Organizational Research.
Kobayashi, Vladimer B; Mol, Stefan T; Berkers, Hannah A; Kismihók, Gábor; Den Hartog, Deanne N
2018-07-01
Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.
NASA Astrophysics Data System (ADS)
Demir, I.; Krajewski, W. F.
2013-12-01
As geoscientists are confronted with increasingly massive datasets from environmental observations to simulations, one of the biggest challenges is having the right tools to gain scientific insight from the data and communicate the understanding to stakeholders. Recent developments in web technologies make it easy to manage, visualize and share large data sets with general public. Novel visualization techniques and dynamic user interfaces allow users to interact with data, and modify the parameters to create custom views of the data to gain insight from simulations and environmental observations. This requires developing new data models and intelligent knowledge discovery techniques to explore and extract information from complex computational simulations or large data repositories. Scientific visualization will be an increasingly important component to build comprehensive environmental information platforms. This presentation provides an overview of the trends and challenges in the field of scientific visualization, and demonstrates information visualization and communication tools developed within the light of these challenges.
Basic science of anterior cruciate ligament injury and repair
Kiapour, A. M.; Murray, M. M.
2014-01-01
Injury to the anterior cruciate ligament (ACL) is one of the most devastating and frequent injuries of the knee. Surgical reconstruction is the current standard of care for treatment of ACL injuries in active patients. The widespread adoption of ACL reconstruction over primary repair was based on early perception of the limited healing capacity of the ACL. Although the majority of ACL reconstruction surgeries successfully restore gross joint stability, post-traumatic osteoarthritis is commonplace following these injuries, even with ACL reconstruction. The development of new techniques to limit the long-term clinical sequelae associated with ACL reconstruction has been the main focus of research over the past decades. The improved knowledge of healing, along with recent advances in tissue engineering and regenerative medicine, has resulted in the discovery of novel biologically augmented ACL-repair techniques that have satisfactory outcomes in preclinical studies. This instructional review provides a summary of the latest advances made in ACL repair. Cite this article: Bone Joint Res 2014;3:20–31. PMID:24497504
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-20
... both sides would participate in an Exchange Auction, this proposed change would aid in price discovery... auction price. This proposed change would aid in price discovery and help to reduce the likelihood of... Sell Shares and, therefore, a User would never have complete knowledge of liquidity available on both...
ERIC Educational Resources Information Center
Carter, Sunshine; Traill, Stacie
2017-01-01
Electronic resource access troubleshooting is familiar work in most libraries. The added complexity introduced when a library implements a web-scale discovery service, however, creates a strong need for well-organized, rigorous training to enable troubleshooting staff to provide the best service possible. This article outlines strategies, tools,…
ERIC Educational Resources Information Center
Yu, Pulan
2012-01-01
Classification, clustering and association mining are major tasks of data mining and have been widely used for knowledge discovery. Associative classification mining, the combination of both association rule mining and classification, has emerged as an indispensable way to support decision making and scientific research. In particular, it offers a…
ERIC Educational Resources Information Center
Silbersack, Elionora W.
2014-01-01
The purpose of this qualitative study was to expand the scarce information available on how mothers first observe their children's early development, assess potential problems, and then come to recognize their concerns. In-depth knowledge about mothers' perspectives on the discovery process can help social workers to promote identification of…
Augmented Reality-Based Simulators as Discovery Learning Tools: An Empirical Study
ERIC Educational Resources Information Center
Ibáñez, María-Blanca; Di-Serio, Ángela; Villarán-Molina, Diego; Delgado-Kloos, Carlos
2015-01-01
This paper reports empirical evidence on having students use AR-SaBEr, a simulation tool based on augmented reality (AR), to discover the basic principles of electricity through a series of experiments. AR-SaBEr was enhanced with knowledge-based support and inquiry-based scaffolding mechanisms, which proved useful for discovery learning in…
76 FR 36320 - Rules of Practice in Proceedings Relative to False Representation and Lottery Orders
Federal Register 2010, 2011, 2012, 2013, 2014
2011-06-22
... officers. 952.18 Evidence. 952.19 Subpoenas. 952.20 Witness fees. 952.21 Discovery. 952.22 Transcript. 952..., motions, proposed orders, and other documents for the record. Discovery need not be filed except as may be... witnesses, that the statement correctly states the witness's opinion or knowledge concerning the matters in...
Making the Long Tail Visible: Social Networking Sites and Independent Music Discovery
ERIC Educational Resources Information Center
Gaffney, Michael; Rafferty, Pauline
2009-01-01
Purpose: The purpose of this paper is to investigate users' knowledge and use of social networking sites and folksonomies to discover if social tagging and folksonomies, within the area of independent music, aid in its information retrieval and discovery. The sites examined in this project are MySpace, Lastfm, Pandora and Allmusic. In addition,…
2016-01-01
Observations of individual organisms (data) can be combined with expert ecological knowledge of species, especially causal knowledge, to model and extract from flower–visiting data useful information about behavioral interactions between insect and plant organisms, such as nectar foraging and pollen transfer. We describe and evaluate a method to elicit and represent such expert causal knowledge of behavioral ecology, and discuss the potential for wider application of this method to the design of knowledge-based systems for knowledge discovery in biodiversity and ecosystem informatics. PMID:27851814
Marie Curie's Doctoral Thesis: Prelude to a Nobel Prize.
ERIC Educational Resources Information Center
Wolke, Robert L.
1988-01-01
Traces the life and research techniques of Marie Curie's doctoral dissertation leading to the discovery and purification of radium from ore. Reexamines the discoveries of other scientists that helped lead to this separation. (ML)
Respiratory Toxicity Biomarkers
The advancement in high throughput genomic, proteomic and metabolomic techniques have accelerated pace of lung biomarker discovery. A recent growth in the discovery of new lung toxicity/disease biomarkers have led to significant advances in our understanding of pathological proce...
Discovery learning model with geogebra assisted for improvement mathematical visual thinking ability
NASA Astrophysics Data System (ADS)
Juandi, D.; Priatna, N.
2018-05-01
The main goal of this study is to improve the mathematical visual thinking ability of high school student through implementation the Discovery Learning Model with Geogebra Assisted. This objective can be achieved through study used quasi-experimental method, with non-random pretest-posttest control design. The sample subject of this research consist of 62 senior school student grade XI in one of school in Bandung district. The required data will be collected through documentation, observation, written tests, interviews, daily journals, and student worksheets. The results of this study are: 1) Improvement students Mathematical Visual Thinking Ability who obtain learning with applied the Discovery Learning Model with Geogebra assisted is significantly higher than students who obtain conventional learning; 2) There is a difference in the improvement of students’ Mathematical Visual Thinking ability between groups based on prior knowledge mathematical abilities (high, medium, and low) who obtained the treatment. 3) The Mathematical Visual Thinking Ability improvement of the high group is significantly higher than in the medium and low groups. 4) The quality of improvement ability of high and low prior knowledge is moderate category, in while the quality of improvement ability in the high category achieved by student with medium prior knowledge.
Systematic identification of latent disease-gene associations from PubMed articles.
Zhang, Yuji; Shen, Feichen; Mojarad, Majid Rastegar; Li, Dingcheng; Liu, Sijia; Tao, Cui; Yu, Yue; Liu, Hongfang
2018-01-01
Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research.
Danchin, Antoine; Ouzounis, Christos; Tokuyasu, Taku; Zucker, Jean-Daniel
2018-07-01
Science and engineering rely on the accumulation and dissemination of knowledge to make discoveries and create new designs. Discovery-driven genome research rests on knowledge passed on via gene annotations. In response to the deluge of sequencing big data, standard annotation practice employs automated procedures that rely on majority rules. We argue this hinders progress through the generation and propagation of errors, leading investigators into blind alleys. More subtly, this inductive process discourages the discovery of novelty, which remains essential in biological research and reflects the nature of biology itself. Annotation systems, rather than being repositories of facts, should be tools that support multiple modes of inference. By combining deduction, induction and abduction, investigators can generate hypotheses when accurate knowledge is extracted from model databases. A key stance is to depart from 'the sequence tells the structure tells the function' fallacy, placing function first. We illustrate our approach with examples of critical or unexpected pathways, using MicroScope to demonstrate how tools can be implemented following the principles we advocate. We end with a challenge to the reader. © 2018 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
Systematic identification of latent disease-gene associations from PubMed articles
Mojarad, Majid Rastegar; Li, Dingcheng; Liu, Sijia; Tao, Cui; Yu, Yue; Liu, Hongfang
2018-01-01
Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research. PMID:29373609
2017-01-01
The development of structure-guided drug discovery is a story of knowledge exchange where new ideas originate from all parts of the research ecosystem. Dorothy Crowfoot Hodgkin obtained insulin from Boots Pure Drug Company in the 1930s and insulin crystallization was optimized in the company Novo in the 1950s, allowing the structure to be determined at Oxford University. The structure of renin was developed in academia, on this occasion in London, in response to a need to develop antihypertensives in pharma. The idea of a dimeric aspartic protease came from an international academic team and was discovered in HIV; it eventually led to new HIV antivirals being developed in industry. Structure-guided fragment-based discovery was developed in large pharma and biotechs, but has been exploited in academia for the development of new inhibitors targeting protein–protein interactions and also antimicrobials to combat mycobacterial infections such as tuberculosis. These observations provide a strong argument against the so-called ‘linear model’, where ideas flow only in one direction from academic institutions to industry. Structure-guided drug discovery is a story of applications of protein crystallography and knowledge exhange between academia and industry that has led to new drug approvals for cancer and other common medical conditions by the Food and Drug Administration in the USA, as well as hope for the treatment of rare genetic diseases and infectious diseases that are a particular challenge in the developing world. PMID:28875019
Choosing experiments to accelerate collective discovery
Rzhetsky, Andrey; Foster, Jacob G.; Foster, Ian T.
2015-01-01
A scientist’s choice of research problem affects his or her personal career trajectory. Scientists’ combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity’s importance corresponds to its degree centrality, and a problem’s difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies. PMID:26554009
NASA Astrophysics Data System (ADS)
Demir, I.; Sermet, M. Y.
2016-12-01
Nobody is immune from extreme events or natural hazards that can lead to large-scale consequences for the nation and public. One of the solutions to reduce the impacts of extreme events is to invest in improving resilience with the ability to better prepare, plan, recover, and adapt to disasters. The National Research Council (NRC) report discusses the topic of how to increase resilience to extreme events through a vision of resilient nation in the year 2030. The report highlights the importance of data, information, gaps and knowledge challenges that needs to be addressed, and suggests every individual to access the risk and vulnerability information to make their communities more resilient. This abstracts presents our project on developing a resilience framework for flooding to improve societal preparedness with objectives; (a) develop a generalized ontology for extreme events with primary focus on flooding; (b) develop a knowledge engine with voice recognition, artificial intelligence, natural language processing, and inference engine. The knowledge engine will utilize the flood ontology and concepts to connect user input to relevant knowledge discovery outputs on flooding; (c) develop a data acquisition and processing framework from existing environmental observations, forecast models, and social networks. The system will utilize the framework, capabilities and user base of the Iowa Flood Information System (IFIS) to populate and test the system; (d) develop a communication framework to support user interaction and delivery of information to users. The interaction and delivery channels will include voice and text input via web-based system (e.g. IFIS), agent-based bots (e.g. Microsoft Skype, Facebook Messenger), smartphone and augmented reality applications (e.g. smart assistant), and automated web workflows (e.g. IFTTT, CloudWork) to open the knowledge discovery for flooding to thousands of community extensible web workflows.
State of the Art: Response Assessment in Lung Cancer in the Era of Genomic Medicine
Hatabu, Hiroto; Johnson, Bruce E.; McLoud, Theresa C.
2014-01-01
Tumor response assessment has been a foundation for advances in cancer therapy. Recent discoveries of effective targeted therapy for specific genomic abnormalities in lung cancer and their clinical application have brought revolutionary advances in lung cancer therapy and transformed the oncologist’s approach to patients with lung cancer. Because imaging is a major method of response assessment in lung cancer both in clinical trials and practice, radiologists must understand the genomic alterations in lung cancer and the rapidly evolving therapeutic approaches to effectively communicate with oncology colleagues and maintain the key role in lung cancer care. This article describes the origin and importance of tumor response assessment, presents the recent genomic discoveries in lung cancer and therapies directed against these genomic changes, and describes how these discoveries affect the radiology community. The authors then summarize the conventional Response Evaluation Criteria in Solid Tumors and World Health Organization guidelines, which continue to be the major determinants of trial endpoints, and describe their limitations particularly in an era of genomic-based therapy. More advanced imaging techniques for lung cancer response assessment are presented, including computed tomography tumor volume and perfusion, dynamic contrast material–enhanced and diffusion-weighted magnetic resonance imaging, and positron emission tomography with fluorine 18 fluorodeoxyglucose and novel tracers. State-of-art knowledge of lung cancer biology, treatment, and imaging will help the radiology community to remain effective contributors to the personalized care of lung cancer patients. © RSNA, 2014 PMID:24661292
discovery toolset for Emulytics v. 1.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fritz, David; Crussell, Jonathan
The discovery toolset for Emulytics enables the construction of high-fidelity emulation models of systems. The toolset consists of a set of tools and techniques to automatically go from network discovery of operational systems to emulating those complex systems. Our toolset combines data from host discovery and network mapping tools into an intermediate representation that can then be further refined. Once the intermediate representation reaches the desired state, our toolset supports emitting the Emulytics models with varying levels of specificity based on experiment needs.
Quantitative mass spectrometry: an overview
NASA Astrophysics Data System (ADS)
Urban, Pawel L.
2016-10-01
Mass spectrometry (MS) is a mainstream chemical analysis technique in the twenty-first century. It has contributed to numerous discoveries in chemistry, physics and biochemistry. Hundreds of research laboratories scattered all over the world use MS every day to investigate fundamental phenomena on the molecular level. MS is also widely used by industry-especially in drug discovery, quality control and food safety protocols. In some cases, mass spectrometers are indispensable and irreplaceable by any other metrological tools. The uniqueness of MS is due to the fact that it enables direct identification of molecules based on the mass-to-charge ratios as well as fragmentation patterns. Thus, for several decades now, MS has been used in qualitative chemical analysis. To address the pressing need for quantitative molecular measurements, a number of laboratories focused on technological and methodological improvements that could render MS a fully quantitative metrological platform. In this theme issue, the experts working for some of those laboratories share their knowledge and enthusiasm about quantitative MS. I hope this theme issue will benefit readers, and foster fundamental and applied research based on quantitative MS measurements. This article is part of the themed issue 'Quantitative mass spectrometry'.
Jackson, Rebecca D; Best, Thomas M; Borlawsky, Tara B; Lai, Albert M; James, Stephen; Gurcan, Metin N
2012-01-01
The conduct of clinical and translational research regularly involves the use of a variety of heterogeneous and large-scale data resources. Scalable methods for the integrative analysis of such resources, particularly when attempting to leverage computable domain knowledge in order to generate actionable hypotheses in a high-throughput manner, remain an open area of research. In this report, we describe both a generalizable design pattern for such integrative knowledge-anchored hypothesis discovery operations and our experience in applying that design pattern in the experimental context of a set of driving research questions related to the publicly available Osteoarthritis Initiative data repository. We believe that this ‘test bed’ project and the lessons learned during its execution are both generalizable and representative of common clinical and translational research paradigms. PMID:22647689
Huo, Zhiguang; Tseng, George
2017-01-01
Cancer subtypes discovery is the first step to deliver personalized medicine to cancer patients. With the accumulation of massive multi-level omics datasets and established biological knowledge databases, omics data integration with incorporation of rich existing biological knowledge is essential for deciphering a biological mechanism behind the complex diseases. In this manuscript, we propose an integrative sparse K-means (is-K means) approach to discover disease subtypes with the guidance of prior biological knowledge via sparse overlapping group lasso. An algorithm using an alternating direction method of multiplier (ADMM) will be applied for fast optimization. Simulation and three real applications in breast cancer and leukemia will be used to compare is-K means with existing methods and demonstrate its superior clustering accuracy, feature selection, functional annotation of detected molecular features and computing efficiency. PMID:28959370
Huo, Zhiguang; Tseng, George
2017-06-01
Cancer subtypes discovery is the first step to deliver personalized medicine to cancer patients. With the accumulation of massive multi-level omics datasets and established biological knowledge databases, omics data integration with incorporation of rich existing biological knowledge is essential for deciphering a biological mechanism behind the complex diseases. In this manuscript, we propose an integrative sparse K -means (is- K means) approach to discover disease subtypes with the guidance of prior biological knowledge via sparse overlapping group lasso. An algorithm using an alternating direction method of multiplier (ADMM) will be applied for fast optimization. Simulation and three real applications in breast cancer and leukemia will be used to compare is- K means with existing methods and demonstrate its superior clustering accuracy, feature selection, functional annotation of detected molecular features and computing efficiency.
A Semantic Lexicon-Based Approach for Sense Disambiguation and Its WWW Application
NASA Astrophysics Data System (ADS)
di Lecce, Vincenzo; Calabrese, Marco; Soldo, Domenico
This work proposes a basic framework for resolving sense disambiguation through the use of Semantic Lexicon, a machine readable dictionary managing both word senses and lexico-semantic relations. More specifically, polysemous ambiguity characterizing Web documents is discussed. The adopted Semantic Lexicon is WordNet, a lexical knowledge-base of English words widely adopted in many research studies referring to knowledge discovery. The proposed approach extends recent works on knowledge discovery by focusing on the sense disambiguation aspect. By exploiting the structure of WordNet database, lexico-semantic features are used to resolve the inherent sense ambiguity of written text with particular reference to HTML resources. The obtained results may be extended to generic hypertextual repositories as well. Experiments show that polysemy reduction can be used to hint about the meaning of specific senses in given contexts.
Rector, Annabel; Tachezy, Ruth; Van Ranst, Marc
2004-01-01
The discovery of novel viruses has often been accomplished by using hybridization-based methods that necessitate the availability of a previously characterized virus genome probe or knowledge of the viral nucleotide sequence to construct consensus or degenerate PCR primers. In their natural replication cycle, certain viruses employ a rolling-circle mechanism to propagate their circular genomes, and multiply primed rolling-circle amplification (RCA) with φ29 DNA polymerase has recently been applied in the amplification of circular plasmid vectors used in cloning. We employed an isothermal RCA protocol that uses random hexamer primers to amplify the complete genomes of papillomaviruses without the need for prior knowledge of their DNA sequences. We optimized this RCA technique with extracted human papillomavirus type 16 (HPV-16) DNA from W12 cells, using a real-time quantitative PCR assay to determine amplification efficiency, and obtained a 2.4 × 104-fold increase in HPV-16 DNA concentration. We were able to clone the complete HPV-16 genome from this multiply primed RCA product. The optimized protocol was subsequently applied to a bovine fibropapillomatous wart tissue sample. Whereas no papillomavirus DNA could be detected by restriction enzyme digestion of the original sample, multiply primed RCA enabled us to obtain a sufficient amount of papillomavirus DNA for restriction enzyme analysis, cloning, and subsequent sequencing of a novel variant of bovine papillomavirus type 1. The multiply primed RCA method allows the discovery of previously unknown papillomaviruses, and possibly also other circular DNA viruses, without a priori sequence information. PMID:15113879
State of the Art in Tumor Antigen and Biomarker Discovery
Even-Desrumeaux, Klervi; Baty, Daniel; Chames, Patrick
2011-01-01
Our knowledge of tumor immunology has resulted in multiple approaches for the treatment of cancer. However, a gap between research of new tumors markers and development of immunotherapy has been established and very few markers exist that can be used for treatment. The challenge is now to discover new targets for active and passive immunotherapy. This review aims at describing recent advances in biomarkers and tumor antigen discovery in terms of antigen nature and localization, and is highlighting the most recent approaches used for their discovery including “omics” technology. PMID:24212823
ERIC Educational Resources Information Center
Pauleen, David J.; Corbitt, Brian; Yoong, Pak
2007-01-01
Purpose: To provide a conceptual model for the discovery and articulation of emergent organizational knowledge, particularly knowledge that develops when people work with new technologies. Design/methodology/approach: The model is based on two widely accepted research methods--action learning and grounded theory--and is illustrated using a case…
Modeling Emergence in Neuroprotective Regulatory Networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sanfilippo, Antonio P.; Haack, Jereme N.; McDermott, Jason E.
2013-01-05
The use of predictive modeling in the analysis of gene expression data can greatly accelerate the pace of scientific discovery in biomedical research by enabling in silico experimentation to test disease triggers and potential drug therapies. Techniques that focus on modeling emergence, such as agent-based modeling and multi-agent simulations, are of particular interest as they support the discovery of pathways that may have never been observed in the past. Thus far, these techniques have been primarily applied at the multi-cellular level, or have focused on signaling and metabolic networks. We present an approach where emergence modeling is extended to regulatorymore » networks and demonstrate its application to the discovery of neuroprotective pathways. An initial evaluation of the approach indicates that emergence modeling provides novel insights for the analysis of regulatory networks that can advance the discovery of acute treatments for stroke and other diseases.« less
Cache-Cache Comparison for Supporting Meaningful Learning
ERIC Educational Resources Information Center
Wang, Jingyun; Fujino, Seiji
2015-01-01
The paper presents a meaningful discovery learning environment called "cache-cache comparison" for a personalized learning support system. The processing of seeking hidden relations or concepts in "cache-cache comparison" is intended to encourage learners to actively locate new knowledge in their knowledge framework and check…
From Wisdom to Innocence: Passing on the Knowledge of the Night Sky
NASA Technical Reports Server (NTRS)
Shope, R.
1996-01-01
Memorable learning can happen when the whole family shares the thrill of discovery together. The fascination of the night sky presents a perfect opportunity for gifted parents and children to experience the tradition of passing on knowledge from generation to generation.
Ontology-guided data preparation for discovering genotype-phenotype relationships.
Coulet, Adrien; Smaïl-Tabbone, Malika; Benlian, Pascale; Napoli, Amedeo; Devignes, Marie-Dominique
2008-04-25
Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bio-ontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning. This paper presents a method that exploits bio-ontologies for guiding data selection within the preparation step of the KDD process. We propose three scenarios in which domain knowledge and ontology elements such as subsumption, properties, class descriptions, are taken into account for data selection, before the data mining step. Each of these scenarios is illustrated within a case-study relative to the search of genotype-phenotype relationships in a familial hypercholesterolemia dataset. The guiding of data selection based on domain knowledge is analysed and shows a direct influence on the volume and significance of the data mining results. The method proposed in this paper is an efficient alternative to numerical methods for data selection based on domain knowledge. In turn, the results of this study may be reused in ontology modelling and data integration.
The dendritic spine story: an intriguing process of discovery.
DeFelipe, Javier
2015-01-01
Dendritic spines are key components of a variety of microcircuits and they represent the majority of postsynaptic targets of glutamatergic axon terminals in the brain. The present article will focus on the discovery of dendritic spines, which was possible thanks to the application of the Golgi technique to the study of the nervous system, and will also explore the early interpretation of these elements. This discovery represents an interesting chapter in the history of neuroscience as it shows us that progress in the study of the structure of the nervous system is based not only on the emergence of new techniques but also on our ability to exploit the methods already available and correctly interpret their microscopic images.
A framework for interval-valued information system
NASA Astrophysics Data System (ADS)
Yin, Yunfei; Gong, Guanghong; Han, Liang
2012-09-01
Interval-valued information system is used to transform the conventional dataset into the interval-valued form. To conduct the interval-valued data mining, we conduct two investigations: (1) construct the interval-valued information system, and (2) conduct the interval-valued knowledge discovery. In constructing the interval-valued information system, we first make the paired attributes in the database discovered, and then, make them stored in the neighbour locations in a common database and regard them as 'one' new field. In conducting the interval-valued knowledge discovery, we utilise some related priori knowledge and regard the priori knowledge as the control objectives; and design an approximate closed-loop control mining system. On the implemented experimental platform (prototype), we conduct the corresponding experiments and compare the proposed algorithms with several typical algorithms, such as the Apriori algorithm, the FP-growth algorithm and the CLOSE+ algorithm. The experimental results show that the interval-valued information system method is more effective than the conventional algorithms in discovering interval-valued patterns.
Knowledge Discovery in Variant Databases Using Inductive Logic Programming
Nguyen, Hoan; Luu, Tien-Dao; Poch, Olivier; Thompson, Julie D.
2013-01-01
Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/. PMID:23589683
Knowledge discovery in variant databases using inductive logic programming.
Nguyen, Hoan; Luu, Tien-Dao; Poch, Olivier; Thompson, Julie D
2013-01-01
Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/.
Semantic biomedical resource discovery: a Natural Language Processing framework.
Sfakianaki, Pepi; Koumakis, Lefteris; Sfakianakis, Stelios; Iatraki, Galatia; Zacharioudakis, Giorgos; Graf, Norbert; Marias, Kostas; Tsiknakis, Manolis
2015-09-30
A plethora of publicly available biomedical resources do currently exist and are constantly increasing at a fast rate. In parallel, specialized repositories are been developed, indexing numerous clinical and biomedical tools. The main drawback of such repositories is the difficulty in locating appropriate resources for a clinical or biomedical decision task, especially for non-Information Technology expert users. In parallel, although NLP research in the clinical domain has been active since the 1960s, progress in the development of NLP applications has been slow and lags behind progress in the general NLP domain. The aim of the present study is to investigate the use of semantics for biomedical resources annotation with domain specific ontologies and exploit Natural Language Processing methods in empowering the non-Information Technology expert users to efficiently search for biomedical resources using natural language. A Natural Language Processing engine which can "translate" free text into targeted queries, automatically transforming a clinical research question into a request description that contains only terms of ontologies, has been implemented. The implementation is based on information extraction techniques for text in natural language, guided by integrated ontologies. Furthermore, knowledge from robust text mining methods has been incorporated to map descriptions into suitable domain ontologies in order to ensure that the biomedical resources descriptions are domain oriented and enhance the accuracy of services discovery. The framework is freely available as a web application at ( http://calchas.ics.forth.gr/ ). For our experiments, a range of clinical questions were established based on descriptions of clinical trials from the ClinicalTrials.gov registry as well as recommendations from clinicians. Domain experts manually identified the available tools in a tools repository which are suitable for addressing the clinical questions at hand, either individually or as a set of tools forming a computational pipeline. The results were compared with those obtained from an automated discovery of candidate biomedical tools. For the evaluation of the results, precision and recall measurements were used. Our results indicate that the proposed framework has a high precision and low recall, implying that the system returns essentially more relevant results than irrelevant. There are adequate biomedical ontologies already available, sufficiency of existing NLP tools and quality of biomedical annotation systems for the implementation of a biomedical resources discovery framework, based on the semantic annotation of resources and the use on NLP techniques. The results of the present study demonstrate the clinical utility of the application of the proposed framework which aims to bridge the gap between clinical question in natural language and efficient dynamic biomedical resources discovery.
Daily life activity routine discovery in hemiparetic rehabilitation patients using topic models.
Seiter, J; Derungs, A; Schuster-Amft, C; Amft, O; Tröster, G
2015-01-01
Monitoring natural behavior and activity routines of hemiparetic rehabilitation patients across the day can provide valuable progress information for therapists and patients and contribute to an optimized rehabilitation process. In particular, continuous patient monitoring could add type, frequency and duration of daily life activity routines and hence complement standard clinical scores that are assessed for particular tasks only. Machine learning methods have been applied to infer activity routines from sensor data. However, supervised methods require activity annotations to build recognition models and thus require extensive patient supervision. Discovery methods, including topic models could provide patient routine information and deal with variability in activity and movement performance across patients. Topic models have been used to discover characteristic activity routine patterns of healthy individuals using activity primitives recognized from supervised sensor data. Yet, the applicability of topic models for hemiparetic rehabilitation patients and techniques to derive activity primitives without supervision needs to be addressed. We investigate, 1) whether a topic model-based activity routine discovery framework can infer activity routines of rehabilitation patients from wearable motion sensor data. 2) We compare the performance of our topic model-based activity routine discovery using rule-based and clustering-based activity vocabulary. We analyze the activity routine discovery in a dataset recorded with 11 hemiparetic rehabilitation patients during up to ten full recording days per individual in an ambulatory daycare rehabilitation center using wearable motion sensors attached to both wrists and the non-affected thigh. We introduce and compare rule-based and clustering-based activity vocabulary to process statistical and frequency acceleration features to activity words. Activity words were used for activity routine pattern discovery using topic models based on Latent Dirichlet Allocation. Discovered activity routine patterns were then mapped to six categorized activity routines. Using the rule-based approach, activity routines could be discovered with an average accuracy of 76% across all patients. The rule-based approach outperformed clustering by 10% and showed less confusions for predicted activity routines. Topic models are suitable to discover daily life activity routines in hemiparetic rehabilitation patients without trained classifiers and activity annotations. Activity routines show characteristic patterns regarding activity primitives including body and extremity postures and movement. A patient-independent rule set can be derived. Including expert knowledge supports successful activity routine discovery over completely data-driven clustering.
ERIC Educational Resources Information Center
Hyman, Harvey
2012-01-01
This dissertation examines the impact of exploration and learning upon eDiscovery information retrieval; it is written in three parts. Part I contains foundational concepts and background on the topics of information retrieval and eDiscovery. This part informs the reader about the research frameworks, methodologies, data collection, and…
ERIC Educational Resources Information Center
Yang, Xi; Chen, Jin
2017-01-01
Botanical gardens (BGs) are important agencies that enhance human knowledge and attitude towards flora conservation. By following free-choice learning model, we developed a "Discovery map" and distributed the map to visitors at the Xishuangbanna Tropical Botanical Garden in Yunnan, China. Visitors, who did and did not receive discovery…
J. D. Solomon; L. Newsome; T. H. Filer
1984-01-01
A stem-boring weevil obtained from infested clusters of mistletoe was subsequently reared and identified as Myrmex sp. To our knowledge its discovery in Mississippi is the easternmost record of mistletoe-feeding Myrmex, previously recorded only from the West and Southwest. Based on current studies, the weevil overwinters as larvae in tunnels within mistletoe stems....
NASA Astrophysics Data System (ADS)
Sharkov, N. A.; Sharkova, O. A.
2018-05-01
The paper identifies the importance of the Leonhard Euler's discoveries in the field of shipbuilding for the scientific evolution of academician A. N. Krylov and for the modern knowledge in survivability and safety of ships. The works by Leonard Euler "Marine Science" and "The Moon Motion New Theory" are discussed.
Application of statistical mining in healthcare data management for allergic diseases
NASA Astrophysics Data System (ADS)
Wawrzyniak, Zbigniew M.; Martínez Santolaya, Sara
2014-11-01
The paper aims to discuss data mining techniques based on statistical tools in medical data management in case of long-term diseases. The data collected from a population survey is the source for reasoning and identifying disease processes responsible for patient's illness and its symptoms, and prescribing a knowledge and decisions in course of action to correct patient's condition. The case considered as a sample of constructive approach to data management is a dependence of allergic diseases of chronic nature on some symptoms and environmental conditions. The knowledge summarized in a systematic way as accumulated experience constitutes to an experiential simplified model of the diseases with feature space constructed of small set of indicators. We have presented the model of disease-symptom-opinion with knowledge discovery for data management in healthcare. The feature is evident that the model is purely data-driven to evaluate the knowledge of the diseases` processes and probability dependence of future disease events on symptoms and other attributes. The example done from the outcomes of the survey of long-term (chronic) disease shows that a small set of core indicators as 4 or more symptoms and opinions could be very helpful in reflecting health status change over disease causes. Furthermore, the data driven understanding of the mechanisms of diseases gives physicians the basis for choices of treatment what outlines the need of data governance in this research domain of discovered knowledge from surveys.
Drewes, Stephan; Straková, Petra; Drexler, Jan F; Jacob, Jens; Ulrich, Rainer G
2017-01-01
Rodents are distributed throughout the world and interact with humans in many ways. They provide vital ecosystem services, some species are useful models in biomedical research and some are held as pet animals. However, many rodent species can have adverse effects such as damage to crops and stored produce, and they are of health concern because of the transmission of pathogens to humans and livestock. The first rodent viruses were discovered by isolation approaches and resulted in break-through knowledge in immunology, molecular and cell biology, and cancer research. In addition to rodent-specific viruses, rodent-borne viruses are causing a large number of zoonotic diseases. Most prominent examples are reemerging outbreaks of human hemorrhagic fever disease cases caused by arena- and hantaviruses. In addition, rodents are reservoirs for vector-borne pathogens, such as tick-borne encephalitis virus and Borrelia spp., and may carry human pathogenic agents, but likely are not involved in their transmission to human. In our days, next-generation sequencing or high-throughput sequencing (HTS) is revolutionizing the speed of the discovery of novel viruses, but other molecular approaches, such as generic RT-PCR/PCR and rolling circle amplification techniques, contribute significantly to the rapidly ongoing process. However, the current knowledge still represents only the tip of the iceberg, when comparing the known human viruses to those known for rodents, the mammalian taxon with the largest species number. The diagnostic potential of HTS-based metagenomic approaches is illustrated by their use in the discovery and complete genome determination of novel borna- and adenoviruses as causative disease agents in squirrels. In conclusion, HTS, in combination with conventional RT-PCR/PCR-based approaches, resulted in a drastically increased knowledge of the diversity of rodent viruses. Future improvements of the used workflows, including bioinformatics analysis, will further enhance our knowledge and preparedness in case of the emergence of novel viruses. Classical virological and additional molecular approaches are needed for genome annotation and functional characterization of novel viruses, discovered by these technologies, and evaluation of their zoonotic potential. © 2017 Elsevier Inc. All rights reserved.
Cryo-EM in drug discovery: achievements, limitations and prospects.
Renaud, Jean-Paul; Chari, Ashwin; Ciferri, Claudio; Liu, Wen-Ti; Rémigy, Hervé-William; Stark, Holger; Wiesmann, Christian
2018-06-08
Cryo-electron microscopy (cryo-EM) of non-crystalline single particles is a biophysical technique that can be used to determine the structure of biological macromolecules and assemblies. Historically, its potential for application in drug discovery has been heavily limited by two issues: the minimum size of the structures it can be used to study and the resolution of the images. However, recent technological advances - including the development of direct electron detectors and more effective computational image analysis techniques - are revolutionizing the utility of cryo-EM, leading to a burst of high-resolution structures of large macromolecular assemblies. These advances have raised hopes that single-particle cryo-EM might soon become an important tool for drug discovery, particularly if they could enable structural determination for 'intractable' targets that are still not accessible to X-ray crystallographic analysis. This article describes the recent advances in the field and critically assesses their relevance for drug discovery as well as discussing at what stages of the drug discovery pipeline cryo-EM can be useful today and what to expect in the near future.
Discovery of the leinamycin family of natural products by mining actinobacterial genomes
Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen
2017-01-01
Nature’s ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF–SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF–SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm-type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature’s rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature’s biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity. PMID:29229819
Discovery of the leinamycin family of natural products by mining actinobacterial genomes.
Pan, Guohui; Xu, Zhengren; Guo, Zhikai; Hindra; Ma, Ming; Yang, Dong; Zhou, Hao; Gansemans, Yannick; Zhu, Xiangcheng; Huang, Yong; Zhao, Li-Xing; Jiang, Yi; Cheng, Jinhua; Van Nieuwerburgh, Filip; Suh, Joo-Won; Duan, Yanwen; Shen, Ben
2017-12-26
Nature's ability to generate diverse natural products from simple building blocks has inspired combinatorial biosynthesis. The knowledge-based approach to combinatorial biosynthesis has allowed the production of designer analogs by rational metabolic pathway engineering. While successful, structural alterations are limited, with designer analogs often produced in compromised titers. The discovery-based approach to combinatorial biosynthesis complements the knowledge-based approach by exploring the vast combinatorial biosynthesis repertoire found in Nature. Here we showcase the discovery-based approach to combinatorial biosynthesis by targeting the domain of unknown function and cysteine lyase domain (DUF-SH) didomain, specific for sulfur incorporation from the leinamycin (LNM) biosynthetic machinery, to discover the LNM family of natural products. By mining bacterial genomes from public databases and the actinomycetes strain collection at The Scripps Research Institute, we discovered 49 potential producers that could be grouped into 18 distinct clades based on phylogenetic analysis of the DUF-SH didomains. Further analysis of the representative genomes from each of the clades identified 28 lnm -type gene clusters. Structural diversities encoded by the LNM-type biosynthetic machineries were predicted based on bioinformatics and confirmed by in vitro characterization of selected adenylation proteins and isolation and structural elucidation of the guangnanmycins and weishanmycins. These findings demonstrate the power of the discovery-based approach to combinatorial biosynthesis for natural product discovery and structural diversity and highlight Nature's rich biosynthetic repertoire. Comparative analysis of the LNM-type biosynthetic machineries provides outstanding opportunities to dissect Nature's biosynthetic strategies and apply these findings to combinatorial biosynthesis for natural product discovery and structural diversity.
Video mining using combinations of unsupervised and supervised learning techniques
NASA Astrophysics Data System (ADS)
Divakaran, Ajay; Miyahara, Koji; Peker, Kadir A.; Radhakrishnan, Regunathan; Xiong, Ziyou
2003-12-01
We discuss the meaning and significance of the video mining problem, and present our work on some aspects of video mining. A simple definition of video mining is unsupervised discovery of patterns in audio-visual content. Such purely unsupervised discovery is readily applicable to video surveillance as well as to consumer video browsing applications. We interpret video mining as content-adaptive or "blind" content processing, in which the first stage is content characterization and the second stage is event discovery based on the characterization obtained in stage 1. We discuss the target applications and find that using a purely unsupervised approach are too computationally complex to be implemented on our product platform. We then describe various combinations of unsupervised and supervised learning techniques that help discover patterns that are useful to the end-user of the application. We target consumer video browsing applications such as commercial message detection, sports highlights extraction etc. We employ both audio and video features. We find that supervised audio classification combined with unsupervised unusual event discovery enables accurate supervised detection of desired events. Our techniques are computationally simple and robust to common variations in production styles etc.
Innovation in medicine: Ignaz the reviled and Egas the regaled.
Csoka, Antonei Benjamin
2016-06-01
In our current climate of rapid technological progress, it seems counterintuitive to think that modern science can learn anything of ethical value from the dark recesses of the nineteenth century or earlier. However, this happens to be quite true, with plenty of knowledge and wisdom to be gleaned by studying our scientific predecessors. Presently, our journals are flooded with original concepts and potential breakthroughs, a continuous stream of ideas pushing the frontiers of knowledge ever forward. Some ideas flourish while others flounder; but what sets the two apart? The distinguishing feature between success and failure within this context is the ability to discern the appropriate time to accept an innovation with open arms, versus when to take a more cautious approach. And the primary arbiters for whether an idea will catch on or not are the professional audience. I illustrate this concept by comparing the initial reception of two innovative ideas from Medicine's past: sterile technique, and prefrontal lobotomy. Sterile technique was first introduced by Dr. Ignaz Semmelweis and was initially ridiculed and rejected, with Semmelweis eventually dying in exile. Conversely, lobotomy was accepted and lauded and its inventor, Dr. Egas Moniz, won the Nobel Prize for his "discovery". This begs the question: why was a technique with the potential to save millions of lives initially rejected, whereas paradoxically, one that compromised and sometimes destroyed lives, accepted? Here I explore and analyze the potential reasons why, suggest how we can learn from these mistakes of the past and apply new insight to some current ethical dilemmas.
Modeling & Informatics at Vertex Pharmaceuticals Incorporated: our philosophy for sustained impact
NASA Astrophysics Data System (ADS)
McGaughey, Georgia; Patrick Walters, W.
2017-03-01
Molecular modelers and informaticians have the unique opportunity to integrate cross-functional data using a myriad of tools, methods and visuals to generate information. Using their drug discovery expertise, information is transformed to knowledge that impacts drug discovery. These insights are often times formulated locally and then applied more broadly, which influence the discovery of new medicines. This is particularly true in an organization where the members are exposed to projects throughout an organization, such as in the case of the global Modeling & Informatics group at Vertex Pharmaceuticals. From its inception, Vertex has been a leader in the development and use of computational methods for drug discovery. In this paper, we describe the Modeling & Informatics group at Vertex and the underlying philosophy, which has driven this team to sustain impact on the discovery of first-in-class transformative medicines.
FIR: An Effective Scheme for Extracting Useful Metadata from Social Media.
Chen, Long-Sheng; Lin, Zue-Cheng; Chang, Jing-Rong
2015-11-01
Recently, the use of social media for health information exchange is expanding among patients, physicians, and other health care professionals. In medical areas, social media allows non-experts to access, interpret, and generate medical information for their own care and the care of others. Researchers paid much attention on social media in medical educations, patient-pharmacist communications, adverse drug reactions detection, impacts of social media on medicine and healthcare, and so on. However, relatively few papers discuss how to extract useful knowledge from a huge amount of textual comments in social media effectively. Therefore, this study aims to propose a Fuzzy adaptive resonance theory network based Information Retrieval (FIR) scheme by combining Fuzzy adaptive resonance theory (ART) network, Latent Semantic Indexing (LSI), and association rules (AR) discovery to extract knowledge from social media. In our FIR scheme, Fuzzy ART network firstly has been employed to segment comments. Next, for each customer segment, we use LSI technique to retrieve important keywords. Then, in order to make the extracted keywords understandable, association rules mining is presented to organize these extracted keywords to build metadata. These extracted useful voices of customers will be transformed into design needs by using Quality Function Deployment (QFD) for further decision making. Unlike conventional information retrieval techniques which acquire too many keywords to get key points, our FIR scheme can extract understandable metadata from social media.
Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi
2016-01-01
The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers.
Emerging technology becomes an opportunity for EOS
NASA Astrophysics Data System (ADS)
Fargion, Giulietta S.; Harberts, Robert; Masek, Jeffrey G.
1996-11-01
During the last decade, we have seen an explosive growth in our ability to collect and generate data. When implemented, NASA's Earth observing system data information system (EOSDIS) will receive about 50 gigabytes of remotely sensed image data per hour. This will generate an urgent need for new techniques and tools that can automatically and intelligently assist in transforming this abundance of data into useful knowledge. Some emerging technologies that address these challenges include data mining and knowledge discovery in databases (KDD). The most basic data mining application is a content-based search (examples include finding images of particular meteorological phenomena or identifying data that have been previously mined or interpreted). In order that these technologies be effectively exploited for EOSDIS development, a better understanding of data mining and the requirements for using this technology is necessary. The authors are currently undertaking a project exploring the requirements and options of content-based search and data mining for use on EOSDIS. The scope of the project is to develop a prototype with which to investigate user interface concepts, requirements, and designs relevant for EOSDIS core system (ECS) subsystem utilizing these techniques. The goal is to identify a generic handling of these functions. This prototype will help identify opportunities which the earth science community and EOSDIS can use to meet the challenges of collecting, searching, retrieving, and interacting with abundant data resources in highly productive ways.
Castro-Santos, Patricia; Díaz-Peña, Roberto
2017-09-01
Most rheumatic diseases are complex or multifactorial entities with pathogeneses that interact with both multiple genetic factors and a high number of diverse environmental factors. Knowledge of the human genome sequence and its diversity among populations has provided a crucial step forward in our understanding of genetic diseases, identifying many genetic loci or genes associated with diverse phenotypes. In general, susceptibility to autoimmunity is associated with multiple risk factors, but the mechanism of the environmental component influence is poorly understood. Studies in twins have demonstrated that genetics do not explain the totality of the pathogenesis of rheumatic diseases. One method of modulating gene expression through environmental effects is via epigenetic modifications. These techniques open a new field for identifying useful new biomarkers and therapeutic targets. In this context, the development of "-omics" techniques is an opportunity to progress in our knowledge of complex diseases, impacting the discovery of new potential biomarkers suitable for their introduction into clinical practice. In this review, we focus on the recent advances in the fields of genomics and epigenomics in rheumatic diseases and their potential to be useful for the diagnosis, follow-up, and treatment of these diseases. The ultimate aim of genomic studies in any human disease is to understand its pathogenesis, thereby enabling the prediction of the evolution of the disease to establish new treatments and address the development of personalized therapies.
Zare Hosseini, Zeinab; Mohammadzadeh, Mahdi
2016-01-01
The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers. PMID:27610177
Historical and Current Perspective on Tobacco use and Nicotine Addiction
Dani, John A.; Balfour, David J.K.
2011-01-01
Although the addictive influence of tobacco was recognized very early, the modern concepts of nicotine addiction have relied on knowledge of cholinergic neurotransmission and nicotinic acetylcholine receptors (nAChRs). The discovery of the “receptive substance” by Langley, that would turn out to be nAChRs, and “Vagusstoff” (acetylcholine) by Loewi, coincided with an exciting time when the concept of chemical synaptic transmission was being formulated. More recently, the application of more powerful techniques and the study of animal models that replicate key features of nicotine dependence have led to important advancements in our understanding of molecular, cellular, and systems mechanisms of nicotine addiction. In this Review, we present a historical perspective and overview of the research that has led to our present understanding of nicotine addiction. PMID:21696833
PRO-Elicere: A Study for Create a New Process of Dependability Analysis of Space Computer Systems
NASA Astrophysics Data System (ADS)
da Silva, Glauco; Netto Lahoz, Carlos Henrique
2013-09-01
This paper presents the new approach to the computer system dependability analysis, called PRO-ELICERE, which introduces data mining concepts and intelligent mechanisms to decision support to analyze the potential hazards and failures of a critical computer system. Also, are presented some techniques and tools that support the traditional dependability analysis and briefly discusses the concept of knowledge discovery and intelligent databases for critical computer systems. After that, introduces the PRO-ELICERE process, an intelligent approach to automate the ELICERE, a process created to extract non-functional requirements for critical computer systems. The PRO-ELICERE can be used in the V&V activities in the projects of Institute of Aeronautics and Space, such as the Brazilian Satellite Launcher (VLS-1).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tzeng, Nian-Feng; White, Christopher D.; Moreman, Douglas
2012-07-14
The UCoMS research cluster has spearheaded three research areas since August 2004, including wireless and sensor networks, Grid computing, and petroleum applications. The primary goals of UCoMS research are three-fold: (1) creating new knowledge to push forward the technology forefronts on pertinent research on the computing and monitoring aspects of energy resource management, (2) developing and disseminating software codes and toolkits for the research community and the public, and (3) establishing system prototypes and testbeds for evaluating innovative techniques and methods. Substantial progress and diverse accomplishment have been made by research investigators in their respective areas of expertise cooperatively onmore » such topics as sensors and sensor networks, wireless communication and systems, computational Grids, particularly relevant to petroleum applications.« less
Mass spectrometry for fragment screening.
Chan, Daniel Shiu-Hin; Whitehouse, Andrew J; Coyne, Anthony G; Abell, Chris
2017-11-08
Fragment-based approaches in chemical biology and drug discovery have been widely adopted worldwide in both academia and industry. Fragment hits tend to interact weakly with their targets, necessitating the use of sensitive biophysical techniques to detect their binding. Common fragment screening techniques include differential scanning fluorimetry (DSF) and ligand-observed NMR. Validation and characterization of hits is usually performed using a combination of protein-observed NMR, isothermal titration calorimetry (ITC) and X-ray crystallography. In this context, MS is a relatively underutilized technique in fragment screening for drug discovery. MS-based techniques have the advantage of high sensitivity, low sample consumption and being label-free. This review highlights recent examples of the emerging use of MS-based techniques in fragment screening. © 2017 The Author(s). Published by Portland Press Limited on behalf of the Biochemical Society.
Automatic Beam Path Analysis of Laser Wakefield Particle Acceleration Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rubel, Oliver; Geddes, Cameron G.R.; Cormier-Michel, Estelle
2009-10-19
Numerical simulations of laser wakefield particle accelerators play a key role in the understanding of the complex acceleration process and in the design of expensive experimental facilities. As the size and complexity of simulation output grows, an increasingly acute challenge is the practical need for computational techniques that aid in scientific knowledge discovery. To that end, we present a set of data-understanding algorithms that work in concert in a pipeline fashion to automatically locate and analyze high energy particle bunches undergoing acceleration in very large simulation datasets. These techniques work cooperatively by first identifying features of interest in individual timesteps,more » then integrating features across timesteps, and based on the information derived perform analysis of temporally dynamic features. This combination of techniques supports accurate detection of particle beams enabling a deeper level of scientific understanding of physical phenomena than hasbeen possible before. By combining efficient data analysis algorithms and state-of-the-art data management we enable high-performance analysis of extremely large particle datasets in 3D. We demonstrate the usefulness of our methods for a variety of 2D and 3D datasets and discuss the performance of our analysis pipeline.« less
Genomic impact of cigarette smoke, with application to three smoking-related diseases.
Talikka, M; Sierro, N; Ivanov, N V; Chaudhary, N; Peck, M J; Hoeng, J; Coggins, C R E; Peitsch, M C
2012-11-01
There is considerable evidence that inhaled toxicants such as cigarette smoke can cause both irreversible changes to the genetic material (DNA mutations) and putatively reversible changes to the epigenetic landscape (changes in the DNA methylation and chromatin modification state). The diseases that are believed to involve genetic and epigenetic perturbations include lung cancer, chronic obstructive pulmonary disease (COPD), and cardiovascular disease (CVD), all of which are strongly linked epidemiologically to cigarette smoking. In this review, we highlight the significance of genomics and epigenomics in these major smoking-related diseases. We also summarize the in vitro and in vivo findings on the specific perturbations that smoke and its constituent compounds can inflict upon the genome, particularly on the pulmonary system. Finally, we review state-of-the-art genomics and new techniques such as high-throughput sequencing and genome-wide chromatin assays, rapidly evolving techniques which have allowed epigenetic changes to be characterized at the genome level. These techniques have the potential to significantly improve our understanding of the specific mechanisms by which exposure to environmental chemicals causes disease. Such mechanistic knowledge provides a variety of opportunities for enhanced product safety assessment and the discovery of novel therapeutic interventions.
Application of the MIDAS approach for analysis of lysine acetylation sites.
Evans, Caroline A; Griffiths, John R; Unwin, Richard D; Whetton, Anthony D; Corfe, Bernard M
2013-01-01
Multiple Reaction Monitoring Initiated Detection and Sequencing (MIDAS™) is a mass spectrometry-based technique for the detection and characterization of specific post-translational modifications (Unwin et al. 4:1134-1144, 2005), for example acetylated lysine residues (Griffiths et al. 18:1423-1428, 2007). The MIDAS™ technique has application for discovery and analysis of acetylation sites. It is a hypothesis-driven approach that requires a priori knowledge of the primary sequence of the target protein and a proteolytic digest of this protein. MIDAS essentially performs a targeted search for the presence of modified, for example acetylated, peptides. The detection is based on the combination of the predicted molecular weight (measured as mass-charge ratio) of the acetylated proteolytic peptide and a diagnostic fragment (product ion of m/z 126.1), which is generated by specific fragmentation of acetylated peptides during collision induced dissociation performed in tandem mass spectrometry (MS) analysis. Sequence information is subsequently obtained which enables acetylation site assignment. The technique of MIDAS was later trademarked by ABSciex for targeted protein analysis where an MRM scan is combined with full MS/MS product ion scan to enable sequence confirmation.
ERIC Educational Resources Information Center
Lynton, Ernest A.
2016-01-01
New knowledge is created in the course of the application of outreach. Each complex problem in the real world is likely to have unique aspects and thus it requires some modification of standard approaches. Hence, each engagement in outreach is likely to have an element of inquiry and discovery, leading to new knowledge. The flow of knowledge is in…
Federal Register 2010, 2011, 2012, 2013, 2014
2011-11-08
... Collection; Comment Request; Papahanaumokuakea Marine National Monument Mokupapapa Discovery Center Exhibit... collection. Mokupapapa Discovery Center (Center) is an outreach arm of Papahanaumokuakea Marine National... of automated collection techniques or other forms of information technology. Comments submitted in...
Concepts of formal concept analysis
NASA Astrophysics Data System (ADS)
Žáček, Martin; Homola, Dan; Miarka, Rostislav
2017-07-01
The aim of this article is apply of Formal Concept Analysis on concept of world. Formal concept analysis (FCA) as a methodology of data analysis, information management and knowledge representation has potential to be applied to a verity of linguistic problems. FCA is mathematical theory for concepts and concept hierarchies that reflects an understanding of concept. Formal concept analysis explicitly formalizes extension and intension of a concept, their mutual relationships. A distinguishing feature of FCA is an inherent integration of three components of conceptual processing of data and knowledge, namely, the discovery and reasoning with concepts in data, discovery and reasoning with dependencies in data, and visualization of data, concepts, and dependencies with folding/unfolding capabilities.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hillis, D.R.
A computer-based simulation with an artificial intelligence component and discovery learning was investigated as a method to formulate training needs for new or unfamiliar technologies. Specifically, the study examined if this simulation method would provide for the recognition of applications and knowledge/skills which would be the basis for establishing training needs. The study also examined the effect of field-dependence/independence on recognition of applications and knowledge/skills. A pretest-posttest control group experimental design involving fifty-eight college students from an industrial technology program was used. The study concluded that the simulation was effective in developing recognition of applications and the knowledge/skills for amore » new or unfamiliar technology. And, the simulation's effectiveness for providing this recognition was not limited by an individual's field-dependence/independence.« less
Semi-automated knowledge discovery: identifying and profiling human trafficking
NASA Astrophysics Data System (ADS)
Poelmans, Jonas; Elzinga, Paul; Ignatov, Dmitry I.; Kuznetsov, Sergei O.
2012-11-01
We propose an iterative and human-centred knowledge discovery methodology based on formal concept analysis. The proposed approach recognizes the important role of the domain expert in mining real-world enterprise applications and makes use of specific domain knowledge, including human intelligence and domain-specific constraints. Our approach was empirically validated at the Amsterdam-Amstelland police to identify suspects and victims of human trafficking in 266,157 suspicious activity reports. Based on guidelines of the Attorney Generals of the Netherlands, we first defined multiple early warning indicators that were used to index the police reports. Using concept lattices, we revealed numerous unknown human trafficking and loverboy suspects. In-depth investigation by the police resulted in a confirmation of their involvement in illegal activities resulting in actual arrestments been made. Our human-centred approach was embedded into operational policing practice and is now successfully used on a daily basis to cope with the vastly growing amount of unstructured information.
Database systems for knowledge-based discovery.
Jagarlapudi, Sarma A R P; Kishan, K V Radha
2009-01-01
Several database systems have been developed to provide valuable information from the bench chemist to biologist, medical practitioner to pharmaceutical scientist in a structured format. The advent of information technology and computational power enhanced the ability to access large volumes of data in the form of a database where one could do compilation, searching, archiving, analysis, and finally knowledge derivation. Although, data are of variable types the tools used for database creation, searching and retrieval are similar. GVK BIO has been developing databases from publicly available scientific literature in specific areas like medicinal chemistry, clinical research, and mechanism-based toxicity so that the structured databases containing vast data could be used in several areas of research. These databases were classified as reference centric or compound centric depending on the way the database systems were designed. Integration of these databases with knowledge derivation tools would enhance the value of these systems toward better drug design and discovery.
Rossi, Lorenzo; Gippoliti, Spartaco; Angelici, Francesco Maria
2018-06-04
Although empirical data are necessary to describe new species, their discoveries can be guided from the survey of the so-called circumstantial evidence (that indirectly determines the existence or nonexistence of a fact). Yet this type of evidence, generally linked to traditional ecological knowledge (TEK), is often disputed by field biologists due to its uncertain nature and, on account of that, generally untapped by them. To verify this behavior and the utility of circumstantial evidence, we reviewed the existing literature about the species of apes and monkeys described or rediscovered since January 1, 1980 and submitted a poll to the authors. The results show that circumstantial evidence has proved to be useful in 40.5% of the examined cases and point to the possibility that its use could speed up the process at the heart of the discovery and description of new species, an essential step for conservation purposes.
The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...
Dewey: How to Make It Work for You
ERIC Educational Resources Information Center
Panzer, Michael
2013-01-01
As knowledge brokers, librarians are living in interesting times for themselves and libraries. It causes them to wonder sometimes if the traditional tools like the Dewey Decimal Classification (DDC) system can cope with the onslaught of information. The categories provided do not always seem adequate for the knowledge-discovery habits of…
78 FR 29071 - Assessment of Mediation and Arbitration Procedures
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-17
... proceeding. Program participants in the new arbitration program will have prior knowledge of the issues to be... final rules, all parties opting into the arbitration program will have full prior knowledge that these... including discovery, the submission of evidence, and the treatment of confidential information, and the...
21st century environmental problems are wicked and require holistic systems thinking and solutions that integrate social and economic knowledge with knowledge of the environment. Computer-based technologies are fundamental to our ability to research and understand the relevant sy...
Teaching Practice: A Perspective on Inter-Text and Prior Knowledge
ERIC Educational Resources Information Center
Costley, Kevin C.; West, Howard G.
2012-01-01
The use of teaching practices that involve intertextual relationship discovery in today's elementary classrooms is increasingly essential to the success of young learners of reading. Teachers must constantly strive to expand their perspective of how to incorporate the dialogue included in prior knowledge assessment. Teachers must also consider how…
Globalization of Knowledge Discovery and Information Retrieval in Teaching and Learning
ERIC Educational Resources Information Center
Zaidel, Mark; Guerrero, Osiris
2008-01-01
Developments in communication and information technologies in the last decade have had a significant impact on instructional and learning activities. For many students and educators, the Internet became the significant medium for sharing instruction, learning and communication. Access to knowledge beyond boundaries and cultures has an impact on…
Vocational Education Institutions' Role in National Innovation
ERIC Educational Resources Information Center
Moodie, Gavin
2006-01-01
This article distinguishes research--the discovery of new knowledge--from innovation, which is understood to be the transformation of practice in a community or the incorporation of existing knowledge into economic activity. From a survey of roles served by vocational education institutions in a number of OECD countries the paper argues that…
Dai, Jun; Wang, Chunlei; Traeger, Sarah C; Discenza, Lorell; Obermeier, Mary T; Tymiak, Adrienne A; Zhang, Yingru
2017-03-03
Atropisomers are stereoisomers resulting from hindered bond rotation. From synthesis of pure atropisomers, characterization of their interconversion thermodynamics to investigation of biological stereoselectivity, the evaluation of drug candidates subject to atropisomerism creates special challenges and can be complicated in both early drug discovery and later drug development. In this paper, we demonstrate an array of analytical techniques and systematic approaches to study the atropisomerism of drug molecules to meet these challenges. Using a case study of Bruton's tyrosine kinase (BTK) inhibitor drug candidates at Bristol-Myers Squibb, we present the analytical strategies and methodologies used during drug discovery including the detection of atropisomers, the determination of their relative composition, the identification of relative chirality, the isolation of individual atropisomers, the evaluation of interconversion kinetics, and the characterization of chiral stability in the solid state and in solution. In vivo and in vitro stereo-stability and stereo-selectivity were investigated as well as the pharmacological significance of any changes in atropisomer ratios. Techniques applied in these studies include analytical and preparative enantioselective supercritical fluid chromatography (SFC), enantioselective high performance liquid chromatography (HPLC), circular dichroism (CD), and mass spectrometry (MS). Our experience illustrates how atropisomerism can be a very complicated issue in drug discovery and why a thorough understanding of this phenomenon is necessary to provide guidance for pharmaceutical development. Analytical techniques and methodologies facilitate key decisions during the discovery of atropisomeric drug candidates by characterizing time-dependent physicochemical properties that can have significant biological implications and relevance to pharmaceutical development plans. Copyright © 2017 Elsevier B.V. All rights reserved.
Frontline: the liberal arts of psychoanalysis.
Fradenburg, Aranye
2011-01-01
In terms of process, psychoanalysis is more closely related to the disciplines of the arts and humanities than those of the sciences, however much the latter have contributed to our knowledge of the mind and our discussions of technique. Will we, accordingly, assert our support for liberal arts education, at a time when it is under unprecedented attack? Neuroscience has made remarkable strides in establishing the importance of artistic and humanist training to the plasticity and connectedness of mental functioning. But these discoveries have sadly done nothing to protect the academic disciplines of the arts and humanities from budget cuts and closings. It is as if contemporary boosters of technical and scientific education had no interest in, or knew nothing about, the new knowledge of the brain that scientists are actually producing. Will psychiatrists and psychoanalysts, for the sake of the arts and the sciences, support liberal arts education, or will we distance ourselves from it, and thus abandon the well-being of the very minds we will later be trying to tend in our offices? Is it not our responsibility to speak for the importance of thriving, since surviving depends on it?
The dendritic spine story: an intriguing process of discovery
DeFelipe, Javier
2015-01-01
Dendritic spines are key components of a variety of microcircuits and they represent the majority of postsynaptic targets of glutamatergic axon terminals in the brain. The present article will focus on the discovery of dendritic spines, which was possible thanks to the application of the Golgi technique to the study of the nervous system, and will also explore the early interpretation of these elements. This discovery represents an interesting chapter in the history of neuroscience as it shows us that progress in the study of the structure of the nervous system is based not only on the emergence of new techniques but also on our ability to exploit the methods already available and correctly interpret their microscopic images. PMID:25798090
Recombinant organisms for production of industrial products
Adrio, Jose-Luis
2010-01-01
A revolution in industrial microbiology was sparked by the discoveries of ther double-stranded structure of DNA and the development of recombinant DNA technology. Traditional industrial microbiology was merged with molecular biology to yield improved recombinant processes for the industrial production of primary and secondary metabolites, protein biopharmaceuticals and industrial enzymes. Novel genetic techniques such as metabolic engineering, combinatorial biosynthesis and molecular breeding techniques and their modifications are contributing greatly to the development of improved industrial processes. In addition, functional genomics, proteomics and metabolomics are being exploited for the discovery of novel valuable small molecules for medicine as well as enzymes for catalysis. The sequencing of industrial microbal genomes is being carried out which bodes well for future process improvement and discovery of new industrial products. PMID:21326937
Optogenetic Approaches to Drug Discovery in Neuroscience and Beyond.
Zhang, Hongkang; Cohen, Adam E
2017-07-01
Recent advances in optogenetics have opened new routes to drug discovery, particularly in neuroscience. Physiological cellular assays probe functional phenotypes that connect genomic data to patient health. Optogenetic tools, in particular tools for all-optical electrophysiology, now provide a means to probe cellular disease models with unprecedented throughput and information content. These techniques promise to identify functional phenotypes associated with disease states and to identify compounds that improve cellular function regardless of whether the compound acts directly on a target or through a bypass mechanism. This review discusses opportunities and unresolved challenges in applying optogenetic techniques throughout the discovery pipeline - from target identification and validation, to target-based and phenotypic screens, to clinical trials. Copyright © 2017 Elsevier Ltd. All rights reserved.
Perspectives on NMR in drug discovery: a technique comes of age
Pellecchia, Maurizio; Bertini, Ivano; Cowburn, David; Dalvit, Claudio; Giralt, Ernest; Jahnke, Wolfgang; James, Thomas L.; Homans, Steve W.; Kessler, Horst; Luchinat, Claudio; Meyer, Bernd; Oschkinat, Hartmut; Peng, Jeff; Schwalbe, Harald; Siegal, Gregg
2009-01-01
In the past decade, the potential of harnessing the ability of nuclear magnetic resonance (NMR) spectroscopy to monitor intermolecular interactions as a tool for drug discovery has been increasingly appreciated in academia and industry. In this Perspective, we highlight some of the major applications of NMR in drug discovery, focusing on hit and lead generation, and provide a critical analysis of its current and potential utility. PMID:19172689
Comparative study on drug safety surveillance between medical students of Malaysia and Nigeria
Abubakar, Abdullahi Rabiu; Ismail, Salwani; Rahman, Nor Iza A; Haque, Mainul
2015-01-01
Background Internationally, there is a remarkable achievement in the areas of drug discovery, drug design, and clinical trials. New and efficient drug formulation techniques are widely available which have led to success in treatment of several diseases. Despite these achievements, large number of patients continue to experience adverse drug reactions (ADRs), and majority of them are yet to be on record. Objectives The purpose of this survey is to compare knowledge, attitude, and practice with respect to ADRs and pharmacovigilance (PV) between medical students of Malaysia and Nigeria and to determine if there is a relationship between their knowledge and practice. Method A cross-sectional, questionnaire-based survey involving year IV and year V medical students of the Department of Medicine, Universiti Sultan Zainal Abidin and Bayero University Kano was carried out. The questionnaire which comprised 25 questions on knowledge, attitude, and practice was adopted, modified, validated, and administered to them. The response was analyzed using SPSS version 20. Results The response rate from each country was 74%. There was a statistically significant difference in mean knowledge and practice score on ADRs and PV between medical students of Malaysia and Nigeria, both at P<0.000. No significance difference in attitude was observed at P=0.389. Also, a statistically significant relationship was recorded between their knowledge and practice (r=0.229, P=0.001), although the relationship was weak. Conclusion Nigerian medical students have better knowledge and practice than those of Malaysia, although they need improvement. Imparting knowledge of ADRs and PV among medical students will upgrade their practice and enhance health care delivery services in the future. PMID:26170680
NASA Astrophysics Data System (ADS)
McGranaghan, R. M.; Mannucci, A. J.; Verkhoglyadova, O. P.; Malik, N.
2017-12-01
How do we evolve beyond current traditional methods in order to innovate into the future? In what disruptive innovations will the next frontier of space physics and aeronomy (SPA) be grounded? We believe the answer to these compelling, yet equally challenging, questions lies in a shift of focus: from a narrow, field-specific view to a radically inclusive, interdisciplinary new modus operandi at the intersection of SPA and the information and data sciences. Concretely addressing these broader themes, we present results from a novel technique for knowledge discovery in the magnetosphere-ionosphere-thermosphere (MIT) system: complex network analysis (NA). We share findings from the first NA of ionospheric total electron content (TEC) data, including hemispheric and interplanetary magnetic field clock angle dependencies [1]. Our work shows that NA complements more traditional approaches for the investigation of TEC structure and dynamics, by both reaffirming well-established understanding, giving credence to the method, and identifying new connections, illustrating the exciting potential. We contextualize these new results through a discussion of the potential of data-driven discovery in the MIT system when innovative data science techniques are embraced. We address implications and potentially disruptive data analysis approaches for SPA in terms of: 1) the future of the geospace observational system; 2) understanding multi-scale phenomena; and 3) machine learning. [1] McGranaghan, R. M., A. J. Mannucci, O. Verkhoglyadova, and N. Malik (2017), Finding multiscale connectivity in our geospace observational system: Network analysis of total electron content, J. Geophys. Res. Space Physics, 122, doi:10.1002/2017JA024202.
Discovering the structure of nerve tissue: Part 3: From Jan Evangelista Purkyně to Ludwig Mauthner.
Chvátal, Alexandr
2017-01-01
The previous works of Purkyně, Valentin, and Remak showed that the central and peripheral nervous systems contained not only nerve fibers but also cellular elements. The use of microscopes and new fixation techniques enabled them to accurately obtain data on the structure of nerve tissue and consequently in many European universities microscopes started to become widely used in histological and morphological studies. The present review summarizes important discoveries concerning the structure of neural tissue, mostly from vertebrates, during the period from 1838 to 1865. This review describes the discoveries of famous as well as less well-known scholars of the time, who contributed significantly to current understandings about the structure of neural tissue. The period is characterized by the first descriptions of different types of nerve cells and the first attempts of a cytoarchitectonic description of the spinal cord and brain. During the same time, the concept of a neuroglial tissue was introduced, first as a tissue for "gluing" nerve fibers, cells, and blood capillaries into one unit, but later some glial cells were described for the first time. Questions arose as to whether or not cells in ganglia and the central nervous system had the same morphological and functional properties, and whether nerve fibers and cell bodies were interconnected. Microscopic techniques started to be used for the examination of physiological as well as pathological nerve tissues. The overall state of knowledge was just a step away from the emergence of the concept of neurons and glial cells.
Automated Knowledge Discovery From Simulators
NASA Technical Reports Server (NTRS)
Burl, Michael; DeCoste, Dennis; Mazzoni, Dominic; Scharenbroich, Lucas; Enke, Brian; Merline, William
2007-01-01
A computational method, SimLearn, has been devised to facilitate efficient knowledge discovery from simulators. Simulators are complex computer programs used in science and engineering to model diverse phenomena such as fluid flow, gravitational interactions, coupled mechanical systems, and nuclear, chemical, and biological processes. SimLearn uses active-learning techniques to efficiently address the "landscape characterization problem." In particular, SimLearn tries to determine which regions in "input space" lead to a given output from the simulator, where "input space" refers to an abstraction of all the variables going into the simulator, e.g., initial conditions, parameters, and interaction equations. Landscape characterization can be viewed as an attempt to invert the forward mapping of the simulator and recover the inputs that produce a particular output. Given that a single simulation run can take days or weeks to complete even on a large computing cluster, SimLearn attempts to reduce costs by reducing the number of simulations needed to effect discoveries. Unlike conventional data-mining methods that are applied to static predefined datasets, SimLearn involves an iterative process in which a most informative dataset is constructed dynamically by using the simulator as an oracle. On each iteration, the algorithm models the knowledge it has gained through previous simulation trials and then chooses which simulation trials to run next. Running these trials through the simulator produces new data in the form of input-output pairs. The overall process is embodied in an algorithm that combines support vector machines (SVMs) with active learning. SVMs use learning from examples (the examples are the input-output pairs generated by running the simulator) and a principle called maximum margin to derive predictors that generalize well to new inputs. In SimLearn, the SVM plays the role of modeling the knowledge that has been gained through previous simulation trials. Active learning is used to determine which new input points would be most informative if their output were known. The selected input points are run through the simulator to generate new information that can be used to refine the SVM. The process is then repeated. SimLearn carefully balances exploration (semi-randomly searching around the input space) versus exploitation (using the current state of knowledge to conduct a tightly focused search). During each iteration, SimLearn uses not one, but an ensemble of SVMs. Each SVM in the ensemble is characterized by different hyper-parameters that control various aspects of the learned predictor - for example, whether the predictor is constrained to be very smooth (nearby points in input space lead to similar output predictions) or whether the predictor is allowed to be "bumpy." The various SVMs will have different preferences about which input points they would like to run through the simulator next. SimLearn includes a formal mechanism for balancing the ensemble SVM preferences so that a single choice can be made for the next set of trials.
Phenotypic mutant library: potential for gene discovery
USDA-ARS?s Scientific Manuscript database
The rapid development of high throughput and affordable Next- Generation Sequencing (NGS) techniques has renewed interest in gene discovery using forward genetics. The conventional forward genetic approach starts with isolation of mutants with a phenotype of interest, mapping the mutation within a s...
Urbanowicz, Ryan J.; Granizo-Mackenzie, Ambrose; Moore, Jason H.
2014-01-01
Michigan-style learning classifier systems (M-LCSs) represent an adaptive and powerful class of evolutionary algorithms which distribute the learned solution over a sizable population of rules. However their application to complex real world data mining problems, such as genetic association studies, has been limited. Traditional knowledge discovery strategies for M-LCS rule populations involve sorting and manual rule inspection. While this approach may be sufficient for simpler problems, the confounding influence of noise and the need to discriminate between predictive and non-predictive attributes calls for additional strategies. Additionally, tests of significance must be adapted to M-LCS analyses in order to make them a viable option within fields that require such analyses to assess confidence. In this work we introduce an M-LCS analysis pipeline that combines uniquely applied visualizations with objective statistical evaluation for the identification of predictive attributes, and reliable rule generalizations in noisy single-step data mining problems. This work considers an alternative paradigm for knowledge discovery in M-LCSs, shifting the focus from individual rules to a global, population-wide perspective. We demonstrate the efficacy of this pipeline applied to the identification of epistasis (i.e., attribute interaction) and heterogeneity in noisy simulated genetic association data. PMID:25431544
Behavior change interventions: the potential of ontologies for advancing science and practice.
Larsen, Kai R; Michie, Susan; Hekler, Eric B; Gibson, Bryan; Spruijt-Metz, Donna; Ahern, David; Cole-Lewis, Heather; Ellis, Rebecca J Bartlett; Hesse, Bradford; Moser, Richard P; Yi, Jean
2017-02-01
A central goal of behavioral medicine is the creation of evidence-based interventions for promoting behavior change. Scientific knowledge about behavior change could be more effectively accumulated using "ontologies." In information science, an ontology is a systematic method for articulating a "controlled vocabulary" of agreed-upon terms and their inter-relationships. It involves three core elements: (1) a controlled vocabulary specifying and defining existing classes; (2) specification of the inter-relationships between classes; and (3) codification in a computer-readable format to enable knowledge generation, organization, reuse, integration, and analysis. This paper introduces ontologies, provides a review of current efforts to create ontologies related to behavior change interventions and suggests future work. This paper was written by behavioral medicine and information science experts and was developed in partnership between the Society of Behavioral Medicine's Technology Special Interest Group (SIG) and the Theories and Techniques of Behavior Change Interventions SIG. In recent years significant progress has been made in the foundational work needed to develop ontologies of behavior change. Ontologies of behavior change could facilitate a transformation of behavioral science from a field in which data from different experiments are siloed into one in which data across experiments could be compared and/or integrated. This could facilitate new approaches to hypothesis generation and knowledge discovery in behavioral science.
Constructing a Graph Database for Semantic Literature-Based Discovery.
Hristovski, Dimitar; Kastrin, Andrej; Dinevski, Dejan; Rindflesch, Thomas C
2015-01-01
Literature-based discovery (LBD) generates discoveries, or hypotheses, by combining what is already known in the literature. Potential discoveries have the form of relations between biomedical concepts; for example, a drug may be determined to treat a disease other than the one for which it was intended. LBD views the knowledge in a domain as a network; a set of concepts along with the relations between them. As a starting point, we used SemMedDB, a database of semantic relations between biomedical concepts extracted with SemRep from Medline. SemMedDB is distributed as a MySQL relational database, which has some problems when dealing with network data. We transformed and uploaded SemMedDB into the Neo4j graph database, and implemented the basic LBD discovery algorithms with the Cypher query language. We conclude that storing the data needed for semantic LBD is more natural in a graph database. Also, implementing LBD discovery algorithms is conceptually simpler with a graph query language when compared with standard SQL.
Particle swarm optimization with recombination and dynamic linkage discovery.
Chen, Ying-Ping; Peng, Wen-Chih; Jian, Ming-Chung
2007-12-01
In this paper, we try to improve the performance of the particle swarm optimizer by incorporating the linkage concept, which is an essential mechanism in genetic algorithms, and design a new linkage identification technique called dynamic linkage discovery to address the linkage problem in real-parameter optimization problems. Dynamic linkage discovery is a costless and effective linkage recognition technique that adapts the linkage configuration by employing only the selection operator without extra judging criteria irrelevant to the objective function. Moreover, a recombination operator that utilizes the discovered linkage configuration to promote the cooperation of particle swarm optimizer and dynamic linkage discovery is accordingly developed. By integrating the particle swarm optimizer, dynamic linkage discovery, and recombination operator, we propose a new hybridization of optimization methodologies called particle swarm optimization with recombination and dynamic linkage discovery (PSO-RDL). In order to study the capability of PSO-RDL, numerical experiments were conducted on a set of benchmark functions as well as on an important real-world application. The benchmark functions used in this paper were proposed in the 2005 Institute of Electrical and Electronics Engineers Congress on Evolutionary Computation. The experimental results on the benchmark functions indicate that PSO-RDL can provide a level of performance comparable to that given by other advanced optimization techniques. In addition to the benchmark, PSO-RDL was also used to solve the economic dispatch (ED) problem for power systems, which is a real-world problem and highly constrained. The results indicate that PSO-RDL can successfully solve the ED problem for the three-unit power system and obtain the currently known best solution for the 40-unit system.
Park, D L; Stoloff, L
1989-04-01
The control by the Food and Drug Administration (FDA) of aflatoxin, a relatively recently discovered, unavoidable natural contaminant produced by specific molds that invade a number of basic food and feedstuffs, provides an example of the varying forces that affect risk assessment and management by a regulatory Agency. This is the story of how the FDA responded to the initial discovery of a potential carcinogenic hazard to humans in a domestic commodity, to the developing information concerning the nature of the hazard, to the economic and political pressures that are created by the impact of natural forces on regulatory controls, and to the restraints of laws within which the Agency must work. This story covers four periods: the years of discovery and action decisions on the basis of meager knowledge and the fear of cancer; the years of tinkering on paper with the regulatory process, the years of digestion of the accumulating knowledge, and the application of that knowledge to actions forced by natural events; and an audit of the current status of knowledge about the hazard from aflatoxin, and proposals for regulatory control based on that knowledge.
Postgenomic strategies in antibacterial drug discovery.
Brötz-Oesterhelt, Heike; Sass, Peter
2010-10-01
During the last decade the field of antibacterial drug discovery has changed in many aspects including bacterial organisms of primary interest, discovery strategies applied and pharmaceutical companies involved. Target-based high-throughput screening had been disappointingly unsuccessful for antibiotic research. Understanding of this lack of success has increased substantially and the lessons learned refer to characteristics of targets, screening libraries and screening strategies. The 'genomics' approach was replaced by a diverse array of discovery strategies, for example, searching for new natural product leads among previously abandoned compounds or new microbial sources, screening for synthetic inhibitors by targeted approaches including structure-based design and analyses of focused libraries and designing resistance-breaking properties into antibiotics of established classes. Furthermore, alternative treatment options are being pursued including anti-virulence strategies and immunotherapeutic approaches. This article summarizes the lessons learned from the genomics era and describes discovery strategies resulting from that knowledge.
Priority of discovery in the life sciences
Vale, Ronald D; Hyman, Anthony A
2016-01-01
The job of a scientist is to make a discovery and then communicate this new knowledge to others. For a scientist to be successful, he or she needs to be able to claim credit or priority for discoveries throughout their career. However, despite being fundamental to the reward system of science, the principles for establishing the "priority of discovery" are rarely discussed. Here we break down priority into two steps: disclosure, in which the discovery is released to the world-wide community; and validation, in which other scientists assess the accuracy, quality and importance of the work. Currently, in biology, disclosure and an initial validation are combined in a journal publication. Here, we discuss the advantages of separating these steps into disclosure via a preprint, and validation via a combination of peer review at a journal and additional evaluation by the wider scientific community. PMID:27310529
Thirty years of beta Pic and debris disks studies
NASA Astrophysics Data System (ADS)
Lagrange, Anne-Marie; Boccaletti, Anthony
2015-01-01
In the last 30 years, our knowledge of planetary systems has considerably evolved, in particular thanks to the development of observational techniques and computer simulations for modeling. From the observational point of view, emblematic discoveries thirty years ago have opened a way to dedicated studies, among which the IRAS detections of IR excess associated to dust surrounding main-sequence stars. Shortly after these discoveries, the first image of a debris disk around the star beta Pictoris in 1984 was made, followed in the 90's by the indirect detection of extrasolar planets and, a decade later, by the direct imaging of young giant planets. Beta Pictoris is a ground-breaking object for the study of formation and evolution of planetary systems. It is a unique system in many regards, as it is made of dust, planetesimals, comets and at least one giant planet. Observations with various techniques (imaging, spectroscopy, interferometry) at multiple wavelengths (from the UV to radio waves) have allowed significant progress in the understanding of this system. Yet, many questions are still open, and more results are expected in the coming decade thanks to the next generation of instruments like for instance ALMA, JWST, SPHERE and many others. To celebrate the thirtieth anniversary of the first debris disk image, we propose to gather experts on the analysis of beta Pictoris and interested colleagues to review and discuss the observational knowledge on this archetypal system (including the latest results), as well as its current understanding and related open questions to be addressed in the next decade, such as the history of the disk and planet formation, dynamical evolution, etc. Similar, well-studied debris disks systems with significant amount of observational data that allow in-depth modeling will be also presented and discussed. Second, in a two-days dedicated workshop, we will gather to define an action plan for the typically 3-5 next years to achieve a full, comprehensive description of the whole beta Pictoris system, and to organize the necessary work, and possible milestones. In the next years, a similar approach may, eventually, be applicable to other systems.
Thirty years of beta Pic and debris disks studies
NASA Astrophysics Data System (ADS)
Lagrange, A.-M.; Boccaletti, A.
2014-09-01
In the last 30 years, our knowledge of planetary systems has considerably evolved, in particular thanks to the development of observational techniques and computer simulations for modeling. From the observational point of view, emblematic discoveries thirty years ago have opened a way to dedicated studies, among which the IRAS detections of IR excess associated to dust surrounding main-sequence stars. Shortly after these discoveries, the first image of a debris disk around the star beta Pictoris in 1984 was made, followed in the 90's by the indirect detection of extrasolar planets and, a decade later, by the direct imaging of young giant planets. Beta Pictoris is a ground-breaking object for the study of formation and evolution of planetary systems. It is a unique system in many regards, as it is made of dust, planetesimals, comets and at least one giant planet. Observations with various techniques (imaging, spectroscopy, interferometry) at multiple wavelengths (from the UV to radio waves) have allowed significant progress in the understanding of this system. Yet, many questions are still open, and more results are expected in the coming decade thanks to the next generation of instruments like for instance ALMA, JWST, SPHERE and many others. To celebrate the thirtieth anniversary of the first debris disk image, we propose to gather experts on the analysis of beta Pictoris and interested colleagues to review and discuss the observational knowledge on this archetypal system (including the latest results), as well as its current understanding and related open questions to be addressed in the next decade, such as the history of the disk and planet formation, dynamical evolution, etc. Similar, well-studied debris disks systems with significant amount of observational data that allow in-depth modeling will be also presented and discussed. Second, in a two-days dedicated workshop, we will gather to define an action plan for the typically 3-5 next years to achieve a full, comprehensive description of the whole beta Pictoris system, and to organize the necessary work, and possible milestones. In the next years, a similar approach may, eventually, be applicable to other systems.
Computational biology for cardiovascular biomarker discovery.
Azuaje, Francisco; Devaux, Yvan; Wagner, Daniel
2009-07-01
Computational biology is essential in the process of translating biological knowledge into clinical practice, as well as in the understanding of biological phenomena based on the resources and technologies originating from the clinical environment. One such key contribution of computational biology is the discovery of biomarkers for predicting clinical outcomes using 'omic' information. This process involves the predictive modelling and integration of different types of data and knowledge for screening, diagnostic or prognostic purposes. Moreover, this requires the design and combination of different methodologies based on statistical analysis and machine learning. This article introduces key computational approaches and applications to biomarker discovery based on different types of 'omic' data. Although we emphasize applications in cardiovascular research, the computational requirements and advances discussed here are also relevant to other domains. We will start by introducing some of the contributions of computational biology to translational research, followed by an overview of methods and technologies used for the identification of biomarkers with predictive or classification value. The main types of 'omic' approaches to biomarker discovery will be presented with specific examples from cardiovascular research. This will include a review of computational methodologies for single-source and integrative data applications. Major computational methods for model evaluation will be described together with recommendations for reporting models and results. We will present recent advances in cardiovascular biomarker discovery based on the combination of gene expression and functional network analyses. The review will conclude with a discussion of key challenges for computational biology, including perspectives from the biosciences and clinical areas.
Standardized plant disease evaluations will enhance resistance gene discovery
USDA-ARS?s Scientific Manuscript database
Gene discovery and marker development using DNA-based tools require plant populations with well documented phenotypes. If dissimilar phenotype evaluation methods or data scoring techniques are employed with different crops, or at different labs for the same crops, then data mining for genetic marker...
Roden, Suzanne E; Dutton, Peter H; Morin, Phillip A
2009-01-01
The green sea turtle, Chelonia mydas, was used as a case study for single nucleotide polymorphism (SNP) discovery in a species that has little genetic sequence information available. As green turtles have a complex population structure, additional nuclear markers other than microsatellites could add to our understanding of their complex life history. Amplified fragment length polymorphism technique was used to generate sets of random fragments of genomic DNA, which were then electrophoretically separated with precast gels, stained with SYBR green, excised, and directly sequenced. It was possible to perform this method without the use of polyacrylamide gels, radioactive or fluorescent labeled primers, or hybridization methods, reducing the time, expense, and safety hazards of SNP discovery. Within 13 loci, 2547 base pairs were screened, resulting in the discovery of 35 SNPs. Using this method, it was possible to yield a sufficient number of loci to screen for SNP markers without the availability of prior sequence information.
The top quark (20 years after the discovery)
Boos, Eduard; Brandt, Oleg; Denisov, Dmitri; ...
2015-09-10
On the twentieth anniversary of the observation of the top quark, we trace our understanding of this heaviest of all known particles from the prediction of its existence, through the searches and discovery, to the current knowledge of its production mechanisms and properties. We also discuss the central role of the top quark in the Standard Model and the windows that it opens for seeking new physics beyond the Standard Model.
USDA-ARS?s Scientific Manuscript database
Valuable information on the location and context of ecological studies are locked up in publications in myriad formats that are not easily machine readable. This presents significant challenges to building geographic-based tools to search for and visualize sources of ecological knowledge. JournalMap...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-24
... his or her knowledge and belief, the information contained in the document is accurate and complete. The first item in the certification required by SEC Form N-CSR is: ``Based on my knowledge, this..., competitiveness and financial integrity of futures markets; (3) price discovery; (4) sound risk management...
Incremental Knowledge Discovery in Social Media
ERIC Educational Resources Information Center
Tang, Xuning
2013-01-01
In light of the prosperity of online social media, Web users are shifting from data consumers to data producers. To catch the pulse of this rapidly changing world, it is critical to transform online social media data to information and to knowledge. This dissertation centers on the issue of modeling the dynamics of user communities, trending…
Effects of Students' Prior Knowledge on Scientific Reasoning in Density.
ERIC Educational Resources Information Center
Yang, Il-Ho; Kwon, Yong-Ju; Kim, Young-Shin; Jang, Myoung-Duk; Jeong, Jin-Woo; Park, Kuk-Tae
2002-01-01
Investigates the effects of students' prior knowledge on the scientific reasoning processes of performing the task of controlling variables with computer simulation and identifies a number of problems that students encounter in scientific discovery. Involves (n=27) 5th grade students and (n=33) 7th grade students. Indicates that students' prior…
Recombinant organisms for production of industrial products.
Adrio, Jose-Luis; Demain, Arnold L
2010-01-01
A revolution in industrial microbiology was sparked by the discoveries of ther double-stranded structure of DNA and the development of recombinant DNA technology. Traditional industrial microbiology was merged with molecular biology to yield improved recombinant processes for the industrial production of primary and secondary metabolites, protein biopharmaceuticals and industrial enzymes. Novel genetic techniques such as metabolic engineering, combinatorial biosynthesis and molecular breeding techniques and their modifications are contributing greatly to the development of improved industrial processes. In addition, functional genomics, proteomics and metabolomics are being exploited for the discovery of novel valuable small molecules for medicine as well as enzymes for catalysis. The sequencing of industrial microbal genomes is being carried out which bodes well for future process improvement and discovery of new industrial products. © 2010 Landes Bioscience
Automatic extraction of relations between medical concepts in clinical texts
Harabagiu, Sanda; Roberts, Kirk
2011-01-01
Objective A supervised machine learning approach to discover relations between medical problems, treatments, and tests mentioned in electronic medical records. Materials and methods A single support vector machine classifier was used to identify relations between concepts and to assign their semantic type. Several resources such as Wikipedia, WordNet, General Inquirer, and a relation similarity metric inform the classifier. Results The techniques reported in this paper were evaluated in the 2010 i2b2 Challenge and obtained the highest F1 score for the relation extraction task. When gold standard data for concepts and assertions were available, F1 was 73.7, precision was 72.0, and recall was 75.3. F1 is defined as 2*Precision*Recall/(Precision+Recall). Alternatively, when concepts and assertions were discovered automatically, F1 was 48.4, precision was 57.6, and recall was 41.7. Discussion Although a rich set of features was developed for the classifiers presented in this paper, little knowledge mining was performed from medical ontologies such as those found in UMLS. Future studies should incorporate features extracted from such knowledge sources, which we expect to further improve the results. Moreover, each relation discovery was treated independently. Joint classification of relations may further improve the quality of results. Also, joint learning of the discovery of concepts, assertions, and relations may also improve the results of automatic relation extraction. Conclusion Lexical and contextual features proved to be very important in relation extraction from medical texts. When they are not available to the classifier, the F1 score decreases by 3.7%. In addition, features based on similarity contribute to a decrease of 1.1% when they are not available. PMID:21846787
McAlpine, James B
2009-03-27
Over the past decade major changes have occurred in the access to genome sequences that encode the enzymes responsible for the biosynthesis of secondary metabolites, knowledge of how those sequences translate into the final structure of the metabolite, and the ability to alter the sequence to obtain predicted products via both homologous and heterologous expression. Novel genera have been discovered leading to new chemotypes, but more surprisingly several instances have been uncovered where the apparently general rules of modular translation have not applied. Several new biosynthetic pathways have been unearthed, and our general knowledge grows rapidly. This review aims to highlight some of the more striking discoveries and advances of the decade.
Text mining patents for biomedical knowledge.
Rodriguez-Esteban, Raul; Bundschus, Markus
2016-06-01
Biomedical text mining of scientific knowledge bases, such as Medline, has received much attention in recent years. Given that text mining is able to automatically extract biomedical facts that revolve around entities such as genes, proteins, and drugs, from unstructured text sources, it is seen as a major enabler to foster biomedical research and drug discovery. In contrast to the biomedical literature, research into the mining of biomedical patents has not reached the same level of maturity. Here, we review existing work and highlight the associated technical challenges that emerge from automatically extracting facts from patents. We conclude by outlining potential future directions in this domain that could help drive biomedical research and drug discovery. Copyright © 2016 Elsevier Ltd. All rights reserved.
Modelling and enhanced molecular dynamics to steer structure-based drug discovery.
Kalyaanamoorthy, Subha; Chen, Yi-Ping Phoebe
2014-05-01
The ever-increasing gap between the availabilities of the genome sequences and the crystal structures of proteins remains one of the significant challenges to the modern drug discovery efforts. The knowledge of structure-dynamics-functionalities of proteins is important in order to understand several key aspects of structure-based drug discovery, such as drug-protein interactions, drug binding and unbinding mechanisms and protein-protein interactions. This review presents a brief overview on the different state of the art computational approaches that are applied for protein structure modelling and molecular dynamics simulations of biological systems. We give an essence of how different enhanced sampling molecular dynamics approaches, together with regular molecular dynamics methods, assist in steering the structure based drug discovery processes. Copyright © 2013 Elsevier Ltd. All rights reserved.
A bilateral integrative health-care knowledge service mechanism based on 'MedGrid'.
Liu, Chao; Jiang, Zuhua; Zhen, Lu; Su, Hai
2008-04-01
Current health-care organizations are encountering impression of paucity of medical knowledge. This paper classifies medical knowledge with new scopes. The discovery of health-care 'knowledge flow' initiates a bilateral integrative health-care knowledge service, and we make medical knowledge 'flow' around and gain comprehensive effectiveness through six operations (such as knowledge refreshing...). Seizing the active demand of Chinese health-care revolution, this paper presents 'MedGrid', which is a platform with medical ontology and knowledge contents service. Each level and detailed contents are described on MedGrid info-structure. Moreover, a new diagnosis and treatment mechanism are formed by technically connecting with electronic health-care records (EHRs).
Zebrafish Models of Human Leukemia: Technological Advances and Mechanistic Insights.
Harrison, Nicholas R; Laroche, Fabrice J F; Gutierrez, Alejandro; Feng, Hui
2016-01-01
Insights concerning leukemic pathophysiology have been acquired in various animal models and further efforts to understand the mechanisms underlying leukemic treatment resistance and disease relapse promise to improve therapeutic strategies. The zebrafish (Danio rerio) is a vertebrate organism with a conserved hematopoietic program and unique experimental strengths suiting it for the investigation of human leukemia. Recent technological advances in zebrafish research including efficient transgenesis, precise genome editing, and straightforward transplantation techniques have led to the generation of a number of leukemia models. The transparency of the zebrafish when coupled with improved lineage-tracing and imaging techniques has revealed exquisite details of leukemic initiation, progression, and regression. With these advantages, the zebrafish represents a unique experimental system for leukemic research and additionally, advances in zebrafish-based high-throughput drug screening promise to hasten the discovery of novel leukemia therapeutics. To date, investigators have accumulated knowledge of the genetic underpinnings critical to leukemic transformation and treatment resistance and without doubt, zebrafish are rapidly expanding our understanding of disease mechanisms and helping to shape therapeutic strategies for improved outcomes in leukemic patients.
Zebrafish Models of Human Leukemia: Technological Advances and Mechanistic Insights
Harrison, Nicholas R.; Laroche, Fabrice J.F.; Gutierrez, Alejandro
2016-01-01
Insights concerning leukemic pathophysiology have been acquired in various animal models and further efforts to understand the mechanisms underlying leukemic treatment resistance and disease relapse promise to improve therapeutic strategies. The zebrafish (Danio rerio) is a vertebrate organism with a conserved hematopoietic program and unique experimental strengths suiting it for the investigation of human leukemia. Recent technological advances in zebrafish research including efficient transgenesis, precise genome editing, and straightforward transplantation techniques have led to the generation of a number of leukemia models. The transparency of the zebrafish when coupled with improved lineage-tracing and imaging techniques has revealed exquisite details of leukemic initiation, progression, and regression. With these advantages, the zebrafish represents a unique experimental system for leukemic research and additionally, advances in zebrafish-based high-throughput drug screening promise to hasten the discovery of novel leukemia therapeutics. To date, investigators have accumulated knowledge of the genetic underpinnings critical to leukemic transformation and treatment resistance and without doubt, zebrafish are rapidly expanding our understanding of disease mechanisms and helping to shape therapeutic strategies for improved outcomes in leukemic patients. PMID:27165361
Theory and practice: How do we teach our students about light?
NASA Astrophysics Data System (ADS)
Creath, Katherine
2007-08-01
As optical scientists and engineers we have an educational paradigm that stresses passing knowledge from teacher to student. We are also taught to use inductive reasoning to solve problems. Yet many of the fundamental questions in optics such as the topic of this conference "What are photons?" require that we use retroductive reasoning to deduce the possible and probable cause of the observations and measurements we make. We can agree that we don't have all the answers for many fundamental questions in optics. The retroductive reasoning process requires a different way of thinking from our traditional classroom setting. Most of us learned to do this through working in a research lab or industry. With the amount of information and new discoveries to consider, it makes it difficult to cover everything in the classroom. This paper looks at transformational learning techniques and how they have been applied in science and engineering. These techniques show promise to prepare our students to learn how to learn and develop skills they can directly apply to research and industry.
Biomarker Discovery by Novel Sensors Based on Nanoproteomics Approaches
Dasilva, Noelia; Díez, Paula; Matarraz, Sergio; González-González, María; Paradinas, Sara; Orfao, Alberto; Fuentes, Manuel
2012-01-01
During the last years, proteomics has facilitated biomarker discovery by coupling high-throughput techniques with novel nanosensors. In the present review, we focus on the study of label-based and label-free detection systems, as well as nanotechnology approaches, indicating their advantages and applications in biomarker discovery. In addition, several disease biomarkers are shown in order to display the clinical importance of the improvement of sensitivity and selectivity by using nanoproteomics approaches as novel sensors. PMID:22438764
Current status and future prospects for enabling chemistry technology in the drug discovery process.
Djuric, Stevan W; Hutchins, Charles W; Talaty, Nari N
2016-01-01
This review covers recent advances in the implementation of enabling chemistry technologies into the drug discovery process. Areas covered include parallel synthesis chemistry, high-throughput experimentation, automated synthesis and purification methods, flow chemistry methodology including photochemistry, electrochemistry, and the handling of "dangerous" reagents. Also featured are advances in the "computer-assisted drug design" area and the expanding application of novel mass spectrometry-based techniques to a wide range of drug discovery activities.
Ishii, Masaru
2015-06-01
Recent advances in intravital bone imaging technology has enabled us to grasp the real cellular behaviors and functions in vivo , revolutionizing the field of drug discovery for novel therapeutics against intractable bone diseases. In this chapter, I introduce various updated information on pharmacological actions of several antibone resorptive agents, which could only be derived from advanced imaging techniques, and also discuss the future perspectives of this new trend in drug discovery.
Enhancing knowledge discovery from cancer genomics data with Galaxy
Albuquerque, Marco A.; Grande, Bruno M.; Ritch, Elie J.; Pararajalingam, Prasath; Jessa, Selin; Krzywinski, Martin; Grewal, Jasleen K.; Shah, Sohrab P.; Boutros, Paul C.
2017-01-01
Abstract The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. PMID:28327945
Democratizing data science through data science training.
Van Horn, John Darrell; Fierro, Lily; Kamdar, Jeana; Gordon, Jonathan; Stewart, Crystal; Bhattrai, Avnish; Abe, Sumiko; Lei, Xiaoxiao; O'Driscoll, Caroline; Sinha, Aakanchha; Jain, Priyambada; Burns, Gully; Lerman, Kristina; Ambite, José Luis
2018-01-01
The biomedical sciences have experienced an explosion of data which promises to overwhelm many current practitioners. Without easy access to data science training resources, biomedical researchers may find themselves unable to wrangle their own datasets. In 2014, to address the challenges posed such a data onslaught, the National Institutes of Health (NIH) launched the Big Data to Knowledge (BD2K) initiative. To this end, the BD2K Training Coordinating Center (TCC; bigdatau.org) was funded to facilitate both in-person and online learning, and open up the concepts of data science to the widest possible audience. Here, we describe the activities of the BD2K TCC and its focus on the construction of the Educational Resource Discovery Index (ERuDIte), which identifies, collects, describes, and organizes online data science materials from BD2K awardees, open online courses, and videos from scientific lectures and tutorials. ERuDIte now indexes over 9,500 resources. Given the richness of online training materials and the constant evolution of biomedical data science, computational methods applying information retrieval, natural language processing, and machine learning techniques are required - in effect, using data science to inform training in data science. In so doing, the TCC seeks to democratize novel insights and discoveries brought forth via large-scale data science training.
Enhancing knowledge discovery from cancer genomics data with Galaxy.
Albuquerque, Marco A; Grande, Bruno M; Ritch, Elie J; Pararajalingam, Prasath; Jessa, Selin; Krzywinski, Martin; Grewal, Jasleen K; Shah, Sohrab P; Boutros, Paul C; Morin, Ryan D
2017-05-01
The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. © The Author 2017. Published by Oxford University Press.
iADRs: towards online adverse drug reaction analysis.
Lin, Wen-Yang; Li, He-Yi; Du, Jhih-Wei; Feng, Wen-Yu; Lo, Chiao-Feng; Soo, Von-Wun
2012-12-01
Adverse Drug Reaction (ADR) is one of the most important issues in the assessment of drug safety. In fact, many adverse drug reactions are not discovered during limited pre-marketing clinical trials; instead, they are only observed after long term post-marketing surveillance of drug usage. In light of this, the detection of adverse drug reactions, as early as possible, is an important topic of research for the pharmaceutical industry. Recently, large numbers of adverse events and the development of data mining technology have motivated the development of statistical and data mining methods for the detection of ADRs. These stand-alone methods, with no integration into knowledge discovery systems, are tedious and inconvenient for users and the processes for exploration are time-consuming. This paper proposes an interactive system platform for the detection of ADRs. By integrating an ADR data warehouse and innovative data mining techniques, the proposed system not only supports OLAP style multidimensional analysis of ADRs, but also allows the interactive discovery of associations between drugs and symptoms, called a drug-ADR association rule, which can be further developed using other factors of interest to the user, such as demographic information. The experiments indicate that interesting and valuable drug-ADR association rules can be efficiently mined.
Democratizing data science through data science training
Van Horn, John Darrell; Fierro, Lily; Kamdar, Jeana; Gordon, Jonathan; Stewart, Crystal; Bhattrai, Avnish; Abe, Sumiko; Lei, Xiaoxiao; O’Driscoll, Caroline; Sinha, Aakanchha; Jain, Priyambada; Burns, Gully; Lerman, Kristina; Ambite, José Luis
2017-01-01
The biomedical sciences have experienced an explosion of data which promises to overwhelm many current practitioners. Without easy access to data science training resources, biomedical researchers may find themselves unable to wrangle their own datasets. In 2014, to address the challenges posed such a data onslaught, the National Institutes of Health (NIH) launched the Big Data to Knowledge (BD2K) initiative. To this end, the BD2K Training Coordinating Center (TCC; bigdatau.org) was funded to facilitate both in-person and online learning, and open up the concepts of data science to the widest possible audience. Here, we describe the activities of the BD2K TCC and its focus on the construction of the Educational Resource Discovery Index (ERuDIte), which identifies, collects, describes, and organizes online data science materials from BD2K awardees, open online courses, and videos from scientific lectures and tutorials. ERuDIte now indexes over 9,500 resources. Given the richness of online training materials and the constant evolution of biomedical data science, computational methods applying information retrieval, natural language processing, and machine learning techniques are required - in effect, using data science to inform training in data science. In so doing, the TCC seeks to democratize novel insights and discoveries brought forth via large-scale data science training. PMID:29218890
Wains: a pattern-seeking artificial life species.
de Buitléir, Amy; Russell, Michael; Daly, Mark
2012-01-01
We describe the initial phase of a research project to develop an artificial life framework designed to extract knowledge from large data sets with minimal preparation or ramp-up time. In this phase, we evolved an artificial life population with a new brain architecture. The agents have sufficient intelligence to discover patterns in data and to make survival decisions based on those patterns. The species uses diploid reproduction, Hebbian learning, and Kohonen self-organizing maps, in combination with novel techniques such as using pattern-rich data as the environment and framing the data analysis as a survival problem for artificial life. The first generation of agents mastered the pattern discovery task well enough to thrive. Evolution further adapted the agents to their environment by making them a little more pessimistic, and also by making their brains more efficient.
Assessment of microbiota:host interactions at the vaginal mucosa interface.
Pruski, Pamela; Lewis, Holly V; Lee, Yun S; Marchesi, Julian R; Bennett, Phillip R; Takats, Zoltan; MacIntyre, David A
2018-04-27
There is increasing appreciation of the role that vaginal microbiota play in health and disease throughout a woman's lifespan. This has been driven partly by molecular techniques that enable detailed identification and characterisation of microbial community structures. However, these methods do not enable assessment of the biochemical and immunological interactions between host and vaginal microbiota involved in pathophysiology. This review examines our current knowledge of the relationships that exist between vaginal microbiota and the host at the level of the vaginal mucosal interface. We also consider methodological approaches to microbiomic, immunologic and metabolic profiling that permit assessment of these interactions. Integration of information derived from these platforms brings the potential for biomarker discovery, disease risk stratification and improved understanding of the mechanisms regulating vaginal microbial community dynamics in health and disease. Copyright © 2018 Elsevier Inc. All rights reserved.
Challenges in Timeseries Analysis from Microlensing
NASA Astrophysics Data System (ADS)
Street, R. A.
2017-06-01
Despite a flood of discoveries over the last ~ 20 years, our knowledge of the exoplanet population is incomplete owing to a gap between the sensitivities of different detection techniques. However, a census of exoplanets at all separations from their host stars is essential to fully understand planet formation mechanisms. Microlensing offers an effective way to bridge the gap around 1-10 AU and is therefore one of the major science goals of the Wide Field Infrared Survey Telescope (WFIRST) mission. WFIRST's survey of the Galactic Bulge is expected to discover ~ 20,000 microlensing events, including ~ 3000 planets, which represents a substantial data analysis challenge with the modeling software currently available. This paper highlights areas where further work is needed. The community is encouraged to join new software development efforts aimed at making the modeling of microlensing events both more accessible and rigorous.
Vilariño Besteiro, M P; Pérez Franco, C; Gallego Morales, L; Calvo Sagardoy, R; García de Lorenzo, A
2009-01-01
This paper intends to show the combination of therapeutical strategies in the treatment of long evolution food disorders. This fashion of work entitled "Modelo Santa Cristina" is based on several theoretical paradigms: Enabling Model, Action Control Model, Change Process Transtheoretical Model and Cognitive-Behavioural Model (Cognitive Restructuring and Learning Theories). Furthermore, Gestalt, Systemic and Psychodrama Orientation Techniques. The purpose of the treatment is both the normalization of food patterns and the increase in self-knowledge, self-acceptance and self-efficacy of patients. The exploration of ambivalence to change, the discovery of the functions of symptoms and the search for alternative behaviours, the normalization of food patterns, bodily image, cognitive restructuring, decision taking, communication skills and elaboration of traumatic experiences are among the main areas of intervention.
Convection in Cool Stars, as Seen Through Kepler's Eyes
NASA Astrophysics Data System (ADS)
Bastien, Fabienne A.
2015-01-01
Stellar surface processes represent a fundamental limit to the detection of extrasolar planets with the currently most heavily-used techniques. As such, considerable effort has gone into trying to mitigate the impact of these processes on planet detection, with most studies focusing on magnetic spots. Meanwhile, high-precision photometric planet surveys like CoRoT and Kepler have unveiled a wide variety of stellar variability at previously inaccessible levels. We demonstrate that these newly revealed variations are not solely magnetically driven but also trace surface convection through light curve ``flicker.'' We show that ``flicker'' not only yields a simple measurement of surface gravity with a precision of ˜0.1 dex, but it may also improve our knowledge of planet properties, enhance radial velocity planet detection and discovery, and provide new insights into stellar evolution.
Harnessing the potential of natural products in drug discovery from a cheminformatics vantage point.
Rodrigues, Tiago
2017-11-15
Natural products (NPs) present a privileged source of inspiration for chemical probe and drug design. Despite the biological pre-validation of the underlying molecular architectures and their relevance in drug discovery, the poor accessibility to NPs, complexity of the synthetic routes and scarce knowledge of their macromolecular counterparts in phenotypic screens still hinder their broader exploration. Cheminformatics algorithms now provide a powerful means of circumventing the abovementioned challenges and unlocking the full potential of NPs in a drug discovery context. Herein, I discuss recent advances in the computer-assisted design of NP mimics and how artificial intelligence may accelerate future NP-inspired molecular medicine.
Integrated Bio-Entity Network: A System for Biological Knowledge Discovery
Bell, Lindsey; Chowdhary, Rajesh; Liu, Jun S.; Niu, Xufeng; Zhang, Jinfeng
2011-01-01
A significant part of our biological knowledge is centered on relationships between biological entities (bio-entities) such as proteins, genes, small molecules, pathways, gene ontology (GO) terms and diseases. Accumulated at an increasing speed, the information on bio-entity relationships is archived in different forms at scattered places. Most of such information is buried in scientific literature as unstructured text. Organizing heterogeneous information in a structured form not only facilitates study of biological systems using integrative approaches, but also allows discovery of new knowledge in an automatic and systematic way. In this study, we performed a large scale integration of bio-entity relationship information from both databases containing manually annotated, structured information and automatic information extraction of unstructured text in scientific literature. The relationship information we integrated in this study includes protein–protein interactions, protein/gene regulations, protein–small molecule interactions, protein–GO relationships, protein–pathway relationships, and pathway–disease relationships. The relationship information is organized in a graph data structure, named integrated bio-entity network (IBN), where the vertices are the bio-entities and edges represent their relationships. Under this framework, graph theoretic algorithms can be designed to perform various knowledge discovery tasks. We designed breadth-first search with pruning (BFSP) and most probable path (MPP) algorithms to automatically generate hypotheses—the indirect relationships with high probabilities in the network. We show that IBN can be used to generate plausible hypotheses, which not only help to better understand the complex interactions in biological systems, but also provide guidance for experimental designs. PMID:21738677
A Discovery-Oriented Approach to Solid-Phase Peptide Synthesis
ERIC Educational Resources Information Center
Bockman, Matthew R.; Miedema, Christopher J.; Brennan, Brian B.
2012-01-01
In this discovery-oriented laboratory experiment, students use solid-phase synthesis techniques to construct a dipeptide containing an unknown amino acid. Following synthesis and cleavage from the polymeric support, electrospray ionization-mass spectrometry is employed to identify the unknown amino acid that was used in the peptide coupling. This…
Analyzing Student Inquiry Data Using Process Discovery and Sequence Classification
ERIC Educational Resources Information Center
Emond, Bruno; Buffett, Scott
2015-01-01
This paper reports on results of applying process discovery mining and sequence classification mining techniques to a data set of semi-structured learning activities. The main research objective is to advance educational data mining to model and support self-regulated learning in heterogeneous environments of learning content, activities, and…
Strategies for adding adaptive learning mechanisms to rule-based diagnostic expert systems
NASA Technical Reports Server (NTRS)
Stclair, D. C.; Sabharwal, C. L.; Bond, W. E.; Hacke, Keith
1988-01-01
Rule-based diagnostic expert systems can be used to perform many of the diagnostic chores necessary in today's complex space systems. These expert systems typically take a set of symptoms as input and produce diagnostic advice as output. The primary objective of such expert systems is to provide accurate and comprehensive advice which can be used to help return the space system in question to nominal operation. The development and maintenance of diagnostic expert systems is time and labor intensive since the services of both knowledge engineer(s) and domain expert(s) are required. The use of adaptive learning mechanisms to increment evaluate and refine rules promises to reduce both time and labor costs associated with such systems. This paper describes the basic adaptive learning mechanisms of strengthening, weakening, generalization, discrimination, and discovery. Next basic strategies are discussed for adding these learning mechanisms to rule-based diagnostic expert systems. These strategies support the incremental evaluation and refinement of rules in the knowledge base by comparing the set of advice given by the expert system (A) with the correct diagnosis (C). Techniques are described for selecting those rules in the in the knowledge base which should participate in adaptive learning. The strategies presented may be used with a wide variety of learning algorithms. Further, these strategies are applicable to a large number of rule-based diagnostic expert systems. They may be used to provide either immediate or deferred updating of the knowledge base.
Code of Federal Regulations, 2011 CFR
2011-07-01
... increasing knowledge or understanding in science and engineering. Applied research is defined as efforts that attempt to determine and exploit the potential of scientific discoveries or improvements in technology...
76 FR 4452 - Privacy Act of 1974; Report of Modified or Altered System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2011-01-25
... Disease Control and Prevention (CDC) for more complete knowledge of the disease/condition in the following... the light of future discoveries and proven associations so that relevant data collected at the time of... professional staff at the Centers for Disease Control and Prevention (CDC) for more complete knowledge of the...
Trying to Teach Well: A Story of Small Discoveries
ERIC Educational Resources Information Center
Lewis, P. J.
2004-01-01
''Stories do not simply contain knowledge, they are themselves the knowledge'' (Jackson (In: K. Eagan, H. McEwan (Eds.), Narrative in Teaching, Learning and Research, Teacher College Press, New York, 1995, p. 5)). How can we teach well? Perhaps we can find answers through our stories from the classroom. It is through our stories that we make sense…
ERIC Educational Resources Information Center
Polavaram, Sridevi
2016-01-01
Neuroscience can greatly benefit from using novel methods in computer science and informatics, which enable knowledge discovery in unexpected ways. Currently one of the biggest challenges in Neuroscience is to map the functional circuitry of the brain. The applications of this goal range from understanding structural reorganization of neurons to…
Knowledge Translation versus Knowledge Integration: A "Funder's" Perspective
ERIC Educational Resources Information Center
Kerner, Jon F.
2006-01-01
Each year, billions of US tax dollars are spent on basic discovery, intervention development, and efficacy research, while hundreds of billions of US tax dollars are also spent on health service delivery programs. However, little is spent on or known about how best to ensure that the lessons learned from science inform and improve the quality of…
The Assessment of Self-Directed Learning Readiness in Medical Education
ERIC Educational Resources Information Center
Monroe, Katherine Swint
2014-01-01
The rapid pace of scientific discovery has catalyzed the need for medical students to be able to find and assess new information. The knowledge required for physicians' skillful practice will change of the course of their careers, and, to keep up, they must be able to recognized their deficiencies, search for new knowledge, and critically evaluate…
EPA's Web Taxonomy is a faceted hierarchical vocabulary used to tag web pages with terms from a controlled vocabulary. Tagging enables search and discovery of EPA's Web based information assests. EPA's Web Taxonomy is being provided in Simple Knowledge Organization System (SKOS) format. SKOS is a standard for sharing and linking knowledge organization systems that promises to make Federal terminology resources more interoperable.
The Emergence of Organizing Structure in Conceptual Representation.
Lake, Brenden M; Lawrence, Neil D; Tenenbaum, Joshua B
2018-06-01
Both scientists and children make important structural discoveries, yet their computational underpinnings are not well understood. Structure discovery has previously been formalized as probabilistic inference about the right structural form-where form could be a tree, ring, chain, grid, etc. (Kemp & Tenenbaum, 2008). Although this approach can learn intuitive organizations, including a tree for animals and a ring for the color circle, it assumes a strong inductive bias that considers only these particular forms, and each form is explicitly provided as initial knowledge. Here we introduce a new computational model of how organizing structure can be discovered, utilizing a broad hypothesis space with a preference for sparse connectivity. Given that the inductive bias is more general, the model's initial knowledge shows little qualitative resemblance to some of the discoveries it supports. As a consequence, the model can also learn complex structures for domains that lack intuitive description, as well as predict human property induction judgments without explicit structural forms. By allowing form to emerge from sparsity, our approach clarifies how both the richness and flexibility of human conceptual organization can coexist. Copyright © 2018 Cognitive Science Society, Inc.
Knowledge discovery by accuracy maximization
Cacciatore, Stefano; Luchinat, Claudio; Tenori, Leonardo
2014-01-01
Here we describe KODAMA (knowledge discovery by accuracy maximization), an unsupervised and semisupervised learning algorithm that performs feature extraction from noisy and high-dimensional data. Unlike other data mining methods, the peculiarity of KODAMA is that it is driven by an integrated procedure of cross-validation of the results. The discovery of a local manifold’s topology is led by a classifier through a Monte Carlo procedure of maximization of cross-validated predictive accuracy. Briefly, our approach differs from previous methods in that it has an integrated procedure of validation of the results. In this way, the method ensures the highest robustness of the obtained solution. This robustness is demonstrated on experimental datasets of gene expression and metabolomics, where KODAMA compares favorably with other existing feature extraction methods. KODAMA is then applied to an astronomical dataset, revealing unexpected features. Interesting and not easily predictable features are also found in the analysis of the State of the Union speeches by American presidents: KODAMA reveals an abrupt linguistic transition sharply separating all post-Reagan from all pre-Reagan speeches. The transition occurs during Reagan’s presidency and not from its beginning. PMID:24706821
The Knowledge-Integrated Network Biomarkers Discovery for Major Adverse Cardiac Events
Jin, Guangxu; Zhou, Xiaobo; Wang, Honghui; Zhao, Hong; Cui, Kemi; Zhang, Xiang-Sun; Chen, Luonan; Hazen, Stanley L.; Li, King; Wong, Stephen T. C.
2010-01-01
The mass spectrometry (MS) technology in clinical proteomics is very promising for discovery of new biomarkers for diseases management. To overcome the obstacles of data noises in MS analysis, we proposed a new approach of knowledge-integrated biomarker discovery using data from Major Adverse Cardiac Events (MACE) patients. We first built up a cardiovascular-related network based on protein information coming from protein annotations in Uniprot, protein–protein interaction (PPI), and signal transduction database. Distinct from the previous machine learning methods in MS data processing, we then used statistical methods to discover biomarkers in cardiovascular-related network. Through the tradeoff between known protein information and data noises in mass spectrometry data, we finally could firmly identify those high-confident biomarkers. Most importantly, aided by protein–protein interaction network, that is, cardiovascular-related network, we proposed a new type of biomarkers, that is, network biomarkers, composed of a set of proteins and the interactions among them. The candidate network biomarkers can classify the two groups of patients more accurately than current single ones without consideration of biological molecular interaction. PMID:18665624
Usage and applications of Semantic Web techniques and technologies to support chemistry research
2014-01-01
Background The drug discovery process is now highly dependent on the management, curation and integration of large amounts of potentially useful data. Semantics are necessary in order to interpret the information and derive knowledge. Advances in recent years have mitigated concerns that the lack of robust, usable tools has inhibited the adoption of methodologies based on semantics. Results This paper presents three examples of how Semantic Web techniques and technologies can be used in order to support chemistry research: a controlled vocabulary for quantities, units and symbols in physical chemistry; a controlled vocabulary for the classification and labelling of chemical substances and mixtures; and, a database of chemical identifiers. This paper also presents a Web-based service that uses the datasets in order to assist with the completion of risk assessment forms, along with a discussion of the legal implications and value-proposition for the use of such a service. Conclusions We have introduced the Semantic Web concepts, technologies, and methodologies that can be used to support chemistry research, and have demonstrated the application of those techniques in three areas very relevant to modern chemistry research, generating three new datasets that we offer as exemplars of an extensible portfolio of advanced data integration facilities. We have thereby established the importance of Semantic Web techniques and technologies for meeting Wild’s fourth “grand challenge”. PMID:24855494
Usage and applications of Semantic Web techniques and technologies to support chemistry research.
Borkum, Mark I; Frey, Jeremy G
2014-01-01
The drug discovery process is now highly dependent on the management, curation and integration of large amounts of potentially useful data. Semantics are necessary in order to interpret the information and derive knowledge. Advances in recent years have mitigated concerns that the lack of robust, usable tools has inhibited the adoption of methodologies based on semantics. THIS PAPER PRESENTS THREE EXAMPLES OF HOW SEMANTIC WEB TECHNIQUES AND TECHNOLOGIES CAN BE USED IN ORDER TO SUPPORT CHEMISTRY RESEARCH: a controlled vocabulary for quantities, units and symbols in physical chemistry; a controlled vocabulary for the classification and labelling of chemical substances and mixtures; and, a database of chemical identifiers. This paper also presents a Web-based service that uses the datasets in order to assist with the completion of risk assessment forms, along with a discussion of the legal implications and value-proposition for the use of such a service. We have introduced the Semantic Web concepts, technologies, and methodologies that can be used to support chemistry research, and have demonstrated the application of those techniques in three areas very relevant to modern chemistry research, generating three new datasets that we offer as exemplars of an extensible portfolio of advanced data integration facilities. We have thereby established the importance of Semantic Web techniques and technologies for meeting Wild's fourth "grand challenge".
Geerts, Hugo; Dacks, Penny A; Devanarayan, Viswanath; Haas, Magali; Khachaturian, Zaven S; Gordon, Mark Forrest; Maudsley, Stuart; Romero, Klaus; Stephenson, Diane
2016-09-01
Massive investment and technological advances in the collection of extensive and longitudinal information on thousands of Alzheimer patients results in large amounts of data. These "big-data" databases can potentially advance CNS research and drug development. However, although necessary, they are not sufficient, and we posit that they must be matched with analytical methods that go beyond retrospective data-driven associations with various clinical phenotypes. Although these empirically derived associations can generate novel and useful hypotheses, they need to be organically integrated in a quantitative understanding of the pathology that can be actionable for drug discovery and development. We argue that mechanism-based modeling and simulation approaches, where existing domain knowledge is formally integrated using complexity science and quantitative systems pharmacology can be combined with data-driven analytics to generate predictive actionable knowledge for drug discovery programs, target validation, and optimization of clinical development. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
A knowledge discovery object model API for Java
Zuyderduyn, Scott D; Jones, Steven JM
2003-01-01
Background Biological data resources have become heterogeneous and derive from multiple sources. This introduces challenges in the management and utilization of this data in software development. Although efforts are underway to create a standard format for the transmission and storage of biological data, this objective has yet to be fully realized. Results This work describes an application programming interface (API) that provides a framework for developing an effective biological knowledge ontology for Java-based software projects. The API provides a robust framework for the data acquisition and management needs of an ontology implementation. In addition, the API contains classes to assist in creating GUIs to represent this data visually. Conclusions The Knowledge Discovery Object Model (KDOM) API is particularly useful for medium to large applications, or for a number of smaller software projects with common characteristics or objectives. KDOM can be coupled effectively with other biologically relevant APIs and classes. Source code, libraries, documentation and examples are available at . PMID:14583100
Emergence of Chinese drug discovery research: impact of hit and lead identification.
Zhou, Caihong; Zhou, Yan; Wang, Jia; Zhu, Yue; Deng, Jiejie; Wang, Ming-Wei
2015-03-01
The identification of hits and the generation of viable leads is an early and yet crucial step in drug discovery. In the West, the main players of drug discovery are pharmaceutical and biotechnology companies, while in China, academic institutions remain central in the field of drug discovery. There has been a tremendous amount of investment from the public as well as private sectors to support infrastructure buildup and expertise consolidation relative to drug discovery and development in the past two decades. A large-scale compound library has been established in China, and a series of high-impact discoveries of lead compounds have been made by integrating information obtained from different technology-based strategies. Natural products are a major source in China's drug discovery efforts. Knowledge has been enhanced via disruptive breakthroughs such as the discovery of Boc5 as a nonpeptidic agonist of glucagon-like peptide 1 receptor (GLP-1R), one of the class B G protein-coupled receptors (GPCRs). Most of the original hit identification and lead generation were carried out by academic institutions, including universities and specialized research institutes. The Chinese pharmaceutical industry is gradually transforming itself from manufacturing low-end generics and active pharmaceutical ingredients to inventing new drugs. © 2014 Society for Laboratory Automation and Screening.
Fragment-based approaches to anti-HIV drug discovery: state of the art and future opportunities.
Huang, Boshi; Kang, Dongwei; Zhan, Peng; Liu, Xinyong
2015-12-01
The search for additional drugs to treat HIV infection is a continuing effort due to the emergence and spread of HIV strains resistant to nearly all current drugs. The recent literature reveals that fragment-based drug design/discovery (FBDD) has become an effective alternative to conventional high-throughput screening strategies for drug discovery. In this critical review, the authors describe the state of the art in FBDD strategies for the discovery of anti-HIV drug-like compounds. The article focuses on fragment screening techniques, direct fragment-based design and early hit-to-lead progress. Rapid progress in biophysical detection and in silico techniques has greatly aided the application of FBDD to discover candidate agents directed at a variety of anti-HIV targets. Growing evidence suggests that structural insights on key proteins in the HIV life cycle can be applied in the early phase of drug discovery campaigns, providing valuable information on the binding modes and efficiently prompting fragment hit-to-lead progression. The combination of structural insights with improved methodologies for FBDD, including the privileged fragment-based reconstruction approach, fragment hybridization based on crystallographic overlays, fragment growth exploiting dynamic combinatorial chemistry, and high-speed fragment assembly via diversity-oriented synthesis followed by in situ screening, offers the possibility of more efficient and rapid discovery of novel drugs for HIV-1 prevention or treatment. Though the use of FBDD in anti-HIV drug discovery is still in its infancy, it is anticipated that anti-HIV agents developed via fragment-based strategies will be introduced into the clinic in the future.
From laptop to benchtop to bedside: Structure-based Drug Design on Protein Targets
Chen, Lu; Morrow, John K.; Tran, Hoang T.; Phatak, Sharangdhar S.; Du-Cuny, Lei; Zhang, Shuxing
2013-01-01
As an important aspect of computer-aided drug design, structure-based drug design brought a new horizon to pharmaceutical development. This in silico method permeates all aspects of drug discovery today, including lead identification, lead optimization, ADMET prediction and drug repurposing. Structure-based drug design has resulted in fruitful successes drug discovery targeting protein-ligand and protein-protein interactions. Meanwhile, challenges, noted by low accuracy and combinatoric issues, may also cause failures. In this review, state-of-the-art techniques for protein modeling (e.g. structure prediction, modeling protein flexibility, etc.), hit identification/optimization (e.g. molecular docking, focused library design, fragment-based design, molecular dynamic, etc.), and polypharmacology design will be discussed. We will explore how structure-based techniques can facilitate the drug discovery process and interplay with other experimental approaches. PMID:22316152
Mathematical modeling for novel cancer drug discovery and development.
Zhang, Ping; Brusic, Vladimir
2014-10-01
Mathematical modeling enables: the in silico classification of cancers, the prediction of disease outcomes, optimization of therapy, identification of promising drug targets and prediction of resistance to anticancer drugs. In silico pre-screened drug targets can be validated by a small number of carefully selected experiments. This review discusses the basics of mathematical modeling in cancer drug discovery and development. The topics include in silico discovery of novel molecular drug targets, optimization of immunotherapies, personalized medicine and guiding preclinical and clinical trials. Breast cancer has been used to demonstrate the applications of mathematical modeling in cancer diagnostics, the identification of high-risk population, cancer screening strategies, prediction of tumor growth and guiding cancer treatment. Mathematical models are the key components of the toolkit used in the fight against cancer. The combinatorial complexity of new drugs discovery is enormous, making systematic drug discovery, by experimentation, alone difficult if not impossible. The biggest challenges include seamless integration of growing data, information and knowledge, and making them available for a multiplicity of analyses. Mathematical models are essential for bringing cancer drug discovery into the era of Omics, Big Data and personalized medicine.
Lifeomics leads the age of grand discoveries.
He, Fuchu
2013-03-01
When our knowledge of a field accumulates to a certain level, we are bound to see the rise of one or more great scientists. They will make a series of grand discoveries/breakthroughs and push the discipline into an 'age of grand discoveries'. Mathematics, geography, physics and chemistry have all experienced their ages of grand discoveries; and in life sciences, the age of grand discoveries has appeared countless times since the 16th century. Thanks to the ever-changing development of molecular biology over the past 50 years, contemporary life science is once again approaching its breaking point and the trigger for this is most likely to be 'lifeomics'. At the end of the 20th century, genomics wrote out the 'script of life'; proteomics decoded the script; and RNAomics, glycomics and metabolomics came into bloom. These 'omics', with their unique epistemology and methodology, quickly became the thrust of life sciences, pushing the discipline to new high. Lifeomics, which encompasses all omics, has taken shape and is now signalling the dawn of a new era, the age of grand discoveries.
Current status and future prospects for enabling chemistry technology in the drug discovery process
Djuric, Stevan W.; Hutchins, Charles W.; Talaty, Nari N.
2016-01-01
This review covers recent advances in the implementation of enabling chemistry technologies into the drug discovery process. Areas covered include parallel synthesis chemistry, high-throughput experimentation, automated synthesis and purification methods, flow chemistry methodology including photochemistry, electrochemistry, and the handling of “dangerous” reagents. Also featured are advances in the “computer-assisted drug design” area and the expanding application of novel mass spectrometry-based techniques to a wide range of drug discovery activities. PMID:27781094
Hively, Lee M [Philadelphia, TN
2011-07-12
The invention relates to a method and apparatus for simultaneously processing different sources of test data into informational data and then processing different categories of informational data into knowledge-based data. The knowledge-based data can then be communicated between nodes in a system of multiple computers according to rules for a type of complex, hierarchical computer system modeled on a human brain.
Computer-aided drug discovery research at a global contract research organization
NASA Astrophysics Data System (ADS)
Kitchen, Douglas B.
2017-03-01
Computer-aided drug discovery started at Albany Molecular Research, Inc in 1997. Over nearly 20 years the role of cheminformatics and computational chemistry has grown throughout the pharmaceutical industry and at AMRI. This paper will describe the infrastructure and roles of CADD throughout drug discovery and some of the lessons learned regarding the success of several methods. Various contributions provided by computational chemistry and cheminformatics in chemical library design, hit triage, hit-to-lead and lead optimization are discussed. Some frequently used computational chemistry techniques are described. The ways in which they may contribute to discovery projects are presented based on a few examples from recent publications.
Computer-aided drug discovery research at a global contract research organization.
Kitchen, Douglas B
2017-03-01
Computer-aided drug discovery started at Albany Molecular Research, Inc in 1997. Over nearly 20 years the role of cheminformatics and computational chemistry has grown throughout the pharmaceutical industry and at AMRI. This paper will describe the infrastructure and roles of CADD throughout drug discovery and some of the lessons learned regarding the success of several methods. Various contributions provided by computational chemistry and cheminformatics in chemical library design, hit triage, hit-to-lead and lead optimization are discussed. Some frequently used computational chemistry techniques are described. The ways in which they may contribute to discovery projects are presented based on a few examples from recent publications.
2006-05-19
KENNEDY SPACE CENTER, FLA. -- Near Launch Pad 39B, wild pigs (at right) root for food near a stand of trees while Space Shuttle Discovery rolls out to the pad. The 4.2-mile journey from the Vehicle Assembly Building began at 12:45 p.m. EDT. The rollout is an important step before launch of Discovery on mission STS-121 to the International Space Station. Discovery's launch is targeted for July 1 in a launch window that extends to July 19. During the 12-day mission, Discovery's crew will test new hardware and techniques to improve shuttle safety, as well as deliver supplies and make repairs to the station. Photo credit: NASA/Ken Thornsley
From Residency to Lifelong Learning.
Brandt, Keith
2015-11-01
The residency training experience is the perfect environment for learning. The university/institution patient population provides a never-ending supply of patients with unique management challenges. Resources abound that allow the discovery of knowledge about similar situations. Senior teachers provide counseling and help direct appropriate care. Periodic testing and evaluations identify deficiencies, which can be corrected with future study. What happens, however, when the resident graduates? Do they possess all the knowledge they'll need for the rest of their career? Will medical discovery stand still limiting the need for future study? If initial certification establishes that the physician has the skills and knowledge to function as an independent physician and surgeon, how do we assure the public that plastic surgeons will practice lifelong learning and remain safe throughout their career? Enter Maintenance of Certification (MOC). In an ideal world, MOC would provide many of the same tools as residency training: identification of gaps in knowledge, resources to correct those deficiencies, overall assessment of knowledge, feedback about communication skills and professionalism, and methods to evaluate and improve one's practice. This article discusses the need; for education and self-assessment that extends beyond residency training and a commitment to lifelong learning. The American Board of Plastic Surgery MOC program is described to demonstrate how it helps the diplomate reach the goal of continuous practice improvement.
Semantically-enabled Knowledge Discovery in the Deep Carbon Observatory
NASA Astrophysics Data System (ADS)
Wang, H.; Chen, Y.; Ma, X.; Erickson, J. S.; West, P.; Fox, P. A.
2013-12-01
The Deep Carbon Observatory (DCO) is a decadal effort aimed at transforming scientific and public understanding of carbon in the complex deep earth system from the perspectives of Deep Energy, Deep Life, Extreme Physics and Chemistry, and Reservoirs and Fluxes. Over the course of the decade DCO scientific activities will generate a massive volume of data across a variety of disciplines, presenting significant challenges in terms of data integration, management, analysis and visualization, and ultimately limiting the ability of scientists across disciplines to make insights and unlock new knowledge. The DCO Data Science Team (DCO-DS) is applying Semantic Web methodologies to construct a knowledge representation focused on the DCO Earth science disciplines, and use it together with other technologies (e.g. natural language processing and data mining) to create a more expressive representation of the distributed corpus of DCO artifacts including datasets, metadata, instruments, sensors, platforms, deployments, researchers, organizations, funding agencies, grants and various awards. The embodiment of this knowledge representation is the DCO Data Science Infrastructure, in which unique entities within the DCO domain and the relations between them are recognized and explicitly identified. The DCO-DS Infrastructure will serve as a platform for more efficient and reliable searching, discovery, access, and publication of information and knowledge for the DCO scientific community and beyond.
Scaffold Repurposing of Old Drugs Towards New Cancer Drug Discovery.
Chen, Haijun; Wu, Jianlei; Gao, Yu; Chen, Haiying; Zhou, Jia
2016-01-01
As commented by the Nobelist James Black that "The most fruitful basis of the discovery of a new drug is to start with an old drug", drug repurposing represents an attractive drug discovery strategy. Despite the success of several repurposed drugs on the market, the ultimate therapeutic potential of a large number of non-cancer drugs is hindered during their repositioning due to various issues including the limited efficacy and intellectual property. With the increasing knowledge about the pharmacological properties and newly identified targets, the scaffolds of the old drugs emerge as a great treasure-trove towards new cancer drug discovery. In this review, we summarize the recent advances in the development of novel small molecules for cancer therapy by scaffold repurposing with highlighted examples. The relevant strategies, advantages, challenges and future research directions associated with this approach are also discussed.
Why Quantify Uncertainty in Ecosystem Studies: Obligation versus Discovery Tool?
NASA Astrophysics Data System (ADS)
Harmon, M. E.
2016-12-01
There are multiple motivations for quantifying uncertainty in ecosystem studies. One is as an obligation; the other is as a tool useful in moving ecosystem science toward discovery. While reporting uncertainty should become a routine expectation, a more convincing motivation involves discovery. By clarifying what is known and to what degree it is known, uncertainty analyses can point the way toward improvements in measurements, sampling designs, and models. While some of these improvements (e.g., better sampling designs) may lead to incremental gains, those involving models (particularly model selection) may require large gains in knowledge. To be fully harnessed as a discovery tool, attitudes toward uncertainty may have to change: rather than viewing uncertainty as a negative assessment of what was done, it should be viewed as positive, helpful assessment of what remains to be done.