A Virtual Bioinformatics Knowledge Environment for Early Cancer Detection
NASA Technical Reports Server (NTRS)
Crichton, Daniel; Srivastava, Sudhir; Johnsey, Donald
2003-01-01
Discovery of disease biomarkers for cancer is a leading focus of early detection. The National Cancer Institute created a network of collaborating institutions focused on the discovery and validation of cancer biomarkers called the Early Detection Research Network (EDRN). Informatics plays a key role in enabling a virtual knowledge environment that provides scientists real time access to distributed data sets located at research institutions across the nation. The distributed and heterogeneous nature of the collaboration makes data sharing across institutions very difficult. EDRN has developed a comprehensive informatics effort focused on developing a national infrastructure enabling seamless access, sharing and discovery of science data resources across all EDRN sites. This paper will discuss the EDRN knowledge system architecture, its objectives and its accomplishments.
Virtual Observatories, Data Mining, and Astroinformatics
NASA Astrophysics Data System (ADS)
Borne, Kirk
The historical, current, and future trends in knowledge discovery from data in astronomy are presented here. The story begins with a brief history of data gathering and data organization. A description of the development ofnew information science technologies for astronomical discovery is then presented. Among these are e-Science and the virtual observatory, with its data discovery, access, display, and integration protocols; astroinformatics and data mining for exploratory data analysis, information extraction, and knowledge discovery from distributed data collections; new sky surveys' databases, including rich multivariate observational parameter sets for large numbers of objects; and the emerging discipline of data-oriented astronomical research, called astroinformatics. Astroinformatics is described as the fourth paradigm of astronomical research, following the three traditional research methodologies: observation, theory, and computation/modeling. Astroinformatics research areas include machine learning, data mining, visualization, statistics, semantic science, and scientific data management.Each of these areas is now an active research discipline, with significantscience-enabling applications in astronomy. Research challenges and sample research scenarios are presented in these areas, in addition to sample algorithms for data-oriented research. These information science technologies enable scientific knowledge discovery from the increasingly large and complex data collections in astronomy. The education and training of the modern astronomy student must consequently include skill development in these areas, whose practitioners have traditionally been limited to applied mathematicians, computer scientists, and statisticians. Modern astronomical researchers must cross these traditional discipline boundaries, thereby borrowing the best of breed methodologies from multiple disciplines. In the era of large sky surveys and numerous large telescopes, the potential for astronomical discovery is equally large, and so the data-oriented research methods, algorithms, and techniques that are presented here will enable the greatest discovery potential from the ever-growing data and information resources in astronomy.
Translational Research 2.0: a framework for accelerating collaborative discovery.
Asakiewicz, Chris
2014-05-01
The world wide web has revolutionized the conduct of global, cross-disciplinary research. In the life sciences, interdisciplinary approaches to problem solving and collaboration are becoming increasingly important in facilitating knowledge discovery and integration. Web 2.0 technologies promise to have a profound impact - enabling reproducibility, aiding in discovery, and accelerating and transforming medical and healthcare research across the healthcare ecosystem. However, knowledge integration and discovery require a consistent foundation upon which to operate. A foundation should be capable of addressing some of the critical issues associated with how research is conducted within the ecosystem today and how it should be conducted for the future. This article will discuss a framework for enhancing collaborative knowledge discovery across the medical and healthcare research ecosystem. A framework that could serve as a foundation upon which ecosystem stakeholders can enhance the way data, information and knowledge is created, shared and used to accelerate the translation of knowledge from one area of the ecosystem to another.
Semantics-enabled service discovery framework in the SIMDAT pharma grid.
Qu, Cangtao; Zimmermann, Falk; Kumpf, Kai; Kamuzinzi, Richard; Ledent, Valérie; Herzog, Robert
2008-03-01
We present the design and implementation of a semantics-enabled service discovery framework in the data Grids for process and product development using numerical simulation and knowledge discovery (SIMDAT) Pharma Grid, an industry-oriented Grid environment for integrating thousands of Grid-enabled biological data services and analysis services. The framework consists of three major components: the Web ontology language (OWL)-description logic (DL)-based biological domain ontology, OWL Web service ontology (OWL-S)-based service annotation, and semantic matchmaker based on the ontology reasoning. Built upon the framework, workflow technologies are extensively exploited in the SIMDAT to assist biologists in (semi)automatically performing in silico experiments. We present a typical usage scenario through the case study of a biological workflow: IXodus.
Big, Deep, and Smart Data in Scanning Probe Microscopy
Kalinin, Sergei V.; Strelcov, Evgheni; Belianinov, Alex; ...
2016-09-27
Scanning probe microscopy techniques open the door to nanoscience and nanotechnology by enabling imaging and manipulation of structure and functionality of matter on nanometer and atomic scales. We analyze the discovery process by SPM in terms of information flow from tip-surface junction to the knowledge adoption by scientific community. Furthermore, we discuss the challenges and opportunities offered by merging of SPM and advanced data mining, visual analytics, and knowledge discovery technologies.
Big, Deep, and Smart Data in Scanning Probe Microscopy.
Kalinin, Sergei V; Strelcov, Evgheni; Belianinov, Alex; Somnath, Suhas; Vasudevan, Rama K; Lingerfelt, Eric J; Archibald, Richard K; Chen, Chaomei; Proksch, Roger; Laanait, Nouamane; Jesse, Stephen
2016-09-27
Scanning probe microscopy (SPM) techniques have opened the door to nanoscience and nanotechnology by enabling imaging and manipulation of the structure and functionality of matter at nanometer and atomic scales. Here, we analyze the scientific discovery process in SPM by following the information flow from the tip-surface junction, to knowledge adoption by the wider scientific community. We further discuss the challenges and opportunities offered by merging SPM with advanced data mining, visual analytics, and knowledge discovery technologies.
Human Exploration and Development of Space: Strategic Plan
NASA Technical Reports Server (NTRS)
Branscome, Darrell (Editor); Allen, Marc (Editor); Bihner, William (Editor); Craig, Mark (Editor); Crouch, Matthew (Editor); Crouch, Roger (Editor); Flaherty, Chris (Editor); Haynes, Norman (Editor); Horowitz, Steven (Editor)
2000-01-01
The five goals of the Human Exploration and Development of Space include: 1) Explore the Space Frontier; 2) Expand Scientific Knowledge; 3) Enable Humans to Live and Work Permanently in Space; 4) Enable the Commercial Development of Space; and 5) Share the Experience and Benefits of Discovery.
ERIC Educational Resources Information Center
Carter, Sunshine; Traill, Stacie
2017-01-01
Electronic resource access troubleshooting is familiar work in most libraries. The added complexity introduced when a library implements a web-scale discovery service, however, creates a strong need for well-organized, rigorous training to enable troubleshooting staff to provide the best service possible. This article outlines strategies, tools,…
An integrative model for in-silico clinical-genomics discovery science.
Lussier, Yves A; Sarkar, Indra Nell; Cantor, Michael
2002-01-01
Human Genome discovery research has set the pace for Post-Genomic Discovery Research. While post-genomic fields focused at the molecular level are intensively pursued, little effort is being deployed in the later stages of molecular medicine discovery research, such as clinical-genomics. The objective of this study is to demonstrate the relevance and significance of integrating mainstream clinical informatics decision support systems to current bioinformatics genomic discovery science. This paper is a feasibility study of an original model enabling novel "in-silico" clinical-genomic discovery science and that demonstrates its feasibility. This model is designed to mediate queries among clinical and genomic knowledge bases with relevant bioinformatic analytic tools (e.g. gene clustering). Briefly, trait-disease-gene relationships were successfully illustrated using QMR, OMIM, SNOMED-RT, GeneCluster and TreeView. The analyses were visualized as two-dimensional dendrograms of clinical observations clustered around genes. To our knowledge, this is the first study using knowledge bases of clinical decision support systems for genomic discovery. Although this study is a proof of principle, it provides a framework for the development of clinical decision-support-system driven, high-throughput clinical-genomic technologies which could potentially unveil significant high-level functions of genes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ruebel, Oliver
2009-11-20
Knowledge discovery from large and complex collections of today's scientific datasets is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the increasing number of data dimensions and data objects is presenting tremendous challenges for data analysis and effective data exploration methods and tools. Researchers are overwhelmed with data and standard tools are often insufficient to enable effective data analysis and knowledge discovery. The main objective of this thesis is to provide important new capabilities to accelerate scientific knowledge discovery form large, complex, and multivariate scientific data. The research coveredmore » in this thesis addresses these scientific challenges using a combination of scientific visualization, information visualization, automated data analysis, and other enabling technologies, such as efficient data management. The effectiveness of the proposed analysis methods is demonstrated via applications in two distinct scientific research fields, namely developmental biology and high-energy physics.Advances in microscopy, image analysis, and embryo registration enable for the first time measurement of gene expression at cellular resolution for entire organisms. Analysis of high-dimensional spatial gene expression datasets is a challenging task. By integrating data clustering and visualization, analysis of complex, time-varying, spatial gene expression patterns and their formation becomes possible. The analysis framework MATLAB and the visualization have been integrated, making advanced analysis tools accessible to biologist and enabling bioinformatic researchers to directly integrate their analysis with the visualization. Laser wakefield particle accelerators (LWFAs) promise to be a new compact source of high-energy particles and radiation, with wide applications ranging from medicine to physics. To gain insight into the complex physical processes of particle acceleration, physicists model LWFAs computationally. The datasets produced by LWFA simulations are (i) extremely large, (ii) of varying spatial and temporal resolution, (iii) heterogeneous, and (iv) high-dimensional, making analysis and knowledge discovery from complex LWFA simulation data a challenging task. To address these challenges this thesis describes the integration of the visualization system VisIt and the state-of-the-art index/query system FastBit, enabling interactive visual exploration of extremely large three-dimensional particle datasets. Researchers are especially interested in beams of high-energy particles formed during the course of a simulation. This thesis describes novel methods for automatic detection and analysis of particle beams enabling a more accurate and efficient data analysis process. By integrating these automated analysis methods with visualization, this research enables more accurate, efficient, and effective analysis of LWFA simulation data than previously possible.« less
Concept of operations for knowledge discovery from Big Data across enterprise data warehouses
NASA Astrophysics Data System (ADS)
Sukumar, Sreenivas R.; Olama, Mohammed M.; McNair, Allen W.; Nutaro, James J.
2013-05-01
The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Options that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.
2013-01-01
Background Professionals in the biomedical domain are confronted with an increasing mass of data. Developing methods to assist professional end users in the field of Knowledge Discovery to identify, extract, visualize and understand useful information from these huge amounts of data is a huge challenge. However, there are so many diverse methods and methodologies available, that for biomedical researchers who are inexperienced in the use of even relatively popular knowledge discovery methods, it can be very difficult to select the most appropriate method for their particular research problem. Results A web application, called KNODWAT (KNOwledge Discovery With Advanced Techniques) has been developed, using Java on Spring framework 3.1. and following a user-centered approach. The software runs on Java 1.6 and above and requires a web server such as Apache Tomcat and a database server such as the MySQL Server. For frontend functionality and styling, Twitter Bootstrap was used as well as jQuery for interactive user interface operations. Conclusions The framework presented is user-centric, highly extensible and flexible. Since it enables methods for testing using existing data to assess suitability and performance, it is especially suitable for inexperienced biomedical researchers, new to the field of knowledge discovery and data mining. For testing purposes two algorithms, CART and C4.5 were implemented using the WEKA data mining framework. PMID:23763826
Holzinger, Andreas; Zupan, Mario
2013-06-13
Professionals in the biomedical domain are confronted with an increasing mass of data. Developing methods to assist professional end users in the field of Knowledge Discovery to identify, extract, visualize and understand useful information from these huge amounts of data is a huge challenge. However, there are so many diverse methods and methodologies available, that for biomedical researchers who are inexperienced in the use of even relatively popular knowledge discovery methods, it can be very difficult to select the most appropriate method for their particular research problem. A web application, called KNODWAT (KNOwledge Discovery With Advanced Techniques) has been developed, using Java on Spring framework 3.1. and following a user-centered approach. The software runs on Java 1.6 and above and requires a web server such as Apache Tomcat and a database server such as the MySQL Server. For frontend functionality and styling, Twitter Bootstrap was used as well as jQuery for interactive user interface operations. The framework presented is user-centric, highly extensible and flexible. Since it enables methods for testing using existing data to assess suitability and performance, it is especially suitable for inexperienced biomedical researchers, new to the field of knowledge discovery and data mining. For testing purposes two algorithms, CART and C4.5 were implemented using the WEKA data mining framework.
Label-assisted mass spectrometry for the acceleration of reaction discovery and optimization
NASA Astrophysics Data System (ADS)
Cabrera-Pardo, Jaime R.; Chai, David I.; Liu, Song; Mrksich, Milan; Kozmin, Sergey A.
2013-05-01
The identification of new reactions expands our knowledge of chemical reactivity and enables new synthetic applications. Accelerating the pace of this discovery process remains challenging. We describe a highly effective and simple platform for screening a large number of potential chemical reactions in order to discover and optimize previously unknown catalytic transformations, thereby revealing new chemical reactivity. Our strategy is based on labelling one of the reactants with a polyaromatic chemical tag, which selectively undergoes a photoionization/desorption process upon laser irradiation, without the assistance of an external matrix, and enables rapid mass spectrometric detection of any products originating from such labelled reactants in complex reaction mixtures without any chromatographic separation. This method was successfully used for high-throughput discovery and subsequent optimization of two previously unknown benzannulation reactions.
ERIC Educational Resources Information Center
Polavaram, Sridevi
2016-01-01
Neuroscience can greatly benefit from using novel methods in computer science and informatics, which enable knowledge discovery in unexpected ways. Currently one of the biggest challenges in Neuroscience is to map the functional circuitry of the brain. The applications of this goal range from understanding structural reorganization of neurons to…
EPA's Web Taxonomy is a faceted hierarchical vocabulary used to tag web pages with terms from a controlled vocabulary. Tagging enables search and discovery of EPA's Web based information assests. EPA's Web Taxonomy is being provided in Simple Knowledge Organization System (SKOS) format. SKOS is a standard for sharing and linking knowledge organization systems that promises to make Federal terminology resources more interoperable.
Chen, Yi-An; Tripathi, Lokesh P; Mizuguchi, Kenji
2016-01-01
Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org. © The Author(s) 2016. Published by Oxford University Press.
Chen, Yi-An; Tripathi, Lokesh P.; Mizuguchi, Kenji
2016-01-01
Data analysis is one of the most critical and challenging steps in drug discovery and disease biology. A user-friendly resource to visualize and analyse high-throughput data provides a powerful medium for both experimental and computational biologists to understand vastly different biological data types and obtain a concise, simplified and meaningful output for better knowledge discovery. We have previously developed TargetMine, an integrated data warehouse optimized for target prioritization. Here we describe how upgraded and newly modelled data types in TargetMine can now survey the wider biological and chemical data space, relevant to drug discovery and development. To enhance the scope of TargetMine from target prioritization to broad-based knowledge discovery, we have also developed a new auxiliary toolkit to assist with data analysis and visualization in TargetMine. This toolkit features interactive data analysis tools to query and analyse the biological data compiled within the TargetMine data warehouse. The enhanced system enables users to discover new hypotheses interactively by performing complicated searches with no programming and obtaining the results in an easy to comprehend output format. Database URL: http://targetmine.mizuguchilab.org PMID:26989145
Integrative Systems Biology for Data Driven Knowledge Discovery
Greene, Casey S.; Troyanskaya, Olga G.
2015-01-01
Integrative systems biology is an approach that brings together diverse high throughput experiments and databases to gain new insights into biological processes or systems at molecular through physiological levels. These approaches rely on diverse high-throughput experimental techniques that generate heterogeneous data by assaying varying aspects of complex biological processes. Computational approaches are necessary to provide an integrative view of these experimental results and enable data-driven knowledge discovery. Hypotheses generated from these approaches can direct definitive molecular experiments in a cost effective manner. Using integrative systems biology approaches, we can leverage existing biological knowledge and large-scale data to improve our understanding of yet unknown components of a system of interest and how its malfunction leads to disease. PMID:21044756
Automated Knowledge Discovery from Simulators
NASA Technical Reports Server (NTRS)
Burl, Michael C.; DeCoste, D.; Enke, B. L.; Mazzoni, D.; Merline, W. J.; Scharenbroich, L.
2006-01-01
In this paper, we explore one aspect of knowledge discovery from simulators, the landscape characterization problem, where the aim is to identify regions in the input/ parameter/model space that lead to a particular output behavior. Large-scale numerical simulators are in widespread use by scientists and engineers across a range of government agencies, academia, and industry; in many cases, simulators provide the only means to examine processes that are infeasible or impossible to study otherwise. However, the cost of simulation studies can be quite high, both in terms of the time and computational resources required to conduct the trials and the manpower needed to sift through the resulting output. Thus, there is strong motivation to develop automated methods that enable more efficient knowledge extraction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Olama, Mohammed M; Nutaro, James J; Sukumar, Sreenivas R
2013-01-01
The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Optionsmore » that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.« less
Closed-Loop Multitarget Optimization for Discovery of New Emulsion Polymerization Recipes
2015-01-01
Self-optimization of chemical reactions enables faster optimization of reaction conditions or discovery of molecules with required target properties. The technology of self-optimization has been expanded to discovery of new process recipes for manufacture of complex functional products. A new machine-learning algorithm, specifically designed for multiobjective target optimization with an explicit aim to minimize the number of “expensive” experiments, guides the discovery process. This “black-box” approach assumes no a priori knowledge of chemical system and hence particularly suited to rapid development of processes to manufacture specialist low-volume, high-value products. The approach was demonstrated in discovery of process recipes for a semibatch emulsion copolymerization, targeting a specific particle size and full conversion. PMID:26435638
Crowdsourcing Knowledge Discovery and Innovations in Medicine
2014-01-01
Clinicians face difficult treatment decisions in contexts that are not well addressed by available evidence as formulated based on research. The digitization of medicine provides an opportunity for clinicians to collaborate with researchers and data scientists on solutions to previously ambiguous and seemingly insolvable questions. But these groups tend to work in isolated environments, and do not communicate or interact effectively. Clinicians are typically buried in the weeds and exigencies of daily practice such that they do not recognize or act on ways to improve knowledge discovery. Researchers may not be able to identify the gaps in clinical knowledge. For data scientists, the main challenge is discerning what is relevant in a domain that is both unfamiliar and complex. Each type of domain expert can contribute skills unavailable to the other groups. “Health hackathons” and “data marathons”, in which diverse participants work together, can leverage the current ready availability of digital data to discover new knowledge. Utilizing the complementary skills and expertise of these talented, but functionally divided groups, innovations are formulated at the systems level. As a result, the knowledge discovery process is simultaneously democratized and improved, real problems are solved, cross-disciplinary collaboration is supported, and innovations are enabled. PMID:25239002
Crowdsourcing knowledge discovery and innovations in medicine.
Celi, Leo Anthony; Ippolito, Andrea; Montgomery, Robert A; Moses, Christopher; Stone, David J
2014-09-19
Clinicians face difficult treatment decisions in contexts that are not well addressed by available evidence as formulated based on research. The digitization of medicine provides an opportunity for clinicians to collaborate with researchers and data scientists on solutions to previously ambiguous and seemingly insolvable questions. But these groups tend to work in isolated environments, and do not communicate or interact effectively. Clinicians are typically buried in the weeds and exigencies of daily practice such that they do not recognize or act on ways to improve knowledge discovery. Researchers may not be able to identify the gaps in clinical knowledge. For data scientists, the main challenge is discerning what is relevant in a domain that is both unfamiliar and complex. Each type of domain expert can contribute skills unavailable to the other groups. "Health hackathons" and "data marathons", in which diverse participants work together, can leverage the current ready availability of digital data to discover new knowledge. Utilizing the complementary skills and expertise of these talented, but functionally divided groups, innovations are formulated at the systems level. As a result, the knowledge discovery process is simultaneously democratized and improved, real problems are solved, cross-disciplinary collaboration is supported, and innovations are enabled.
Enabling knowledge discovery: taxonomy development for NASA
NASA Technical Reports Server (NTRS)
Dutra, J.; Busch, J.
2003-01-01
This white paper provides the background for why it is important to take the next steps with the NASA taxonomy including test and validation, XML schema development, integration with the FirstGov federal search engine, the OneNASA portal and its supporting web content management system.
GalenOWL: Ontology-based drug recommendations discovery
2012-01-01
Background Identification of drug-drug and drug-diseases interactions can pose a difficult problem to cope with, as the increasingly large number of available drugs coupled with the ongoing research activities in the pharmaceutical domain, make the task of discovering relevant information difficult. Although international standards, such as the ICD-10 classification and the UNII registration, have been developed in order to enable efficient knowledge sharing, medical staff needs to be constantly updated in order to effectively discover drug interactions before prescription. The use of Semantic Web technologies has been proposed in earlier works, in order to tackle this problem. Results This work presents a semantic-enabled online service, named GalenOWL, capable of offering real time drug-drug and drug-diseases interaction discovery. For enabling this kind of service, medical information and terminology had to be translated to ontological terms and be appropriately coupled with medical knowledge of the field. International standards such as the aforementioned ICD-10 and UNII, provide the backbone of the common representation of medical data, while the medical knowledge of drug interactions is represented by a rule base which makes use of the aforementioned standards. Details of the system architecture are presented while also giving an outline of the difficulties that had to be overcome. A comparison of the developed ontology-based system with a similar system developed using a traditional business logic rule engine is performed, giving insights on the advantages and drawbacks of both implementations. Conclusions The use of Semantic Web technologies has been found to be a good match for developing drug recommendation systems. Ontologies can effectively encapsulate medical knowledge and rule-based reasoning can capture and encode the drug interactions knowledge. PMID:23256945
Lötsch, Jörn; Lippmann, Catharina; Kringel, Dario; Ultsch, Alfred
2017-01-01
Genes causally involved in human insensitivity to pain provide a unique molecular source of studying the pathophysiology of pain and the development of novel analgesic drugs. The increasing availability of “big data” enables novel research approaches to chronic pain while also requiring novel techniques for data mining and knowledge discovery. We used machine learning to combine the knowledge about n = 20 genes causally involved in human hereditary insensitivity to pain with the knowledge about the functions of thousands of genes. An integrated computational analysis proposed that among the functions of this set of genes, the processes related to nervous system development and to ceramide and sphingosine signaling pathways are particularly important. This is in line with earlier suggestions to use these pathways as therapeutic target in pain. Following identification of the biological processes characterizing hereditary insensitivity to pain, the biological processes were used for a similarity analysis with the functions of n = 4,834 database-queried drugs. Using emergent self-organizing maps, a cluster of n = 22 drugs was identified sharing important functional features with hereditary insensitivity to pain. Several members of this cluster had been implicated in pain in preclinical experiments. Thus, the present concept of machine-learned knowledge discovery for pain research provides biologically plausible results and seems to be suitable for drug discovery by identifying a narrow choice of repurposing candidates, demonstrating that contemporary machine-learned methods offer innovative approaches to knowledge discovery from available evidence. PMID:28848388
Text mining patents for biomedical knowledge.
Rodriguez-Esteban, Raul; Bundschus, Markus
2016-06-01
Biomedical text mining of scientific knowledge bases, such as Medline, has received much attention in recent years. Given that text mining is able to automatically extract biomedical facts that revolve around entities such as genes, proteins, and drugs, from unstructured text sources, it is seen as a major enabler to foster biomedical research and drug discovery. In contrast to the biomedical literature, research into the mining of biomedical patents has not reached the same level of maturity. Here, we review existing work and highlight the associated technical challenges that emerge from automatically extracting facts from patents. We conclude by outlining potential future directions in this domain that could help drive biomedical research and drug discovery. Copyright © 2016 Elsevier Ltd. All rights reserved.
Developing integrated crop knowledge networks to advance candidate gene discovery.
Hassani-Pak, Keywan; Castellote, Martin; Esch, Maria; Hindle, Matthew; Lysenko, Artem; Taubert, Jan; Rawlings, Christopher
2016-12-01
The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.
Featured Article: Genotation: Actionable knowledge for the scientific reader
Willis, Ethan; Sakauye, Mark; Jose, Rony; Chen, Hao; Davis, Robert L
2016-01-01
We present an article viewer application that allows a scientific reader to easily discover and share knowledge by linking genomics-related concepts to knowledge of disparate biomedical databases. High-throughput data streams generated by technical advancements have contributed to scientific knowledge discovery at an unprecedented rate. Biomedical Informaticists have created a diverse set of databases to store and retrieve the discovered knowledge. The diversity and abundance of such resources present biomedical researchers a challenge with knowledge discovery. These challenges highlight a need for a better informatics solution. We use a text mining algorithm, Genomine, to identify gene symbols from the text of a journal article. The identified symbols are supplemented with information from the GenoDB knowledgebase. Self-updating GenoDB contains information from NCBI Gene, Clinvar, Medgen, dbSNP, KEGG, PharmGKB, Uniprot, and Hugo Gene databases. The journal viewer is a web application accessible via a web browser. The features described herein are accessible on www.genotation.org. The Genomine algorithm identifies gene symbols with an accuracy shown by .65 F-Score. GenoDB currently contains information regarding 59,905 gene symbols, 5633 drug–gene relationships, 5981 gene–disease relationships, and 713 pathways. This application provides scientific readers with actionable knowledge related to concepts of a manuscript. The reader will be able to save and share supplements to be visualized in a graphical manner. This provides convenient access to details of complex biological phenomena, enabling biomedical researchers to generate novel hypothesis to further our knowledge in human health. This manuscript presents a novel application that integrates genomic, proteomic, and pharmacogenomic information to supplement content of a biomedical manuscript and enable readers to automatically discover actionable knowledge. PMID:26900164
Featured Article: Genotation: Actionable knowledge for the scientific reader.
Nagahawatte, Panduka; Willis, Ethan; Sakauye, Mark; Jose, Rony; Chen, Hao; Davis, Robert L
2016-06-01
We present an article viewer application that allows a scientific reader to easily discover and share knowledge by linking genomics-related concepts to knowledge of disparate biomedical databases. High-throughput data streams generated by technical advancements have contributed to scientific knowledge discovery at an unprecedented rate. Biomedical Informaticists have created a diverse set of databases to store and retrieve the discovered knowledge. The diversity and abundance of such resources present biomedical researchers a challenge with knowledge discovery. These challenges highlight a need for a better informatics solution. We use a text mining algorithm, Genomine, to identify gene symbols from the text of a journal article. The identified symbols are supplemented with information from the GenoDB knowledgebase. Self-updating GenoDB contains information from NCBI Gene, Clinvar, Medgen, dbSNP, KEGG, PharmGKB, Uniprot, and Hugo Gene databases. The journal viewer is a web application accessible via a web browser. The features described herein are accessible on www.genotation.org The Genomine algorithm identifies gene symbols with an accuracy shown by .65 F-Score. GenoDB currently contains information regarding 59,905 gene symbols, 5633 drug-gene relationships, 5981 gene-disease relationships, and 713 pathways. This application provides scientific readers with actionable knowledge related to concepts of a manuscript. The reader will be able to save and share supplements to be visualized in a graphical manner. This provides convenient access to details of complex biological phenomena, enabling biomedical researchers to generate novel hypothesis to further our knowledge in human health. This manuscript presents a novel application that integrates genomic, proteomic, and pharmacogenomic information to supplement content of a biomedical manuscript and enable readers to automatically discover actionable knowledge. © 2016 by the Society for Experimental Biology and Medicine.
Knowledge Discovery in Chess Using an Aesthetics Approach
ERIC Educational Resources Information Center
Iqbal, Azlan
2012-01-01
Computational aesthetics is a relatively new subfield of artificial intelligence (AI). It includes research that enables computers to "recognize" (and evaluate) beauty in various domains such as visual art, music, and games. Aside from the benefit this gives to humans in terms of creating and appreciating art in these domains, there are perhaps…
NASA Astrophysics Data System (ADS)
Dabiru, L.; O'Hara, C. G.; Shaw, D.; Katragadda, S.; Anderson, D.; Kim, S.; Shrestha, B.; Aanstoos, J.; Frisbie, T.; Policelli, F.; Keblawi, N.
2006-12-01
The Research Project Knowledge Base (RPKB) is currently being designed and will be implemented in a manner that is fully compatible and interoperable with enterprise architecture tools developed to support NASA's Applied Sciences Program. Through user needs assessment, collaboration with Stennis Space Center, Goddard Space Flight Center, and NASA's DEVELOP Staff personnel insight to information needs for the RPKB were gathered from across NASA scientific communities of practice. To enable efficient, consistent, standard, structured, and managed data entry and research results compilation a prototype RPKB has been designed and fully integrated with the existing NASA Earth Science Systems Components database. The RPKB will compile research project and keyword information of relevance to the six major science focus areas, 12 national applications, and the Global Change Master Directory (GCMD). The RPKB will include information about projects awarded from NASA research solicitations, project investigator information, research publications, NASA data products employed, and model or decision support tools used or developed as well as new data product information. The RPKB will be developed in a multi-tier architecture that will include a SQL Server relational database backend, middleware, and front end client interfaces for data entry. The purpose of this project is to intelligently harvest the results of research sponsored by the NASA Applied Sciences Program and related research program results. We present various approaches for a wide spectrum of knowledge discovery of research results, publications, projects, etc. from the NASA Systems Components database and global information systems and show how this is implemented in SQL Server database. The application of knowledge discovery is useful for intelligent query answering and multiple-layered database construction. Using advanced EA tools such as the Earth Science Architecture Tool (ESAT), RPKB will enable NASA and partner agencies to efficiently identify the significant results for new experiment directions and principle investigators to formulate experiment directions for new proposals.
Berler, Alexander; Pavlopoulos, Sotiris; Koutsouris, Dimitris
2005-06-01
The advantages of the introduction of information and communication technologies in the complex health-care sector are already well-known and well-stated in the past. It is, nevertheless, paradoxical that although the medical community has embraced with satisfaction most of the technological discoveries allowing the improvement in patient care, this has not happened when talking about health-care informatics. Taking the above issue of concern, our work proposes an information model for knowledge management (KM) based upon the use of key performance indicators (KPIs) in health-care systems. Based upon the use of the balanced scorecard (BSC) framework (Kaplan/Norton) and quality assurance techniques in health care (Donabedian), this paper is proposing a patient journey centered approach that drives information flow at all levels of the day-to-day process of delivering effective and managed care, toward information assessment and knowledge discovery. In order to persuade health-care decision-makers to assess the added value of KM tools, those should be used to propose new performance measurement and performance management techniques at all levels of a health-care system. The proposed KPIs are forming a complete set of metrics that enable the performance management of a regional health-care system. In addition, the performance framework established is technically applied by the use of state-of-the-art KM tools such as data warehouses and business intelligence information systems. In that sense, the proposed infrastructure is, technologically speaking, an important KM tool that enables knowledge sharing amongst various health-care stakeholders and between different health-care groups. The use of BSC is an enabling framework toward a KM strategy in health care.
Collaborative Web-Enabled GeoAnalytics Applied to OECD Regional Data
NASA Astrophysics Data System (ADS)
Jern, Mikael
Recent advances in web-enabled graphics technologies have the potential to make a dramatic impact on developing collaborative geovisual analytics (GeoAnalytics). In this paper, tools are introduced that help establish progress initiatives at international and sub-national levels aimed at measuring and collaborating, through statistical indicators, economic, social and environmental developments and to engage both statisticians and the public in such activities. Given this global dimension of such a task, the “dream” of building a repository of progress indicators, where experts and public users can use GeoAnalytics collaborative tools to compare situations for two or more countries, regions or local communities, could be accomplished. While the benefits of GeoAnalytics tools are many, it remains a challenge to adapt these dynamic visual tools to the Internet. For example, dynamic web-enabled animation that enables statisticians to explore temporal, spatial and multivariate demographics data from multiple perspectives, discover interesting relationships, share their incremental discoveries with colleagues and finally communicate selected relevant knowledge to the public. These discoveries often emerge through the diverse backgrounds and experiences of expert domains and are precious in a creative analytics reasoning process. In this context, we introduce a demonstrator “OECD eXplorer”, a customized tool for interactively analyzing, and collaborating gained insights and discoveries based on a novel story mechanism that capture, re-use and share task-related explorative events.
Autonomy enables new science missions
NASA Astrophysics Data System (ADS)
Doyle, Richard J.; Gor, Victoria; Man, Guy K.; Stolorz, Paul E.; Chapman, Clark; Merline, William J.; Stern, Alan
1997-01-01
The challenge of space flight in NASA's future is to enable smaller, more frequent and intensive space exploration at much lower total cost without substantially decreasing mission reliability, capability, or the scientific return on investment. The most effective way to achieve this goal is to build intelligent capabilities into the spacecraft themselves. Our technological vision for meeting the challenge of returning quality science through limited communication bandwidth will actually put scientists in a more direct link with the spacecraft than they have enjoyed to date. Technologies such as pattern recognition and machine learning can place a part of the scientist's awareness onboard the spacecraft to prioritize downlink or to autonomously trigger time-critical follow-up observations-particularly important in flyby missions-without ground interaction. Onboard knowledge discovery methods can be used to include candidate discoveries in each downlink for scientists' scrutiny. Such capabilities will allow scientists to quickly reprioritize missions in a much more intimate and efficient manner than is possible today. Ultimately, new classes of exploration missions will be enabled.
To ontologise or not to ontologise: An information model for a geospatial knowledge infrastructure
NASA Astrophysics Data System (ADS)
Stock, Kristin; Stojanovic, Tim; Reitsma, Femke; Ou, Yang; Bishr, Mohamed; Ortmann, Jens; Robertson, Anne
2012-08-01
A geospatial knowledge infrastructure consists of a set of interoperable components, including software, information, hardware, procedures and standards, that work together to support advanced discovery and creation of geoscientific resources, including publications, data sets and web services. The focus of the work presented is the development of such an infrastructure for resource discovery. Advanced resource discovery is intended to support scientists in finding resources that meet their needs, and focuses on representing the semantic details of the scientific resources, including the detailed aspects of the science that led to the resource being created. This paper describes an information model for a geospatial knowledge infrastructure that uses ontologies to represent these semantic details, including knowledge about domain concepts, the scientific elements of the resource (analysis methods, theories and scientific processes) and web services. This semantic information can be used to enable more intelligent search over scientific resources, and to support new ways to infer and visualise scientific knowledge. The work describes the requirements for semantic support of a knowledge infrastructure, and analyses the different options for information storage based on the twin goals of semantic richness and syntactic interoperability to allow communication between different infrastructures. Such interoperability is achieved by the use of open standards, and the architecture of the knowledge infrastructure adopts such standards, particularly from the geospatial community. The paper then describes an information model that uses a range of different types of ontologies, explaining those ontologies and their content. The information model was successfully implemented in a working geospatial knowledge infrastructure, but the evaluation identified some issues in creating the ontologies.
Full, Robert J; Dudley, Robert; Koehl, M A R; Libby, Thomas; Schwab, Cheryl
2015-11-01
Experiencing the thrill of an original scientific discovery can be transformative to students unsure about becoming a scientist, yet few courses offer authentic research experiences. Increasingly, cutting-edge discoveries require an interdisciplinary approach not offered in current departmental-based courses. Here, we describe a one-semester, learning laboratory course on organismal biomechanics offered at our large research university that enables interdisciplinary teams of students from biology and engineering to grow intellectually, collaborate effectively, and make original discoveries. To attain this goal, we avoid traditional "cookbook" laboratories by training 20 students to use a dozen research stations. Teams of five students rotate to a new station each week where a professor, graduate student, and/or team member assists in the use of equipment, guides students through stages of critical thinking, encourages interdisciplinary collaboration, and moves them toward authentic discovery. Weekly discussion sections that involve the entire class offer exchange of discipline-specific knowledge, advice on experimental design, methods of collecting and analyzing data, a statistics primer, and best practices for writing and presenting scientific papers. The building of skills in concert with weekly guided inquiry facilitates original discovery via a final research project that can be presented at a national meeting or published in a scientific journal. © The Author 2015. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
Big data analytics in immunology: a knowledge-based approach.
Zhang, Guang Lan; Sun, Jing; Chitkushev, Lou; Brusic, Vladimir
2014-01-01
With the vast amount of immunological data available, immunology research is entering the big data era. These data vary in granularity, quality, and complexity and are stored in various formats, including publications, technical reports, and databases. The challenge is to make the transition from data to actionable knowledge and wisdom and bridge the knowledge gap and application gap. We report a knowledge-based approach based on a framework called KB-builder that facilitates data mining by enabling fast development and deployment of web-accessible immunological data knowledge warehouses. Immunological knowledge discovery relies heavily on both the availability of accurate, up-to-date, and well-organized data and the proper analytics tools. We propose the use of knowledge-based approaches by developing knowledgebases combining well-annotated data with specialized analytical tools and integrating them into analytical workflow. A set of well-defined workflow types with rich summarization and visualization capacity facilitates the transformation from data to critical information and knowledge. By using KB-builder, we enabled streamlining of normally time-consuming processes of database development. The knowledgebases built using KB-builder will speed up rational vaccine design by providing accurate and well-annotated data coupled with tailored computational analysis tools and workflow.
NASA Astrophysics Data System (ADS)
Stranieri, Andrew; Yearwood, John; Pham, Binh
1999-07-01
The development of data warehouses for the storage and analysis of very large corpora of medical image data represents a significant trend in health care and research. Amongst other benefits, the trend toward warehousing enables the use of techniques for automatically discovering knowledge from large and distributed databases. In this paper, we present an application design for knowledge discovery from databases (KDD) techniques that enhance the performance of the problem solving strategy known as case- based reasoning (CBR) for the diagnosis of radiological images. The problem of diagnosing the abnormality of the cervical spine is used to illustrate the method. The design of a case-based medical image diagnostic support system has three essential characteristics. The first is a case representation that comprises textual descriptions of the image, visual features that are known to be useful for indexing images, and additional visual features to be discovered by data mining many existing images. The second characteristic of the approach presented here involves the development of a case base that comprises an optimal number and distribution of cases. The third characteristic involves the automatic discovery, using KDD techniques, of adaptation knowledge to enhance the performance of the case based reasoner. Together, the three characteristics of our approach can overcome real time efficiency obstacles that otherwise mitigate against the use of CBR to the domain of medical image analysis.
ERIC Educational Resources Information Center
Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.
2000-01-01
These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)
Mathematical modeling for novel cancer drug discovery and development.
Zhang, Ping; Brusic, Vladimir
2014-10-01
Mathematical modeling enables: the in silico classification of cancers, the prediction of disease outcomes, optimization of therapy, identification of promising drug targets and prediction of resistance to anticancer drugs. In silico pre-screened drug targets can be validated by a small number of carefully selected experiments. This review discusses the basics of mathematical modeling in cancer drug discovery and development. The topics include in silico discovery of novel molecular drug targets, optimization of immunotherapies, personalized medicine and guiding preclinical and clinical trials. Breast cancer has been used to demonstrate the applications of mathematical modeling in cancer diagnostics, the identification of high-risk population, cancer screening strategies, prediction of tumor growth and guiding cancer treatment. Mathematical models are the key components of the toolkit used in the fight against cancer. The combinatorial complexity of new drugs discovery is enormous, making systematic drug discovery, by experimentation, alone difficult if not impossible. The biggest challenges include seamless integration of growing data, information and knowledge, and making them available for a multiplicity of analyses. Mathematical models are essential for bringing cancer drug discovery into the era of Omics, Big Data and personalized medicine.
Recent advances in inkjet dispensing technologies: applications in drug discovery.
Zhu, Xiangcheng; Zheng, Qiang; Yang, Hu; Cai, Jin; Huang, Lei; Duan, Yanwen; Xu, Zhinan; Cen, Peilin
2012-09-01
Inkjet dispensing technology is a promising fabrication methodology widely applied in drug discovery. The automated programmable characteristics and high-throughput efficiency makes this approach potentially very useful in miniaturizing the design patterns for assays and drug screening. Various custom-made inkjet dispensing systems as well as specialized bio-ink and substrates have been developed and applied to fulfill the increasing demands of basic drug discovery studies. The incorporation of other modern technologies has further exploited the potential of inkjet dispensing technology in drug discovery and development. This paper reviews and discusses the recent developments and practical applications of inkjet dispensing technology in several areas of drug discovery and development including fundamental assays of cells and proteins, microarrays, biosensors, tissue engineering, basic biological and pharmaceutical studies. Progression in a number of areas of research including biomaterials, inkjet mechanical systems and modern analytical techniques as well as the exploration and accumulation of profound biological knowledge has enabled different inkjet dispensing technologies to be developed and adapted for high-throughput pattern fabrication and miniaturization. This in turn presents a great opportunity to propel inkjet dispensing technology into drug discovery.
NASA Astrophysics Data System (ADS)
McGibbney, L. J.; Jiang, Y.; Burgess, A. B.
2017-12-01
Big Earth observation data have been produced, archived and made available online, but discovering the right data in a manner that precisely and efficiently satisfies user needs presents a significant challenge to the Earth Science (ES) community. An emerging trend in information retrieval community is to utilize knowledge graphs to assist users in quickly finding desired information from across knowledge sources. This is particularly prevalent within the fields of social media and complex multimodal information processing to name but a few, however building a domain-specific knowledge graph is labour-intensive and hard to keep up-to-date. In this work, we update our progress on the Earth Science Knowledge Graph (ESKG) project; an ESIP-funded testbed project which provides an automatic approach to building a dynamic knowledge graph for ES to improve interdisciplinary data discovery by leveraging implicit, latent existing knowledge present within across several U.S Federal Agencies e.g. NASA, NOAA and USGS. ESKG strengthens ties between observations and user communities by: 1) developing a knowledge graph derived from various sources e.g. Web pages, Web Services, etc. via natural language processing and knowledge extraction techniques; 2) allowing users to traverse, explore, query, reason and navigate ES data via knowledge graph interaction. ESKG has the potential to revolutionize the way in which ES communities interact with ES data in the open world through the entity, spatial and temporal linkages and characteristics that make it up. This project enables the advancement of ESIP collaboration areas including both Discovery and Semantic Technologies by putting graph information right at our fingertips in an interactive, modern manner and reducing the efforts to constructing ontology. To demonstrate the ESKG concept, we will demonstrate use of our framework across NASA JPL's PO.DAAC, NOAA's Earth Observation Requirements Evaluation System (EORES) and various USGS systems.
Relating the "mirrorness" of mirror neurons to their origins.
Kilner, James M; Friston, Karl J
2014-04-01
Ever since their discovery, mirror neurons have generated much interest and debate. A commonly held view of mirror neuron function is that they transform "visual information into knowledge," thus enabling action understanding and non-verbal social communication between con-specifics (Rizzolatti & Craighero 2004). This functionality is thought to be so important that it has been argued that mirror neurons must be a result of selective pressure.
Translational biomarkers: from discovery and development to clinical practice.
Subramanyam, Meena; Goyal, Jaya
The refinement of disease taxonomy utilizing molecular phenotypes has led to significant improvements in the precision of disease diagnosis and customization of treatment options. This has also spurred efforts to identify novel biomarkers to understand the impact of therapeutically altering the underlying molecular network on disease course, and to support decision-making in drug discovery and development. However, gaps in knowledge regarding disease heterogeneity, combined with the inadequacies of surrogate disease model systems, make it challenging to demonstrate the unequivocal association of molecular and physiological biomarkers to disease pathology. This article will discuss the current landscape in biomarker research and highlight strategies being adopted to increase the likelihood of transitioning biomarkers from discovery to medical practice to enable more objective decision making, and to improve health outcome. Copyright © 2016 Elsevier Ltd. All rights reserved.
Williams, Kevin; Bilsland, Elizabeth; Sparkes, Andrew; Aubrey, Wayne; Young, Michael; Soldatova, Larisa N; De Grave, Kurt; Ramon, Jan; de Clare, Michaela; Sirawaraporn, Worachart; Oliver, Stephen G; King, Ross D
2015-03-06
There is an urgent need to make drug discovery cheaper and faster. This will enable the development of treatments for diseases currently neglected for economic reasons, such as tropical and orphan diseases, and generally increase the supply of new drugs. Here, we report the Robot Scientist 'Eve' designed to make drug discovery more economical. A Robot Scientist is a laboratory automation system that uses artificial intelligence (AI) techniques to discover scientific knowledge through cycles of experimentation. Eve integrates and automates library-screening, hit-confirmation, and lead generation through cycles of quantitative structure activity relationship learning and testing. Using econometric modelling we demonstrate that the use of AI to select compounds economically outperforms standard drug screening. For further efficiency Eve uses a standardized form of assay to compute Boolean functions of compound properties. These assays can be quickly and cheaply engineered using synthetic biology, enabling more targets to be assayed for a given budget. Eve has repositioned several drugs against specific targets in parasites that cause tropical diseases. One validated discovery is that the anti-cancer compound TNP-470 is a potent inhibitor of dihydrofolate reductase from the malaria-causing parasite Plasmodium vivax.
Williams, Kevin; Bilsland, Elizabeth; Sparkes, Andrew; Aubrey, Wayne; Young, Michael; Soldatova, Larisa N.; De Grave, Kurt; Ramon, Jan; de Clare, Michaela; Sirawaraporn, Worachart; Oliver, Stephen G.; King, Ross D.
2015-01-01
There is an urgent need to make drug discovery cheaper and faster. This will enable the development of treatments for diseases currently neglected for economic reasons, such as tropical and orphan diseases, and generally increase the supply of new drugs. Here, we report the Robot Scientist ‘Eve’ designed to make drug discovery more economical. A Robot Scientist is a laboratory automation system that uses artificial intelligence (AI) techniques to discover scientific knowledge through cycles of experimentation. Eve integrates and automates library-screening, hit-confirmation, and lead generation through cycles of quantitative structure activity relationship learning and testing. Using econometric modelling we demonstrate that the use of AI to select compounds economically outperforms standard drug screening. For further efficiency Eve uses a standardized form of assay to compute Boolean functions of compound properties. These assays can be quickly and cheaply engineered using synthetic biology, enabling more targets to be assayed for a given budget. Eve has repositioned several drugs against specific targets in parasites that cause tropical diseases. One validated discovery is that the anti-cancer compound TNP-470 is a potent inhibitor of dihydrofolate reductase from the malaria-causing parasite Plasmodium vivax. PMID:25652463
Semantically-enabled Knowledge Discovery in the Deep Carbon Observatory
NASA Astrophysics Data System (ADS)
Wang, H.; Chen, Y.; Ma, X.; Erickson, J. S.; West, P.; Fox, P. A.
2013-12-01
The Deep Carbon Observatory (DCO) is a decadal effort aimed at transforming scientific and public understanding of carbon in the complex deep earth system from the perspectives of Deep Energy, Deep Life, Extreme Physics and Chemistry, and Reservoirs and Fluxes. Over the course of the decade DCO scientific activities will generate a massive volume of data across a variety of disciplines, presenting significant challenges in terms of data integration, management, analysis and visualization, and ultimately limiting the ability of scientists across disciplines to make insights and unlock new knowledge. The DCO Data Science Team (DCO-DS) is applying Semantic Web methodologies to construct a knowledge representation focused on the DCO Earth science disciplines, and use it together with other technologies (e.g. natural language processing and data mining) to create a more expressive representation of the distributed corpus of DCO artifacts including datasets, metadata, instruments, sensors, platforms, deployments, researchers, organizations, funding agencies, grants and various awards. The embodiment of this knowledge representation is the DCO Data Science Infrastructure, in which unique entities within the DCO domain and the relations between them are recognized and explicitly identified. The DCO-DS Infrastructure will serve as a platform for more efficient and reliable searching, discovery, access, and publication of information and knowledge for the DCO scientific community and beyond.
Suram, Santosh K.; Xue, Yexiang; Bai, Junwen; ...
2016-11-21
Rapid construction of phase diagrams is a central tenet of combinatorial materials science with accelerated materials discovery efforts often hampered by challenges in interpreting combinatorial X-ray diffraction data sets, which we address by developing AgileFD, an artificial intelligence algorithm that enables rapid phase mapping from a combinatorial library of X-ray diffraction patterns. AgileFD models alloying-based peak shifting through a novel expansion of convolutional nonnegative matrix factorization, which not only improves the identification of constituent phases but also maps their concentration and lattice parameter as a function of composition. By incorporating Gibbs’ phase rule into the algorithm, physically meaningful phase mapsmore » are obtained with unsupervised operation, and more refined solutions are attained by injecting expert knowledge of the system. The algorithm is demonstrated through investigation of the V–Mn–Nb oxide system where decomposition of eight oxide phases, including two with substantial alloying, provides the first phase map for this pseudoternary system. This phase map enables interpretation of high-throughput band gap data, leading to the discovery of new solar light absorbers and the alloying-based tuning of the direct-allowed band gap energy of MnV 2O 6. Lastly, the open-source family of AgileFD algorithms can be implemented into a broad range of high throughput workflows to accelerate materials discovery.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Suram, Santosh K.; Xue, Yexiang; Bai, Junwen
Rapid construction of phase diagrams is a central tenet of combinatorial materials science with accelerated materials discovery efforts often hampered by challenges in interpreting combinatorial X-ray diffraction data sets, which we address by developing AgileFD, an artificial intelligence algorithm that enables rapid phase mapping from a combinatorial library of X-ray diffraction patterns. AgileFD models alloying-based peak shifting through a novel expansion of convolutional nonnegative matrix factorization, which not only improves the identification of constituent phases but also maps their concentration and lattice parameter as a function of composition. By incorporating Gibbs’ phase rule into the algorithm, physically meaningful phase mapsmore » are obtained with unsupervised operation, and more refined solutions are attained by injecting expert knowledge of the system. The algorithm is demonstrated through investigation of the V–Mn–Nb oxide system where decomposition of eight oxide phases, including two with substantial alloying, provides the first phase map for this pseudoternary system. This phase map enables interpretation of high-throughput band gap data, leading to the discovery of new solar light absorbers and the alloying-based tuning of the direct-allowed band gap energy of MnV 2O 6. Lastly, the open-source family of AgileFD algorithms can be implemented into a broad range of high throughput workflows to accelerate materials discovery.« less
Object-graphs for context-aware visual category discovery.
Lee, Yong Jae; Grauman, Kristen
2012-02-01
How can knowing about some categories help us to discover new ones in unlabeled images? Unsupervised visual category discovery is useful to mine for recurring objects without human supervision, but existing methods assume no prior information and thus tend to perform poorly for cluttered scenes with multiple objects. We propose to leverage knowledge about previously learned categories to enable more accurate discovery, and address challenges in estimating their familiarity in unsegmented, unlabeled images. We introduce two variants of a novel object-graph descriptor to encode the 2D and 3D spatial layout of object-level co-occurrence patterns relative to an unfamiliar region and show that by using them to model the interaction between an image’s known and unknown objects, we can better detect new visual categories. Rather than mine for all categories from scratch, our method identifies new objects while drawing on useful cues from familiar ones. We evaluate our approach on several benchmark data sets and demonstrate clear improvements in discovery over conventional purely appearance-based baselines.
BioTextQuest: a web-based biomedical text mining suite for concept discovery.
Papanikolaou, Nikolas; Pafilis, Evangelos; Nikolaou, Stavros; Ouzounis, Christos A; Iliopoulos, Ioannis; Promponas, Vasilis J
2011-12-01
BioTextQuest combines automated discovery of significant terms in article clusters with structured knowledge annotation, via Named Entity Recognition services, offering interactive user-friendly visualization. A tag-cloud-based illustration of terms labeling each document cluster are semantically annotated according to the biological entity, and a list of document titles enable users to simultaneously compare terms and documents of each cluster, facilitating concept association and hypothesis generation. BioTextQuest allows customization of analysis parameters, e.g. clustering/stemming algorithms, exclusion of documents/significant terms, to better match the biological question addressed. http://biotextquest.biol.ucy.ac.cy vprobon@ucy.ac.cy; iliopj@med.uoc.gr Supplementary data are available at Bioinformatics online.
Brancaccio, Rosario N; Robitaille, Alexis; Dutta, Sankhadeep; Cuenin, Cyrille; Santare, Daiga; Skenders, Girts; Leja, Marcis; Fischer, Nicole; Giuliano, Anna R; Rollison, Dana E; Grundhoff, Adam; Tommasino, Massimo; Gheit, Tarik
2018-05-07
With the advent of new molecular tools, the discovery of new papillomaviruses (PVs) has accelerated during the past decade, enabling the expansion of knowledge about the viral populations that inhabit the human body. Human PVs (HPVs) are etiologically linked to benign or malignant lesions of the skin and mucosa. The detection of HPV types can vary widely, depending mainly on the methodology and the quality of the biological sample. Next-generation sequencing is one of the most powerful tools, enabling the discovery of novel viruses in a wide range of biological material. Here, we report a novel protocol for the detection of known and unknown HPV types in human skin and oral gargle samples using improved PCR protocols combined with next-generation sequencing. We identified 105 putative new PV types in addition to 296 known types, thus providing important information about the viral distribution in the oral cavity and skin. Copyright © 2018. Published by Elsevier Inc.
Panacea, a semantic-enabled drug recommendations discovery framework.
Doulaverakis, Charalampos; Nikolaidis, George; Kleontas, Athanasios; Kompatsiaris, Ioannis
2014-03-06
Personalized drug prescription can be benefited from the use of intelligent information management and sharing. International standard classifications and terminologies have been developed in order to provide unique and unambiguous information representation. Such standards can be used as the basis of automated decision support systems for providing drug-drug and drug-disease interaction discovery. Additionally, Semantic Web technologies have been proposed in earlier works, in order to support such systems. The paper presents Panacea, a semantic framework capable of offering drug-drug and drug-diseases interaction discovery. For enabling this kind of service, medical information and terminology had to be translated to ontological terms and be appropriately coupled with medical knowledge of the field. International standard classifications and terminologies, provide the backbone of the common representation of medical data while the medical knowledge of drug interactions is represented by a rule base which makes use of the aforementioned standards. Representation is based on a lightweight ontology. A layered reasoning approach is implemented where at the first layer ontological inference is used in order to discover underlying knowledge, while at the second layer a two-step rule selection strategy is followed resulting in a computationally efficient reasoning approach. Details of the system architecture are presented while also giving an outline of the difficulties that had to be overcome. Panacea is evaluated both in terms of quality of recommendations against real clinical data and performance. The quality recommendation gave useful insights regarding requirements for real world deployment and revealed several parameters that affected the recommendation results. Performance-wise, Panacea is compared to a previous published work by the authors, a service for drug recommendations named GalenOWL, and presents their differences in modeling and approach to the problem, while also pinpointing the advantages of Panacea. Overall, the paper presents a framework for providing an efficient drug recommendations service where Semantic Web technologies are coupled with traditional business rule engines.
The lessons of Varsovian's reconnaissance
NASA Technical Reports Server (NTRS)
Bents, D. J.
1990-01-01
The role played by advanced technology is illustrated with respect to the anticipated era of discovery and exploration (in space): how bold new exploration initiatives may or may not be enabled. Enabling technology makes the mission feasible. To be truly enabling, however, the technology must not only render the proposed mission technically feasible, but also make it viable economically; that is, low enough in cost (relative to the economy supporting it) that urgent national need is not required for justification, low enough that risks can be programmatically tolerated. An allegorical parallel is drawn to the Roman Empire of the second century AD, shown to have possessed by that time the necessary knowledge, motivation, means, and technical capability of mounting, through the use of innovative mission planning, an initiative similar to Columbus' voyage. They failed to do so; not because they lacked the vision, but because their technology was not advanced enough to make it an acceptable proposition economically. Speculation, based on the historical perspective, is made on the outcome of contemporary plans for future exploration showing how they will be subjected to the same historical forces, within limits imposed by the state of technology development, that shaped the timing of that previous era of discovery and exploration.
Scientific Knowledge and Technology, Animal Experimentation, and Pharmaceutical Development.
Kinter, Lewis B; DeGeorge, Joseph J
2016-12-01
Human discovery of pharmacologically active substances is arguably the oldest of the biomedical sciences with origins >3500 years ago. Since ancient times, four major transformations have dramatically impacted pharmaceutical development, each driven by advances in scientific knowledge, technology, and/or regulation: (1) anesthesia, analgesia, and antisepsis; (2) medicinal chemistry; (3) regulatory toxicology; and (4) targeted drug discovery. Animal experimentation in pharmaceutical development is a modern phenomenon dating from the 20th century and enabling several of the four transformations. While each transformation resulted in more effective and/or safer pharmaceuticals, overall attrition, cycle time, cost, numbers of animals used, and low probability of success for new products remain concerns, and pharmaceutical development remains a very high risk business proposition. In this manuscript we review pharmaceutical development since ancient times, describe its coevolution with animal experimentation, and attempt to predict the characteristics of future transformations. © The Author 2016. Published by Oxford University Press on behalf of the Institute for Laboratory Animal Research. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Use of model organism and disease databases to support matchmaking for human disease gene discovery.
Mungall, Christopher J; Washington, Nicole L; Nguyen-Xuan, Jeremy; Condit, Christopher; Smedley, Damian; Köhler, Sebastian; Groza, Tudor; Shefchek, Kent; Hochheiser, Harry; Robinson, Peter N; Lewis, Suzanna E; Haendel, Melissa A
2015-10-01
The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases. © 2015 WILEY PERIODICALS, INC.
Pituitary Medicine From Discovery to Patient-Focused Outcomes
2016-01-01
Context: This perspective traces a pipeline of discovery in pituitary medicine over the past 75 years. Objective: To place in context past advances and predict future changes in understanding pituitary pathophysiology and clinical care. Design: Author's perspective on reports of pituitary advances in the published literature. Setting: Clinical and translational Endocrinology. Outcomes: Discovery of the hypothalamic-pituitary axis and mechanisms for pituitary control, have culminated in exquisite understanding of anterior pituitary cell function and dysfunction. Challenges facing the discipline include fundamental understanding of pituitary adenoma pathogenesis leading to more effective treatments of inexorably growing and debilitating hormone secreting pituitary tumors as well as medical management of non-secreting pituitary adenomas. Newly emerging pituitary syndromes include those associated with immune-targeted cancer therapies and head trauma. Conclusions: Novel diagnostic techniques including imaging genomic, proteomic, and biochemical analyses will yield further knowledge to enable diagnosis of heretofore cryptic syndromes, as well as sub classifications of pituitary syndromes for personalized treatment approaches. Cost effective personalized approaches to precision therapy must demonstrate value, and will be empowered by multidisciplinary approaches to integrating complex subcellular information to identify therapeutic targets for enabling maximal outcomes. These goals will be challenging to attain given the rarity of pituitary disorders and the difficulty in conducting appropriately powered prospective trials. PMID:26908107
Irizarry, Kristopher J L; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L; Barrett, Gini; Barr, Margaret C
2016-01-01
Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management.
Irizarry, Kristopher J. L.; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L.; Barrett, Gini; Barr, Margaret C.
2016-01-01
Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management. PMID:27376076
Open cyberGIS software for geospatial research and education in the big data era
NASA Astrophysics Data System (ADS)
Wang, Shaowen; Liu, Yan; Padmanabhan, Anand
CyberGIS represents an interdisciplinary field combining advanced cyberinfrastructure, geographic information science and systems (GIS), spatial analysis and modeling, and a number of geospatial domains to improve research productivity and enable scientific breakthroughs. It has emerged as new-generation GIS that enable unprecedented advances in data-driven knowledge discovery, visualization and visual analytics, and collaborative problem solving and decision-making. This paper describes three open software strategies-open access, source, and integration-to serve various research and education purposes of diverse geospatial communities. These strategies have been implemented in a leading-edge cyberGIS software environment through three corresponding software modalities: CyberGIS Gateway, Toolkit, and Middleware, and achieved broad and significant impacts.
Knowledge Discovery from Databases: An Introductory Review.
ERIC Educational Resources Information Center
Vickery, Brian
1997-01-01
Introduces new procedures being used to extract knowledge from databases and discusses rationales for developing knowledge discovery methods. Methods are described for such techniques as classification, clustering, and the detection of deviations from pre-established norms. Examines potential uses of knowledge discovery in the information field.…
Discovery: Under the Microscope at Kennedy Space Center
NASA Technical Reports Server (NTRS)
Howard, Philip M.
2013-01-01
The National Aeronautics & Space Administration (NASA) is known for discovery, exploration, and advancement of knowledge. Since the days of Leeuwenhoek, microscopy has been at the forefront of discovery and knowledge. No truer is that statement than today at Kennedy Space Center (KSC), where microscopy plays a major role in contamination identification and is an integral part of failure analysis. Space exploration involves flight hardware undergoing rigorous "visually clean" inspections at every step of processing. The unknown contaminants that are discovered on these inspections can directly impact the mission by decreasing performance of sensors and scientific detectors on spacecraft and satellites, acting as micrometeorites, damaging critical sealing surfaces, and causing hazards to the crew of manned missions. This talk will discuss how microscopy has played a major role in all aspects of space port operations at KSC. Case studies will highlight years of analysis at the Materials Science Division including facility and payload contamination for the Navigation Signal Timing and Ranging Global Positioning Satellites (NA VST AR GPS) missions, quality control monitoring of monomethyl hydrazine fuel procurement for launch vehicle operations, Shuttle Solids Rocket Booster (SRB) foam processing failure analysis, and Space Shuttle Main Engine Cut-off (ECO) flight sensor anomaly analysis. What I hope to share with my fellow microscopists is some of the excitement of microscopy and how its discoveries has led to hardware processing, that has helped enable the successful launch of vehicles and space flight missions here at Kennedy Space Center.
Company Profile: Selventa, Inc.
Fryburg, David A; Latino, Louis J; Tagliamonte, John; Kenney, Renee D; Song, Diane H; Levine, Arnold J; de Graaf, David
2012-08-01
Selventa, Inc. (MA, USA) is a biomarker discovery company that enables personalized healthcare. Originally founded as Genstruct, Inc., Selventa has undergone significant evolution from a technology-based service provider to an active partner in the development of diagnostic tests, functioning as a molecular dashboard of disease activity using a unique platform. As part of that evolution, approximately 2 years ago the company was rebranded as Selventa to reflect its new identity and mission. The contributions to biomedical research by Selventa are based on in silico, reverse-engineering methods to determine biological causality. That is, given a set of in vitro or in vivo biological observations, which biological mechanisms can explain the measured results? Facilitated by a large and carefully curated knowledge base, these in silico methods generated new insights into the mechanisms driving a disease. As Selventa's methods would enable biomarker discovery and be directly applicable to generating novel diagnostics, the scientists at Selventa have focused on the development of predictive biomarkers of response in autoimmune and oncologic diseases. Selventa is presently building a portfolio of independent, as well as partnered, biomarker projects with the intention to create diagnostic tests that predict response to therapy.
NASA Technical Reports Server (NTRS)
Bents, D. J.
1990-01-01
The key role played by technology advancement with respect to the anticipated era of discovery and exploration (in space) is illustrated: how bold new initiatives may or may not be enabled. A truly enabling technology not only renders the proposed missions technically feasible, but also makes them viable economically; that is, low enough in cost (relative to the economy supporting them) that urgent national need is not required for justification, low enough in cost that high risk can be programmatically tolerated. A fictional parallel is drawn to the Roman Empire of the second century A.D., shown to have possessed by that time the necessary knowledge, motivation, means, and technical capability of mounting, through the use of innovative mission planning, an initiative similar to Columbus' voyage. They failed to do so because they lacked the advanced technology necessary to make it an acceptable proposition economically. Speculation, based on the historical perspective, is made on the outcome of contemporary plans for future exploration showing how they will be subjected to the same historical forces, within limits imposed by the state of technology development, that shaped the timing of that previous era of discovery and exploration.
Towards a Web-Enabled Geovisualization and Analytics Platform for the Energy and Water Nexus
NASA Astrophysics Data System (ADS)
Sanyal, J.; Chandola, V.; Sorokine, A.; Allen, M.; Berres, A.; Pang, H.; Karthik, R.; Nugent, P.; McManamay, R.; Stewart, R.; Bhaduri, B. L.
2017-12-01
Interactive data analytics are playing an increasingly vital role in the generation of new, critical insights regarding the complex dynamics of the energy/water nexus (EWN) and its interactions with climate variability and change. Integration of impacts, adaptation, and vulnerability (IAV) science with emerging, and increasingly critical, data science capabilities offers a promising potential to meet the needs of the EWN community. To enable the exploration of pertinent research questions, a web-based geospatial visualization platform is being built that integrates a data analysis toolbox with advanced data fusion and data visualization capabilities to create a knowledge discovery framework for the EWN. The system, when fully built out, will offer several geospatial visualization capabilities including statistical visual analytics, clustering, principal-component analysis, dynamic time warping, support uncertainty visualization and the exploration of data provenance, as well as support machine learning discoveries to render diverse types of geospatial data and facilitate interactive analysis. Key components in the system architecture includes NASA's WebWorldWind, the Globus toolkit, postgresql, as well as other custom built software modules.
The relation between prior knowledge and students' collaborative discovery learning processes
NASA Astrophysics Data System (ADS)
Gijlers, Hannie; de Jong, Ton
2005-03-01
In this study we investigate how prior knowledge influences knowledge development during collaborative discovery learning. Fifteen dyads of students (pre-university education, 15-16 years old) worked on a discovery learning task in the physics field of kinematics. The (face-to-face) communication between students was recorded and the interaction with the environment was logged. Based on students' individual judgments of the truth-value and testability of a series of domain-specific propositions, a detailed description of the knowledge configuration for each dyad was created before they entered the learning environment. Qualitative analyses of two dialogues illustrated that prior knowledge influences the discovery learning processes, and knowledge development in a pair of students. Assessments of student and dyad definitional (domain-specific) knowledge, generic (mathematical and graph) knowledge, and generic (discovery) skills were related to the students' dialogue in different discovery learning processes. Results show that a high level of definitional prior knowledge is positively related to the proportion of communication regarding the interpretation of results. Heterogeneity with respect to generic prior knowledge was positively related to the number of utterances made in the discovery process categories hypotheses generation and experimentation. Results of the qualitative analyses indicated that collaboration between extremely heterogeneous dyads is difficult when the high achiever is not willing to scaffold information and work in the low achiever's zone of proximal development.
The development of health care data warehouses to support data mining.
Lyman, Jason A; Scully, Kenneth; Harrison, James H
2008-03-01
Clinical data warehouses offer tremendous benefits as a foundation for data mining. By serving as a source for comprehensive clinical and demographic information on large patient populations, they streamline knowledge discovery efforts by providing standard and efficient mechanisms to replace time-consuming and expensive original data collection, organization, and processing. Building effective data warehouses requires knowledge of and attention to key issues in database design, data acquisition and processing, and data access and security. In this article, the authors provide an operational and technical definition of data warehouses, present examples of data mining projects enabled by existing data warehouses, and describe key issues and challenges related to warehouse development and implementation.
Information visualisation for science and policy: engaging users and avoiding bias.
McInerny, Greg J; Chen, Min; Freeman, Robin; Gavaghan, David; Meyer, Miriah; Rowland, Francis; Spiegelhalter, David J; Stefaner, Moritz; Tessarolo, Geizi; Hortal, Joaquin
2014-03-01
Visualisations and graphics are fundamental to studying complex subject matter. However, beyond acknowledging this value, scientists and science-policy programmes rarely consider how visualisations can enable discovery, create engaging and robust reporting, or support online resources. Producing accessible and unbiased visualisations from complicated, uncertain data requires expertise and knowledge from science, policy, computing, and design. However, visualisation is rarely found in our scientific training, organisations, or collaborations. As new policy programmes develop [e.g., the Intergovernmental Platform on Biodiversity and Ecosystem Services (IPBES)], we need information visualisation to permeate increasingly both the work of scientists and science policy. The alternative is increased potential for missed discoveries, miscommunications, and, at worst, creating a bias towards the research that is easiest to display. Copyright © 2014 Elsevier Ltd. All rights reserved.
Enabling the Discovery of Gravitational Radiation
NASA Astrophysics Data System (ADS)
Isaacson, Richard
2017-01-01
The discovery of gravitational radiation was announced with the publication of the results of a physics experiment involving over a thousand participants. This was preceded by a century of theoretical work, involving a similarly large group of physicists, mathematicians, and computer scientists. This huge effort was enabled by a substantial commitment of resources, both public and private, to develop the different strands of this complex research enterprise, and to build a community of scientists to carry it out. In the excitement following the discovery, the role of key enablers of this success has not always been adequately recognized in popular accounts. In this talk, I will try to call attention to a few of the key ingredients that proved crucial to enabling the successful discovery of gravitational waves, and the opening of a new field of science.
Information Fusion for Natural and Man-Made Disasters
2007-01-31
comprehensively large, and metaphysically accurate model of situations, through which specific tasks such as situation assessment, knowledge discovery , or the...significance” is always context specific. Event discovery is a very important element of the HLF process, which can lead to knowledge discovery about...expected, given the current state of knowledge . Examples of such behavior may include discovery of a new aggregate or situation, a specific pattern of
Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita
2015-07-14
In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world. Copyright © 2015 Hadjithomas et al.
Introduction to biological complexity as a missing link in drug discovery.
Gintant, Gary A; George, Christopher H
2018-06-06
Despite a burgeoning knowledge of the intricacies and mechanisms responsible for human disease, technological advances in medicinal chemistry, and more efficient assays used for drug screening, it remains difficult to discover novel and effective pharmacologic therapies. Areas covered: By reference to the primary literature and concepts emerging from academic and industrial drug screening landscapes, the authors propose that this disconnect arises from the inability to scale and integrate responses from simpler model systems to outcomes from more complex and human-based biological systems. Expert opinion: Further collaborative efforts combining target-based and phenotypic-based screening along with systems-based pharmacology and informatics will be necessary to harness the technological breakthroughs of today to derive the novel drug candidates of tomorrow. New questions must be asked of enabling technologies-while recognizing inherent limitations-in a way that moves drug development forward. Attempts to integrate mechanistic and observational information acquired across multiple scales frequently expose the gap between our knowledge and our understanding as the level of complexity increases. We hope that the thoughts and actionable items highlighted will help to inform the directed evolution of the drug discovery process.
Healthcare applications of knowledge discovery in databases.
DeGruy, K B
2000-01-01
Many healthcare leaders find themselves overwhelmed with data, but lack the information they need to make informed decisions. Knowledge discovery in databases (KDD) can help organizations turn their data into information. KDD is the process of finding complex patterns and relationships in data. The tools and techniques of KDD have achieved impressive results in other industries, and healthcare needs to take advantage of advances in this exciting field. Recent advances in the KDD field have brought it from the realm of research institutions and large corporations to many smaller companies. Software and hardware advances enable small organizations to tap the power of KDD using desktop PCs. KDD has been used extensively for fraud detection and focused marketing. There is a wealth of data available within the healthcare industry that would benefit from the application of KDD tools and techniques. Providers and payers have a vast quantity of data (such as, charges and claims), but not effective way to analyze the data to accurately determine relationships and trends. Organizations that take advantage of KDD techniques will find that they offer valuable assistance in the quest to lower healthcare costs while improving healthcare quality.
A New System To Support Knowledge Discovery: Telemakus.
ERIC Educational Resources Information Center
Revere, Debra; Fuller, Sherrilynne S.; Bugni, Paul F.; Martin, George M.
2003-01-01
The Telemakus System builds on the areas of concept representation, schema theory, and information visualization to enhance knowledge discovery from scientific literature. This article describes the underlying theories and an overview of a working implementation designed to enhance the knowledge discovery process through retrieval, visual and…
Mining Hierarchies and Similarity Clusters from Value Set Repositories.
Peterson, Kevin J; Jiang, Guoqian; Brue, Scott M; Shen, Feichen; Liu, Hongfang
2017-01-01
A value set is a collection of permissible values used to describe a specific conceptual domain for a given purpose. By helping to establish a shared semantic understanding across use cases, these artifacts are important enablers of interoperability and data standardization. As the size of repositories cataloging these value sets expand, knowledge management challenges become more pronounced. Specifically, discovering value sets applicable to a given use case may be challenging in a large repository. In this study, we describe methods to extract implicit relationships between value sets, and utilize these relationships to overlay organizational structure onto value set repositories. We successfully extract two different structurings, hierarchy and clustering, and show how tooling can leverage these structures to enable more effective value set discovery.
Fang, Hai; Knezevic, Bogdan; Burnham, Katie L; Knight, Julian C
2016-12-13
Biological interpretation of genomic summary data such as those resulting from genome-wide association studies (GWAS) and expression quantitative trait loci (eQTL) studies is one of the major bottlenecks in medical genomics research, calling for efficient and integrative tools to resolve this problem. We introduce eXploring Genomic Relations (XGR), an open source tool designed for enhanced interpretation of genomic summary data enabling downstream knowledge discovery. Targeting users of varying computational skills, XGR utilises prior biological knowledge and relationships in a highly integrated but easily accessible way to make user-input genomic summary datasets more interpretable. We show how by incorporating ontology, annotation, and systems biology network-driven approaches, XGR generates more informative results than conventional analyses. We apply XGR to GWAS and eQTL summary data to explore the genomic landscape of the activated innate immune response and common immunological diseases. We provide genomic evidence for a disease taxonomy supporting the concept of a disease spectrum from autoimmune to autoinflammatory disorders. We also show how XGR can define SNP-modulated gene networks and pathways that are shared and distinct between diseases, how it achieves functional, phenotypic and epigenomic annotations of genes and variants, and how it enables exploring annotation-based relationships between genetic variants. XGR provides a single integrated solution to enhance interpretation of genomic summary data for downstream biological discovery. XGR is released as both an R package and a web-app, freely available at http://galahad.well.ox.ac.uk/XGR .
Research to knowledge: promoting the training of physician-scientists in the biology of pregnancy.
Sadovsky, Yoel; Caughey, Aaron B; DiVito, Michelle; D'Alton, Mary E; Murtha, Amy P
2018-01-01
Common disorders of pregnancy, such as preeclampsia, preterm birth, and fetal growth abnormalities, continue to challenge perinatal biologists seeking insights into disease pathogenesis that will result in better diagnosis, therapy, and disease prevention. These challenges have recently been intensified with discoveries that associate gestational diseases with long-term maternal and neonatal outcomes. Whereas modern high-throughput investigative tools enable scientists and clinicians to noninvasively probe the maternal-fetal genome, epigenome, and other analytes, their implications for clinical medicine remain uncertain. Bridging these knowledge gaps depends on strengthening the existing pool of scientists with expertise in basic, translational, and clinical tools to address pertinent questions in the biology of pregnancy. Although PhD researchers are critical in this quest, physician-scientists would facilitate the inquiry by bringing together clinical challenges and investigative tools, promoting a culture of intellectual curiosity among clinical providers, and helping transform discoveries into relevant knowledge and clinical solutions. Uncertainties related to future administration of health care, federal support for research, attrition of physician-scientists, and an inadequate supply of new scholars may jeopardize our ability to address these challenges. New initiatives are necessary to attract current scholars and future generations of researchers seeking expertise in the scientific method and to support them, through mentorship and guidance, in pursuing a career that combines scientific investigation with clinical medicine. These efforts will promote breadth and depth of inquiry into the biology of pregnancy and enhance the pace of translation of scientific discoveries into better medicine and disease prevention. Copyright © 2017 Elsevier Inc. All rights reserved.
Knowledge Discovery as an Aid to Organizational Creativity.
ERIC Educational Resources Information Center
Siau, Keng
2000-01-01
This article presents the concept of knowledge discovery, a process of searching for associations in large volumes of computer data, as an aid to creativity. It then discusses the various techniques in knowledge discovery. Mednick's associative theory of creative thought serves as the theoretical foundation for this research. (Contains…
2017-06-27
From - To) 05-27-2017 Final 17-03-2017 - 15-03-2018 4. TITLE AND SUBTITLE Sa. CONTRACT NUMBER FA2386-17-1-0102 Advances in Knowledge Discovery and...Springer; Switzerland. 14. ABSTRACT The Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) is a leading international conference...in the areas of knowledge discovery and data mining (KDD). We had three keynote speeches, delivered by Sang Cha from Seoul National University
Belyanskaya, Svetlana L; Ding, Yun; Callahan, James F; Lazaar, Aili L; Israel, David I
2017-05-04
DNA-encoded chemical library technology was developed with the vision of its becoming a transformational platform for drug discovery. The hope was that a new paradigm for the discovery of low-molecular-weight drugs would be enabled by combining the vast molecular diversity achievable with combinatorial chemistry, the information-encoding attributes of DNA, the power of molecular biology, and a streamlined selection-based discovery process. Here, we describe the discovery and early clinical development of GSK2256294, an inhibitor of soluble epoxide hydrolase (sEH, EPHX2), by using encoded-library technology (ELT). GSK2256294 is an orally bioavailable, potent and selective inhibitor of sEH that has a long half life and produced no serious adverse events in a first-time-in-human clinical study. To our knowledge, GSK2256294 is the first molecule discovered from this technology to enter human clinical testing and represents a realization of the vision that DNA-encoded chemical library technology can efficiently yield molecules with favorable properties that can be readily progressed into high-quality drugs. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
The Relation between Prior Knowledge and Students' Collaborative Discovery Learning Processes
ERIC Educational Resources Information Center
Gijlers, Hannie; de Jong, Ton
2005-01-01
In this study we investigate how prior knowledge influences knowledge development during collaborative discovery learning. Fifteen dyads of students (pre-university education, 15-16 years old) worked on a discovery learning task in the physics field of kinematics. The (face-to-face) communication between students was recorded and the interaction…
Shestakova, M V
2011-01-01
Recent revolution in the knowledge about structure, physiological and pathophysiological effects of renin-angiotensin-aldosteron system (RAAS) took place recently when it was discovered that local synthesis of all the RAAS components occurs in target organs and their tissues (the heart, kidneys, vessels, brain tissues). It was found that besides classic RAAS acting via activation of angiotensin II (Ang-II) and its receptors, there is an alternative RAAS opposed to atherogenic potential of Ang-II. Renin and prorenin are shown to have both enzymatic and hormonal activities. Wider understanding appeared of extrarenal effects of aldosteron, its non-genomic activity. The above discoveries open new opportunities for pharmacological regulation of RAAS activity, which enables more effectively correct overactivity of this system in organs at risk of negativeAng-II impact.
Visualising nursing data using correspondence analysis.
Kokol, Peter; Blažun Vošner, Helena; Železnik, Danica
2016-09-01
Digitally stored, large healthcare datasets enable nurses to use 'big data' techniques and tools in nursing research. Big data is complex and multi-dimensional, so visualisation may be a preferable approach to analyse and understand it. To demonstrate the use of visualisation of big data in a technique called correspondence analysis. In the authors' study, relations among data in a nursing dataset were shown visually in graphs using correspondence analysis. The case presented demonstrates that correspondence analysis is easy to use, shows relations between data visually in a form that is simple to interpret, and can reveal hidden associations between data. Correspondence analysis supports the discovery of new knowledge. Implications for practice Knowledge obtained using correspondence analysis can be transferred immediately into practice or used to foster further research.
SemaTyP: a knowledge graph based literature mining method for drug discovery.
Sang, Shengtian; Yang, Zhihao; Wang, Lei; Liu, Xiaoxia; Lin, Hongfei; Wang, Jian
2018-05-30
Drug discovery is the process through which potential new medicines are identified. High-throughput screening and computer-aided drug discovery/design are the two main drug discovery methods for now, which have successfully discovered a series of drugs. However, development of new drugs is still an extremely time-consuming and expensive process. Biomedical literature contains important clues for the identification of potential treatments. It could support experts in biomedicine on their way towards new discoveries. Here, we propose a biomedical knowledge graph-based drug discovery method called SemaTyP, which discovers candidate drugs for diseases by mining published biomedical literature. We first construct a biomedical knowledge graph with the relations extracted from biomedical abstracts, then a logistic regression model is trained by learning the semantic types of paths of known drug therapies' existing in the biomedical knowledge graph, finally the learned model is used to discover drug therapies for new diseases. The experimental results show that our method could not only effectively discover new drug therapies for new diseases, but also could provide the potential mechanism of action of the candidate drugs. In this paper we propose a novel knowledge graph based literature mining method for drug discovery. It could be a supplementary method for current drug discovery methods.
Aptamer-Based Multiplexed Proteomic Technology for Biomarker Discovery
Gold, Larry; Ayers, Deborah; Bertino, Jennifer; Bock, Christopher; Bock, Ashley; Brody, Edward N.; Carter, Jeff; Dalby, Andrew B.; Eaton, Bruce E.; Fitzwater, Tim; Flather, Dylan; Forbes, Ashley; Foreman, Trudi; Fowler, Cate; Gawande, Bharat; Goss, Meredith; Gunn, Magda; Gupta, Shashi; Halladay, Dennis; Heil, Jim; Heilig, Joe; Hicke, Brian; Husar, Gregory; Janjic, Nebojsa; Jarvis, Thale; Jennings, Susan; Katilius, Evaldas; Keeney, Tracy R.; Kim, Nancy; Koch, Tad H.; Kraemer, Stephan; Kroiss, Luke; Le, Ngan; Levine, Daniel; Lindsey, Wes; Lollo, Bridget; Mayfield, Wes; Mehan, Mike; Mehler, Robert; Nelson, Sally K.; Nelson, Michele; Nieuwlandt, Dan; Nikrad, Malti; Ochsner, Urs; Ostroff, Rachel M.; Otis, Matt; Parker, Thomas; Pietrasiewicz, Steve; Resnicow, Daniel I.; Rohloff, John; Sanders, Glenn; Sattin, Sarah; Schneider, Daniel; Singer, Britta; Stanton, Martin; Sterkel, Alana; Stewart, Alex; Stratford, Suzanne; Vaught, Jonathan D.; Vrkljan, Mike; Walker, Jeffrey J.; Watrobka, Mike; Waugh, Sheela; Weiss, Allison; Wilcox, Sheri K.; Wolfson, Alexey; Wolk, Steven K.; Zhang, Chi; Zichi, Dom
2010-01-01
Background The interrogation of proteomes (“proteomics”) in a highly multiplexed and efficient manner remains a coveted and challenging goal in biology and medicine. Methodology/Principal Findings We present a new aptamer-based proteomic technology for biomarker discovery capable of simultaneously measuring thousands of proteins from small sample volumes (15 µL of serum or plasma). Our current assay measures 813 proteins with low limits of detection (1 pM median), 7 logs of overall dynamic range (∼100 fM–1 µM), and 5% median coefficient of variation. This technology is enabled by a new generation of aptamers that contain chemically modified nucleotides, which greatly expand the physicochemical diversity of the large randomized nucleic acid libraries from which the aptamers are selected. Proteins in complex matrices such as plasma are measured with a process that transforms a signature of protein concentrations into a corresponding signature of DNA aptamer concentrations, which is quantified on a DNA microarray. Our assay takes advantage of the dual nature of aptamers as both folded protein-binding entities with defined shapes and unique nucleotide sequences recognizable by specific hybridization probes. To demonstrate the utility of our proteomics biomarker discovery technology, we applied it to a clinical study of chronic kidney disease (CKD). We identified two well known CKD biomarkers as well as an additional 58 potential CKD biomarkers. These results demonstrate the potential utility of our technology to rapidly discover unique protein signatures characteristic of various disease states. Conclusions/Significance We describe a versatile and powerful tool that allows large-scale comparison of proteome profiles among discrete populations. This unbiased and highly multiplexed search engine will enable the discovery of novel biomarkers in a manner that is unencumbered by our incomplete knowledge of biology, thereby helping to advance the next generation of evidence-based medicine. PMID:21165148
Aptamer-based multiplexed proteomic technology for biomarker discovery.
Gold, Larry; Ayers, Deborah; Bertino, Jennifer; Bock, Christopher; Bock, Ashley; Brody, Edward N; Carter, Jeff; Dalby, Andrew B; Eaton, Bruce E; Fitzwater, Tim; Flather, Dylan; Forbes, Ashley; Foreman, Trudi; Fowler, Cate; Gawande, Bharat; Goss, Meredith; Gunn, Magda; Gupta, Shashi; Halladay, Dennis; Heil, Jim; Heilig, Joe; Hicke, Brian; Husar, Gregory; Janjic, Nebojsa; Jarvis, Thale; Jennings, Susan; Katilius, Evaldas; Keeney, Tracy R; Kim, Nancy; Koch, Tad H; Kraemer, Stephan; Kroiss, Luke; Le, Ngan; Levine, Daniel; Lindsey, Wes; Lollo, Bridget; Mayfield, Wes; Mehan, Mike; Mehler, Robert; Nelson, Sally K; Nelson, Michele; Nieuwlandt, Dan; Nikrad, Malti; Ochsner, Urs; Ostroff, Rachel M; Otis, Matt; Parker, Thomas; Pietrasiewicz, Steve; Resnicow, Daniel I; Rohloff, John; Sanders, Glenn; Sattin, Sarah; Schneider, Daniel; Singer, Britta; Stanton, Martin; Sterkel, Alana; Stewart, Alex; Stratford, Suzanne; Vaught, Jonathan D; Vrkljan, Mike; Walker, Jeffrey J; Watrobka, Mike; Waugh, Sheela; Weiss, Allison; Wilcox, Sheri K; Wolfson, Alexey; Wolk, Steven K; Zhang, Chi; Zichi, Dom
2010-12-07
The interrogation of proteomes ("proteomics") in a highly multiplexed and efficient manner remains a coveted and challenging goal in biology and medicine. We present a new aptamer-based proteomic technology for biomarker discovery capable of simultaneously measuring thousands of proteins from small sample volumes (15 µL of serum or plasma). Our current assay measures 813 proteins with low limits of detection (1 pM median), 7 logs of overall dynamic range (~100 fM-1 µM), and 5% median coefficient of variation. This technology is enabled by a new generation of aptamers that contain chemically modified nucleotides, which greatly expand the physicochemical diversity of the large randomized nucleic acid libraries from which the aptamers are selected. Proteins in complex matrices such as plasma are measured with a process that transforms a signature of protein concentrations into a corresponding signature of DNA aptamer concentrations, which is quantified on a DNA microarray. Our assay takes advantage of the dual nature of aptamers as both folded protein-binding entities with defined shapes and unique nucleotide sequences recognizable by specific hybridization probes. To demonstrate the utility of our proteomics biomarker discovery technology, we applied it to a clinical study of chronic kidney disease (CKD). We identified two well known CKD biomarkers as well as an additional 58 potential CKD biomarkers. These results demonstrate the potential utility of our technology to rapidly discover unique protein signatures characteristic of various disease states. We describe a versatile and powerful tool that allows large-scale comparison of proteome profiles among discrete populations. This unbiased and highly multiplexed search engine will enable the discovery of novel biomarkers in a manner that is unencumbered by our incomplete knowledge of biology, thereby helping to advance the next generation of evidence-based medicine.
Realising the knowledge spiral in healthcare: the role of data mining and knowledge management.
Wickramasinghe, Nilmini; Bali, Rajeev K; Gibbons, M Chris; Schaffer, Jonathan
2008-01-01
Knowledge Management (KM) is an emerging business approach aimed at solving current problems such as competitiveness and the need to innovate which are faced by businesses today. The premise for the need for KM is based on a paradigm shift in the business environment where knowledge is central to organizational performance . Organizations trying to embrace KM have many tools, techniques and strategies at their disposal. A vital technique in KM is data mining which enables critical knowledge to be gained from the analysis of large amounts of data and information. The healthcare industry is a very information rich industry. The collecting of data and information permeate most, if not all areas of this industry; however, the healthcare industry has yet to fully embrace KM, let alone the new evolving techniques of data mining. In this paper, we demonstrate the ubiquitous benefits of data mining and KM to healthcare by highlighting their potential to enable and facilitate superior clinical practice and administrative management to ensue. Specifically, we show how data mining can realize the knowledge spiral by effecting the four key transformations identified by Nonaka of turning: (1) existing explicit knowledge to new explicit knowledge, (2) existing explicit knowledge to new tacit knowledge, (3) existing tacit knowledge to new explicit knowledge and (4) existing tacit knowledge to new tacit knowledge. This is done through the establishment of theoretical models that respectively identify the function of the knowledge spiral and the powers of data mining, both exploratory and predictive, in the knowledge discovery process. Our models are then applied to a healthcare data set to demonstrate the potential of this approach as well as the implications of such an approach to the clinical and administrative aspects of healthcare. Further, we demonstrate how these techniques can facilitate hospitals to address the six healthcare quality dimensions identified by the Committee for Quality Healthcare.
Semantically enabling pharmacogenomic data for the realization of personalized medicine
Samwald, Matthias; Coulet, Adrien; Huerga, Iker; Powers, Robert L; Luciano, Joanne S; Freimuth, Robert R; Whipple, Frederick; Pichler, Elgar; Prud’hommeaux, Eric; Dumontier, Michel; Marshall, M Scott
2014-01-01
Understanding how each individual’s genetics and physiology influences pharmaceutical response is crucial to the realization of personalized medicine and the discovery and validation of pharmacogenomic biomarkers is key to its success. However, integration of genotype and phenotype knowledge in medical information systems remains a critical challenge. The inability to easily and accurately integrate the results of biomolecular studies with patients’ medical records and clinical reports prevents us from realizing the full potential of pharmacogenomic knowledge for both drug development and clinical practice. Herein, we describe approaches using Semantic Web technologies, in which pharmacogenomic knowledge relevant to drug development and medical decision support is represented in such a way that it can be efficiently accessed both by software and human experts. We suggest that this approach increases the utility of data, and that such computational technologies will become an essential part of personalized medicine, alongside diagnostics and pharmaceutical products. PMID:22256869
A New Student Performance Analysing System Using Knowledge Discovery in Higher Educational Databases
ERIC Educational Resources Information Center
Guruler, Huseyin; Istanbullu, Ayhan; Karahasan, Mehmet
2010-01-01
Knowledge discovery is a wide ranged process including data mining, which is used to find out meaningful and useful patterns in large amounts of data. In order to explore the factors having impact on the success of university students, knowledge discovery software, called MUSKUP, has been developed and tested on student data. In this system a…
Mamykina, Lena; Heitkemper, Elizabeth M.; Smaldone, Arlene M.; Kukafka, Rita; Cole-Lewis, Heather J.; Davidson, Patricia G.; Mynatt, Elizabeth D.; Cassells, Andrea; Tobin, Jonathan N.; Hripcsak, George
2017-01-01
Objective To outline new design directions for informatics solutions that facilitate personal discovery with self-monitoring data. We investigate this question in the context of chronic disease self-management with the focus on type 2 diabetes. Materials and methods We conducted an observational qualitative study of discovery with personal data among adults attending a diabetes self-management education (DSME) program that utilized a discovery-based curriculum. The study included observations of class sessions, and interviews and focus groups with the educator and attendees of the program (n = 14). Results The main discovery in diabetes self-management evolved around discovering patterns of association between characteristics of individuals’ activities and changes in their blood glucose levels that the participants referred to as “cause and effect”. This discovery empowered individuals to actively engage in self-management and provided a desired flexibility in selection of personalized self-management strategies. We show that discovery of cause and effect involves four essential phases: (1) feature selection, (2) hypothesis generation, (3) feature evaluation, and (4) goal specification. Further, we identify opportunities to support discovery at each stage with informatics and data visualization solutions by providing assistance with: (1) active manipulation of collected data (e.g., grouping, filtering and side-by-side inspection), (2) hypotheses formulation (e.g., using natural language statements or constructing visual queries), (3) inference evaluation (e.g., through aggregation and visual comparison, and statistical analysis of associations), and (4) translation of discoveries into actionable goals (e.g., tailored selection from computable knowledge sources of effective diabetes self-management behaviors). Discussion The study suggests that discovery of cause and effect in diabetes can be a powerful approach to helping individuals to improve their self-management strategies, and that self-monitoring data can serve as a driving engine for personal discovery that may lead to sustainable behavior changes. Conclusions Enabling personal discovery is a promising new approach to enhancing chronic disease self-management with informatics interventions. PMID:28974460
Mamykina, Lena; Heitkemper, Elizabeth M; Smaldone, Arlene M; Kukafka, Rita; Cole-Lewis, Heather J; Davidson, Patricia G; Mynatt, Elizabeth D; Cassells, Andrea; Tobin, Jonathan N; Hripcsak, George
2017-12-01
To outline new design directions for informatics solutions that facilitate personal discovery with self-monitoring data. We investigate this question in the context of chronic disease self-management with the focus on type 2 diabetes. We conducted an observational qualitative study of discovery with personal data among adults attending a diabetes self-management education (DSME) program that utilized a discovery-based curriculum. The study included observations of class sessions, and interviews and focus groups with the educator and attendees of the program (n = 14). The main discovery in diabetes self-management evolved around discovering patterns of association between characteristics of individuals' activities and changes in their blood glucose levels that the participants referred to as "cause and effect". This discovery empowered individuals to actively engage in self-management and provided a desired flexibility in selection of personalized self-management strategies. We show that discovery of cause and effect involves four essential phases: (1) feature selection, (2) hypothesis generation, (3) feature evaluation, and (4) goal specification. Further, we identify opportunities to support discovery at each stage with informatics and data visualization solutions by providing assistance with: (1) active manipulation of collected data (e.g., grouping, filtering and side-by-side inspection), (2) hypotheses formulation (e.g., using natural language statements or constructing visual queries), (3) inference evaluation (e.g., through aggregation and visual comparison, and statistical analysis of associations), and (4) translation of discoveries into actionable goals (e.g., tailored selection from computable knowledge sources of effective diabetes self-management behaviors). The study suggests that discovery of cause and effect in diabetes can be a powerful approach to helping individuals to improve their self-management strategies, and that self-monitoring data can serve as a driving engine for personal discovery that may lead to sustainable behavior changes. Enabling personal discovery is a promising new approach to enhancing chronic disease self-management with informatics interventions. Copyright © 2017 Elsevier Inc. All rights reserved.
Knowledge Discovery in Databases.
ERIC Educational Resources Information Center
Norton, M. Jay
1999-01-01
Knowledge discovery in databases (KDD) revolves around the investigation and creation of knowledge, processes, algorithms, and mechanisms for retrieving knowledge from data collections. The article is an introductory overview of KDD. The rationale and environment of its development and applications are discussed. Issues related to database design…
Causality discovery technology
NASA Astrophysics Data System (ADS)
Chen, M.; Ertl, T.; Jirotka, M.; Trefethen, A.; Schmidt, A.; Coecke, B.; Bañares-Alcántara, R.
2012-11-01
Causality is the fabric of our dynamic world. We all make frequent attempts to reason causation relationships of everyday events (e.g., what was the cause of my headache, or what has upset Alice?). We attempt to manage causality all the time through planning and scheduling. The greatest scientific discoveries are usually about causality (e.g., Newton found the cause for an apple to fall, and Darwin discovered natural selection). Meanwhile, we continue to seek a comprehensive understanding about the causes of numerous complex phenomena, such as social divisions, economic crisis, global warming, home-grown terrorism, etc. Humans analyse and reason causality based on observation, experimentation and acquired a priori knowledge. Today's technologies enable us to make observations and carry out experiments in an unprecedented scale that has created data mountains everywhere. Whereas there are exciting opportunities to discover new causation relationships, there are also unparalleled challenges to benefit from such data mountains. In this article, we present a case for developing a new piece of ICT, called Causality Discovery Technology. We reason about the necessity, feasibility and potential impact of such a technology.
Nim, Hieu T; Furtado, Milena B; Costa, Mauro W; Rosenthal, Nadia A; Kitano, Hiroaki; Boyd, Sarah E
2015-05-01
Existing de novo software platforms have largely overlooked a valuable resource, the expertise of the intended biologist users. Typical data representations such as long gene lists, or highly dense and overlapping transcription factor networks often hinder biologists from relating these results to their expertise. VISIONET, a streamlined visualisation tool built from experimental needs, enables biologists to transform large and dense overlapping transcription factor networks into sparse human-readable graphs via numerically filtering. The VISIONET interface allows users without a computing background to interactively explore and filter their data, and empowers them to apply their specialist knowledge on far more complex and substantial data sets than is currently possible. Applying VISIONET to the Tbx20-Gata4 transcription factor network led to the discovery and validation of Aldh1a2, an essential developmental gene associated with various important cardiac disorders, as a healthy adult cardiac fibroblast gene co-regulated by cardiogenic transcription factors Gata4 and Tbx20. We demonstrate with experimental validations the utility of VISIONET for expertise-driven gene discovery that opens new experimental directions that would not otherwise have been identified.
The discovery of the growth cone and its influence on the study of axon guidance
Tamariz, Elisa; Varela-Echavarría, Alfredo
2015-01-01
For over a century, there has been a great deal of interest in understanding how neural connectivity is established during development and regeneration. Interest in the latter arises from the possibility that knowledge of this process can be used to re-establish lost connections after lesion or neurodegeneration. At the end of the XIX century, Santiago Ramón y Cajal discovered that the distal tip of growing axons contained a structure that he called the growth cone. He proposed that this structure enabled the axon’s oriented growth in response to attractants, now known as chemotropic molecules. He further proposed that the physical properties of the surrounding tissues could influence the growth cone and the direction of growth. This seminal discovery afforded a plausible explanation for directed axonal growth and has led to the discovery of axon guidance mechanisms that include diffusible attractants and repellants and guidance cues anchored to cell membranes or extracellular matrix. In this review the major events in the development of this field are discussed. PMID:26029056
Chemical Informatics and the Drug Discovery Knowledge Pyramid
Lushington, Gerald H.; Dong, Yinghua; Theertham, Bhargav
2012-01-01
The magnitude of the challenges in preclinical drug discovery is evident in the large amount of capital invested in such efforts in pursuit of a small static number of eventually successful marketable therapeutics. An explosion in the availability of potentially drug-like compounds and chemical biology data on these molecules can provide us with the means to improve the eventual success rates for compounds being considered at the preclinical level, but only if the community is able to access available information in an efficient and meaningful way. Thus, chemical database resources are critical to any serious drug discovery effort. This paper explores the basic principles underlying the development and implementation of chemical databases, and examines key issues of how molecular information may be encoded within these databases so as to enhance the likelihood that users will be able to extract meaningful information from data queries. In addition to a broad survey of conventional data representation and query strategies, key enabling technologies such as new context-sensitive chemical similarity measures and chemical cartridges are examined, with recommendations on how such resources may be integrated into a practical database environment. PMID:23782037
BioTextQuest(+): a knowledge integration platform for literature mining and concept discovery.
Papanikolaou, Nikolas; Pavlopoulos, Georgios A; Pafilis, Evangelos; Theodosiou, Theodosios; Schneider, Reinhard; Satagopam, Venkata P; Ouzounis, Christos A; Eliopoulos, Aristides G; Promponas, Vasilis J; Iliopoulos, Ioannis
2014-11-15
The iterative process of finding relevant information in biomedical literature and performing bioinformatics analyses might result in an endless loop for an inexperienced user, considering the exponential growth of scientific corpora and the plethora of tools designed to mine PubMed(®) and related biological databases. Herein, we describe BioTextQuest(+), a web-based interactive knowledge exploration platform with significant advances to its predecessor (BioTextQuest), aiming to bridge processes such as bioentity recognition, functional annotation, document clustering and data integration towards literature mining and concept discovery. BioTextQuest(+) enables PubMed and OMIM querying, retrieval of abstracts related to a targeted request and optimal detection of genes, proteins, molecular functions, pathways and biological processes within the retrieved documents. The front-end interface facilitates the browsing of document clustering per subject, the analysis of term co-occurrence, the generation of tag clouds containing highly represented terms per cluster and at-a-glance popup windows with information about relevant genes and proteins. Moreover, to support experimental research, BioTextQuest(+) addresses integration of its primary functionality with biological repositories and software tools able to deliver further bioinformatics services. The Google-like interface extends beyond simple use by offering a range of advanced parameterization for expert users. We demonstrate the functionality of BioTextQuest(+) through several exemplary research scenarios including author disambiguation, functional term enrichment, knowledge acquisition and concept discovery linking major human diseases, such as obesity and ageing. The service is accessible at http://bioinformatics.med.uoc.gr/biotextquest. g.pavlopoulos@gmail.com or georgios.pavlopoulos@esat.kuleuven.be Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Chemistry of berkelium: A review
NASA Astrophysics Data System (ADS)
Hobart, D. E.; Peterson, J. R.
Element 97 was first produced in December 1949, by the bombardment of americium-241 with accelerated alpha particles. This new element was named berkelium (Bk) after Berkeley, California, the city of its discovery. In the 36 years since the discovery of Bk, a substantial amount of knowledge concerning the physicochemical properties of this relatively scarce transplutonium element was acquired. All of the Bk isotopes of mass numbers 240 and 242 through 251 are presently known, but only berkelium-249 is available in sufficient quantities for bulk chemical studies. About 0.7 gram of this isotope was isolated at the HFIR/TRU Complex in Oak Ridge, Tennessee in the last 18 years. Over the same time period, the scale of experimental work using berkelium-249 has increased from the tracer level to bulk studies at the microgram level to solution and solid state investigations with milligram quantities. Extended knowledge of the physicochemical behavior of berkelium is important in its own right, because Bk is the first member of the second half of the actinide series. In addition, such information should enable more accurate extrapolations to the predicted behavior of heavier elements for which experimental studies are severely limited by lack of material and/or by intense radioactivity.
Computational approaches to predict bacteriophage–host relationships
Edwards, Robert A.; McNair, Katelyn; Faust, Karoline; Raes, Jeroen; Dutilh, Bas E.
2015-01-01
Metagenomics has changed the face of virus discovery by enabling the accurate identification of viral genome sequences without requiring isolation of the viruses. As a result, metagenomic virus discovery leaves the first and most fundamental question about any novel virus unanswered: What host does the virus infect? The diversity of the global virosphere and the volumes of data obtained in metagenomic sequencing projects demand computational tools for virus–host prediction. We focus on bacteriophages (phages, viruses that infect bacteria), the most abundant and diverse group of viruses found in environmental metagenomes. By analyzing 820 phages with annotated hosts, we review and assess the predictive power of in silico phage–host signals. Sequence homology approaches are the most effective at identifying known phage–host pairs. Compositional and abundance-based methods contain significant signal for phage–host classification, providing opportunities for analyzing the unknowns in viral metagenomes. Together, these computational approaches further our knowledge of the interactions between phages and their hosts. Importantly, we find that all reviewed signals significantly link phages to their hosts, illustrating how current knowledge and insights about the interaction mechanisms and ecology of coevolving phages and bacteria can be exploited to predict phage–host relationships, with potential relevance for medical and industrial applications. PMID:26657537
A Knowledge Discovery framework for Planetary Defense
NASA Astrophysics Data System (ADS)
Jiang, Y.; Yang, C. P.; Li, Y.; Yu, M.; Bambacus, M.; Seery, B.; Barbee, B.
2016-12-01
Planetary Defense, a project funded by NASA Goddard and the NSF, is a multi-faceted effort focused on the mitigation of Near Earth Object (NEO) threats to our planet. Currently, there exists a dispersion of information concerning NEO's amongst different organizations and scientists, leading to a lack of a coherent system of information to be used for efficient NEO mitigation. In this paper, a planetary defense knowledge discovery engine is proposed to better assist the development and integration of a NEO responding system. Specifically, we have implemented an organized information framework by two means: 1) the development of a semantic knowledge base, which provides a structure for relevant information. It has been developed by the implementation of web crawling and natural language processing techniques, which allows us to collect and store the most relevant structured information on a regular basis. 2) the development of a knowledge discovery engine, which allows for the efficient retrieval of information from our knowledge base. The knowledge discovery engine has been built on the top of Elasticsearch, an open source full-text search engine, as well as cutting-edge machine learning ranking and recommendation algorithms. This proposed framework is expected to advance the knowledge discovery and innovation in planetary science domain.
NASA Astrophysics Data System (ADS)
Raschka, Sebastian; Scott, Anne M.; Liu, Nan; Gunturu, Santosh; Huertas, Mar; Li, Weiming; Kuhn, Leslie A.
2018-03-01
While the advantage of screening vast databases of molecules to cover greater molecular diversity is often mentioned, in reality, only a few studies have been published demonstrating inhibitor discovery by screening more than a million compounds for features that mimic a known three-dimensional (3D) ligand. Two factors contribute: the general difficulty of discovering potent inhibitors, and the lack of free, user-friendly software to incorporate project-specific knowledge and user hypotheses into 3D ligand-based screening. The Screenlamp modular toolkit presented here was developed with these needs in mind. We show Screenlamp's ability to screen more than 12 million commercially available molecules and identify potent in vivo inhibitors of a G protein-coupled bile acid receptor within the first year of a discovery project. This pheromone receptor governs sea lamprey reproductive behavior, and to our knowledge, this project is the first to establish the efficacy of computational screening in discovering lead compounds for aquatic invasive species control. Significant enhancement in activity came from selecting compounds based on one of the hypotheses: that matching two distal oxygen groups in the 3D structure of the pheromone is crucial for activity. Six of the 15 most active compounds met these criteria. A second hypothesis—that presence of an alkyl sulfate side chain results in high activity—identified another 6 compounds in the top 10, demonstrating the significant benefits of hypothesis-driven screening.
Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; ...
2015-07-14
In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG’s comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve asmore » the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC’s focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in lphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG’s extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world.« less
The discovery of HTLV-1, the first pathogenic human retrovirus.
Coffin, John M
2015-12-22
After the discovery of retroviral reverse transcriptase in 1970, there was a flurry of activity, sparked by the "War on Cancer," to identify human cancer retroviruses. After many false claims resulting from various artifacts, most scientists abandoned the search, but the Gallo laboratory carried on, developing both specific assays and new cell culture methods that enabled them to report, in the accompanying 1980 PNAS paper, identification and partial characterization of human T-cell leukemia virus (HTLV; now known as HTLV-1) produced by a T-cell line from a lymphoma patient. Follow-up studies, including collaboration with the group that first identified a cluster of adult T-cell leukemia (ATL) cases in Japan, provided conclusive evidence that HTLV was the cause of this disease. HTLV-1 is now known to infect at least 4-10 million people worldwide, about 5% of whom will develop ATL. Despite intensive research, knowledge of the viral etiology has not led to improvement in treatment or outcome of ATL. However, the technology for discovery of HTLV and acknowledgment of the existence of pathogenic human retroviruses laid the technical and intellectual foundation for the discovery of the cause of AIDS soon afterward. Without this advance, our ability to diagnose and treat HIV infection most likely would have been long delayed.
Mapping Quantitative Field Resistance Against Apple Scab in a 'Fiesta' x 'Discovery' Progeny.
Liebhard, R; Koller, B; Patocchi, A; Kellerhals, M; Pfammatter, W; Jermini, M; Gessler, C
2003-04-01
ABSTRACT Breeding of resistant apple cultivars (Malus x domestica) as a disease management strategy relies on the knowledge and understanding of the underlying genetics. The availability of molecular markers and genetic linkage maps enables the detection and the analysis of major resistance genes as well as of quantitative trait loci (QTL) contributing to the resistance of a genotype. Such a genetic linkage map was constructed, based on a segregating population of the cross between apple cvs. Fiesta (syn. Red Pippin) and Discovery. The progeny was observed for 3 years at three different sites in Switzerland and field resistance against apple scab (Venturia inaequalis) was assessed. Only a weak correlation was detected between leaf scab and fruit scab. A QTL analysis was performed, based on the genetic linkage map consisting of 804 molecular markers and covering all 17 chromosomes of apple. With the maximum likelihood-based interval mapping method, eight genomic regions were identified, six conferring resistance against leaf scab and two conferring fruit scab resistance. Although cv. Discovery showed a much stronger resistance against scab in the field, most QTL identified were attributed to the more susceptible parent 'Fiesta'. This indicated a high degree of homozygosity at the scab resistance loci in 'Discovery', preventing their detection in the progeny due to the lack of segregation.
Current status and future prospects for enabling chemistry technology in the drug discovery process.
Djuric, Stevan W; Hutchins, Charles W; Talaty, Nari N
2016-01-01
This review covers recent advances in the implementation of enabling chemistry technologies into the drug discovery process. Areas covered include parallel synthesis chemistry, high-throughput experimentation, automated synthesis and purification methods, flow chemistry methodology including photochemistry, electrochemistry, and the handling of "dangerous" reagents. Also featured are advances in the "computer-assisted drug design" area and the expanding application of novel mass spectrometry-based techniques to a wide range of drug discovery activities.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 49 Transportation 4 2012-10-01 2012-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...
Code of Federal Regulations, 2011 CFR
2011-10-01
... 49 Transportation 4 2011-10-01 2011-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...
Code of Federal Regulations, 2013 CFR
2013-10-01
... 49 Transportation 4 2013-10-01 2013-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...
Code of Federal Regulations, 2014 CFR
2014-10-01
... 49 Transportation 4 2014-10-01 2014-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...
Code of Federal Regulations, 2010 CFR
2010-10-01
... 49 Transportation 4 2010-10-01 2010-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...
Joslin, John; Gilligan, James; Anderson, Paul; Garcia, Catherine; Sharif, Orzala; Hampton, Janice; Cohen, Steven; King, Miranda; Zhou, Bin; Jiang, Shumei; Trussell, Christopher; Dunn, Robert; Fathman, John W; Snead, Jennifer L; Boitano, Anthony E; Nguyen, Tommy; Conner, Michael; Cooke, Mike; Harris, Jennifer; Ainscow, Ed; Zhou, Yingyao; Shaw, Chris; Sipes, Dan; Mainquist, James; Lesley, Scott
2018-05-01
The goal of high-throughput screening is to enable screening of compound libraries in an automated manner to identify quality starting points for optimization. This often involves screening a large diversity of compounds in an assay that preserves a connection to the disease pathology. Phenotypic screening is a powerful tool for drug identification, in that assays can be run without prior understanding of the target and with primary cells that closely mimic the therapeutic setting. Advanced automation and high-content imaging have enabled many complex assays, but these are still relatively slow and low throughput. To address this limitation, we have developed an automated workflow that is dedicated to processing complex phenotypic assays for flow cytometry. The system can achieve a throughput of 50,000 wells per day, resulting in a fully automated platform that enables robust phenotypic drug discovery. Over the past 5 years, this screening system has been used for a variety of drug discovery programs, across many disease areas, with many molecules advancing quickly into preclinical development and into the clinic. This report will highlight a diversity of approaches that automated flow cytometry has enabled for phenotypic drug discovery.
Knowledge Discovery from Biomedical Ontologies in Cross Domains.
Shen, Feichen; Lee, Yugyung
2016-01-01
In recent years, there is an increasing demand for sharing and integration of medical data in biomedical research. In order to improve a health care system, it is required to support the integration of data by facilitating semantic interoperability systems and practices. Semantic interoperability is difficult to achieve in these systems as the conceptual models underlying datasets are not fully exploited. In this paper, we propose a semantic framework, called Medical Knowledge Discovery and Data Mining (MedKDD), that aims to build a topic hierarchy and serve the semantic interoperability between different ontologies. For the purpose, we fully focus on the discovery of semantic patterns about the association of relations in the heterogeneous information network representing different types of objects and relationships in multiple biological ontologies and the creation of a topic hierarchy through the analysis of the discovered patterns. These patterns are used to cluster heterogeneous information networks into a set of smaller topic graphs in a hierarchical manner and then to conduct cross domain knowledge discovery from the multiple biological ontologies. Thus, patterns made a greater contribution in the knowledge discovery across multiple ontologies. We have demonstrated the cross domain knowledge discovery in the MedKDD framework using a case study with 9 primary biological ontologies from Bio2RDF and compared it with the cross domain query processing approach, namely SLAP. We have confirmed the effectiveness of the MedKDD framework in knowledge discovery from multiple medical ontologies.
Knowledge Discovery from Biomedical Ontologies in Cross Domains
Shen, Feichen; Lee, Yugyung
2016-01-01
In recent years, there is an increasing demand for sharing and integration of medical data in biomedical research. In order to improve a health care system, it is required to support the integration of data by facilitating semantic interoperability systems and practices. Semantic interoperability is difficult to achieve in these systems as the conceptual models underlying datasets are not fully exploited. In this paper, we propose a semantic framework, called Medical Knowledge Discovery and Data Mining (MedKDD), that aims to build a topic hierarchy and serve the semantic interoperability between different ontologies. For the purpose, we fully focus on the discovery of semantic patterns about the association of relations in the heterogeneous information network representing different types of objects and relationships in multiple biological ontologies and the creation of a topic hierarchy through the analysis of the discovered patterns. These patterns are used to cluster heterogeneous information networks into a set of smaller topic graphs in a hierarchical manner and then to conduct cross domain knowledge discovery from the multiple biological ontologies. Thus, patterns made a greater contribution in the knowledge discovery across multiple ontologies. We have demonstrated the cross domain knowledge discovery in the MedKDD framework using a case study with 9 primary biological ontologies from Bio2RDF and compared it with the cross domain query processing approach, namely SLAP. We have confirmed the effectiveness of the MedKDD framework in knowledge discovery from multiple medical ontologies. PMID:27548262
Knowledge discovery with classification rules in a cardiovascular dataset.
Podgorelec, Vili; Kokol, Peter; Stiglic, Milojka Molan; Hericko, Marjan; Rozman, Ivan
2005-12-01
In this paper we study an evolutionary machine learning approach to data mining and knowledge discovery based on the induction of classification rules. A method for automatic rules induction called AREX using evolutionary induction of decision trees and automatic programming is introduced. The proposed algorithm is applied to a cardiovascular dataset consisting of different groups of attributes which should possibly reveal the presence of some specific cardiovascular problems in young patients. A case study is presented that shows the use of AREX for the classification of patients and for discovering possible new medical knowledge from the dataset. The defined knowledge discovery loop comprises a medical expert's assessment of induced rules to drive the evolution of rule sets towards more appropriate solutions. The final result is the discovery of a possible new medical knowledge in the field of pediatric cardiology.
Progress in Biomedical Knowledge Discovery: A 25-year Retrospective
Sacchi, L.
2016-01-01
Summary Objectives We sought to explore, via a systematic review of the literature, the state of the art of knowledge discovery in biomedical databases as it existed in 1992, and then now, 25 years later, mainly focused on supervised learning. Methods We performed a rigorous systematic search of PubMed and latent Dirichlet allocation to identify themes in the literature and trends in the science of knowledge discovery in and between time periods and compare these trends. We restricted the result set using a bracket of five years previous, such that the 1992 result set was restricted to articles published between 1987 and 1992, and the 2015 set between 2011 and 2015. This was to reflect the current literature available at the time to researchers and others at the target dates of 1992 and 2015. The search term was framed as: Knowledge Discovery OR Data Mining OR Pattern Discovery OR Pattern Recognition, Automated. Results A total 538 and 18,172 documents were retrieved for 1992 and 2015, respectively. The number and type of data sources increased dramatically over the observation period, primarily due to the advent of electronic clinical systems. The period 1992-2015 saw the emergence of new areas of research in knowledge discovery, and the refinement and application of machine learning approaches that were nascent or unknown in 1992. Conclusions Over the 25 years of the observation period, we identified numerous developments that impacted the science of knowledge discovery, including the availability of new forms of data, new machine learning algorithms, and new application domains. Through a bibliometric analysis we examine the striking changes in the availability of highly heterogeneous data resources, the evolution of new algorithmic approaches to knowledge discovery, and we consider from legal, social, and political perspectives possible explanations of the growth of the field. Finally, we reflect on the achievements of the past 25 years to consider what the next 25 years will bring with regard to the availability of even more complex data and to the methods that could be, and are being now developed for the discovery of new knowledge in biomedical data. PMID:27488403
Progress in Biomedical Knowledge Discovery: A 25-year Retrospective.
Sacchi, L; Holmes, J H
2016-08-02
We sought to explore, via a systematic review of the literature, the state of the art of knowledge discovery in biomedical databases as it existed in 1992, and then now, 25 years later, mainly focused on supervised learning. We performed a rigorous systematic search of PubMed and latent Dirichlet allocation to identify themes in the literature and trends in the science of knowledge discovery in and between time periods and compare these trends. We restricted the result set using a bracket of five years previous, such that the 1992 result set was restricted to articles published between 1987 and 1992, and the 2015 set between 2011 and 2015. This was to reflect the current literature available at the time to researchers and others at the target dates of 1992 and 2015. The search term was framed as: Knowledge Discovery OR Data Mining OR Pattern Discovery OR Pattern Recognition, Automated. A total 538 and 18,172 documents were retrieved for 1992 and 2015, respectively. The number and type of data sources increased dramatically over the observation period, primarily due to the advent of electronic clinical systems. The period 1992- 2015 saw the emergence of new areas of research in knowledge discovery, and the refinement and application of machine learning approaches that were nascent or unknown in 1992. Over the 25 years of the observation period, we identified numerous developments that impacted the science of knowledge discovery, including the availability of new forms of data, new machine learning algorithms, and new application domains. Through a bibliometric analysis we examine the striking changes in the availability of highly heterogeneous data resources, the evolution of new algorithmic approaches to knowledge discovery, and we consider from legal, social, and political perspectives possible explanations of the growth of the field. Finally, we reflect on the achievements of the past 25 years to consider what the next 25 years will bring with regard to the availability of even more complex data and to the methods that could be, and are being now developed for the discovery of new knowledge in biomedical data.
Single-Cell Genomics Unravels Brain Cell-Type Complexity.
Guillaumet-Adkins, Amy; Heyn, Holger
2017-01-01
The brain is the most complex tissue in terms of cell types that it comprises, to the extent that it is still poorly understood. Single cell genome and transcriptome profiling allow to disentangle the neuronal heterogeneity, enabling the categorization of individual neurons into groups with similar molecular signatures. Herein, we unravel the current state of knowledge in single cell neurogenomics. We describe the molecular understanding of the cellular architecture of the mammalian nervous system in health and in disease; from the discovery of unrecognized cell types to the validation of known ones, applying these state-of-the-art technologies.
Physiopathology of the cochlear microcirculation.
Shi, Xiaorui
2011-12-01
Normal blood supply to the cochlea is critically important for establishing the endocochlear potential and sustaining production of endolymph. Abnormal cochlear microcirculation has long been considered an etiologic factor in noise-induced hearing loss, age-related hearing loss (presbycusis), sudden hearing loss or vestibular function, and Meniere's disease. Knowledge of the mechanisms underlying the pathophysiology of cochlear microcirculation is of fundamental clinical importance. A better understanding of cochlear blood flow (CoBF) will enable more effective management of hearing disorders resulting from aberrant blood flow. This review focuses on recent discoveries and findings related to the physiopathology of the cochlear microvasculature. Published by Elsevier B.V.
Physiopathology of the Cochlear Microcirculation
Shi, Xiaorui
2011-01-01
Normal blood supply to the cochlea is critically important for establishing the endocochlear potential and sustaining production of endolymph. Abnormal cochlear microcirculation has long been considered an etiologic factor in noise-induced hearing loss, age-related hearing loss (presbycusis), sudden hearing loss or vestibular function, and Meniere's disease. Knowledge of the mechanisms underlying the pathophysiology of cochlear microcirculation is of fundamental clinical importance. A better understanding of cochlear blood flow (CoBF) will enable more effective management of hearing disorders resulting from aberrant blood flow. This review focuses on recent discoveries and findings related to the physiopathology of the cochlear microvasculature. PMID:21875658
Calling on a million minds for community annotation in WikiProteins
Mons, Barend; Ashburner, Michael; Chichester, Christine; van Mulligen, Erik; Weeber, Marc; den Dunnen, Johan; van Ommen, Gert-Jan; Musen, Mark; Cockerill, Matthew; Hermjakob, Henning; Mons, Albert; Packer, Abel; Pacheco, Roberto; Lewis, Suzanna; Berkeley, Alfred; Melton, William; Barris, Nickolas; Wales, Jimmy; Meijssen, Gerard; Moeller, Erik; Roes, Peter Jan; Borner, Katy; Bairoch, Amos
2008-01-01
WikiProteins enables community annotation in a Wiki-based system. Extracts of major data sources have been fused into an editable environment that links out to the original sources. Data from community edits create automatic copies of the original data. Semantic technology captures concepts co-occurring in one sentence and thus potential factual statements. In addition, indirect associations between concepts have been calculated. We call on a 'million minds' to annotate a 'million concepts' and to collect facts from the literature with the reward of collaborative knowledge discovery. The system is available for beta testing at . PMID:18507872
VAiRoma: A Visual Analytics System for Making Sense of Places, Times, and Events in Roman History.
Cho, Isaac; Dou, Wewnen; Wang, Derek Xiaoyu; Sauda, Eric; Ribarsky, William
2016-01-01
Learning and gaining knowledge of Roman history is an area of interest for students and citizens at large. This is an example of a subject with great sweep (with many interrelated sub-topics over, in this case, a 3,000 year history) that is hard to grasp by any individual and, in its full detail, is not available as a coherent story. In this paper, we propose a visual analytics approach to construct a data driven view of Roman history based on a large collection of Wikipedia articles. Extracting and enabling the discovery of useful knowledge on events, places, times, and their connections from large amounts of textual data has always been a challenging task. To this aim, we introduce VAiRoma, a visual analytics system that couples state-of-the-art text analysis methods with an intuitive visual interface to help users make sense of events, places, times, and more importantly, the relationships between them. VAiRoma goes beyond textual content exploration, as it permits users to compare, make connections, and externalize the findings all within the visual interface. As a result, VAiRoma allows users to learn and create new knowledge regarding Roman history in an informed way. We evaluated VAiRoma with 16 participants through a user study, with the task being to learn about roman piazzas through finding relevant articles and new relationships. Our study results showed that the VAiRoma system enables the participants to find more relevant articles and connections compared to Web searches and literature search conducted in a roman library. Subjective feedback on VAiRoma was also very positive. In addition, we ran two case studies that demonstrate how VAiRoma can be used for deeper analysis, permitting the rapid discovery and analysis of a small number of key documents even when the original collection contains hundreds of thousands of documents.
Communication in Collaborative Discovery Learning
ERIC Educational Resources Information Center
Saab, Nadira; van Joolingen, Wouter R.; van Hout-Wolters, Bernadette H. A. M.
2005-01-01
Background: Constructivist approaches to learning focus on learning environments in which students have the opportunity to construct knowledge themselves, and negotiate this knowledge with others. "Discovery learning" and "collaborative learning" are examples of learning contexts that cater for knowledge construction processes. We introduce a…
Practice-Based Knowledge Discovery for Comparative Effectiveness Research: An Organizing Framework
Lucero, Robert J.; Bakken, Suzanne
2014-01-01
Electronic health information systems can increase the ability of health-care organizations to investigate the effects of clinical interventions. The authors present an organizing framework that integrates outcomes and informatics research paradigms to guide knowledge discovery in electronic clinical databases. They illustrate its application using the example of hospital acquired pressure ulcers (HAPU). The Knowledge Discovery through Informatics for Comparative Effectiveness Research (KDI-CER) framework was conceived as a heuristic to conceptualize study designs and address potential methodological limitations imposed by using a single research perspective. Advances in informatics research can play a complementary role in advancing the field of outcomes research including CER. The KDI-CER framework can be used to facilitate knowledge discovery from routinely collected electronic clinical data. PMID:25278645
Assessment of microbiota:host interactions at the vaginal mucosa interface.
Pruski, Pamela; Lewis, Holly V; Lee, Yun S; Marchesi, Julian R; Bennett, Phillip R; Takats, Zoltan; MacIntyre, David A
2018-04-27
There is increasing appreciation of the role that vaginal microbiota play in health and disease throughout a woman's lifespan. This has been driven partly by molecular techniques that enable detailed identification and characterisation of microbial community structures. However, these methods do not enable assessment of the biochemical and immunological interactions between host and vaginal microbiota involved in pathophysiology. This review examines our current knowledge of the relationships that exist between vaginal microbiota and the host at the level of the vaginal mucosal interface. We also consider methodological approaches to microbiomic, immunologic and metabolic profiling that permit assessment of these interactions. Integration of information derived from these platforms brings the potential for biomarker discovery, disease risk stratification and improved understanding of the mechanisms regulating vaginal microbial community dynamics in health and disease. Copyright © 2018 Elsevier Inc. All rights reserved.
The future of poultry science research: things I think I think.
Taylor, R L
2009-06-01
Much poultry research progress has occurred over the first century of the Poultry Science Association. During that time, specific problems have been solved and much basic biological knowledge has been gained. Scientific discovery has exceeded its integration into foundation concepts. Researchers need to be involved in the public's development of critical thinking skills to enable discernment of fact versus fiction. Academic, government, and private institutions need to hire the best people. Issues of insufficient research funding will be remedied by a combination of strategies rather than by a single cure. Scientific advocacy for poultry-related issues is critical to success. Two other keys to the future are funding for higher-risk projects, whose outcome is truly unknown, and specific allocations for new investigators. Diligent, ongoing efforts by poultry scientists will enable progress beyond the challenges.
Current status and future prospects for enabling chemistry technology in the drug discovery process
Djuric, Stevan W.; Hutchins, Charles W.; Talaty, Nari N.
2016-01-01
This review covers recent advances in the implementation of enabling chemistry technologies into the drug discovery process. Areas covered include parallel synthesis chemistry, high-throughput experimentation, automated synthesis and purification methods, flow chemistry methodology including photochemistry, electrochemistry, and the handling of “dangerous” reagents. Also featured are advances in the “computer-assisted drug design” area and the expanding application of novel mass spectrometry-based techniques to a wide range of drug discovery activities. PMID:27781094
Three-Component Reaction Discovery Enabled by Mass Spectrometry of Self-Assembled Monolayers
Montavon, Timothy J.; Li, Jing; Cabrera-Pardo, Jaime R.; Mrksich, Milan; Kozmin, Sergey A.
2011-01-01
Multi-component reactions have been extensively employed in many areas of organic chemistry. Despite significant progress, the discovery of such enabling transformations remains challenging. Here, we present the development of a parallel, label-free reaction-discovery platform, which can be used for identification of new multi-component transformations. Our approach is based on the parallel mass spectrometric screening of interfacial chemical reactions on arrays of self-assembled monolayers. This strategy enabled the identification of a simple organic phosphine that can catalyze a previously unknown condensation of siloxy alkynes, aldehydes and amines to produce 3-hydroxy amides with high efficiency and diastereoselectivity. The reaction was further optimized using solution phase methods. PMID:22169871
Knowledge-based public health situation awareness
NASA Astrophysics Data System (ADS)
Mirhaji, Parsa; Zhang, Jiajie; Srinivasan, Arunkumar; Richesson, Rachel L.; Smith, Jack W.
2004-09-01
There have been numerous efforts to create comprehensive databases from multiple sources to monitor the dynamics of public health and most specifically to detect the potential threats of bioterrorism before widespread dissemination. But there are not many evidences for the assertion that these systems are timely and dependable, or can reliably identify man made from natural incident. One must evaluate the value of so called 'syndromic surveillance systems' along with the costs involved in design, development, implementation and maintenance of such systems and the costs involved in investigation of the inevitable false alarms1. In this article we will introduce a new perspective to the problem domain with a shift in paradigm from 'surveillance' toward 'awareness'. As we conceptualize a rather different approach to tackle the problem, we will introduce a different methodology in application of information science, computer science, cognitive science and human-computer interaction concepts in design and development of so called 'public health situation awareness systems'. We will share some of our design and implementation concepts for the prototype system that is under development in the Center for Biosecurity and Public Health Informatics Research, in the University of Texas Health Science Center at Houston. The system is based on a knowledgebase containing ontologies with different layers of abstraction, from multiple domains, that provide the context for information integration, knowledge discovery, interactive data mining, information visualization, information sharing and communications. The modular design of the knowledgebase and its knowledge representation formalism enables incremental evolution of the system from a partial system to a comprehensive knowledgebase of 'public health situation awareness' as it acquires new knowledge through interactions with domain experts or automatic discovery of new knowledge.
1994-09-30
relational versus object oriented DBMS, knowledge discovery, data models, rnetadata, data filtering, clustering techniques, and synthetic data. A secondary...The first was the investigation of Al/ES Lapplications (knowledge discovery, data mining, and clustering ). Here CAST collabo.rated with Dr. Fred Petry...knowledge discovery system based on clustering techniques; implemented an on-line data browser to the DBMS; completed preliminary efforts to apply object
Integration of cardiac proteome biology and medicine by a specialized knowledgebase.
Zong, Nobel C; Li, Haomin; Li, Hua; Lam, Maggie P Y; Jimenez, Rafael C; Kim, Christina S; Deng, Ning; Kim, Allen K; Choi, Jeong Ho; Zelaya, Ivette; Liem, David; Meyer, David; Odeberg, Jacob; Fang, Caiyun; Lu, Hao-Jie; Xu, Tao; Weiss, James; Duan, Huilong; Uhlen, Mathias; Yates, John R; Apweiler, Rolf; Ge, Junbo; Hermjakob, Henning; Ping, Peipei
2013-10-12
Omics sciences enable a systems-level perspective in characterizing cardiovascular biology. Integration of diverse proteomics data via a computational strategy will catalyze the assembly of contextualized knowledge, foster discoveries through multidisciplinary investigations, and minimize unnecessary redundancy in research efforts. The goal of this project is to develop a consolidated cardiac proteome knowledgebase with novel bioinformatics pipeline and Web portals, thereby serving as a new resource to advance cardiovascular biology and medicine. We created Cardiac Organellar Protein Atlas Knowledgebase (COPaKB; www.HeartProteome.org), a centralized platform of high-quality cardiac proteomic data, bioinformatics tools, and relevant cardiovascular phenotypes. Currently, COPaKB features 8 organellar modules, comprising 4203 LC-MS/MS experiments from human, mouse, drosophila, and Caenorhabditis elegans, as well as expression images of 10,924 proteins in human myocardium. In addition, the Java-coded bioinformatics tools provided by COPaKB enable cardiovascular investigators in all disciplines to retrieve and analyze pertinent organellar protein properties of interest. COPaKB provides an innovative and interactive resource that connects research interests with the new biological discoveries in protein sciences. With an array of intuitive tools in this unified Web server, nonproteomics investigators can conveniently collaborate with proteomics specialists to dissect the molecular signatures of cardiovascular phenotypes.
Girardi, Dominic; Küng, Josef; Kleiser, Raimund; Sonnberger, Michael; Csillag, Doris; Trenkler, Johannes; Holzinger, Andreas
2016-09-01
Established process models for knowledge discovery find the domain-expert in a customer-like and supervising role. In the field of biomedical research, it is necessary to move the domain-experts into the center of this process with far-reaching consequences for both their research output and the process itself. In this paper, we revise the established process models for knowledge discovery and propose a new process model for domain-expert-driven interactive knowledge discovery. Furthermore, we present a research infrastructure which is adapted to this new process model and demonstrate how the domain-expert can be deeply integrated even into the highly complex data-mining process and data-exploration tasks. We evaluated this approach in the medical domain for the case of cerebral aneurysms research.
Application of Ontologies for Big Earth Data
NASA Astrophysics Data System (ADS)
Huang, T.; Chang, G.; Armstrong, E. M.; Boening, C.
2014-12-01
Connected data is smarter data! Earth Science research infrastructure must do more than just being able to support temporal, geospatial discovery of satellite data. As the Earth Science data archives continue to expand across NASA data centers, the research communities are demanding smarter data services. A successful research infrastructure must be able to present researchers the complete picture, that is, datasets with linked citations, related interdisciplinary data, imageries, current events, social media discussions, and scientific data tools that are relevant to the particular dataset. The popular Semantic Web for Earth and Environmental Terminology (SWEET) ontologies is a collection of ontologies and concepts designed to improve discovery and application of Earth Science data. The SWEET ontologies collection was initially developed to capture the relationships between keywords in the NASA Global Change Master Directory (GCMD). Over the years this popular ontologies collection has expanded to cover over 200 ontologies and 6000 concepts to enable scalable classification of Earth system science concepts and Space science. This presentation discusses the semantic web technologies as the enabling technology for data-intensive science. We will discuss the application of the SWEET ontologies as a critical component in knowledge-driven research infrastructure for some of the recent projects, which include the DARPA Ontological System for Context Artifact and Resources (OSCAR), 2013 NASA ACCESS Virtual Quality Screening Service (VQSS), and the 2013 NASA Sea Level Change Portal (SLCP) projects. The presentation will also discuss the benefits in using semantic web technologies in developing research infrastructure for Big Earth Science Data in an attempt to "accommodate all domains and provide the necessary glue for information to be cross-linked, correlated, and discovered in a semantically rich manner." [1] [1] Savas Parastatidis: A platform for all that we know: creating a knowledge-driven research infrastructure. The Fourth Paradigm 2009: 165-172
Tenenbaum, Jessica D.; Whetzel, Patricia L.; Anderson, Kent; Borromeo, Charles D.; Dinov, Ivo D.; Gabriel, Davera; Kirschner, Beth; Mirel, Barbara; Morris, Tim; Noy, Natasha; Nyulas, Csongor; Rubenson, David; Saxman, Paul R.; Singh, Harpreet; Whelan, Nancy; Wright, Zach; Athey, Brian D.; Becich, Michael J.; Ginsburg, Geoffrey S.; Musen, Mark A.; Smith, Kevin A.; Tarantal, Alice F.; Rubin, Daniel L; Lyster, Peter
2010-01-01
The biomedical research community relies on a diverse set of resources, both within their own institutions and at other research centers. In addition, an increasing number of shared electronic resources have been developed. Without effective means to locate and query these resources, it is challenging, if not impossible, for investigators to be aware of the myriad resources available, or to effectively perform resource discovery when the need arises. In this paper, we describe the development and use of the Biomedical Resource Ontology (BRO) to enable semantic annotation and discovery of biomedical resources. We also describe the Resource Discovery System (RDS) which is a federated, inter-institutional pilot project that uses the BRO to facilitate resource discovery on the Internet. Through the RDS framework and its associated Biositemaps infrastructure, the BRO facilitates semantic search and discovery of biomedical resources, breaking down barriers and streamlining scientific research that will improve human health. PMID:20955817
Knowledge Discovery in Textual Documentation: Qualitative and Quantitative Analyses.
ERIC Educational Resources Information Center
Loh, Stanley; De Oliveira, Jose Palazzo M.; Gastal, Fabio Leite
2001-01-01
Presents an application of knowledge discovery in texts (KDT) concerning medical records of a psychiatric hospital. The approach helps physicians to extract knowledge about patients and diseases that may be used for epidemiological studies, for training professionals, and to support physicians to diagnose and evaluate diseases. (Author/AEF)
Ahmed, Wamiq M; Lenz, Dominik; Liu, Jia; Paul Robinson, J; Ghafoor, Arif
2008-03-01
High-throughput biological imaging uses automated imaging devices to collect a large number of microscopic images for analysis of biological systems and validation of scientific hypotheses. Efficient manipulation of these datasets for knowledge discovery requires high-performance computational resources, efficient storage, and automated tools for extracting and sharing such knowledge among different research sites. Newly emerging grid technologies provide powerful means for exploiting the full potential of these imaging techniques. Efficient utilization of grid resources requires the development of knowledge-based tools and services that combine domain knowledge with analysis algorithms. In this paper, we first investigate how grid infrastructure can facilitate high-throughput biological imaging research, and present an architecture for providing knowledge-based grid services for this field. We identify two levels of knowledge-based services. The first level provides tools for extracting spatiotemporal knowledge from image sets and the second level provides high-level knowledge management and reasoning services. We then present cellular imaging markup language, an extensible markup language-based language for modeling of biological images and representation of spatiotemporal knowledge. This scheme can be used for spatiotemporal event composition, matching, and automated knowledge extraction and representation for large biological imaging datasets. We demonstrate the expressive power of this formalism by means of different examples and extensive experimental results.
Building Knowledge Graphs for NASA's Earth Science Enterprise
NASA Astrophysics Data System (ADS)
Zhang, J.; Lee, T. J.; Ramachandran, R.; Shi, R.; Bao, Q.; Gatlin, P. N.; Weigel, A. M.; Maskey, M.; Miller, J. J.
2016-12-01
Inspired by Google Knowledge Graph, we have been building a prototype Knowledge Graph for Earth scientists, connecting information and data in NASA's Earth science enterprise. Our primary goal is to advance the state-of-the-art NASA knowledge extraction capability by going beyond traditional catalog search and linking different distributed information (such as data, publications, services, tools and people). This will enable a more efficient pathway to knowledge discovery. While Google Knowledge Graph provides impressive semantic-search and aggregation capabilities, it is limited to search topics for general public. We use the similar knowledge graph approach to semantically link information gathered from a wide variety of sources within the NASA Earth Science enterprise. Our prototype serves as a proof of concept on the viability of building an operational "knowledge base" system for NASA Earth science. Information is pulled from structured sources (such as NASA CMR catalog, GCMD, and Climate and Forecast Conventions) and unstructured sources (such as research papers). Leveraging modern techniques of machine learning, information retrieval, and deep learning, we provide an integrated data mining and information discovery environment to help Earth scientists to use the best data, tools, methodologies, and models available to answer a hypothesis. Our knowledge graph would be able to answer questions like: Which articles discuss topics investigating similar hypotheses? How have these methods been tested for accuracy? Which approaches have been highly cited within the scientific community? What variables were used for this method and what datasets were used to represent them? What processing was necessary to use this data? These questions then lead researchers and citizen scientists to investigate the sources where data can be found, available user guides, information on how the data was acquired, and available tools and models to use with this data. As a proof of concept, we focus on a well-defined domain - Hurricane Science linking research articles and their findings, data, people and tools/services. Modern information retrieval, natural language processing machine learning and deep learning techniques are applied to build the knowledge network.
Enabling drug discovery project decisions with integrated computational chemistry and informatics
NASA Astrophysics Data System (ADS)
Tsui, Vickie; Ortwine, Daniel F.; Blaney, Jeffrey M.
2017-03-01
Computational chemistry/informatics scientists and software engineers in Genentech Small Molecule Drug Discovery collaborate with experimental scientists in a therapeutic project-centric environment. Our mission is to enable and improve pre-clinical drug discovery design and decisions. Our goal is to deliver timely data, analysis, and modeling to our therapeutic project teams using best-in-class software tools. We describe our strategy, the organization of our group, and our approaches to reach this goal. We conclude with a summary of the interdisciplinary skills required for computational scientists and recommendations for their training.
IsoMAP (Isoscape Modeling, Analysis, and Prediction)
NASA Astrophysics Data System (ADS)
Miller, C. C.; Bowen, G. J.; Zhang, T.; Zhao, L.; West, J. B.; Liu, Z.; Rapolu, N.
2009-12-01
IsoMAP is a TeraGrid-based web portal aimed at building the infrastructure that brings together distributed multi-scale and multi-format geospatial datasets to enable statistical analysis and modeling of environmental isotopes. A typical workflow enabled by the portal includes (1) data source exploration and selection, (2) statistical analysis and model development; (3) predictive simulation of isotope distributions using models developed in (1) and (2); (4) analysis and interpretation of simulated spatial isotope distributions (e.g., comparison with independent observations, pattern analysis). The gridded models and data products created by one user can be shared and reused among users within the portal, enabling collaboration and knowledge transfer. This infrastructure and the research it fosters can lead to fundamental changes in our knowledge of the water cycle and ecological and biogeochemical processes through analysis of network-based isotope data, but it will be important A) that those with whom the data and models are shared can be sure of the origin, quality, inputs, and processing history of these products, and B) the system is agile and intuitive enough to facilitate this sharing (rather than just ‘allow’ it). IsoMAP researchers are therefore building into the portal’s architecture several components meant to increase the amount of metadata about users’ products and to repurpose those metadata to make sharing and discovery more intuitive and robust to both expected, professional users as well as unforeseeable populations from other sectors.
A collaborative filtering-based approach to biomedical knowledge discovery.
Lever, Jake; Gakkhar, Sitanshu; Gottlieb, Michael; Rashnavadi, Tahereh; Lin, Santina; Siu, Celia; Smith, Maia; Jones, Martin R; Krzywinski, Martin; Jones, Steven J M; Wren, Jonathan
2018-02-15
The increase in publication rates makes it challenging for an individual researcher to stay abreast of all relevant research in order to find novel research hypotheses. Literature-based discovery methods make use of knowledge graphs built using text mining and can infer future associations between biomedical concepts that will likely occur in new publications. These predictions are a valuable resource for researchers to explore a research topic. Current methods for prediction are based on the local structure of the knowledge graph. A method that uses global knowledge from across the knowledge graph needs to be developed in order to make knowledge discovery a frequently used tool by researchers. We propose an approach based on the singular value decomposition (SVD) that is able to combine data from across the knowledge graph through a reduced representation. Using cooccurrence data extracted from published literature, we show that SVD performs better than the leading methods for scoring discoveries. We also show the diminishing predictive power of knowledge discovery as we compare our predictions with real associations that appear further into the future. Finally, we examine the strengths and weaknesses of the SVD approach against another well-performing system using several predicted associations. All code and results files for this analysis can be accessed at https://github.com/jakelever/knowledgediscovery. sjones@bcgsc.ca. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Datasets2Tools, repository and search engine for bioinformatics datasets, tools and canned analyses
Torre, Denis; Krawczuk, Patrycja; Jagodnik, Kathleen M.; Lachmann, Alexander; Wang, Zichen; Wang, Lily; Kuleshov, Maxim V.; Ma’ayan, Avi
2018-01-01
Biomedical data repositories such as the Gene Expression Omnibus (GEO) enable the search and discovery of relevant biomedical digital data objects. Similarly, resources such as OMICtools, index bioinformatics tools that can extract knowledge from these digital data objects. However, systematic access to pre-generated ‘canned’ analyses applied by bioinformatics tools to biomedical digital data objects is currently not available. Datasets2Tools is a repository indexing 31,473 canned bioinformatics analyses applied to 6,431 datasets. The Datasets2Tools repository also contains the indexing of 4,901 published bioinformatics software tools, and all the analyzed datasets. Datasets2Tools enables users to rapidly find datasets, tools, and canned analyses through an intuitive web interface, a Google Chrome extension, and an API. Furthermore, Datasets2Tools provides a platform for contributing canned analyses, datasets, and tools, as well as evaluating these digital objects according to their compliance with the findable, accessible, interoperable, and reusable (FAIR) principles. By incorporating community engagement, Datasets2Tools promotes sharing of digital resources to stimulate the extraction of knowledge from biomedical research data. Datasets2Tools is freely available from: http://amp.pharm.mssm.edu/datasets2tools. PMID:29485625
Datasets2Tools, repository and search engine for bioinformatics datasets, tools and canned analyses.
Torre, Denis; Krawczuk, Patrycja; Jagodnik, Kathleen M; Lachmann, Alexander; Wang, Zichen; Wang, Lily; Kuleshov, Maxim V; Ma'ayan, Avi
2018-02-27
Biomedical data repositories such as the Gene Expression Omnibus (GEO) enable the search and discovery of relevant biomedical digital data objects. Similarly, resources such as OMICtools, index bioinformatics tools that can extract knowledge from these digital data objects. However, systematic access to pre-generated 'canned' analyses applied by bioinformatics tools to biomedical digital data objects is currently not available. Datasets2Tools is a repository indexing 31,473 canned bioinformatics analyses applied to 6,431 datasets. The Datasets2Tools repository also contains the indexing of 4,901 published bioinformatics software tools, and all the analyzed datasets. Datasets2Tools enables users to rapidly find datasets, tools, and canned analyses through an intuitive web interface, a Google Chrome extension, and an API. Furthermore, Datasets2Tools provides a platform for contributing canned analyses, datasets, and tools, as well as evaluating these digital objects according to their compliance with the findable, accessible, interoperable, and reusable (FAIR) principles. By incorporating community engagement, Datasets2Tools promotes sharing of digital resources to stimulate the extraction of knowledge from biomedical research data. Datasets2Tools is freely available from: http://amp.pharm.mssm.edu/datasets2tools.
Building Faculty Capacity through the Learning Sciences
ERIC Educational Resources Information Center
Moy, Elizabeth; O'Sullivan, Gerard; Terlecki, Melissa; Jernstedt, Christian
2014-01-01
Discoveries in the learning sciences (especially in neuroscience) have yielded a rich and growing body of knowledge about how students learn, yet this knowledge is only half of the story. The other half is "know how," i.e. the application of this knowledge. For faculty members, that means applying the discoveries of the learning sciences…
A deep learning and novelty detection framework for rapid phenotyping in high-content screening
Sommer, Christoph; Hoefler, Rudolf; Samwer, Matthias; Gerlich, Daniel W.
2017-01-01
Supervised machine learning is a powerful and widely used method for analyzing high-content screening data. Despite its accuracy, efficiency, and versatility, supervised machine learning has drawbacks, most notably its dependence on a priori knowledge of expected phenotypes and time-consuming classifier training. We provide a solution to these limitations with CellCognition Explorer, a generic novelty detection and deep learning framework. Application to several large-scale screening data sets on nuclear and mitotic cell morphologies demonstrates that CellCognition Explorer enables discovery of rare phenotypes without user training, which has broad implications for improved assay development in high-content screening. PMID:28954863
2015-03-01
MPMI has played a leading role in disseminating new insights into plant-microbe interactions and promoting new approaches. Articles in this Focus Issue highlight the power of genomic studies in uncovering novel determinants of plant interactions with microbial symbionts (good), pathogens (bad), and complex microbial communities (unknown). Many articles also illustrate how genomics can support translational research by quickly advancing our knowledge of important microbes that have not been widely studied. Click on Next Article or Table of Contents above to view the articles in this Focus Issue. (From the mobile site, go to the MPMI March 2015 issue.).
NASA Technical Reports Server (NTRS)
Centrella, Joan M.
2011-01-01
The Laser Interferometer Space Antenna (LISA) is a space-borne observatory that will open the low frequency (approx.0.1-100 mHz) gravitational wave window on the universe. LISA will observe a rich variety of gravitational wave sources, including mergers of massive black holes, captures of stellar black holes by massive black holes in the centers of galaxies, and compact Galactic binaries. These sources are generally long-lived, providing unprecedented opportunities for multi-messenger astronomy in the transient sky. This talk will present an overview of these scientific arenas, highlighting how LISA will enable stunning discoveries in origins, understanding the cosmic order, and the frontiers of knowledge.
Toledo, Jon B.; Van Deerlin, Vivianna M.; Lee, Edward B.; Suh, EunRan; Baek, Young; Robinson, John L.; Xie, Sharon X.; McBride, Jennifer; Wood, Elisabeth M.; Schuck, Theresa; Irwin, David J.; Gross, Rachel G.; Hurtig, Howard; McCluskey, Leo; Elman, Lauren; Karlawish, Jason; Schellenberg, Gerard; Chen-Plotkin, Alice; Wolk, David; Grossman, Murray; Arnold, Steven E.; Shaw, Leslie M.; Lee, Virginia M.-Y.; Trojanowski, John Q.
2014-01-01
Neurodegenerative diseases (NDs) are defined by the accumulation of abnormal protein deposits in the central nervous system (CNS), and only neuropathological examination enables a definitive diagnosis. Brain banks and their associated scientific programs have shaped the actual knowledge of NDs, identifying and characterizing the CNS deposits that define new diseases, formulating staging schemes, and establishing correlations between neuropathological changes and clinical features. However, brain banks have evolved to accommodate the banking of biofluids as well as DNA and RNA samples. Moreover, the value of biobanks is greatly enhanced if they link all the multidimensional clinical and laboratory information of each case, which is accomplished, optimally, using systematic and standardized operating procedures, and in the framework of multidisciplinary teams with the support of a flexible and user-friendly database system that facilitates the sharing of information of all the teams in the network. We describe a biobanking system that is a platform for discovery research at the Center for Neurodegenerative Disease Research at the University of Pennsylvania. PMID:23978324
The center for causal discovery of biomedical knowledge from big data
Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard
2015-01-01
The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. PMID:26138794
The Effect of Rules and Discovery in the Retention and Retrieval of Braille Inkprint Letter Pairs.
ERIC Educational Resources Information Center
Nagengast, Daniel L.; And Others
The effects of rule knowledge were investigated using Braille inkprint pairs. Both recognition and recall were studied in three groups of subjects: rule knowledge, rule discovery, and no rule. Two hypotheses were tested: (1) that the group exposed to the rule would score better than would a discovery group and a control group; and (2) that all…
Knowledge-Based Topic Model for Unsupervised Object Discovery and Localization.
Niu, Zhenxing; Hua, Gang; Wang, Le; Gao, Xinbo
Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.
User needs analysis and usability assessment of DataMed - a biomedical data discovery index.
Dixit, Ram; Rogith, Deevakar; Narayana, Vidya; Salimi, Mandana; Gururaj, Anupama; Ohno-Machado, Lucila; Xu, Hua; Johnson, Todd R
2017-11-30
To present user needs and usability evaluations of DataMed, a Data Discovery Index (DDI) that allows searching for biomedical data from multiple sources. We conducted 2 phases of user studies. Phase 1 was a user needs analysis conducted before the development of DataMed, consisting of interviews with researchers. Phase 2 involved iterative usability evaluations of DataMed prototypes. We analyzed data qualitatively to document researchers' information and user interface needs. Biomedical researchers' information needs in data discovery are complex, multidimensional, and shaped by their context, domain knowledge, and technical experience. User needs analyses validate the need for a DDI, while usability evaluations of DataMed show that even though aggregating metadata into a common search engine and applying traditional information retrieval tools are promising first steps, there remain challenges for DataMed due to incomplete metadata and the complexity of data discovery. Biomedical data poses distinct problems for search when compared to websites or publications. Making data available is not enough to facilitate biomedical data discovery: new retrieval techniques and user interfaces are necessary for dataset exploration. Consistent, complete, and high-quality metadata are vital to enable this process. While available data and researchers' information needs are complex and heterogeneous, a successful DDI must meet those needs and fit into the processes of biomedical researchers. Research directions include formalizing researchers' information needs, standardizing overviews of data to facilitate relevance judgments, implementing user interfaces for concept-based searching, and developing evaluation methods for open-ended discovery systems such as DDIs. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Rosario, Karyna; Padilla-Rodriguez, Marco; Kraberger, Simona; Stainton, Daisy; Martin, Darren P; Breitbart, Mya; Varsani, Arvind
2013-01-01
Geminiviruses have emerged as serious agricultural pathogens. Despite all the species that have been already catalogued, new molecular techniques continue to expand the diversity and geographical ranges of these single-stranded DNA viruses and their associated satellite molecules. Since all geminiviruses are insect-transmitted, examination of insect vector populations through vector-enabled metagenomics (VEM) has been recently used to investigate the diversity of geminiviruses transmitted by a specific vector in a given region. Here we used a more comprehensive adaptation of the VEM approach by surveying small circular DNA viruses found within top insect predators, specifically dragonflies (Epiprocta). This 'predator-enabled' approach is not limited to viral groups transmitted by specific vectors since dragonflies can accumulate the wide range of viruses transmitted by their diverse insect prey. Analysis of six dragonflies collected from an agricultural field in Puerto Rico culminated in the discovery of the first mastrevirus (Dragonfly-associated mastrevirus; DfasMV) and alphasatellite molecule (Dragonfly-associated alphasatellite; Dfas-alphasatellite) from the Caribbean. Since DfasMV and Dfas-alphasatellite are divergent from the limited number of sequences that have been reported from the Americas, this study unequivocally demonstrates that there have been at least two independent past introductions of both mastreviruses and alphasatellites to the New World. Overall, the use of predacious insects as sampling tools can profoundly alter our views of natural plant virus diversity and biogeography by allowing the discovery of novel geminiviruses and associated satellite molecules without a priori knowledge of the types of viruses or insect vectors in a given area. Copyright © 2012 Elsevier B.V. All rights reserved.
Jóźwik, Jagoda; Kałużna-Czaplińska, Joanna
2016-01-01
Currently, analysis of various human body fluids is one of the most essential and promising approaches to enable the discovery of biomarkers or pathophysiological mechanisms for disorders and diseases. Analysis of these fluids is challenging due to their complex composition and unique characteristics. Development of new analytical methods in this field has made it possible to analyze body fluids with higher selectivity, sensitivity, and precision. The composition and concentration of analytes in body fluids are most often determined by chromatography-based techniques. There is no doubt that proper use of knowledge that comes from a better understanding of the role of body fluids requires the cooperation of scientists of diverse specializations, including analytical chemists, biologists, and physicians. This article summarizes current knowledge about the application of different chromatographic methods in analyses of a wide range of compounds in human body fluids in order to diagnose certain diseases and disorders.
Computational knowledge integration in biopharmaceutical research.
Ficenec, David; Osborne, Mark; Pradines, Joel; Richards, Dan; Felciano, Ramon; Cho, Raymond J; Chen, Richard O; Liefeld, Ted; Owen, James; Ruttenberg, Alan; Reich, Christian; Horvath, Joseph; Clark, Tim
2003-09-01
An initiative to increase biopharmaceutical research productivity by capturing, sharing and computationally integrating proprietary scientific discoveries with public knowledge is described. This initiative involves both organisational process change and multiple interoperating software systems. The software components rely on mutually supporting integration techniques. These include a richly structured ontology, statistical analysis of experimental data against stored conclusions, natural language processing of public literature, secure document repositories with lightweight metadata, web services integration, enterprise web portals and relational databases. This approach has already begun to increase scientific productivity in our enterprise by creating an organisational memory (OM) of internal research findings, accessible on the web. Through bringing together these components it has also been possible to construct a very large and expanding repository of biological pathway information linked to this repository of findings which is extremely useful in analysis of DNA microarray data. This repository, in turn, enables our research paradigm to be shifted towards more comprehensive systems-based understandings of drug action.
Good surgeon: A search for meaning.
Akopov, Andrey L; Artioukh, Dmitri Y
2017-01-01
The art and philosophy of surgery are not as often discussed as scientific discoveries and technological advances in the modern era of surgery. Although these are difficult to teach and pass on to the next generations of surgeons they are no less important for training good surgeons and maintaining their high standards. The authors of this review and opinion article tried to define what being a good surgeon really means and to look into the subject by analysing the essential conditions for being a good surgeon and the qualities that such a specialist should possess. In addition to a strong theoretic knowledge and practical skills and among the several described professional and personal characteristics, a good surgeon is expected to have common sense. It enables a surgeon to make a sound practical judgment independent of specialized medical knowledge and training. The possible ways of developing and/or enhancing common sense during surgical training and subsequent practice require separate analysis.
Nurses collaborating with cross disciplinary networks: starting to integrate genomics into practice.
Adegbola, Maxine
2010-07-01
Nurses and other health-care providers are poised to include genetic discoveries into practice settings and to translate such knowledge for consumer benefit within culturally appropriate contexts. Nurses must seek collaboration with multi-disciplinary networks both locally and internationally. They must also capitalize on the expertise of other seasoned researchers in order to gain national and international exposure, recognition, and funding. Scholarly tailgating is using network relationships to achieve one's professional goals, and capitalizing on expert knowledge from seasoned researchers, educators, and practitioners from diverse international groups. By using scholarly tailgating principles, nurses can become important agents of change for multi-disciplinary networks, and thereby assist in decreasing health disparities. The purpose of this document is to encourage and inspire nurses to seek collaborative multi-disciplinary networks to enable genomic integration into health-care practice and education. Strategies for integrating genomics into practice settings are discussed.
Concept Formation in Scientific Knowledge Discovery from a Constructivist View
NASA Astrophysics Data System (ADS)
Peng, Wei; Gero, John S.
The central goal of scientific knowledge discovery is to learn cause-effect relationships among natural phenomena presented as variables and the consequences their interactions. Scientific knowledge is normally expressed as scientific taxonomies and qualitative and quantitative laws [1]. This type of knowledge represents intrinsic regularities of the observed phenomena that can be used to explain and predict behaviors of the phenomena. It is a generalization that is abstracted and externalized from a set of contexts and applicable to a broader scope. Scientific knowledge is a type of third-person knowledge, i.e., knowledge that independent of a specific enquirer. Artificial intelligence approaches, particularly data mining algorithms that are used to identify meaningful patterns from large data sets, are approaches that aim to facilitate the knowledge discovery process [2]. A broad spectrum of algorithms has been developed in addressing classification, associative learning, and clustering problems. However, their linkages to people who use them have not been adequately explored. Issues in relation to supporting the interpretation of the patterns, the application of prior knowledge to the data mining process and addressing user interactions remain challenges for building knowledge discovery tools [3]. As a consequence, scientists rely on their experience to formulate problems, evaluate hypotheses, reason about untraceable factors and derive new problems. This type of knowledge which they have developed during their career is called “first-person” knowledge. The formation of scientific knowledge (third-person knowledge) is highly influenced by the enquirer’s first-person knowledge construct, which is a result of his or her interactions with the environment. There have been attempts to craft automatic knowledge discovery tools but these systems are limited in their capabilities to handle the dynamics of personal experience. There are now trends in developing approaches to assist scientists applying their expertise to model formation, simulation, and prediction in various domains [4], [5]. On the other hand, first-person knowledge becomes third-person theory only if it proves general by evidence and is acknowledged by a scientific community. Researchers start to focus on building interactive cooperation platforms [1] to accommodate different views into the knowledge discovery process. There are some fundamental questions in relation to scientific knowledge development. What aremajor components for knowledge construction and how do people construct their knowledge? How is this personal construct assimilated and accommodated into a scientific paradigm? How can one design a computational system to facilitate these processes? This chapter does not attempt to answer all these questions but serves as a basis to foster thinking along this line. A brief literature review about how people develop their knowledge is carried out through a constructivist view. A hydrological modeling scenario is presented to elucidate the approach.
Concept Formation in Scientific Knowledge Discovery from a Constructivist View
NASA Astrophysics Data System (ADS)
Peng, Wei; Gero, John S.
The central goal of scientific knowledge discovery is to learn cause-effect relationships among natural phenomena presented as variables and the consequences their interactions. Scientific knowledge is normally expressed as scientific taxonomies and qualitative and quantitative laws [1]. This type of knowledge represents intrinsic regularities of the observed phenomena that can be used to explain and predict behaviors of the phenomena. It is a generalization that is abstracted and externalized from a set of contexts and applicable to a broader scope. Scientific knowledge is a type of third-person knowledge, i.e., knowledge that independent of a specific enquirer. Artificial intelligence approaches, particularly data mining algorithms that are used to identify meaningful patterns from large data sets, are approaches that aim to facilitate the knowledge discovery process [2]. A broad spectrum of algorithms has been developed in addressing classification, associative learning, and clustering problems. However, their linkages to people who use them have not been adequately explored. Issues in relation to supporting the interpretation of the patterns, the application of prior knowledge to the data mining process and addressing user interactions remain challenges for building knowledge discovery tools [3]. As a consequence, scientists rely on their experience to formulate problems, evaluate hypotheses, reason about untraceable factors and derive new problems. This type of knowledge which they have developed during their career is called "first-person" knowledge. The formation of scientific knowledge (third-person knowledge) is highly influenced by the enquirer's first-person knowledge construct, which is a result of his or her interactions with the environment. There have been attempts to craft automatic knowledge discovery tools but these systems are limited in their capabilities to handle the dynamics of personal experience. There are now trends in developing approaches to assist scientists applying their expertise to model formation, simulation, and prediction in various domains [4], [5]. On the other hand, first-person knowledge becomes third-person theory only if it proves general by evidence and is acknowledged by a scientific community. Researchers start to focus on building interactive cooperation platforms [1] to accommodate different views into the knowledge discovery process. There are some fundamental questions in relation to scientific knowledge development. What aremajor components for knowledge construction and how do people construct their knowledge? How is this personal construct assimilated and accommodated into a scientific paradigm? How can one design a computational system to facilitate these processes? This chapter does not attempt to answer all these questions but serves as a basis to foster thinking along this line. A brief literature review about how people develop their knowledge is carried out through a constructivist view. A hydrological modeling scenario is presented to elucidate the approach.
Knowledge Discovery and Data Mining: An Overview
NASA Technical Reports Server (NTRS)
Fayyad, U.
1995-01-01
The process of knowledge discovery and data mining is the process of information extraction from very large databases. Its importance is described along with several techniques and considerations for selecting the most appropriate technique for extracting information from a particular data set.
12 CFR 263.53 - Discovery depositions.
Code of Federal Regulations, 2014 CFR
2014-01-01
... 12 Banks and Banking 4 2014-01-01 2014-01-01 false Discovery depositions. 263.53 Section 263.53... Discovery depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts...
12 CFR 263.53 - Discovery depositions.
Code of Federal Regulations, 2012 CFR
2012-01-01
... 12 Banks and Banking 4 2012-01-01 2012-01-01 false Discovery depositions. 263.53 Section 263.53... Discovery depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts...
A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification
Wang, Jing; Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; ...
2013-01-01
Background . The availability of large complex data sets generated by high throughput technologies has enabled the recent proliferation of disease biomarker studies. However, a recurring problem in deriving biological information from large data sets is how to best incorporate expert knowledge into the biomarker selection process. Objective . To develop a generalizable framework that can incorporate expert knowledge into data-driven processes in a semiautomated way while providing a metric for optimization in a biomarker selection scheme. Methods . The framework was implemented as a pipeline consisting of five components for the identification of signatures from integrated clustering (ISIC). Expertmore » knowledge was integrated into the biomarker identification process using the combination of two distinct approaches; a distance-based clustering approach and an expert knowledge-driven functional selection. Results . The utility of the developed framework ISIC was demonstrated on proteomics data from a study of chronic obstructive pulmonary disease (COPD). Biomarker candidates were identified in a mouse model using ISIC and validated in a study of a human cohort. Conclusions . Expert knowledge can be introduced into a biomarker discovery process in different ways to enhance the robustness of selected marker candidates. Developing strategies for extracting orthogonal and robust features from large data sets increases the chances of success in biomarker identification.« less
A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification
Wang, Jing; Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Varnum, Susan M.; Brown, Joseph N.; Riensche, Roderick M.; Adkins, Joshua N.; Jacobs, Jon M.; Hoidal, John R.; Scholand, Mary Beth; Pounds, Joel G.; Blackburn, Michael R.; Rodland, Karin D.; McDermott, Jason E.
2013-01-01
Background. The availability of large complex data sets generated by high throughput technologies has enabled the recent proliferation of disease biomarker studies. However, a recurring problem in deriving biological information from large data sets is how to best incorporate expert knowledge into the biomarker selection process. Objective. To develop a generalizable framework that can incorporate expert knowledge into data-driven processes in a semiautomated way while providing a metric for optimization in a biomarker selection scheme. Methods. The framework was implemented as a pipeline consisting of five components for the identification of signatures from integrated clustering (ISIC). Expert knowledge was integrated into the biomarker identification process using the combination of two distinct approaches; a distance-based clustering approach and an expert knowledge-driven functional selection. Results. The utility of the developed framework ISIC was demonstrated on proteomics data from a study of chronic obstructive pulmonary disease (COPD). Biomarker candidates were identified in a mouse model using ISIC and validated in a study of a human cohort. Conclusions. Expert knowledge can be introduced into a biomarker discovery process in different ways to enhance the robustness of selected marker candidates. Developing strategies for extracting orthogonal and robust features from large data sets increases the chances of success in biomarker identification. PMID:24223463
A Semiautomated Framework for Integrating Expert Knowledge into Disease Marker Identification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jing; Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.
2013-10-01
Background. The availability of large complex data sets generated by high throughput technologies has enabled the recent proliferation of disease biomarker studies. However, a recurring problem in deriving biological information from large data sets is how to best incorporate expert knowledge into the biomarker selection process. Objective. To develop a generalizable framework that can incorporate expert knowledge into data-driven processes in a semiautomated way while providing a metric for optimization in a biomarker selection scheme. Methods. The framework was implemented as a pipeline consisting of five components for the identification of signatures from integrated clustering (ISIC). Expert knowledge was integratedmore » into the biomarker identification process using the combination of two distinct approaches; a distance-based clustering approach and an expert knowledge-driven functional selection. Results. The utility of the developed framework ISIC was demonstrated on proteomics data from a study of chronic obstructive pulmonary disease (COPD). Biomarker candidates were identified in a mouse model using ISIC and validated in a study of a human cohort. Conclusions. Expert knowledge can be introduced into a biomarker discovery process in different ways to enhance the robustness of selected marker candidates. Developing strategies for extracting orthogonal and robust features from large data sets increases the chances of success in biomarker identification.« less
ERIC Educational Resources Information Center
Benoit, Gerald
2002-01-01
Discusses data mining (DM) and knowledge discovery in databases (KDD), taking the view that KDD is the larger view of the entire process, with DM emphasizing the cleaning, warehousing, mining, and visualization of knowledge discovery in databases. Highlights include algorithms; users; the Internet; text mining; and information extraction.…
Gibert, Karina; García-Rudolph, Alejandro; Curcoll, Lluïsa; Soler, Dolors; Pla, Laura; Tormos, José María
2009-01-01
In this paper, an integral Knowledge Discovery Methodology, named Clustering based on rules by States, which incorporates artificial intelligence (AI) and statistical methods as well as interpretation-oriented tools, is used for extracting knowledge patterns about the evolution over time of the Quality of Life (QoL) of patients with Spinal Cord Injury. The methodology incorporates the interaction with experts as a crucial element with the clustering methodology to guarantee usefulness of the results. Four typical patterns are discovered by taking into account prior expert knowledge. Several hypotheses are elaborated about the reasons for psychological distress or decreases in QoL of patients over time. The knowledge discovery from data (KDD) approach turns out, once again, to be a suitable formal framework for handling multidimensional complexity of the health domains.
Semantically Enabling Knowledge Representation of Metamorphic Petrology Data
NASA Astrophysics Data System (ADS)
West, P.; Fox, P. A.; Spear, F. S.; Adali, S.; Nguyen, C.; Hallett, B. W.; Horkley, L. K.
2012-12-01
More and more metamorphic petrology data is being collected around the world, and is now being organized together into different virtual data portals by means of virtual organizations. For example, there is the virtual data portal Petrological Database (PetDB, http://www.petdb.org) of the Ocean Floor that is organizing scientific information about geochemical data of ocean floor igneous and metamorphic rocks; and also The Metamorphic Petrology Database (MetPetDB, http://metpetdb.rpi.edu) that is being created by a global community of metamorphic petrologists in collaboration with software engineers and data managers at Rensselaer Polytechnic Institute. The current focus is to provide the ability for scientists and researchers to register their data and search the databases for information regarding sample collections. What we present here is the next step in evolution of the MetPetDB portal, utilizing semantically enabled features such as discovery, data casting, faceted search, knowledge representation, and linked data as well as organizing information about the community and collaboration within the virtual community itself. We take the information that is currently represented in a relational database and make it available through web services, SPARQL endpoints, semantic and triple-stores where inferencing is enabled. We will be leveraging research that has taken place in virtual observatories, such as the Virtual Solar Terrestrial Observatory (VSTO) and the Biological and Chemical Oceanography Data Management Office (BCO-DMO); vocabulary work done in various communities such as Observations and Measurements (ISO 19156), FOAF (Friend of a Friend), Bibo (Bibliography Ontology), and domain specific ontologies; enabling provenance traces of samples and subsamples using the different provenance ontologies; and providing the much needed linking of data from the various research organizations into a common, collaborative virtual observatory. In addition to better representing and presenting the actual data, we also look to organize and represent the knowledge information and expertise behind the data. Domain experts hold a lot of knowledge in their minds, in their presentations and publications, and elsewhere. Not only is this a technical issue, this is also a social issue in that we need to be able to encourage the domain experts to share their knowledge in a way that can be searched and queried over. With this additional focus in MetPetDB the site can be used more efficiently by other domain experts, but can also be utilized by non-specialists as well in order to educate people of the importance of the work being done as well as enable future domain experts.
Klimovskaia, Anna; Ganscha, Stefan; Claassen, Manfred
2016-12-01
Stochastic chemical reaction networks constitute a model class to quantitatively describe dynamics and cell-to-cell variability in biological systems. The topology of these networks typically is only partially characterized due to experimental limitations. Current approaches for refining network topology are based on the explicit enumeration of alternative topologies and are therefore restricted to small problem instances with almost complete knowledge. We propose the reactionet lasso, a computational procedure that derives a stepwise sparse regression approach on the basis of the Chemical Master Equation, enabling large-scale structure learning for reaction networks by implicitly accounting for billions of topology variants. We have assessed the structure learning capabilities of the reactionet lasso on synthetic data for the complete TRAIL induced apoptosis signaling cascade comprising 70 reactions. We find that the reactionet lasso is able to efficiently recover the structure of these reaction systems, ab initio, with high sensitivity and specificity. With only < 1% false discoveries, the reactionet lasso is able to recover 45% of all true reactions ab initio among > 6000 possible reactions and over 102000 network topologies. In conjunction with information rich single cell technologies such as single cell RNA sequencing or mass cytometry, the reactionet lasso will enable large-scale structure learning, particularly in areas with partial network structure knowledge, such as cancer biology, and thereby enable the detection of pathological alterations of reaction networks. We provide software to allow for wide applicability of the reactionet lasso.
Low, Eric; Bountra, Chas; Lee, Wen Hwa
2016-01-01
We are experiencing a new era enabled by unencumbered access to high quality data through the emergence of open science initiatives in the historically challenging area of early stage drug discovery. At the same time, many patient-centric organisations are taking matters into their own hands by participating in, enabling and funding research. Here we present the rationale behind the innovative partnership between the Structural Genomics Consortium (SGC)-an open, pre-competitive pre-clinical research consortium and the research-focused patient organisation Myeloma UK to create a new, comprehensive platform to accelerate the discovery and development of new treatments for multiple myeloma.
Computational approaches for drug discovery.
Hung, Che-Lun; Chen, Chi-Chun
2014-09-01
Cellular proteins are the mediators of multiple organism functions being involved in physiological mechanisms and disease. By discovering lead compounds that affect the function of target proteins, the target diseases or physiological mechanisms can be modulated. Based on knowledge of the ligand-receptor interaction, the chemical structures of leads can be modified to improve efficacy, selectivity and reduce side effects. One rational drug design technology, which enables drug discovery based on knowledge of target structures, functional properties and mechanisms, is computer-aided drug design (CADD). The application of CADD can be cost-effective using experiments to compare predicted and actual drug activity, the results from which can used iteratively to improve compound properties. The two major CADD-based approaches are structure-based drug design, where protein structures are required, and ligand-based drug design, where ligand and ligand activities can be used to design compounds interacting with the protein structure. Approaches in structure-based drug design include docking, de novo design, fragment-based drug discovery and structure-based pharmacophore modeling. Approaches in ligand-based drug design include quantitative structure-affinity relationship and pharmacophore modeling based on ligand properties. Based on whether the structure of the receptor and its interaction with the ligand are known, different design strategies can be seed. After lead compounds are generated, the rule of five can be used to assess whether these have drug-like properties. Several quality validation methods, such as cost function analysis, Fisher's cross-validation analysis and goodness of hit test, can be used to estimate the metrics of different drug design strategies. To further improve CADD performance, multi-computers and graphics processing units may be applied to reduce costs. © 2014 Wiley Periodicals, Inc.
Active Storage with Analytics Capabilities and I/O Runtime System for Petascale Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choudhary, Alok
Computational scientists must understand results from experimental, observational and computational simulation generated data to gain insights and perform knowledge discovery. As systems approach the petascale range, problems that were unimaginable a few years ago are within reach. With the increasing volume and complexity of data produced by ultra-scale simulations and high-throughput experiments, understanding the science is largely hampered by the lack of comprehensive I/O, storage, acceleration of data manipulation, analysis, and mining tools. Scientists require techniques, tools and infrastructure to facilitate better understanding of their data, in particular the ability to effectively perform complex data analysis, statistical analysis and knowledgemore » discovery. The goal of this work is to enable more effective analysis of scientific datasets through the integration of enhancements in the I/O stack, from active storage support at the file system layer to MPI-IO and high-level I/O library layers. We propose to provide software components to accelerate data analytics, mining, I/O, and knowledge discovery for large-scale scientific applications, thereby increasing productivity of both scientists and the systems. Our approaches include 1) design the interfaces in high-level I/O libraries, such as parallel netCDF, for applications to activate data mining operations at the lower I/O layers; 2) Enhance MPI-IO runtime systems to incorporate the functionality developed as a part of the runtime system design; 3) Develop parallel data mining programs as part of runtime library for server-side file system in PVFS file system; and 4) Prototype an active storage cluster, which will utilize multicore CPUs, GPUs, and FPGAs to carry out the data mining workload.« less
Lung tumor diagnosis and subtype discovery by gene expression profiling.
Wang, Lu-yong; Tu, Zhuowen
2006-01-01
The optimal treatment of patients with complex diseases, such as cancers, depends on the accurate diagnosis by using a combination of clinical and histopathological data. In many scenarios, it becomes tremendously difficult because of the limitations in clinical presentation and histopathology. To accurate diagnose complex diseases, the molecular classification based on gene or protein expression profiles are indispensable for modern medicine. Moreover, many heterogeneous diseases consist of various potential subtypes in molecular basis and differ remarkably in their response to therapies. It is critical to accurate predict subgroup on disease gene expression profiles. More fundamental knowledge of the molecular basis and classification of disease could aid in the prediction of patient outcome, the informed selection of therapies, and identification of novel molecular targets for therapy. In this paper, we propose a new disease diagnostic method, probabilistic boosting tree (PB tree) method, on gene expression profiles of lung tumors. It enables accurate disease classification and subtype discovery in disease. It automatically constructs a tree in which each node combines a number of weak classifiers into a strong classifier. Also, subtype discovery is naturally embedded in the learning process. Our algorithm achieves excellent diagnostic performance, and meanwhile it is capable of detecting the disease subtype based on gene expression profile.
[Economics, politics, and public health in Porfirian Mexico (1876-1910)].
Carrillo, Ana María
2002-01-01
The article examines the scientific, political, and economic elements that permitted the birth of modern public health in Mexico under the Porfirio Díaz administration (1876-1910). Firstly, a portion of Mexican physicians were open to the discoveries of microbiology, immunology, and epidemiology. Secondly, the State's growing concentration of power in public health matters ran parallel to its concentration of disciplinary political power and enabled this new knowledge to be placed at the service of collective health problem prevention. Lastly, both imperialism and the Porfirian elite needed to protect their business interests. The article evaluates public health achievements and limitations during the Porfirian period, abruptly interrupted by the revolution begun in 1910.
From IHE Audit Trails to XES Event Logs Facilitating Process Mining.
Paster, Ferdinand; Helm, Emmanuel
2015-01-01
Recently Business Intelligence approaches like process mining are applied to the healthcare domain. The goal of process mining is to gain process knowledge, compliance and room for improvement by investigating recorded event data. Previous approaches focused on process discovery by event data from various specific systems. IHE, as a globally recognized basis for healthcare information systems, defines in its ATNA profile how real-world events must be recorded in centralized event logs. The following approach presents how audit trails collected by the means of ATNA can be transformed to enable process mining. Using the standardized audit trails provides the ability to apply these methods to all IHE based information systems.
NASA EOSDIS: Enabling Science by Improving User Knowledge
NASA Technical Reports Server (NTRS)
Lindsay, Francis; Brennan, Jennifer; Blumenfeld, Joshua
2016-01-01
Lessons learned and impacts of applying these newer methods are explained and include several examples from our current efforts such as the interactive, on-line webinars focusing on data discovery and access including tool usage, informal and informative data chats with data experts across our EOSDIS community, data user profile interviews with scientists actively using EOSDIS data in their research, and improved conference and meeting interactions via EOSDIS data interactively used during hyper-wall talks and Worldview application. The suite of internet-based, interactive capabilities and technologies has allowed our project to expand our user community by making the data and applications from numerous Earth science missions more engaging, approachable and meaningful.
Gathering and Exploring Scientific Knowledge in Pharmacovigilance
Lopes, Pedro; Nunes, Tiago; Campos, David; Furlong, Laura Ines; Bauer-Mehren, Anna; Sanz, Ferran; Carrascosa, Maria Carmen; Mestres, Jordi; Kors, Jan; Singh, Bharat; van Mulligen, Erik; Van der Lei, Johan; Diallo, Gayo; Avillach, Paul; Ahlberg, Ernst; Boyer, Scott; Diaz, Carlos; Oliveira, José Luís
2013-01-01
Pharmacovigilance plays a key role in the healthcare domain through the assessment, monitoring and discovery of interactions amongst drugs and their effects in the human organism. However, technological advances in this field have been slowing down over the last decade due to miscellaneous legal, ethical and methodological constraints. Pharmaceutical companies started to realize that collaborative and integrative approaches boost current drug research and development processes. Hence, new strategies are required to connect researchers, datasets, biomedical knowledge and analysis algorithms, allowing them to fully exploit the true value behind state-of-the-art pharmacovigilance efforts. This manuscript introduces a new platform directed towards pharmacovigilance knowledge providers. This system, based on a service-oriented architecture, adopts a plugin-based approach to solve fundamental pharmacovigilance software challenges. With the wealth of collected clinical and pharmaceutical data, it is now possible to connect knowledge providers’ analysis and exploration algorithms with real data. As a result, new strategies allow a faster identification of high-risk interactions between marketed drugs and adverse events, and enable the automated uncovering of scientific evidence behind them. With this architecture, the pharmacovigilance field has a new platform to coordinate large-scale drug evaluation efforts in a unique ecosystem, publicly available at http://bioinformatics.ua.pt/euadr/. PMID:24349421
Resource Discovery within the Networked "Hybrid" Library.
ERIC Educational Resources Information Center
Leigh, Sally-Anne
This paper focuses on the development, adoption, and integration of resource discovery, knowledge management, and/or knowledge sharing interfaces such as interactive portals, and the use of the library's World Wide Web presence to increase the availability and usability of information services. The introduction addresses changes in library…
A biological compression model and its applications.
Cao, Minh Duc; Dix, Trevor I; Allison, Lloyd
2011-01-01
A biological compression model, expert model, is presented which is superior to existing compression algorithms in both compression performance and speed. The model is able to compress whole eukaryotic genomes. Most importantly, the model provides a framework for knowledge discovery from biological data. It can be used for repeat element discovery, sequence alignment and phylogenetic analysis. We demonstrate that the model can handle statistically biased sequences and distantly related sequences where conventional knowledge discovery tools often fail.
Data mining in pharma sector: benefits.
Ranjan, Jayanthi
2009-01-01
The amount of data getting generated in any sector at present is enormous. The information flow in the pharma industry is huge. Pharma firms are progressing into increased technology-enabled products and services. Data mining, which is knowledge discovery from large sets of data, helps pharma firms to discover patterns in improving the quality of drug discovery and delivery methods. The paper aims to present how data mining is useful in the pharma industry, how its techniques can yield good results in pharma sector, and to show how data mining can really enhance in making decisions using pharmaceutical data. This conceptual paper is written based on secondary study, research and observations from magazines, reports and notes. The author has listed the types of patterns that can be discovered using data mining in pharma data. The paper shows how data mining is useful in the pharma industry and how its techniques can yield good results in pharma sector. Although much work can be produced for discovering knowledge in pharma data using data mining, the paper is limited to conceptualizing the ideas and view points at this stage; future work may include applying data mining techniques to pharma data based on primary research using the available, famous significant data mining tools. Research papers and conceptual papers related to data mining in Pharma industry are rare; this is the motivation for the paper.
FY10 Engineering Innovations, Research and Technology Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lane, M A; Aceves, S M; Paulson, C N
This report summarizes key research, development, and technology advancements in Lawrence Livermore National Laboratory's Engineering Directorate for FY2010. These efforts exemplify Engineering's nearly 60-year history of developing and applying the technology innovations needed for the Laboratory's national security missions, and embody Engineering's mission to ''Enable program success today and ensure the Laboratory's vitality tomorrow.'' Leading off the report is a section featuring compelling engineering innovations. These innovations range from advanced hydrogen storage that enables clean vehicles, to new nuclear material detection technologies, to a landmine detection system using ultra-wideband ground-penetrating radar. Many have been recognized with R&D Magazine's prestigious R&Dmore » 100 Award; all are examples of the forward-looking application of innovative engineering to pressing national problems and challenging customer requirements. Engineering's capability development strategy includes both fundamental research and technology development. Engineering research creates the competencies of the future where discovery-class groundwork is required. Our technology development (or reduction to practice) efforts enable many of the research breakthroughs across the Laboratory to translate from the world of basic research to the national security missions of the Laboratory. This portfolio approach produces new and advanced technological capabilities, and is a unique component of the value proposition of the Lawrence Livermore Laboratory. The balance of the report highlights this work in research and technology, organized into thematic technical areas: Computational Engineering; Micro/Nano-Devices and Structures; Measurement Technologies; Engineering Systems for Knowledge Discovery; and Energy Manipulation. Our investments in these areas serve not only known programmatic requirements of today and tomorrow, but also anticipate the breakthrough engineering innovations that will be needed in the future.« less
Modelling Chemical Reasoning to Predict and Invent Reactions.
Segler, Marwin H S; Waller, Mark P
2017-05-02
The ability to reason beyond established knowledge allows organic chemists to solve synthetic problems and invent novel transformations. Herein, we propose a model that mimics chemical reasoning, and formalises reaction prediction as finding missing links in a knowledge graph. We have constructed a knowledge graph containing 14.4 million molecules and 8.2 million binary reactions, which represents the bulk of all chemical reactions ever published in the scientific literature. Our model outperforms a rule-based expert system in the reaction prediction task for 180 000 randomly selected binary reactions. The data-driven model generalises even beyond known reaction types, and is thus capable of effectively (re-)discovering novel transformations (even including transition metal-catalysed reactions). Our model enables computers to infer hypotheses about reactivity and reactions by only considering the intrinsic local structure of the graph and because each single reaction prediction is typically achieved in a sub-second time frame, the model can be used as a high-throughput generator of reaction hypotheses for reaction discovery. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
2010-01-01
Background The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. Results In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. Conclusion High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data. PMID:20122245
Seok, Junhee; Kaushal, Amit; Davis, Ronald W; Xiao, Wenzhong
2010-01-18
The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data.
Form-Focused Discovery Activities in English Classes
ERIC Educational Resources Information Center
Ogeyik, Muhlise Cosgun
2011-01-01
Form-focused discovery activities allow language learners to grasp various aspects of a target language by contributing implicit knowledge by using discovered explicit knowledge. Moreover, such activities can assist learners to perceive and discover the features of their language input. In foreign language teaching environments, they can be used…
Covington, Brett C; McLean, John A; Bachmann, Brian O
2017-01-04
Covering: 2000 to 2016The labor-intensive process of microbial natural product discovery is contingent upon identifying discrete secondary metabolites of interest within complex biological extracts, which contain inventories of all extractable small molecules produced by an organism or consortium. Historically, compound isolation prioritization has been driven by observed biological activity and/or relative metabolite abundance and followed by dereplication via accurate mass analysis. Decades of discovery using variants of these methods has generated the natural pharmacopeia but also contributes to recent high rediscovery rates. However, genomic sequencing reveals substantial untapped potential in previously mined organisms, and can provide useful prescience of potentially new secondary metabolites that ultimately enables isolation. Recently, advances in comparative metabolomics analyses have been coupled to secondary metabolic predictions to accelerate bioactivity and abundance-independent discovery work flows. In this review we will discuss the various analytical and computational techniques that enable MS-based metabolomic applications to natural product discovery and discuss the future prospects for comparative metabolomics in natural product discovery.
Ceríaco, Luis M P; Marques, Mariana P; Madeira, Natália C; Vila-Viçosa, Carlos M; Mendes, Paula
2011-09-05
Traditional Ecological Knowledge (TEK) and folklore are repositories of large amounts of information about the natural world. Ideas, perceptions and empirical data held by human communities regarding local species are important sources which enable new scientific discoveries to be made, as well as offering the potential to solve a number of conservation problems. We documented the gecko-related folklore and TEK of the people of southern Portugal, with the particular aim of understanding the main ideas relating to gecko biology and ecology. Our results suggest that local knowledge of gecko ecology and biology is both accurate and relevant. As a result of information provided by local inhabitants, knowledge of the current geographic distribution of Hemidactylus turcicus was expanded, with its presence reported in nine new locations. It was also discovered that locals still have some misconceptions of geckos as poisonous and carriers of dermatological diseases. The presence of these ideas has led the population to a fear of and aversion to geckos, resulting in direct persecution being one of the major conservation problems facing these animals. It is essential, from both a scientific and conservationist perspective, to understand the knowledge and perceptions that people have towards the animals, since, only then, may hitherto unrecognized pertinent information and conservation problems be detected and resolved.
2011-01-01
Traditional Ecological Knowledge (TEK) and folklore are repositories of large amounts of information about the natural world. Ideas, perceptions and empirical data held by human communities regarding local species are important sources which enable new scientific discoveries to be made, as well as offering the potential to solve a number of conservation problems. We documented the gecko-related folklore and TEK of the people of southern Portugal, with the particular aim of understanding the main ideas relating to gecko biology and ecology. Our results suggest that local knowledge of gecko ecology and biology is both accurate and relevant. As a result of information provided by local inhabitants, knowledge of the current geographic distribution of Hemidactylus turcicus was expanded, with its presence reported in nine new locations. It was also discovered that locals still have some misconceptions of geckos as poisonous and carriers of dermatological diseases. The presence of these ideas has led the population to a fear of and aversion to geckos, resulting in direct persecution being one of the major conservation problems facing these animals. It is essential, from both a scientific and conservationist perspective, to understand the knowledge and perceptions that people have towards the animals, since, only then, may hitherto unrecognized pertinent information and conservation problems be detected and resolved. PMID:21892925
75 FR 66766 - NIAID Blue Ribbon Panel Meeting on Adjuvant Discovery and Development
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-29
..., identifies gaps in knowledge and capabilities, and defines NIAID's goals for the continued discovery... DEPARTMENT OF HEALTH AND HUMAN SERVICES NIAID Blue Ribbon Panel Meeting on Adjuvant Discovery and... agenda for the discovery, development and clinical evaluation of adjuvants for use with preventive...
12 CFR 263.53 - Discovery depositions.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 12 Banks and Banking 3 2011-01-01 2011-01-01 false Discovery depositions. 263.53 Section 263.53... depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts material to the...
12 CFR 19.170 - Discovery depositions.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 1 2010-01-01 2010-01-01 false Discovery depositions. 19.170 Section 19.170... PROCEDURE Discovery Depositions and Subpoenas § 19.170 Discovery depositions. (a) General rule. In any... deposition of an expert, or of a person, including another party, who has direct knowledge of matters that...
12 CFR 19.170 - Discovery depositions.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 12 Banks and Banking 1 2011-01-01 2011-01-01 false Discovery depositions. 19.170 Section 19.170... PROCEDURE Discovery Depositions and Subpoenas § 19.170 Discovery depositions. (a) General rule. In any... deposition of an expert, or of a person, including another party, who has direct knowledge of matters that...
12 CFR 263.53 - Discovery depositions.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 12 Banks and Banking 3 2010-01-01 2010-01-01 false Discovery depositions. 263.53 Section 263.53... depositions. (a) In general. In addition to the discovery permitted in subpart A of this part, limited discovery by means of depositions shall be allowed for individuals with knowledge of facts material to the...
BioGraph: unsupervised biomedical knowledge discovery via automated hypothesis generation
2011-01-01
We present BioGraph, a data integration and data mining platform for the exploration and discovery of biomedical information. The platform offers prioritizations of putative disease genes, supported by functional hypotheses. We show that BioGraph can retrospectively confirm recently discovered disease genes and identify potential susceptibility genes, outperforming existing technologies, without requiring prior domain knowledge. Additionally, BioGraph allows for generic biomedical applications beyond gene discovery. BioGraph is accessible at http://www.biograph.be. PMID:21696594
Ratnam, Joseline; Zdrazil, Barbara; Digles, Daniela; Cuadrado-Rodriguez, Emiliano; Neefs, Jean-Marc; Tipney, Hannah; Siebes, Ronald; Waagmeester, Andra; Bradley, Glyn; Chau, Chau Han; Richter, Lars; Brea, Jose; Evelo, Chris T.; Jacoby, Edgar; Senger, Stefan; Loza, Maria Isabel; Ecker, Gerhard F.; Chichester, Christine
2014-01-01
Integration of open access, curated, high-quality information from multiple disciplines in the Life and Biomedical Sciences provides a holistic understanding of the domain. Additionally, the effective linking of diverse data sources can unearth hidden relationships and guide potential research strategies. However, given the lack of consistency between descriptors and identifiers used in different resources and the absence of a simple mechanism to link them, gathering and combining relevant, comprehensive information from diverse databases remains a challenge. The Open Pharmacological Concepts Triple Store (Open PHACTS) is an Innovative Medicines Initiative project that uses semantic web technology approaches to enable scientists to easily access and process data from multiple sources to solve real-world drug discovery problems. The project draws together sources of publicly-available pharmacological, physicochemical and biomolecular data, represents it in a stable infrastructure and provides well-defined information exploration and retrieval methods. Here, we highlight the utility of this platform in conjunction with workflow tools to solve pharmacological research questions that require interoperability between target, compound, and pathway data. Use cases presented herein cover 1) the comprehensive identification of chemical matter for a dopamine receptor drug discovery program 2) the identification of compounds active against all targets in the Epidermal growth factor receptor (ErbB) signaling pathway that have a relevance to disease and 3) the evaluation of established targets in the Vitamin D metabolism pathway to aid novel Vitamin D analogue design. The example workflows presented illustrate how the Open PHACTS Discovery Platform can be used to exploit existing knowledge and generate new hypotheses in the process of drug discovery. PMID:25522365
Imam, Fahim T.; Larson, Stephen D.; Bandrowski, Anita; Grethe, Jeffery S.; Gupta, Amarnath; Martone, Maryann E.
2012-01-01
An initiative of the NIH Blueprint for neuroscience research, the Neuroscience Information Framework (NIF) project advances neuroscience by enabling discovery and access to public research data and tools worldwide through an open source, semantically enhanced search portal. One of the critical components for the overall NIF system, the NIF Standardized Ontologies (NIFSTD), provides an extensive collection of standard neuroscience concepts along with their synonyms and relationships. The knowledge models defined in the NIFSTD ontologies enable an effective concept-based search over heterogeneous types of web-accessible information entities in NIF’s production system. NIFSTD covers major domains in neuroscience, including diseases, brain anatomy, cell types, sub-cellular anatomy, small molecules, techniques, and resource descriptors. Since the first production release in 2008, NIF has grown significantly in content and functionality, particularly with respect to the ontologies and ontology-based services that drive the NIF system. We present here on the structure, design principles, community engagement, and the current state of NIFSTD ontologies. PMID:22737162
Medical knowledge discovery and management.
Prior, Fred
2009-05-01
Although the volume of medical information is growing rapidly, the ability to rapidly convert this data into "actionable insights" and new medical knowledge is lagging far behind. The first step in the knowledge discovery process is data management and integration, which logically can be accomplished through the application of data warehouse technologies. A key insight that arises from efforts in biosurveillance and the global scope of military medicine is that information must be integrated over both time (longitudinal health records) and space (spatial localization of health-related events). Once data are compiled and integrated it is essential to encode the semantics and relationships among data elements through the use of ontologies and semantic web technologies to convert data into knowledge. Medical images form a special class of health-related information. Traditionally knowledge has been extracted from images by human observation and encoded via controlled terminologies. This approach is rapidly being replaced by quantitative analyses that more reliably support knowledge extraction. The goals of knowledge discovery are the improvement of both the timeliness and accuracy of medical decision making and the identification of new procedures and therapies.
Newton, Mandi S; Scott-Findlay, Shannon
2007-01-01
Background In the past 15 years, knowledge translation in healthcare has emerged as a multifaceted and complex agenda. Theoretical and polemical discussions, the development of a science to study and measure the effects of translating research evidence into healthcare, and the role of key stakeholders including academe, healthcare decision-makers, the public, and government funding bodies have brought scholarly, organizational, social, and political dimensions to the agenda. Objective This paper discusses the current knowledge translation agenda in Canadian healthcare and how elements in this agenda shape the discovery and translation of health knowledge. Discussion The current knowledge translation agenda in Canadian healthcare involves the influence of values, priorities, and people; stakes which greatly shape the discovery of research knowledge and how it is or is not instituted in healthcare delivery. As this agenda continues to take shape and direction, ensuring that it is accountable for its influences is essential and should be at the forefront of concern to the Canadian public and healthcare community. This transparency will allow for scrutiny, debate, and improvements in health knowledge discovery and health services delivery. PMID:17916256
A Drupal-Based Collaborative Framework for Science Workflows
NASA Astrophysics Data System (ADS)
Pinheiro da Silva, P.; Gandara, A.
2010-12-01
Cyber-infrastructure is built from utilizing technical infrastructure to support organizational practices and social norms to provide support for scientific teams working together or dependent on each other to conduct scientific research. Such cyber-infrastructure enables the sharing of information and data so that scientists can leverage knowledge and expertise through automation. Scientific workflow systems have been used to build automated scientific systems used by scientists to conduct scientific research and, as a result, create artifacts in support of scientific discoveries. These complex systems are often developed by teams of scientists who are located in different places, e.g., scientists working in distinct buildings, and sometimes in different time zones, e.g., scientist working in distinct national laboratories. The sharing of these specifications is currently supported by the use of version control systems such as CVS or Subversion. Discussions about the design, improvement, and testing of these specifications, however, often happen elsewhere, e.g., through the exchange of email messages and IM chatting. Carrying on a discussion about these specifications is challenging because comments and specifications are not necessarily connected. For instance, the person reading a comment about a given workflow specification may not be able to see the workflow and even if the person can see the workflow, the person may not specifically know to which part of the workflow a given comments applies to. In this paper, we discuss the design, implementation and use of CI-Server, a Drupal-based infrastructure, to support the collaboration of both local and distributed teams of scientists using scientific workflows. CI-Server has three primary goals: to enable information sharing by providing tools that scientists can use within their scientific research to process data, publish and share artifacts; to build community by providing tools that support discussions between scientists about artifacts used or created through scientific processes; and to leverage the knowledge collected within the artifacts and scientific collaborations to support scientific discoveries.
Discovering Knowledge from AIS Database for Application in VTS
NASA Astrophysics Data System (ADS)
Tsou, Ming-Cheng
The widespread use of the Automatic Identification System (AIS) has had a significant impact on maritime technology. AIS enables the Vessel Traffic Service (VTS) not only to offer commonly known functions such as identification, tracking and monitoring of vessels, but also to provide rich real-time information that is useful for marine traffic investigation, statistical analysis and theoretical research. However, due to the rapid accumulation of AIS observation data, the VTS platform is often unable quickly and effectively to absorb and analyze it. Traditional observation and analysis methods are becoming less suitable for the modern AIS generation of VTS. In view of this, we applied the same data mining technique used for business intelligence discovery (in Customer Relation Management (CRM) business marketing) to the analysis of AIS observation data. This recasts the marine traffic problem as a business-marketing problem and integrates technologies such as Geographic Information Systems (GIS), database management systems, data warehousing and data mining to facilitate the discovery of hidden and valuable information in a huge amount of observation data. Consequently, this provides the marine traffic managers with a useful strategic planning resource.
2017-12-08
Image released April 19, 2013. Astronomers have used NASA's Hubble Space Telescope to photograph the iconic Horsehead Nebula in a new, infrared light to mark the 23rd anniversary of the famous observatory's launch aboard the space shuttle Discovery on April 24, 1990. Looking like an apparition rising from whitecaps of interstellar foam, the iconic Horsehead Nebula has graced astronomy books ever since its discovery more than a century ago. The nebula is a favorite target for amateur and professional astronomers. It is shadowy in optical light. It appears transparent and ethereal when seen at infrared wavelengths. The rich tapestry of the Horsehead Nebula pops out against the backdrop of Milky Way stars and distant galaxies that easily are visible in infrared light. Credit: NASA, ESA, and the Hubble Heritage Team (STScI/AURA) More on this image. NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram
A historical perspective on the role of sensory nerves in neurogenic inflammation.
Sousa-Valente, João; Brain, Susan D
2018-05-01
The term 'neurogenic inflammation' is commonly used, especially with respect to the role of sensory nerves within inflammatory disease. However, despite over a century of research, we remain unclear about the role of these nerves in the vascular biology of inflammation, as compared with their interacting role in pain processing and of their potential for therapeutic manipulation. This chapter attempts to discuss the progress in understanding, from the initial discovery of sensory nerves until the present day. This covers pioneering findings that these nerves exist, are involved in vascular events and act as important sensors of environmental changes, including injury and infection. This is followed by discovery of the contents they release such as the established vasoactive neuropeptides substance P and CGRP as well as anti-inflammatory peptides such as the opioids and somatostatin. The more recent emergence of the importance of the transient receptor potential (TRP) channels has revealed some of the mechanisms by which these nerves sense environmental stimuli. This knowledge enables a platform from which to learn of the potential role of neurogenic inflammation in disease and in turn of novel therapeutic targets.
Dissimilatory Reduction of Extracellular Electron Acceptors in Anaerobic Respiration
Richter, Katrin; Schicklberger, Marcus
2012-01-01
An extension of the respiratory chain to the cell surface is necessary to reduce extracellular electron acceptors like ferric iron or manganese oxides. In the past few years, more and more compounds were revealed to be reduced at the surface of the outer membrane of Gram-negative bacteria, and the list does not seem to have an end so far. Shewanella as well as Geobacter strains are model organisms to discover the biochemistry that enables the dissimilatory reduction of extracellular electron acceptors. In both cases, c-type cytochromes are essential electron-transferring proteins. They make the journey of respiratory electrons from the cytoplasmic membrane through periplasm and over the outer membrane possible. Outer membrane cytochromes have the ability to catalyze the last step of the respiratory chains. Still, recent discoveries provided evidence that they are accompanied by further factors that allow or at least facilitate extracellular reduction. This review gives a condensed overview of our current knowledge of extracellular respiration, highlights recent discoveries, and discusses critically the influence of different strategies for terminal electron transfer reactions. PMID:22179232
The center for causal discovery of biomedical knowledge from big data.
Cooper, Gregory F; Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard
2015-11-01
The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.All rights reserved. For Permissions, please email: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
McGovern, Mary Francis
Non-formal environmental education provides students the opportunity to learn in ways that would not be possible in a traditional classroom setting. Outdoor learning allows students to make connections to their environment and helps to foster an appreciation for nature. This type of education can be interdisciplinary---students not only develop skills in science, but also in mathematics, social studies, technology, and critical thinking. This case study focuses on a non-formal marine education program, the South Carolina Department of Natural Resources' (SCDNR) Discovery vessel based program. The Discovery curriculum was evaluated to determine impact on student knowledge about and attitude toward the estuary. Students from two South Carolina coastal counties who attended the boat program during fall 2014 were asked to complete a brief survey before, immediately after, and two weeks following the program. The results of this study indicate that both student knowledge about and attitude significantly improved after completion of the Discovery vessel based program. Knowledge and attitude scores demonstrated a positive correlation.
The Virtual Learning Commons (VLC): Enabling Co-Innovation Across Disciplines
NASA Astrophysics Data System (ADS)
Pennington, D. D.; Gandara, A.; Del Rio, N.
2014-12-01
A key challenge for scientists addressing grand-challenge problems is identifying, understanding, and integrating potentially relevant methods, models and tools that that are rapidly evolving in the informatics community. Such tools are essential for effectively integrating data and models in complex research projects, yet it is often difficult to know what tools are available and it is not easy to understand or evaluate how they might be used in a given research context. The goal of the National Science Foundation-funded Virtual Learning Commons (VLC) is to improve awareness and understanding of emerging methodologies and technologies, facilitate individual and group evaluation of these, and trace the impact of innovations within and across teams, disciplines, and communities. The VLC is a Web-based social bookmarking site designed specifically to support knowledge exchange in research communities. It is founded on well-developed models of technology adoption, diffusion of innovation, and experiential learning. The VLC makes use of Web 2.0 (Social Web) and Web 3.0 (Semantic Web) approaches. Semantic Web approaches enable discovery of potentially relevant methods, models, and tools, while Social Web approaches enable collaborative learning about their function. The VLC is under development and the first release is expected Fall 2014.
Isidro-Llobet, Albert; Hadje Georgiou, Kathy; Galloway, Warren R J D; Giacomini, Elisa; Hansen, Mette R; Méndez-Abt, Gabriela; Tan, Yaw Sing; Carro, Laura; Sore, Hannah F; Spring, David R
2015-04-21
Macrocyclic peptidomimetics are associated with a broad range of biological activities. However, despite such potentially valuable properties, the macrocyclic peptidomimetic structural class is generally considered as being poorly explored within drug discovery. This has been attributed to the lack of general methods for producing collections of macrocyclic peptidomimetics with high levels of structural, and thus shape, diversity. In particular, there is a lack of scaffold diversity in current macrocyclic peptidomimetic libraries; indeed, the efficient construction of diverse molecular scaffolds presents a formidable general challenge to the synthetic chemist. Herein we describe a new, advanced strategy for the diversity-oriented synthesis (DOS) of macrocyclic peptidomimetics that enables the combinatorial variation of molecular scaffolds (core macrocyclic ring architectures). The generality and robustness of this DOS strategy is demonstrated by the step-efficient synthesis of a structurally diverse library of over 200 macrocyclic peptidomimetic compounds, each based around a distinct molecular scaffold and isolated in milligram quantities, from readily available building-blocks. To the best of our knowledge this represents an unprecedented level of scaffold diversity in a synthetically derived library of macrocyclic peptidomimetics. Cheminformatic analysis indicated that the library compounds access regions of chemical space that are distinct from those addressed by top-selling brand-name drugs and macrocyclic natural products, illustrating the value of our DOS approach to sample regions of chemical space underexploited in current drug discovery efforts. An analysis of three-dimensional molecular shapes illustrated that the DOS library has a relatively high level of shape diversity.
Three-Year College Discovery Master Plan, Bronx Community College, 1998-2001, Parts I-III.
ERIC Educational Resources Information Center
Smith, Shirley; Santa Rita, Emilio
Bronx Community College created a three-year College Discovery (CD) master plan for 1998-2001 to help restructure its counseling programs and support services and enable CD students to acquire an associate's degree level of education. The first area of restructuring is in the role of the director of College Discovery and Counseling. General…
37 CFR 1.71 - Detailed description and specification of the invention.
Code of Federal Regulations, 2011 CFR
2011-07-01
... enable any person skilled in the art or science to which the invention or discovery appertains, or with... specification must include a written description of the invention or discovery and of the manner and process of...
37 CFR 1.71 - Detailed description and specification of the invention.
Code of Federal Regulations, 2010 CFR
2010-07-01
... enable any person skilled in the art or science to which the invention or discovery appertains, or with... specification must include a written description of the invention or discovery and of the manner and process of...
[Challenges and strategies of drug innovation].
Guo, Zong-Ru; Zhao, Hong-Yu
2013-07-01
Drug research involves scientific discovery, technological inventions and product development. This multiple dimensional effort embodies both high risk and high reward and is considered one of the most complicated human activities. Prior to the initiation of a program, an in-depth analysis of "what to do" and "how to do it" must be conducted. On the macro level, market prospects, capital required, risk assessment, necessary human resources, etc. need to be evaluated critically. For execution, drug candidates need to be optimized in multiple properties such as potency, selectivity, pharmacokinetics, safety, formulation, etc., all with the constraint of finite amount of time and resources, to maximize the probability of success in clinical development. Drug discovery is enormously complicated, both in terms of technological innovation and organizing capital and other resources. A deep understanding of the complexity of drug research and our competitive edge is critical for success. Our unique government-enterprise-academia system represents a distinct advantage. As a new player, we have not heavily invested in any particular discovery paradigm, which allows us to select the optimal approach with little organizational burden. Virtue R&D model using CROs has gained momentum lately and China is a global leader in CRO market. Essentially all technological support for drug discovery can be found in China, which greatly enables domestic R&D efforts. The information technology revolution ensures the globalization of drug discovery knowledge, which has bridged much of the gap between China and the developed countries. The blockbuster model and the target-centric drug discovery paradigm have overlooked the research in several important fields such as injectable drugs, orphan drugs, and following high quality therapeutic leads, etc. Prejudice against covalent ligands, prodrugs, nondrug-like ligands can also be taken advantage of to find novel medicines. This article will discuss the current challenges and future opportunities for drug innovation in China.
Structure-Based Virtual Screening for Drug Discovery: Principles, Applications and Recent Advances
Lionta, Evanthia; Spyrou, George; Vassilatis, Demetrios K.; Cournia, Zoe
2014-01-01
Structure-based drug discovery (SBDD) is becoming an essential tool in assisting fast and cost-efficient lead discovery and optimization. The application of rational, structure-based drug design is proven to be more efficient than the traditional way of drug discovery since it aims to understand the molecular basis of a disease and utilizes the knowledge of the three-dimensional structure of the biological target in the process. In this review, we focus on the principles and applications of Virtual Screening (VS) within the context of SBDD and examine different procedures ranging from the initial stages of the process that include receptor and library pre-processing, to docking, scoring and post-processing of topscoring hits. Recent improvements in structure-based virtual screening (SBVS) efficiency through ensemble docking, induced fit and consensus docking are also discussed. The review highlights advances in the field within the framework of several success studies that have led to nM inhibition directly from VS and provides recent trends in library design as well as discusses limitations of the method. Applications of SBVS in the design of substrates for engineered proteins that enable the discovery of new metabolic and signal transduction pathways and the design of inhibitors of multifunctional proteins are also reviewed. Finally, we contribute two promising VS protocols recently developed by us that aim to increase inhibitor selectivity. In the first protocol, we describe the discovery of micromolar inhibitors through SBVS designed to inhibit the mutant H1047R PI3Kα kinase. Second, we discuss a strategy for the identification of selective binders for the RXRα nuclear receptor. In this protocol, a set of target structures is constructed for ensemble docking based on binding site shape characterization and clustering, aiming to enhance the hit rate of selective inhibitors for the desired protein target through the SBVS process. PMID:25262799
A bioinformatics knowledge discovery in text application for grid computing
Castellano, Marcello; Mastronardi, Giuseppe; Bellotti, Roberto; Tarricone, Gianfranco
2009-01-01
Background A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. Methods The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. Results A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. Conclusion In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities. PMID:19534749
A bioinformatics knowledge discovery in text application for grid computing.
Castellano, Marcello; Mastronardi, Giuseppe; Bellotti, Roberto; Tarricone, Gianfranco
2009-06-16
A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources. The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs. A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed. In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities.
NASA Astrophysics Data System (ADS)
Cook, R.; Michener, W.; Vieglais, D.; Budden, A.; Koskela, R.
2012-04-01
Addressing grand environmental science challenges requires unprecedented access to easily understood data that cross the breadth of temporal, spatial, and thematic scales. Tools are needed to plan management of the data, discover the relevant data, integrate heterogeneous and diverse data, and convert the data to information and knowledge. Addressing these challenges requires new approaches for the full data life cycle of managing, preserving, sharing, and analyzing data. DataONE (Observation Network for Earth) represents a virtual organization that enables new science and knowledge creation through preservation and access to data about life on Earth and the environment that sustains it. The DataONE approach is to improve data collection and management techniques; facilitate easy, secure, and persistent storage of data; continue to increase access to data and tools that improve data interoperability; disseminate integrated and user-friendly tools for data discovery and novel analyses; work with researchers to build intuitive data exploration and visualization tools; and support communities of practice via education, outreach, and stakeholder engagement.
ERIC Educational Resources Information Center
Tsantis, Linda; Castellani, John
2001-01-01
This article explores how knowledge-discovery applications can empower educators with the information they need to provide anticipatory guidance for teaching and learning, forecast school and district needs, and find critical markers for making the best program decisions for children and youth with disabilities. Data mining for schools is…
ERIC Educational Resources Information Center
Molina, Otilia Alejandro; Ratté, Sylvie
2017-01-01
This research introduces a method to construct a unified representation of teachers and students perspectives based on the actionable knowledge discovery (AKD) and delivery framework. The representation is constructed using two models: one obtained from student evaluations and the other obtained from teachers' reflections about their teaching…
ERIC Educational Resources Information Center
Taft, Laritza M.
2010-01-01
In its report "To Err is Human", The Institute of Medicine recommended the implementation of internal and external voluntary and mandatory automatic reporting systems to increase detection of adverse events. Knowledge Discovery in Databases (KDD) allows the detection of patterns and trends that would be hidden or less detectable if analyzed by…
Knowledge Discovery Process: Case Study of RNAV Adherence of Radar Track Data
NASA Technical Reports Server (NTRS)
Matthews, Bryan
2018-01-01
This talk is an introduction to the knowledge discovery process, beginning with: identifying the problem, choosing data sources, matching the appropriate machine learning tools, and reviewing the results. The overview will be given in the context of an ongoing study that is assessing RNAV adherence of commercial aircraft in the national airspace.
Antibody-enabled small-molecule drug discovery.
Lawson, Alastair D G
2012-06-29
Although antibody-based therapeutics have become firmly established as medicines for serious diseases, the value of antibodies as tools in the early stages of small-molecule drug discovery is only beginning to be realized. In particular, antibodies may provide information to reduce risk in small-molecule drug discovery by enabling the validation of targets and by providing insights into the design of small-molecule screening assays. Moreover, antibodies can act as guides in the quest for small molecules that have the ability to modulate protein-protein interactions, which have traditionally only been considered to be tractable targets for biological drugs. The development of small molecules that have similar therapeutic effects to current biologics has the potential to benefit a broader range of patients at earlier stages of disease.
Organs-on-chips at the frontiers of drug discovery
Esch, Eric W.; Bahinski, Anthony; Huh, Dongeun
2016-01-01
Improving the effectiveness of preclinical predictions of human drug responses is critical to reducing costly failures in clinical trials. Recent advances in cell biology, microfabrication and microfluidics have enabled the development of microengineered models of the functional units of human organs — known as organs-on-chips — that could provide the basis for preclinical assays with greater predictive power. Here, we examine the new opportunities for the application of organ-on-chip technologies in a range of areas in preclinical drug discovery, such as target identification and validation, target-based screening, and phenotypic screening. We also discuss emerging drug discovery opportunities enabled by organs-on-chips, as well as important challenges in realizing the full potential of this technology. PMID:25792263
Cernak, Tim; Gesmundo, Nathan J; Dykstra, Kevin; Yu, Yang; Wu, Zhicai; Shi, Zhi-Cai; Vachal, Petr; Sperbeck, Donald; He, Shuwen; Murphy, Beth Ann; Sonatore, Lisa; Williams, Steven; Madeira, Maria; Verras, Andreas; Reiter, Maud; Lee, Claire Heechoon; Cuff, James; Sherer, Edward C; Kuethe, Jeffrey; Goble, Stephen; Perrotto, Nicholas; Pinto, Shirly; Shen, Dong-Ming; Nargund, Ravi; Balkovec, James; DeVita, Robert J; Dreher, Spencer D
2017-05-11
Miniaturization and parallel processing play an important role in the evolution of many technologies. We demonstrate the application of miniaturized high-throughput experimentation methods to resolve synthetic chemistry challenges on the frontlines of a lead optimization effort to develop diacylglycerol acyltransferase (DGAT1) inhibitors. Reactions were performed on ∼1 mg scale using glass microvials providing a miniaturized high-throughput experimentation capability that was used to study a challenging S N Ar reaction. The availability of robust synthetic chemistry conditions discovered in these miniaturized investigations enabled the development of structure-activity relationships that ultimately led to the discovery of soluble, selective, and potent inhibitors of DGAT1.
Plowright, Alleyn T; Johnstone, Craig; Kihlberg, Jan; Pettersson, Jonas; Robb, Graeme; Thompson, Richard A
2012-01-01
In drug discovery, the central process of constructing and testing hypotheses, carefully conducting experiments and analysing the associated data for new findings and information is known as the design-make-test-analyse cycle. Each step relies heavily on the inputs and outputs of the other three components. In this article we report our efforts to improve and integrate all parts to enable smooth and rapid flow of high quality ideas. Key improvements include enhancing multi-disciplinary input into 'Design', increasing the use of knowledge and reducing cycle times in 'Make', providing parallel sets of relevant data within ten working days in 'Test' and maximising the learning in 'Analyse'. Copyright © 2011 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Boulicaut, Jean-Francois; Jeudy, Baptiste
Knowledge Discovery in Databases (KDD) is a complex interactive process. The promising theoretical framework of inductive databases considers this is essentially a querying process. It is enabled by a query language which can deal either with raw data or patterns which hold in the data. Mining patterns turns to be the so-called inductive query evaluation process for which constraint-based Data Mining techniques have to be designed. An inductive query specifies declaratively the desired constraints and algorithms are used to compute the patterns satisfying the constraints in the data. We survey important results of this active research domain. This chapter emphasizes a real breakthrough for hard problems concerning local pattern mining under various constraints and it points out the current directions of research as well.
ERIC Educational Resources Information Center
Harmon, Glynn
2013-01-01
The term discovery applies herein to the successful outcome of inquiry in which a significant personal, professional or scholarly breakthrough or insight occurs, and which is individually or socially acknowledged as a key contribution to knowledge. Since discoveries culminate at fixed points in time, discoveries can serve as an outcome metric for…
Jiang, Guoqian; Wang, Chen; Zhu, Qian; Chute, Christopher G
2013-01-01
Knowledge-driven text mining is becoming an important research area for identifying pharmacogenomics target genes. However, few of such studies have been focused on the pharmacogenomics targets of adverse drug events (ADEs). The objective of the present study is to build a framework of knowledge integration and discovery that aims to support pharmacogenomics target predication of ADEs. We integrate a semantically annotated literature corpus Semantic MEDLINE with a semantically coded ADE knowledgebase known as ADEpedia using a semantic web based framework. We developed a knowledge discovery approach combining a network analysis of a protein-protein interaction (PPI) network and a gene functional classification approach. We performed a case study of drug-induced long QT syndrome for demonstrating the usefulness of the framework in predicting potential pharmacogenomics targets of ADEs.
Pohl, Barbara; Fins, Joseph J
2014-04-01
Although health care reform efforts are laudably directed at promoting quality and efficiency, added bureaucracy may have the unintended consequence of constraining physicians' creativity. This has the potential to undermine clinicians' freedom to reframe their thinking in response to unfolding biological knowledge, a defining feature of academic medicine. In this Perspective, the authors illustrate the confluence of creativity, context, and discovery through a historical example: the evolution of tuberculosis (TB) multidrug chemotherapy as espoused by Walsh McDermott and his colleagues during the 1940s and 1950s.Before the discovery of streptomycin in 1943, clinician-researchers aimed to identify a "magic bullet" that would rapidly eradicate tubercle bacilli from the body. In the years following the discovery of streptomycin, it became clear that the biology of TB did not conform to researchers' expectations. The recognition that treatment would neither be simple nor quick prompted further attempts to devise an optimal streptomycin regimen, which would enable the host's immune system to suppress infection and prevent the emergence of streptomycin-resistant strains. By the late 1950s, investigators clarified the limits of streptomycin's effectiveness, which led to combined chemotherapy. In so doing, they gained a better understanding of drug-bacilli-host interactions and shifted attention from the host to the drug-resistant microbe.The authors argue that this tale of discovery offers a latent lesson for academic medicine: As the health care system undergoes systemic restructuring, it is essential to preserve the freedom to reframe thinking and creatively solve translational problems in research and practice.
discovery toolset for Emulytics v. 1.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fritz, David; Crussell, Jonathan
The discovery toolset for Emulytics enables the construction of high-fidelity emulation models of systems. The toolset consists of a set of tools and techniques to automatically go from network discovery of operational systems to emulating those complex systems. Our toolset combines data from host discovery and network mapping tools into an intermediate representation that can then be further refined. Once the intermediate representation reaches the desired state, our toolset supports emitting the Emulytics models with varying levels of specificity based on experiment needs.
Hooks and Shifts: A Dialectical Study of Mediated Discovery
ERIC Educational Resources Information Center
Abrahamson, Dor; Trninic, Dragan; Gutierrez, Jose F.; Huth, Jacob; Lee, Rosa G.
2011-01-01
Radical constructivists advocate discovery-based pedagogical regimes that enable students to incrementally and continuously adapt their cognitive structures to the instrumented cultural environment. Some sociocultural theorists, however, maintain that learning implies discontinuity in conceptual development, because novices must appropriate expert…
'Big Data' Collaboration: Exploring, Recording and Sharing Enterprise Knowledge
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sukumar, Sreenivas R; Ferrell, Regina Kay
2013-01-01
As data sources and data size proliferate, knowledge discovery from "Big Data" is starting to pose several challenges. In this paper, we address a specific challenge in the practice of enterprise knowledge management while extracting actionable nuggets from diverse data sources of seemingly-related information. In particular, we address the challenge of archiving knowledge gained through collaboration, dissemination and visualization as part of the data analysis, inference and decision-making lifecycle. We motivate the implementation of an enterprise data-discovery and knowledge recorder tool, called SEEKER based on real world case-study. We demonstrate SEEKER capturing schema and data-element relationships, tracking the data elementsmore » of value based on the queries and the analytical artifacts that are being created by analysts as they use the data. We show how the tool serves as digital record of institutional domain knowledge and a documentation for the evolution of data elements, queries and schemas over time. As a knowledge management service, a tool like SEEKER saves enterprise resources and time by avoiding analytic silos, expediting the process of multi-source data integration and intelligently documenting discoveries from fellow analysts.« less
NASA Technical Reports Server (NTRS)
Tilton, James C.; Cook, Diane J.
2008-01-01
Under a project recently selected for funding by NASA's Science Mission Directorate under the Applied Information Systems Research (AISR) program, Tilton and Cook will design and implement the integration of the Subdue graph based knowledge discovery system, developed at the University of Texas Arlington and Washington State University, with image segmentation hierarchies produced by the RHSEG software, developed at NASA GSFC, and perform pilot demonstration studies of data analysis, mining and knowledge discovery on NASA data. Subdue represents a method for discovering substructures in structural databases. Subdue is devised for general-purpose automated discovery, concept learning, and hierarchical clustering, with or without domain knowledge. Subdue was developed by Cook and her colleague, Lawrence B. Holder. For Subdue to be effective in finding patterns in imagery data, the data must be abstracted up from the pixel domain. An appropriate abstraction of imagery data is a segmentation hierarchy: a set of several segmentations of the same image at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. The RHSEG program, a recursive approximation to a Hierarchical Segmentation approach (HSEG), can produce segmentation hierarchies quickly and effectively for a wide variety of images. RHSEG and HSEG were developed at NASA GSFC by Tilton. In this presentation we provide background on the RHSEG and Subdue technologies and present a preliminary analysis on how RHSEG and Subdue may be combined to enhance image data analysis, mining and knowledge discovery.
Ander, Bradley P.; Zhang, Xiaoshuai; Xue, Fuzhong; Sharp, Frank R.; Yang, Xiaowei
2013-01-01
The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with ‘large p, small n’ problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed. PMID:23844055
Peng, Bin; Zhu, Dianwen; Ander, Bradley P; Zhang, Xiaoshuai; Xue, Fuzhong; Sharp, Frank R; Yang, Xiaowei
2013-01-01
The discovery of genetic or genomic markers plays a central role in the development of personalized medicine. A notable challenge exists when dealing with the high dimensionality of the data sets, as thousands of genes or millions of genetic variants are collected on a relatively small number of subjects. Traditional gene-wise selection methods using univariate analyses face difficulty to incorporate correlational, structural, or functional structures amongst the molecular measures. For microarray gene expression data, we first summarize solutions in dealing with 'large p, small n' problems, and then propose an integrative Bayesian variable selection (iBVS) framework for simultaneously identifying causal or marker genes and regulatory pathways. A novel partial least squares (PLS) g-prior for iBVS is developed to allow the incorporation of prior knowledge on gene-gene interactions or functional relationships. From the point view of systems biology, iBVS enables user to directly target the joint effects of multiple genes and pathways in a hierarchical modeling diagram to predict disease status or phenotype. The estimated posterior selection probabilities offer probabilitic and biological interpretations. Both simulated data and a set of microarray data in predicting stroke status are used in validating the performance of iBVS in a Probit model with binary outcomes. iBVS offers a general framework for effective discovery of various molecular biomarkers by combining data-based statistics and knowledge-based priors. Guidelines on making posterior inferences, determining Bayesian significance levels, and improving computational efficiencies are also discussed.
Predicting future discoveries from current scientific literature.
Petrič, Ingrid; Cestnik, Bojan
2014-01-01
Knowledge discovery in biomedicine is a time-consuming process starting from the basic research, through preclinical testing, towards possible clinical applications. Crossing of conceptual boundaries is often needed for groundbreaking biomedical research that generates highly inventive discoveries. We demonstrate the ability of a creative literature mining method to advance valuable new discoveries based on rare ideas from existing literature. When emerging ideas from scientific literature are put together as fragments of knowledge in a systematic way, they may lead to original, sometimes surprising, research findings. If enough scientific evidence is already published for the association of such findings, they can be considered as scientific hypotheses. In this chapter, we describe a method for the computer-aided generation of such hypotheses based on the existing scientific literature. Our literature-based discovery of NF-kappaB with its possible connections to autism was recently approved by scientific community, which confirms the ability of our literature mining methodology to accelerate future discoveries based on rare ideas from existing literature.
Janero, David R
2014-08-01
Technology often serves as a handmaiden and catalyst of invention. The discovery of safe, effective medications depends critically upon experimental approaches capable of providing high-impact information on the biological effects of drug candidates early in the discovery pipeline. This information can enable reliable lead identification, pharmacological compound differentiation and successful translation of research output into clinically useful therapeutics. The shallow preclinical profiling of candidate compounds promulgates a minimalistic understanding of their biological effects and undermines the level of value creation necessary for finding quality leads worth moving forward within the development pipeline with efficiency and prognostic reliability sufficient to help remediate the current pharma-industry productivity drought. Three specific technologies discussed herein, in addition to experimental areas intimately associated with contemporary drug discovery, appear to hold particular promise for strengthening the preclinical valuation of drug candidates by deepening lead characterization. These are: i) hydrogen-deuterium exchange mass spectrometry for characterizing structural and ligand-interaction dynamics of disease-relevant proteins; ii) activity-based chemoproteomics for profiling the functional diversity of mammalian proteomes; and iii) nuclease-mediated precision gene editing for developing more translatable cellular and in vivo models of human diseases. When applied in an informed manner congruent with the clinical understanding of disease processes, technologies such as these that span levels of biological organization can serve as valuable enablers of drug discovery and potentially contribute to reducing the current, unacceptably high rates of compound clinical failure.
Bioenergy Knowledge Discovery Framework Fact Sheet
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
The Bioenergy Knowledge Discovery Framework (KDF) supports the development of a sustainable bioenergy industry by providing access to a variety of data sets, publications, and collaboration and mapping tools that support bioenergy research, analysis, and decision making. In the KDF, users can search for information, contribute data, and use the tools and map interface to synthesize, analyze, and visualize information in a spatially integrated manner.
Teachers' Journal Club: Bridging between the Dynamics of Biological Discoveries and Biology Teachers
ERIC Educational Resources Information Center
Brill, Gilat; Falk, Hedda; Yarden, Anat
2003-01-01
Since biology is one of the most dynamic research fields within the natural sciences, the gap between the accumulated knowledge in biology and the knowledge that is taught in schools, increases rapidly with time. Our long-term objective is to develop means to bridge between the dynamics of biological discoveries and the biology teachers and…
Discovering Mendeleev's Model.
ERIC Educational Resources Information Center
Sterling, Donna
1996-01-01
Presents an activity that introduces the historical developments in science that led to the discovery of the periodic table and lets students experience scientific discovery firsthand. Enables students to learn about patterns among the elements and experience how scientists analyze data to discover patterns and build models. (JRH)
DOE Office of Scientific and Technical Information (OSTI.GOV)
McDermott, Jason E.; Wang, Jing; Mitchell, Hugh D.
2013-01-01
The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities both for purely statistical and expert knowledge-based approaches and would benefit from improved integration of the two. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges thatmore » have been encountered. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to biomarker discovery and characterization are key to future success in the biomarker field. We will describe our recommendations of possible approaches to this problem including metrics for the evaluation of biomarkers.« less
Computational functional genomics-based approaches in analgesic drug discovery and repurposing.
Lippmann, Catharina; Kringel, Dario; Ultsch, Alfred; Lötsch, Jörn
2018-06-01
Persistent pain is a major healthcare problem affecting a fifth of adults worldwide with still limited treatment options. The search for new analgesics increasingly includes the novel research area of functional genomics, which combines data derived from various processes related to DNA sequence, gene expression or protein function and uses advanced methods of data mining and knowledge discovery with the goal of understanding the relationship between the genome and the phenotype. Its use in drug discovery and repurposing for analgesic indications has so far been performed using knowledge discovery in gene function and drug target-related databases; next-generation sequencing; and functional proteomics-based approaches. Here, we discuss recent efforts in functional genomics-based approaches to analgesic drug discovery and repurposing and highlight the potential of computational functional genomics in this field including a demonstration of the workflow using a novel R library 'dbtORA'.
Buckler, Andrew J; Liu, Tiffany Ting; Savig, Erica; Suzek, Baris E; Ouellette, M; Danagoulian, J; Wernsing, G; Rubin, Daniel L; Paik, David
2013-08-01
A widening array of novel imaging biomarkers is being developed using ever more powerful clinical and preclinical imaging modalities. These biomarkers have demonstrated effectiveness in quantifying biological processes as they occur in vivo and in the early prediction of therapeutic outcomes. However, quantitative imaging biomarker data and knowledge are not standardized, representing a critical barrier to accumulating medical knowledge based on quantitative imaging data. We use an ontology to represent, integrate, and harmonize heterogeneous knowledge across the domain of imaging biomarkers. This advances the goal of developing applications to (1) improve precision and recall of storage and retrieval of quantitative imaging-related data using standardized terminology; (2) streamline the discovery and development of novel imaging biomarkers by normalizing knowledge across heterogeneous resources; (3) effectively annotate imaging experiments thus aiding comprehension, re-use, and reproducibility; and (4) provide validation frameworks through rigorous specification as a basis for testable hypotheses and compliance tests. We have developed the Quantitative Imaging Biomarker Ontology (QIBO), which currently consists of 488 terms spanning the following upper classes: experimental subject, biological intervention, imaging agent, imaging instrument, image post-processing algorithm, biological target, indicated biology, and biomarker application. We have demonstrated that QIBO can be used to annotate imaging experiments with standardized terms in the ontology and to generate hypotheses for novel imaging biomarker-disease associations. Our results established the utility of QIBO in enabling integrated analysis of quantitative imaging data.
HCV versus HIV drug discovery: Déjà vu all over again?
Watkins, William J; Desai, Manoj C
2013-04-15
Efforts to address HIV infection have been highly successful, enabling chronic suppression of viral replication with once-daily regimens. More recent research into HCV therapeutics have also resulted in very promising clinical candidates. This Digest explores similarities and differences in the two fields and compares the chronology of drug discovery relative to the availability of enabling tools, and concludes that safe and convenient, once-daily regimens are likely to reach approval much more rapidly for HCV than was the case for HIV. Copyright © 2013 Elsevier Ltd. All rights reserved.
Discovery Mechanisms for the Sensor Web
Jirka, Simon; Bröring, Arne; Stasch, Christoph
2009-01-01
This paper addresses the discovery of sensors within the OGC Sensor Web Enablement framework. Whereas services like the OGC Web Map Service or Web Coverage Service are already well supported through catalogue services, the field of sensor networks and the according discovery mechanisms is still a challenge. The focus within this article will be on the use of existing OGC Sensor Web components for realizing a discovery solution. After discussing the requirements for a Sensor Web discovery mechanism, an approach will be presented that was developed within the EU funded project “OSIRIS”. This solution offers mechanisms to search for sensors, exploit basic semantic relationships, harvest sensor metadata and integrate sensor discovery into already existing catalogues. PMID:22574038
Surgical data science: The new knowledge domain
Vedula, S. Swaroop; Hager, Gregory D.
2017-01-01
Healthcare in general, and surgery/interventional care in particular, is evolving through rapid advances in technology and increasing complexity of care with the goal of maximizing quality and value of care. While innovations in diagnostic and therapeutic technologies have driven past improvements in quality of surgical care, future transformation in care will be enabled by data. Conventional methodologies, such as registry studies, are limited in their scope for discovery and research, extent and complexity of data, breadth of analytic techniques, and translation or integration of research findings into patient care. We foresee the emergence of Surgical/Interventional Data Science (SDS) as a key element to addressing these limitations and creating a sustainable path toward evidence-based improvement of interventional healthcare pathways. SDS will create tools to measure, model and quantify the pathways or processes within the context of patient health states or outcomes, and use information gained to inform healthcare decisions, guidelines, best practices, policy, and training, thereby improving the safety and quality of healthcare and its value. Data is pervasive throughout the surgical care pathway; thus, SDS can impact various aspects of care including prevention, diagnosis, intervention, or post-operative recovery. Existing literature already provides preliminary results suggesting how a data science approach to surgical decision-making could more accurately predict severe complications using complex data from pre-, intra-, and post-operative contexts, how it could support intra-operative decision-making using both existing knowledge and continuous data streams throughout the surgical care pathway, and how it could enable effective collaboration between human care providers and intelligent technologies. In addition, SDS is poised to play a central role in surgical education, for example, through objective assessments, automated virtual coaching, and robot-assisted active learning of surgical skill. However, the potential for transforming surgical care and training through SDS may only be realized through a cultural shift that not only institutionalizes technology to seamlessly capture data but also assimilates individuals with expertise in data science into clinical research teams. Furthermore, collaboration with industry partners from the inception of the discovery process promotes optimal design of data products as well as their efficient translation and commercialization. As surgery continues to evolve through advances in technology that enhance delivery of care, SDS represents a new knowledge domain to engineer surgical care of the future. PMID:28936475
Surgical data science: The new knowledge domain.
Vedula, S Swaroop; Hager, Gregory D
2017-04-01
Healthcare in general, and surgery/interventional care in particular, is evolving through rapid advances in technology and increasing complexity of care with the goal of maximizing quality and value of care. While innovations in diagnostic and therapeutic technologies have driven past improvements in quality of surgical care, future transformation in care will be enabled by data. Conventional methodologies, such as registry studies, are limited in their scope for discovery and research, extent and complexity of data, breadth of analytic techniques, and translation or integration of research findings into patient care. We foresee the emergence of Surgical/Interventional Data Science (SDS) as a key element to addressing these limitations and creating a sustainable path toward evidence-based improvement of interventional healthcare pathways. SDS will create tools to measure, model and quantify the pathways or processes within the context of patient health states or outcomes, and use information gained to inform healthcare decisions, guidelines, best practices, policy, and training, thereby improving the safety and quality of healthcare and its value. Data is pervasive throughout the surgical care pathway; thus, SDS can impact various aspects of care including prevention, diagnosis, intervention, or post-operative recovery. Existing literature already provides preliminary results suggesting how a data science approach to surgical decision-making could more accurately predict severe complications using complex data from pre-, intra-, and post-operative contexts, how it could support intra-operative decision-making using both existing knowledge and continuous data streams throughout the surgical care pathway, and how it could enable effective collaboration between human care providers and intelligent technologies. In addition, SDS is poised to play a central role in surgical education, for example, through objective assessments, automated virtual coaching, and robot-assisted active learning of surgical skill. However, the potential for transforming surgical care and training through SDS may only be realized through a cultural shift that not only institutionalizes technology to seamlessly capture data but also assimilates individuals with expertise in data science into clinical research teams. Furthermore, collaboration with industry partners from the inception of the discovery process promotes optimal design of data products as well as their efficient translation and commercialization. As surgery continues to evolve through advances in technology that enhance delivery of care, SDS represents a new knowledge domain to engineer surgical care of the future.
Knowledge Discovery from Posts in Online Health Communities Using Unified Medical Language System.
Chen, Donghua; Zhang, Runtong; Liu, Kecheng; Hou, Lei
2018-06-19
Patient-reported posts in Online Health Communities (OHCs) contain various valuable information that can help establish knowledge-based online support for online patients. However, utilizing these reports to improve online patient services in the absence of appropriate medical and healthcare expert knowledge is difficult. Thus, we propose a comprehensive knowledge discovery method that is based on the Unified Medical Language System for the analysis of narrative posts in OHCs. First, we propose a domain-knowledge support framework for OHCs to provide a basis for post analysis. Second, we develop a Knowledge-Involved Topic Modeling (KI-TM) method to extract and expand explicit knowledge within the text. We propose four metrics, namely, explicit knowledge rate, latent knowledge rate, knowledge correlation rate, and perplexity, for the evaluation of the KI-TM method. Our experimental results indicate that our proposed method outperforms existing methods in terms of providing knowledge support. Our method enhances knowledge support for online patients and can help develop intelligent OHCs in the future.
Learning in the context of distribution drift
2017-05-09
published in the leading data mining journal, Data Mining and Knowledge Discovery (Webb et. al., 2016)1. We have shown that the previous qualitative...learner Low-bias learner Aggregated classifier Figure 7: Architecture for learning fr m streaming data in th co text of variable or unknown...Learning limited dependence Bayesian classifiers, in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD
A Bioinformatic Approach to Inter Functional Interactions within Protein Sequences
2009-02-23
AFOSR/AOARD Reference Number: USAFAOGA07: FA4869-07-1-4050 AFOSR/AOARD Program Manager : Hiroshi Motoda, Ph.D. Period of...Conference on Knowledge Discovery and Data Mining.) In a separate study we have applied our approaches to the problem of whole genome alignment. We have...SIGKDD Conference on Knowledge Discovery and Data Mining Attached. Interactions: Please list: (a) Participation/presentations at meetings
Xiang, Yang; Lu, Kewei; James, Stephen L.; Borlawsky, Tara B.; Huang, Kun; Payne, Philip R.O.
2011-01-01
The Unified Medical Language System (UMLS) is the largest thesaurus in the biomedical informatics domain. Previous works have shown that knowledge constructs comprised of transitively-associated UMLS concepts are effective for discovering potentially novel biomedical hypotheses. However, the extremely large size of the UMLS becomes a major challenge for these applications. To address this problem, we designed a k-neighborhood Decentralization Labeling Scheme (kDLS) for the UMLS, and the corresponding method to effectively evaluate the kDLS indexing results. kDLS provides a comprehensive solution for indexing the UMLS for very efficient large scale knowledge discovery. We demonstrated that it is highly effective to use kDLS paths to prioritize disease-gene relations across the whole genome, with extremely high fold-enrichment values. To our knowledge, this is the first indexing scheme capable of supporting efficient large scale knowledge discovery on the UMLS as a whole. Our expectation is that kDLS will become a vital engine for retrieving information and generating hypotheses from the UMLS for future medical informatics applications. PMID:22154838
Xiang, Yang; Lu, Kewei; James, Stephen L; Borlawsky, Tara B; Huang, Kun; Payne, Philip R O
2012-04-01
The Unified Medical Language System (UMLS) is the largest thesaurus in the biomedical informatics domain. Previous works have shown that knowledge constructs comprised of transitively-associated UMLS concepts are effective for discovering potentially novel biomedical hypotheses. However, the extremely large size of the UMLS becomes a major challenge for these applications. To address this problem, we designed a k-neighborhood Decentralization Labeling Scheme (kDLS) for the UMLS, and the corresponding method to effectively evaluate the kDLS indexing results. kDLS provides a comprehensive solution for indexing the UMLS for very efficient large scale knowledge discovery. We demonstrated that it is highly effective to use kDLS paths to prioritize disease-gene relations across the whole genome, with extremely high fold-enrichment values. To our knowledge, this is the first indexing scheme capable of supporting efficient large scale knowledge discovery on the UMLS as a whole. Our expectation is that kDLS will become a vital engine for retrieving information and generating hypotheses from the UMLS for future medical informatics applications. Copyright © 2011 Elsevier Inc. All rights reserved.
Knowledge Discovery and Data Mining in Iran's Climatic Researches
NASA Astrophysics Data System (ADS)
Karimi, Mostafa
2013-04-01
Advances in measurement technology and data collection is the database gets larger. Large databases require powerful tools for analysis data. Iterative process of acquiring knowledge from information obtained from data processing is done in various forms in all scientific fields. However, when the data volume large, and many of the problems the Traditional methods cannot respond. in the recent years, use of databases in various scientific fields, especially atmospheric databases in climatology expanded. in addition, increases in the amount of data generated by the climate models is a challenge for analysis of it for extraction of hidden pattern and knowledge. The approach to this problem has been made in recent years uses the process of knowledge discovery and data mining techniques with the use of the concepts of machine learning, artificial intelligence and expert (professional) systems is overall performance. Data manning is analytically process for manning in massive volume data. The ultimate goal of data mining is access to information and finally knowledge. climatology is a part of science that uses variety and massive volume data. Goal of the climate data manning is Achieve to information from variety and massive atmospheric and non-atmospheric data. in fact, Knowledge Discovery performs these activities in a logical and predetermined and almost automatic process. The goal of this research is study of uses knowledge Discovery and data mining technique in Iranian climate research. For Achieve This goal, study content (descriptive) analysis and classify base method and issue. The result shown that in climatic research of Iran most clustering, k-means and wards applied and in terms of issues precipitation and atmospheric circulation patterns most introduced. Although several studies in geography and climate issues with statistical techniques such as clustering and pattern extraction is done, Due to the nature of statistics and data mining, but cannot say for internal climate studies in data mining and knowledge discovery techniques are used. However, it is necessary to use the KDD Approach and DM techniques in the climatic studies, specific interpreter of climate modeling result.
Knowledge Retrieval Solutions.
ERIC Educational Resources Information Center
Khan, Kamran
1998-01-01
Excalibur RetrievalWare offers true knowledge retrieval solutions. Its fundamental technologies, Adaptive Pattern Recognition Processing and Semantic Networks, have capabilities for knowledge discovery and knowledge management of full-text, structured and visual information. The software delivers a combination of accuracy, extensibility,…
Knowledge extraction from evolving spiking neural networks with rank order population coding.
Soltic, Snjezana; Kasabov, Nikola
2010-12-01
This paper demonstrates how knowledge can be extracted from evolving spiking neural networks with rank order population coding. Knowledge discovery is a very important feature of intelligent systems. Yet, a disproportionally small amount of research is centered on the issue of knowledge extraction from spiking neural networks which are considered to be the third generation of artificial neural networks. The lack of knowledge representation compatibility is becoming a major detriment to end users of these networks. We show that a high-level knowledge can be obtained from evolving spiking neural networks. More specifically, we propose a method for fuzzy rule extraction from an evolving spiking network with rank order population coding. The proposed method was used for knowledge discovery on two benchmark taste recognition problems where the knowledge learnt by an evolving spiking neural network was extracted in the form of zero-order Takagi-Sugeno fuzzy IF-THEN rules.
Comparison: Discovery on WSMOLX and miAamics/jABC
NASA Astrophysics Data System (ADS)
Kubczak, Christian; Vitvar, Tomas; Winkler, Christian; Zaharia, Raluca; Zaremba, Maciej
This chapter compares the solutions to the SWS-Challenge discovery problems provided by DERI Galway and the joint solution from the Technical University of Dortmund and University of Postdam. The two approaches are described in depth in Chapters 10 and 13. The discovery scenario raises problems associated with making service discovery an automated process. It requires fine-grained specifications of search requests and service functionality including support for fetching dynamic information during the discovery process (e.g., shipment price). Both teams utilize semantics to describe services, service requests and data models in order to enable search at the required fine-grained level of detail.
mHealth Visual Discovery Dashboard.
Fang, Dezhi; Hohman, Fred; Polack, Peter; Sarker, Hillol; Kahng, Minsuk; Sharmin, Moushumi; al'Absi, Mustafa; Chau, Duen Horng
2017-09-01
We present Discovery Dashboard, a visual analytics system for exploring large volumes of time series data from mobile medical field studies. Discovery Dashboard offers interactive exploration tools and a data mining motif discovery algorithm to help researchers formulate hypotheses, discover trends and patterns, and ultimately gain a deeper understanding of their data. Discovery Dashboard emphasizes user freedom and flexibility during the data exploration process and enables researchers to do things previously challenging or impossible to do - in the web-browser and in real time. We demonstrate our system visualizing data from a mobile sensor study conducted at the University of Minnesota that included 52 participants who were trying to quit smoking.
mHealth Visual Discovery Dashboard
Fang, Dezhi; Hohman, Fred; Polack, Peter; Sarker, Hillol; Kahng, Minsuk; Sharmin, Moushumi; al'Absi, Mustafa; Chau, Duen Horng
2018-01-01
We present Discovery Dashboard, a visual analytics system for exploring large volumes of time series data from mobile medical field studies. Discovery Dashboard offers interactive exploration tools and a data mining motif discovery algorithm to help researchers formulate hypotheses, discover trends and patterns, and ultimately gain a deeper understanding of their data. Discovery Dashboard emphasizes user freedom and flexibility during the data exploration process and enables researchers to do things previously challenging or impossible to do — in the web-browser and in real time. We demonstrate our system visualizing data from a mobile sensor study conducted at the University of Minnesota that included 52 participants who were trying to quit smoking. PMID:29354812
Flood AI: An Intelligent Systems for Discovery and Communication of Disaster Knowledge
NASA Astrophysics Data System (ADS)
Demir, I.; Sermet, M. Y.
2017-12-01
Communities are not immune from extreme events or natural disasters that can lead to large-scale consequences for the nation and public. Improving resilience to better prepare, plan, recover, and adapt to disasters is critical to reduce the impacts of extreme events. The National Research Council (NRC) report discusses the topic of how to increase resilience to extreme events through a vision of resilient nation in the year 2030. The report highlights the importance of data, information, gaps and knowledge challenges that needs to be addressed, and suggests every individual to access the risk and vulnerability information to make their communities more resilient. This project presents an intelligent system, Flood AI, for flooding to improve societal preparedness by providing a knowledge engine using voice recognition, artificial intelligence, and natural language processing based on a generalized ontology for disasters with a primary focus on flooding. The knowledge engine utilizes the flood ontology and concepts to connect user input to relevant knowledge discovery channels on flooding by developing a data acquisition and processing framework utilizing environmental observations, forecast models, and knowledge bases. Communication channels of the framework includes web-based systems, agent-based chat bots, smartphone applications, automated web workflows, and smart home devices, opening the knowledge discovery for flooding to many unique use cases.
Duncan, Dean F; Kum, Hye-Chung; Weigensberg, Elizabeth Caplick; Flair, Kimberly A; Stewart, C Joy
2008-11-01
Proper management and implementation of an effective child welfare agency requires the constant use of information about the experiences and outcomes of children involved in the system, emphasizing the need for comprehensive, timely, and accurate data. In the past 20 years, there have been many advances in technology that can maximize the potential of administrative data to promote better evaluation and management in the field of child welfare. Specifically, this article discusses the use of knowledge discovery and data mining (KDD), which makes it possible to create longitudinal data files from administrative data sources, extract valuable knowledge, and make the information available via a user-friendly public Web site. This article demonstrates a successful project in North Carolina where knowledge discovery and data mining technology was used to develop a comprehensive set of child welfare outcomes available through a public Web site to facilitate information sharing of child welfare data to improve policy and practice.
Quantitative proteomics in cardiovascular research: global and targeted strategies
Shen, Xiaomeng; Young, Rebeccah; Canty, John M.; Qu, Jun
2014-01-01
Extensive technical advances in the past decade have substantially expanded quantitative proteomics in cardiovascular research. This has great promise for elucidating the mechanisms of cardiovascular diseases (CVD) and the discovery of cardiac biomarkers used for diagnosis and treatment evaluation. Global and targeted proteomics are the two major avenues of quantitative proteomics. While global approaches enable unbiased discovery of altered proteins via relative quantification at the proteome level, targeted techniques provide higher sensitivity and accuracy, and are capable of multiplexed absolute quantification in numerous clinical/biological samples. While promising, technical challenges need to be overcome to enable full utilization of these techniques in cardiovascular medicine. Here we discuss recent advances in quantitative proteomics and summarize applications in cardiovascular research with an emphasis on biomarker discovery and elucidating molecular mechanisms of disease. We propose the integration of global and targeted strategies as a high-throughput pipeline for cardiovascular proteomics. Targeted approaches enable rapid, extensive validation of biomarker candidates discovered by global proteomics. These approaches provide a promising alternative to immunoassays and other low-throughput means currently used for limited validation. PMID:24920501
Understanding drug targets: no such thing as bad news.
Roberts, Ruth A
2018-05-24
How can small-to-medium pharma and biotech companies enhance the chances of running a successful drug project and maximise the return on a limited number of assets? Having a full appreciation of the safety risks associated with proposed drug targets is a crucial element in understanding the unwanted side-effects that might stop a project in its tracks. Having this information is necessary to complement knowledge about the probable efficacy of a future drug. However, the lack of data-rich insight into drug-target safety is one of the major causes of drug-project failure today. Conducting comprehensive target-safety reviews early in the drug discovery process enables project teams to make the right decisions about which drug targets to take forward. Copyright © 2018 Elsevier Ltd. All rights reserved.
The Suess-Urey mission (return of solar matter to Earth).
Rapp, D; Naderi, F; Neugebauer, M; Sevilla, D; Sweetnam, D; Burnett, D; Wiens, R; Smith, N; Clark, B; McComas, D; Stansbery, E
1996-01-01
The Suess-Urey (S-U) mission has been proposed as a NASA Discovery mission to return samples of matter from the Sun to the Earth for isotopic and chemical analyses in terrestrial laboratories to provide a major improvement in our knowledge of the average chemical and isotopic composition of the solar system. The S-U spacecraft and sample return capsule will be placed in a halo orbit around the L1 Sun-Earth libration point for two years to collect solar wind ions which implant into large passive collectors made of ultra-pure materials. Constant Spacecraft-Sun-Earth geometries enable simple spin stabilized attitude control, simple passive thermal control, and a fixed medium gain antenna. Low data requirements and the safety of a Sun-pointed spinner, result in extremely low mission operations costs.
Evolution and Distribution of Saxitoxin Biosynthesis in Dinoflagellates
Orr, Russell J. S.; Stüken, Anke; Murray, Shauna A.; Jakobsen, Kjetill S.
2013-01-01
Numerous species of marine dinoflagellates synthesize the potent environmental neurotoxic alkaloid, saxitoxin, the agent of the human illness, paralytic shellfish poisoning. In addition, certain freshwater species of cyanobacteria also synthesize the same toxic compound, with the biosynthetic pathway and genes responsible being recently reported. Three theories have been postulated to explain the origin of saxitoxin in dinoflagellates: The production of saxitoxin by co-cultured bacteria rather than the dinoflagellates themselves, convergent evolution within both dinoflagellates and bacteria and horizontal gene transfer between dinoflagellates and bacteria. The discovery of cyanobacterial saxitoxin homologs in dinoflagellates has enabled us for the first time to evaluate these theories. Here, we review the distribution of saxitoxin within the dinoflagellates and our knowledge of its genetic basis to determine the likely evolutionary origins of this potent neurotoxin. PMID:23966031
Evolution and distribution of saxitoxin biosynthesis in dinoflagellates.
Orr, Russell J S; Stüken, Anke; Murray, Shauna A; Jakobsen, Kjetill S
2013-08-08
Numerous species of marine dinoflagellates synthesize the potent environmental neurotoxic alkaloid, saxitoxin, the agent of the human illness, paralytic shellfish poisoning. In addition, certain freshwater species of cyanobacteria also synthesize the same toxic compound, with the biosynthetic pathway and genes responsible being recently reported. Three theories have been postulated to explain the origin of saxitoxin in dinoflagellates: The production of saxitoxin by co-cultured bacteria rather than the dinoflagellates themselves, convergent evolution within both dinoflagellates and bacteria and horizontal gene transfer between dinoflagellates and bacteria. The discovery of cyanobacterial saxitoxin homologs in dinoflagellates has enabled us for the first time to evaluate these theories. Here, we review the distribution of saxitoxin within the dinoflagellates and our knowledge of its genetic basis to determine the likely evolutionary origins of this potent neurotoxin.
Universal fragment descriptors for predicting properties of inorganic crystals
NASA Astrophysics Data System (ADS)
Isayev, Olexandr; Oses, Corey; Toher, Cormac; Gossett, Eric; Curtarolo, Stefano; Tropsha, Alexander
2017-06-01
Although historically materials discovery has been driven by a laborious trial-and-error process, knowledge-driven materials design can now be enabled by the rational combination of Machine Learning methods and materials databases. Here, data from the AFLOW repository for ab initio calculations is combined with Quantitative Materials Structure-Property Relationship models to predict important properties: metal/insulator classification, band gap energy, bulk/shear moduli, Debye temperature and heat capacities. The prediction's accuracy compares well with the quality of the training data for virtually any stoichiometric inorganic crystalline material, reciprocating the available thermomechanical experimental data. The universality of the approach is attributed to the construction of the descriptors: Property-Labelled Materials Fragments. The representations require only minimal structural input allowing straightforward implementations of simple heuristic design rules.
Paediatric genomics: diagnosing rare disease in children.
Wright, Caroline F; FitzPatrick, David R; Firth, Helen V
2018-05-01
The majority of rare diseases affect children, most of whom have an underlying genetic cause for their condition. However, making a molecular diagnosis with current technologies and knowledge is often still a challenge. Paediatric genomics is an immature but rapidly evolving field that tackles this issue by incorporating next-generation sequencing technologies, especially whole-exome sequencing and whole-genome sequencing, into research and clinical workflows. This complex multidisciplinary approach, coupled with the increasing availability of population genetic variation data, has already resulted in an increased discovery rate of causative genes and in improved diagnosis of rare paediatric disease. Importantly, for affected families, a better understanding of the genetic basis of rare disease translates to more accurate prognosis, management, surveillance and genetic advice; stimulates research into new therapies; and enables provision of better support.
Improved Access to NSF Funded Ocean Research Data
NASA Astrophysics Data System (ADS)
Chandler, C. L.; Groman, R. C.; Kinkade, D.; Shepherd, A.; Rauch, S.; Allison, M. D.; Gegg, S. R.; Wiebe, P. H.; Glover, D. M.
2015-12-01
Data from NSF-funded, hypothesis-driven research comprise an essential part of the research results upon which we base our knowledge and improved understanding of the impacts of climate change. Initially funded in 2006, the Biological and Chemical Oceanography Data Management Office (BCO-DMO) works with marine scientists to ensure that data from NSF-funded ocean research programs are fully documented and freely available for future use. BCO-DMO works in partnership with information technology professionals, other marine data repositories and national data archive centers to ensure long-term preservation of these valuable environmental research data. Data contributed to BCO-DMO by the original investigators are enhanced with sufficient discipline-specific documentation and published in a variety of standards-compliant forms designed to enable discovery and support accurate re-use.
Parker, Michael T.
2016-01-01
Recent advances in sequencing technologies have opened the door for the classification of the human virome. While taxonomic classification can be applied to the viruses identified in such studies, this gives no information as to the type of interaction the virus has with the host. As follow-up studies are performed to address these questions, the description of these virus-host interactions would be greatly enriched by applying a standard set of definitions that typify them. This paper describes a framework with which all members of the human virome can be classified based on principles of ecology. The scaffold not only enables categorization of the human virome, but can also inform research aimed at identifying novel virus-host interactions. PMID:27698618
Rough Set Theory based prognostication of life expectancy for terminally ill patients.
Gil-Herrera, Eleazar; Yalcin, Ali; Tsalatsanis, Athanasios; Barnes, Laura E; Djulbegovic, Benjamin
2011-01-01
We present a novel knowledge discovery methodology that relies on Rough Set Theory to predict the life expectancy of terminally ill patients in an effort to improve the hospice referral process. Life expectancy prognostication is particularly valuable for terminally ill patients since it enables them and their families to initiate end-of-life discussions and choose the most desired management strategy for the remainder of their lives. We utilize retrospective data from 9105 patients to demonstrate the design and implementation details of a series of classifiers developed to identify potential hospice candidates. Preliminary results confirm the efficacy of the proposed methodology. We envision our work as a part of a comprehensive decision support system designed to assist terminally ill patients in making end-of-life care decisions.
Universal fragment descriptors for predicting properties of inorganic crystals.
Isayev, Olexandr; Oses, Corey; Toher, Cormac; Gossett, Eric; Curtarolo, Stefano; Tropsha, Alexander
2017-06-05
Although historically materials discovery has been driven by a laborious trial-and-error process, knowledge-driven materials design can now be enabled by the rational combination of Machine Learning methods and materials databases. Here, data from the AFLOW repository for ab initio calculations is combined with Quantitative Materials Structure-Property Relationship models to predict important properties: metal/insulator classification, band gap energy, bulk/shear moduli, Debye temperature and heat capacities. The prediction's accuracy compares well with the quality of the training data for virtually any stoichiometric inorganic crystalline material, reciprocating the available thermomechanical experimental data. The universality of the approach is attributed to the construction of the descriptors: Property-Labelled Materials Fragments. The representations require only minimal structural input allowing straightforward implementations of simple heuristic design rules.
Rehm, Markus; Prehn, Jochen H M
2013-06-01
Systems biology and systems medicine, i.e. the application of systems biology in a clinical context, is becoming of increasing importance in biology, drug discovery and health care. Systems biology incorporates knowledge and methods that are applied in mathematics, physics and engineering, but may not be part of classical training in biology. We here provide an introduction to basic concepts and methods relevant to the construction and application of systems models for apoptosis research. We present the key methods relevant to the representation of biochemical processes in signal transduction models, with a particular reference to apoptotic processes. We demonstrate how such models enable a quantitative and temporal analysis of changes in molecular entities in response to an apoptosis-inducing stimulus, and provide information on cell survival and cell death decisions. We introduce methods for analyzing the spatial propagation of cell death signals, and discuss the concepts of sensitivity analyses that enable a prediction of network responses to disturbances of single or multiple parameters. Copyright © 2013 Elsevier Inc. All rights reserved.
Salehi, Ali; Jimenez-Berni, Jose; Deery, David M; Palmer, Doug; Holland, Edward; Rozas-Larraondo, Pablo; Chapman, Scott C; Georgakopoulos, Dimitrios; Furbank, Robert T
2015-01-01
To our knowledge, there is no software or database solution that supports large volumes of biological time series sensor data efficiently and enables data visualization and analysis in real time. Existing solutions for managing data typically use unstructured file systems or relational databases. These systems are not designed to provide instantaneous response to user queries. Furthermore, they do not support rapid data analysis and visualization to enable interactive experiments. In large scale experiments, this behaviour slows research discovery, discourages the widespread sharing and reuse of data that could otherwise inform critical decisions in a timely manner and encourage effective collaboration between groups. In this paper we present SensorDB, a web based virtual laboratory that can manage large volumes of biological time series sensor data while supporting rapid data queries and real-time user interaction. SensorDB is sensor agnostic and uses web-based, state-of-the-art cloud and storage technologies to efficiently gather, analyse and visualize data. Collaboration and data sharing between different agencies and groups is thereby facilitated. SensorDB is available online at http://sensordb.csiro.au.
Augmented reality enabling intelligence exploitation at the edge
NASA Astrophysics Data System (ADS)
Kase, Sue E.; Roy, Heather; Bowman, Elizabeth K.; Patton, Debra
2015-05-01
Today's Warfighters need to make quick decisions while interacting in densely populated environments comprised of friendly, hostile, and neutral host nation locals. However, there is a gap in the real-time processing of big data streams for edge intelligence. We introduce a big data processing pipeline called ARTEA that ingests, monitors, and performs a variety of analytics including noise reduction, pattern identification, and trend and event detection in the context of an area of operations (AOR). Results of the analytics are presented to the Soldier via an augmented reality (AR) device Google Glass (Glass). Non-intrusive AR devices such as Glass can visually communicate contextually relevant alerts to the Soldier based on the current mission objectives, time, location, and observed or sensed activities. This real-time processing and AR presentation approach to knowledge discovery flattens the intelligence hierarchy enabling the edge Soldier to act as a vital and active participant in the analysis process. We report preliminary observations testing ARTEA and Glass in a document exploitation and person of interest scenario simulating edge Soldier participation in the intelligence process in disconnected deployment conditions.
DATS, the data tag suite to enable discoverability of datasets.
Sansone, Susanna-Assunta; Gonzalez-Beltran, Alejandra; Rocca-Serra, Philippe; Alter, George; Grethe, Jeffrey S; Xu, Hua; Fore, Ian M; Lyle, Jared; Gururaj, Anupama E; Chen, Xiaoling; Kim, Hyeon-Eui; Zong, Nansu; Li, Yueling; Liu, Ruiling; Ozyurt, I Burak; Ohno-Machado, Lucila
2017-06-06
Today's science increasingly requires effective ways to find and access existing datasets that are distributed across a range of repositories. For researchers in the life sciences, discoverability of datasets may soon become as essential as identifying the latest publications via PubMed. Through an international collaborative effort funded by the National Institutes of Health (NIH)'s Big Data to Knowledge (BD2K) initiative, we have designed and implemented the DAta Tag Suite (DATS) model to support the DataMed data discovery index. DataMed's goal is to be for data what PubMed has been for the scientific literature. Akin to the Journal Article Tag Suite (JATS) used in PubMed, the DATS model enables submission of metadata on datasets to DataMed. DATS has a core set of elements, which are generic and applicable to any type of dataset, and an extended set that can accommodate more specialized data types. DATS is a platform-independent model also available as an annotated serialization in schema.org, which in turn is widely used by major search engines like Google, Microsoft, Yahoo and Yandex.
Computational tools for comparative phenomics; the role and promise of ontologies
Gkoutos, Georgios V.; Schofield, Paul N.; Hoehndorf, Robert
2012-01-01
A major aim of the biological sciences is to gain an understanding of human physiology and disease. One important step towards such a goal is the discovery of the function of genes that will lead to better understanding of the physiology and pathophysiology of organisms ultimately providing better understanding, diagnosis, and therapy. Our increasing ability to phenotypically characterise genetic variants of model organisms coupled with systematic and hypothesis-driven mutagenesis is resulting in a wealth of information that could potentially provide insight to the functions of all genes in an organism. The challenge we are now facing is to develop computational methods that can integrate and analyse such data. The introduction of formal ontologies that make their semantics explicit and accessible to automated reasoning promises the tantalizing possibility of standardizing biomedical knowledge allowing for novel, powerful queries that bridge multiple domains, disciplines, species and levels of granularity. We review recent computational approaches that facilitate the integration of experimental data from model organisms with clinical observations in humans. These methods foster novel cross species analysis approaches, thereby enabling comparative phenomics and leading to the potential of translating basic discoveries from the model systems into diagnostic and therapeutic advances at the clinical level. PMID:22814867
Das, Mohua; Tianming, Yang; Jinghua, Dong; Prasetya, Fransisca; Yiming, Xie; Wong, Kendra; Cheong, Adeline; Woon, Esther C Y
2018-06-19
Dynamic combinatorial chemistry (DCC) is a powerful supramolecular approach for discovering ligands for biomolecules. To date, most, if not all, biologically-templated DCC employ only a single biomolecule in directing the self-assembly process. To expand the scope and potential of DCC, herein, we developed a novel multi-protein DCC strategy which combines the discriminatory power of zwitterionic 'thermal-tag' with the sensitivity of differential scanning fluorimetry. This strategy enables the discovery of ligands against several proteins of interest concurrently. It is remarkably sensitive and could differentiate the binding of ligands to structurally-similar subfamily members, which is extremely challenging to achieve. Through this approach, we were able to simultaneously identify subfamily-selective probes against two clinically important epigenetic enzymes, FTO (7; IC₅₀ = 2.6 µM) and ALKBH3 (8; IC₅₀ = 3.7 µM). To our knowledge, this is the first report of a subfamily-selective ALKBH3 inhibitor. The developed strategy could, in principle, be adapted to a broad range of proteins, thus it shall be of widespread scientific interest. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Quantitative mass spectrometry: an overview
NASA Astrophysics Data System (ADS)
Urban, Pawel L.
2016-10-01
Mass spectrometry (MS) is a mainstream chemical analysis technique in the twenty-first century. It has contributed to numerous discoveries in chemistry, physics and biochemistry. Hundreds of research laboratories scattered all over the world use MS every day to investigate fundamental phenomena on the molecular level. MS is also widely used by industry-especially in drug discovery, quality control and food safety protocols. In some cases, mass spectrometers are indispensable and irreplaceable by any other metrological tools. The uniqueness of MS is due to the fact that it enables direct identification of molecules based on the mass-to-charge ratios as well as fragmentation patterns. Thus, for several decades now, MS has been used in qualitative chemical analysis. To address the pressing need for quantitative molecular measurements, a number of laboratories focused on technological and methodological improvements that could render MS a fully quantitative metrological platform. In this theme issue, the experts working for some of those laboratories share their knowledge and enthusiasm about quantitative MS. I hope this theme issue will benefit readers, and foster fundamental and applied research based on quantitative MS measurements. This article is part of the themed issue 'Quantitative mass spectrometry'.
A semantic web ontology for small molecules and their biological targets.
Choi, Jooyoung; Davis, Melissa J; Newman, Andrew F; Ragan, Mark A
2010-05-24
A wide range of data on sequences, structures, pathways, and networks of genes and gene products is available for hypothesis testing and discovery in biological and biomedical research. However, data describing the physical, chemical, and biological properties of small molecules have not been well-integrated with these resources. Semantically rich representations of chemical data, combined with Semantic Web technologies, have the potential to enable the integration of small molecule and biomolecular data resources, expanding the scope and power of biomedical and pharmacological research. We employed the Semantic Web technologies Resource Description Framework (RDF) and Web Ontology Language (OWL) to generate a Small Molecule Ontology (SMO) that represents concepts and provides unique identifiers for biologically relevant properties of small molecules and their interactions with biomolecules, such as proteins. We instanced SMO using data from three public data sources, i.e., DrugBank, PubChem and UniProt, and converted to RDF triples. Evaluation of SMO by use of predetermined competency questions implemented as SPARQL queries demonstrated that data from chemical and biomolecular data sources were effectively represented and that useful knowledge can be extracted. These results illustrate the potential of Semantic Web technologies in chemical, biological, and pharmacological research and in drug discovery.
Magnetars: Challenging the Extremes of Nature
NASA Technical Reports Server (NTRS)
Kouveliotou, Chryssa
2011-01-01
Magnetars are magnetically powered rotating neutron stars with extreme magnetic fields (over 10(exp 14) Gauss). They are discovered and emit predominantly in the X- and gamma-rays. Very few sources (roughly 15) have been found since their discovery in 1987. NASA s Fermi Observatory was launched June 11, 2009; the Fermi Gamma Ray Burst Monitor (GBM) began normal operations on July 14, about a month after launch, when the trigger algorithms were enabled. Since then, we recorded emission from four magnetar sources; of these, only one was an old magnetar: SGR 1806+20. The other three detections were two brand new sources, SGR J0501+4516, discovered with Swift and extensively monitored with both Swift and GBM, SGR J0418+5729, discovered with GBM and the Interplanetary Network (IPN), and SGR J1550-5418, a source originally classified as an Anomalous X-ray Pulsar (AXP 1E1547.0-5408). In my talk I will give a short history of the discovery of magnetars and describe how this, once relatively esoteric field, has emerged as a link between several astrophysical areas including Gamma-Ray Bursts. Finally, I will describe the properties of these sources and the current status of our knowledge of the magnetar population and birth rate.
NASA Astrophysics Data System (ADS)
Furfaro, R.; Linares, R.; Gaylor, D.; Jah, M.; Walls, R.
2016-09-01
In this paper, we present an end-to-end approach that employs machine learning techniques and Ontology-based Bayesian Networks (BN) to characterize the behavior of resident space objects. State-of-the-Art machine learning architectures (e.g. Extreme Learning Machines, Convolutional Deep Networks) are trained on physical models to learn the Resident Space Object (RSO) features in the vectorized energy and momentum states and parameters. The mapping from measurements to vectorized energy and momentum states and parameters enables behavior characterization via clustering in the features space and subsequent RSO classification. Additionally, Space Object Behavioral Ontologies (SOBO) are employed to define and capture the domain knowledge-base (KB) and BNs are constructed from the SOBO in a semi-automatic fashion to execute probabilistic reasoning over conclusions drawn from trained classifiers and/or directly from processed data. Such an approach enables integrating machine learning classifiers and probabilistic reasoning to support higher-level decision making for space domain awareness applications. The innovation here is to use these methods (which have enjoyed great success in other domains) in synergy so that it enables a "from data to discovery" paradigm by facilitating the linkage and fusion of large and disparate sources of information via a Big Data Science and Analytics framework.
Text mining resources for the life sciences.
Przybyła, Piotr; Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia
2016-01-01
Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable-those that have the crucial ability to share information, enabling smooth integration and reusability. © The Author(s) 2016. Published by Oxford University Press.
Text mining resources for the life sciences
Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia
2016-01-01
Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable—those that have the crucial ability to share information, enabling smooth integration and reusability. PMID:27888231
Exploring the Possibilities: Earth and Space Science Missions in the Context of Exploration
NASA Technical Reports Server (NTRS)
Pfarr, Barbara; Calabrese, Michael; Kirkpatrick, James; Malay, Jonathan T.
2006-01-01
According to Dr. Edward J. Weiler, Director of the Goddard Space Flight Center, "Exploration without science is tourism". At the American Astronautical Society's 43rd Annual Robert H. Goddard Memorial Symposium it was quite apparent to all that NASA's current Exploration Initiative is tightly coupled to multiple scientific initiatives: exploration will enable new science and science will enable exploration. NASA's Science Mission Directorate plans to develop priority science missions that deliver science that is vital, compelling and urgent. This paper will discuss the theme of the Goddard Memorial Symposium that science plays a key role in exploration. It will summarize the key scientific questions and some of the space and Earth science missions proposed to answer them, including the Mars and Lunar Exploration Programs, the Beyond Einstein and Navigator Programs, and the Earth-Sun System missions. It will also discuss some of the key technologies that will enable these missions, including the latest in instruments and sensors, large space optical system technologies and optical communications, and briefly discuss developments and achievements since the Symposium. Throughout history, humans have made the biggest scientific discoveries by visiting unknown territories; by going to the Moon and other planets and by seeking out habitable words, NASA is continuing humanity's quest for scientific knowledge.
An Expert System toward Buiding An Earth Science Knowledge Graph
NASA Astrophysics Data System (ADS)
Zhang, J.; Duan, X.; Ramachandran, R.; Lee, T. J.; Bao, Q.; Gatlin, P. N.; Maskey, M.
2017-12-01
In this ongoing work, we aim to build foundations of Cognitive Computing for Earth Science research. The goal of our project is to develop an end-to-end automated methodology for incrementally constructing Knowledge Graphs for Earth Science (KG4ES). These knowledge graphs can then serve as the foundational components for building cognitive systems in Earth science, enabling researchers to uncover new patterns and hypotheses that are virtually impossible to identify today. In addition, this research focuses on developing mining algorithms needed to exploit these constructed knowledge graphs. As such, these graphs will free knowledge from publications that are generated in a very linear, deterministic manner, and structure knowledge in a way that users can both interact and connect with relevant pieces of information. Our major contributions are two-fold. First, we have developed an end-to-end methodology for constructing Knowledge Graphs for Earth Science (KG4ES) using existing corpus of journal papers and reports. One of the key challenges in any machine learning, especially deep learning applications, is the need for robust and large training datasets. We have developed techniques capable of automatically retraining models and incrementally building and updating KG4ES, based on ever evolving training data. We also adopt the evaluation instrument based on common research methodologies used in Earth science research, especially in Atmospheric Science. Second, we have developed an algorithm to infer new knowledge that can exploit the constructed KG4ES. In more detail, we have developed a network prediction algorithm aiming to explore and predict possible new connections in the KG4ES and aid in new knowledge discovery.
2015 Army Science Planning and Strategy Meeting Series: Outcomes and Conclusions
2017-12-21
modeling and nanoscale characterization tools to enable efficient design of hybridized manufacturing ; realtime, multiscale computational capability...to enable predictive analytics for expeditionary on-demand manufacturing • Discovery of design principles to enable programming advanced genetic...goals, significant research is needed to mature the fundamental materials science, processing and manufacturing sciences, design methodologies, data
On the Growth of Scientific Knowledge: Yeast Biology as a Case Study
He, Xionglei; Zhang, Jianzhi
2009-01-01
The tempo and mode of human knowledge expansion is an enduring yet poorly understood topic. Through a temporal network analysis of three decades of discoveries of protein interactions and genetic interactions in baker's yeast, we show that the growth of scientific knowledge is exponential over time and that important subjects tend to be studied earlier. However, expansions of different domains of knowledge are highly heterogeneous and episodic such that the temporal turnover of knowledge hubs is much greater than expected by chance. Familiar subjects are preferentially studied over new subjects, leading to a reduced pace of innovation. While research is increasingly done in teams, the number of discoveries per researcher is greater in smaller teams. These findings reveal collective human behaviors in scientific research and help design better strategies in future knowledge exploration. PMID:19300476
On the growth of scientific knowledge: yeast biology as a case study.
He, Xionglei; Zhang, Jianzhi
2009-03-01
The tempo and mode of human knowledge expansion is an enduring yet poorly understood topic. Through a temporal network analysis of three decades of discoveries of protein interactions and genetic interactions in baker's yeast, we show that the growth of scientific knowledge is exponential over time and that important subjects tend to be studied earlier. However, expansions of different domains of knowledge are highly heterogeneous and episodic such that the temporal turnover of knowledge hubs is much greater than expected by chance. Familiar subjects are preferentially studied over new subjects, leading to a reduced pace of innovation. While research is increasingly done in teams, the number of discoveries per researcher is greater in smaller teams. These findings reveal collective human behaviors in scientific research and help design better strategies in future knowledge exploration.
Ishii, Masaru
2015-06-01
Recent advances in intravital bone imaging technology has enabled us to grasp the real cellular behaviors and functions in vivo , revolutionizing the field of drug discovery for novel therapeutics against intractable bone diseases. In this chapter, I introduce various updated information on pharmacological actions of several antibone resorptive agents, which could only be derived from advanced imaging techniques, and also discuss the future perspectives of this new trend in drug discovery.
Applying Knowledge Discovery in Databases in Public Health Data Set: Challenges and Concerns
Volrathongchia, Kanittha
2003-01-01
In attempting to apply Knowledge Discovery in Databases (KDD) to generate a predictive model from a health care dataset that is currently available to the public, the first step is to pre-process the data to overcome the challenges of missing data, redundant observations, and records containing inaccurate data. This study will demonstrate how to use simple pre-processing methods to improve the quality of input data. PMID:14728545
Exploiting Early Intent Recognition for Competitive Advantage
2009-01-01
basketball [Bhan- dari et al., 1997; Jug et al., 2003], and Robocup soccer sim- ulations [Riley and Veloso, 2000; 2002; Kuhlmann et al., 2006] and non...actions (e.g. before, after, around). Jug et al. [2003] used a similar framework for offline basketball game analysis. More recently, Hess et al...and K. Ramanujam. Advanced Scout: Data mining and knowledge discovery in NBA data. Data Mining and Knowledge Discovery, 1(1):121–125, 1997. [Chang
ERIC Educational Resources Information Center
Fyfe, Emily R.; DeCaro, Marci S.; Rittle-Johnson, Bethany
2013-01-01
An emerging consensus suggests that guided discovery, which combines discovery and instruction, is a more effective educational approach than either one in isolation. The goal of this study was to examine two specific forms of guided discovery, testing whether conceptual instruction should precede or follow exploratory problem solving. In both…
ERIC Educational Resources Information Center
Liu, Chen-Chung; Don, Ping-Hsing; Chung, Chen-Wei; Lin, Shao-Jun; Chen, Gwo-Dong; Liu, Baw-Jhiune
2010-01-01
While Web discovery is usually undertaken as a solitary activity, Web co-discovery may transform Web learning activities from the isolated individual search process into interactive and collaborative knowledge exploration. Recent studies have proposed Web co-search environments on a single computer, supported by multiple one-to-one technologies.…
Knowledge Management in Higher Education: A Knowledge Repository Approach
ERIC Educational Resources Information Center
Wedman, John; Wang, Feng-Kwei
2005-01-01
One might expect higher education, where the discovery and dissemination of new and useful knowledge is vital, to be among the first to implement knowledge management practices. Surprisingly, higher education has been slow to implement knowledge management practices (Townley, 2003). This article describes an ongoing research and development effort…
Rector, Annabel; Tachezy, Ruth; Van Ranst, Marc
2004-01-01
The discovery of novel viruses has often been accomplished by using hybridization-based methods that necessitate the availability of a previously characterized virus genome probe or knowledge of the viral nucleotide sequence to construct consensus or degenerate PCR primers. In their natural replication cycle, certain viruses employ a rolling-circle mechanism to propagate their circular genomes, and multiply primed rolling-circle amplification (RCA) with φ29 DNA polymerase has recently been applied in the amplification of circular plasmid vectors used in cloning. We employed an isothermal RCA protocol that uses random hexamer primers to amplify the complete genomes of papillomaviruses without the need for prior knowledge of their DNA sequences. We optimized this RCA technique with extracted human papillomavirus type 16 (HPV-16) DNA from W12 cells, using a real-time quantitative PCR assay to determine amplification efficiency, and obtained a 2.4 × 104-fold increase in HPV-16 DNA concentration. We were able to clone the complete HPV-16 genome from this multiply primed RCA product. The optimized protocol was subsequently applied to a bovine fibropapillomatous wart tissue sample. Whereas no papillomavirus DNA could be detected by restriction enzyme digestion of the original sample, multiply primed RCA enabled us to obtain a sufficient amount of papillomavirus DNA for restriction enzyme analysis, cloning, and subsequent sequencing of a novel variant of bovine papillomavirus type 1. The multiply primed RCA method allows the discovery of previously unknown papillomaviruses, and possibly also other circular DNA viruses, without a priori sequence information. PMID:15113879
Improving drug safety: From adverse drug reaction knowledge discovery to clinical implementation.
Tan, Yuxiang; Hu, Yong; Liu, Xiaoxiao; Yin, Zhinan; Chen, Xue-Wen; Liu, Mei
2016-11-01
Adverse drug reactions (ADRs) are a major public health concern, causing over 100,000 fatalities in the United States every year with an annual cost of $136 billion. Early detection and accurate prediction of ADRs is thus vital for drug development and patient safety. Multiple scientific disciplines, namely pharmacology, pharmacovigilance, and pharmacoinformatics, have been addressing the ADR problem from different perspectives. With the same goal of improving drug safety, this article summarizes and links the research efforts in the multiple disciplines into a single framework from comprehensive understanding of the interactions between drugs and biological system and the identification of genetic and phenotypic predispositions of patients susceptible to higher ADR risks and finally to the current state of implementation of medication-related decision support systems. We start by describing available computational resources for building drug-target interaction networks with biological annotations, which provides a fundamental knowledge for ADR prediction. Databases are classified by functions to help users in selection. Post-marketing surveillance is then introduced where data-driven approach can not only enhance the prediction accuracy of ADRs but also enables the discovery of genetic and phenotypic risk factors of ADRs. Understanding genetic risk factors for ADR requires well organized patient genetics information and analysis by pharmacogenomic approaches. Finally, current state of clinical decision support systems is presented and described how clinicians can be assisted with the integrated knowledgebase to minimize the risk of ADR. This review ends with a discussion of existing challenges in each of disciplines with potential solutions and future directions. Copyright © 2016 Elsevier Inc. All rights reserved.
Empirical study using network of semantically related associations in bridging the knowledge gap.
Abedi, Vida; Yeasin, Mohammed; Zand, Ramin
2014-11-27
The data overload has created a new set of challenges in finding meaningful and relevant information with minimal cognitive effort. However designing robust and scalable knowledge discovery systems remains a challenge. Recent innovations in the (biological) literature mining tools have opened new avenues to understand the confluence of various diseases, genes, risk factors as well as biological processes in bridging the gaps between the massive amounts of scientific data and harvesting useful knowledge. In this paper, we highlight some of the findings using a text analytics tool, called ARIANA--Adaptive Robust and Integrative Analysis for finding Novel Associations. Empirical study using ARIANA reveals knowledge discovery instances that illustrate the efficacy of such tool. For example, ARIANA can capture the connection between the drug hexamethonium and pulmonary inflammation and fibrosis that caused the tragic death of a healthy volunteer in a 2001 John Hopkins asthma study, even though the abstract of the study was not part of the semantic model. An integrated system, such as ARIANA, could assist the human expert in exploratory literature search by bringing forward hidden associations, promoting data reuse and knowledge discovery as well as stimulating interdisciplinary projects by connecting information across the disciplines.
Scientific Training in the Era of Big Data: A New Pedagogy for Graduate Education.
Aikat, Jay; Carsey, Thomas M; Fecho, Karamarie; Jeffay, Kevin; Krishnamurthy, Ashok; Mucha, Peter J; Rajasekar, Arcot; Ahalt, Stanley C
2017-03-01
The era of "big data" has radically altered the way scientific research is conducted and new knowledge is discovered. Indeed, the scientific method is rapidly being complemented and even replaced in some fields by data-driven approaches to knowledge discovery. This paradigm shift is sometimes referred to as the "fourth paradigm" of data-intensive and data-enabled scientific discovery. Interdisciplinary research with a hard emphasis on translational outcomes is becoming the norm in all large-scale scientific endeavors. Yet, graduate education remains largely focused on individual achievement within a single scientific domain, with little training in team-based, interdisciplinary data-oriented approaches designed to translate scientific data into new solutions to today's critical challenges. In this article, we propose a new pedagogy for graduate education: data-centered learning for the domain-data scientist. Our approach is based on four tenets: (1) Graduate training must incorporate interdisciplinary training that couples the domain sciences with data science. (2) Graduate training must prepare students for work in data-enabled research teams. (3) Graduate training must include education in teaming and leadership skills for the data scientist. (4) Graduate training must provide experiential training through academic/industry practicums and internships. We emphasize that this approach is distinct from today's graduate training, which offers training in either data science or a domain science (e.g., biology, sociology, political science, economics, and medicine), but does not integrate the two within a single curriculum designed to prepare the next generation of domain-data scientists. We are in the process of implementing the proposed pedagogy through the development of a new graduate curriculum based on the above four tenets, and we describe herein our strategy, progress, and lessons learned. While our pedagogy was developed in the context of graduate education, the general approach of data-centered learning can and should be applied to students and professionals at any stage of their education, including at the K-12, undergraduate, graduate, and professional levels. We believe that the time is right to embed data-centered learning within our educational system and, thus, generate the talent required to fully harness the potential of big data.
MetaCoMET: a web platform for discovery and visualization of the core microbiome
USDA-ARS?s Scientific Manuscript database
A key component of the analysis of microbiome datasets is the identification of OTUs shared between multiple experimental conditions, commonly referred to as the core microbiome. Results: We present a web platform named MetaCoMET that enables the discovery and visualization of the core microbiome an...
Service-based analysis of biological pathways
Zheng, George; Bouguettaya, Athman
2009-01-01
Background Computer-based pathway discovery is concerned with two important objectives: pathway identification and analysis. Conventional mining and modeling approaches aimed at pathway discovery are often effective at achieving either objective, but not both. Such limitations can be effectively tackled leveraging a Web service-based modeling and mining approach. Results Inspired by molecular recognitions and drug discovery processes, we developed a Web service mining tool, named PathExplorer, to discover potentially interesting biological pathways linking service models of biological processes. The tool uses an innovative approach to identify useful pathways based on graph-based hints and service-based simulation verifying user's hypotheses. Conclusion Web service modeling of biological processes allows the easy access and invocation of these processes on the Web. Web service mining techniques described in this paper enable the discovery of biological pathways linking these process service models. Algorithms presented in this paper for automatically highlighting interesting subgraph within an identified pathway network enable the user to formulate hypothesis, which can be tested out using our simulation algorithm that are also described in this paper. PMID:19796403
'Ethos' Enabling Organisational Knowledge Creation
NASA Astrophysics Data System (ADS)
Matsudaira, Yoshito
This paper examines knowledge creation in relation to improvements on the production line in the manufacturing department of Nissan Motor Company and aims to clarify embodied knowledge observed in the actions of organisational members who enable knowledge creation will be clarified. For that purpose, this study adopts an approach that adds a first, second, and third-person's viewpoint to the theory of knowledge creation. Embodied knowledge, observed in the actions of organisational members who enable knowledge creation, is the continued practice of 'ethos' (in Greek) founded in Nissan Production Way as an ethical basis. Ethos is knowledge (intangible) assets for knowledge creating companies. Substantiated analysis classifies ethos into three categories: the individual, team and organisation. This indicates the precise actions of the organisational members in each category during the knowledge creation process. This research will be successful in its role of showing the indispensability of ethos - the new concept of knowledge assets, which enables knowledge creation -for future knowledge-based management in the knowledge society.
Analytic Steering: Inserting Context into the Information Dialog
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bohn, Shawn J.; Calapristi, Augustin J.; Brown, Shyretha D.
2011-10-23
An analyst’s intrinsic domain knowledge is a primary asset in almost any analysis task. Unstructured text analysis systems that apply un-supervised content analysis approaches can be more effective if they can leverage this domain knowledge in a manner that augments the information discovery process without obfuscating new or unexpected content. Current unsupervised approaches rely upon the prowess of the analyst to submit the right queries or observe generalized document and term relationships from ranked or visual results. We propose a new approach which allows the user to control or steer the analytic view within the unsupervised space. This process ismore » controlled through the data characterization process via user supplied context in the form of a collection of key terms. We show that steering with an appropriate choice of key terms can provide better relevance to the analytic domain and still enable the analyst to uncover un-expected relationships; this paper discusses cases where various analytic steering approaches can provide enhanced analysis results and cases where analytic steering can have a negative impact on the analysis process.« less
Gramatica, Ruggero; Di Matteo, T; Giorgetti, Stefano; Barbiani, Massimo; Bevec, Dorian; Aste, Tomaso
2014-01-01
We introduce a methodology to efficiently exploit natural-language expressed biomedical knowledge for repurposing existing drugs towards diseases for which they were not initially intended. Leveraging on developments in Computational Linguistics and Graph Theory, a methodology is defined to build a graph representation of knowledge, which is automatically analysed to discover hidden relations between any drug and any disease: these relations are specific paths among the biomedical entities of the graph, representing possible Modes of Action for any given pharmacological compound. We propose a measure for the likeliness of these paths based on a stochastic process on the graph. This measure depends on the abundance of indirect paths between a peptide and a disease, rather than solely on the strength of the shortest path connecting them. We provide real-world examples, showing how the method successfully retrieves known pathophysiological Mode of Action and finds new ones by meaningfully selecting and aggregating contributions from known bio-molecular interactions. Applications of this methodology are presented, and prove the efficacy of the method for selecting drugs as treatment options for rare diseases.
Hackathons as a means of accelerating scientific discoveries and knowledge transfer.
Ghouila, Amel; Siwo, Geoffrey Henry; Entfellner, Jean-Baka Domelevo; Panji, Sumir; Button-Simons, Katrina A; Davis, Sage Zenon; Fadlelmola, Faisal M; Ferdig, Michael T; Mulder, Nicola
2018-05-01
Scientific research plays a key role in the advancement of human knowledge and pursuit of solutions to important societal challenges. Typically, research occurs within specific institutions where data are generated and subsequently analyzed. Although collaborative science bringing together multiple institutions is now common, in such collaborations the analytical processing of the data is often performed by individual researchers within the team, with only limited internal oversight and critical analysis of the workflow prior to publication. Here, we show how hackathons can be a means of enhancing collaborative science by enabling peer review before results of analyses are published by cross-validating the design of studies or underlying data sets and by driving reproducibility of scientific analyses. Traditionally, in data analysis processes, data generators and bioinformaticians are divided and do not collaborate on analyzing the data. Hackathons are a good strategy to build bridges over the traditional divide and are potentially a great agile extension to the more structured collaborations between multiple investigators and institutions. © 2018 Ghouila et al.; Published by Cold Spring Harbor Laboratory Press.
Transforming practice into clinical scholarship.
Limoges, Jacqueline; Acorn, Sonia
2016-04-01
The aims of this paper were to explicate clinical scholarship as synonymous with the scholarship of application and to explore the evolution of scholarly practice to clinical scholarship. Boyer contributed an expanded view of scholarship that recognized various approaches to knowledge production beyond pure research (discovery) to include the scholarship of integration, application and teaching. There is growing interest in using Boyer's framework to advance knowledge production in nursing but the discussion of clinical scholarship in relation to Boyer's framework is sparse. Discussion paper. Literature from 1983-2015 and Boyer's framework. When clinical scholarship is viewed as a synonym for Boyer's scholarship of application, it can be aligned to this well established framework to support knowledge generated in clinical practice. For instance, applying the three criteria for scholarship (documentation, peer review and dissemination) can ensure that the knowledge produced is rigorous, available for critique and used by others to advance nursing practice and patient care. Understanding the differences between scholarly practice and clinical scholarship can promote the development of clinical scholarship. Supporting clinical leaders to identify issues confronting nursing practice can enable scholarly practice to be transformed into clinical scholarship. Expanding the understanding of clinical scholarship and linking it to Boyer's scholarship of application can assist nurses to generate knowledge that addresses clinical concerns. Further dialogue about how clinical scholarship can address the theory-practice gap and how publication of clinical scholarship could be expanded given the goals of clinical scholarship is warranted. © 2016 John Wiley & Sons Ltd.
Text-based discovery in biomedicine: the architecture of the DAD-system.
Weeber, M; Klein, H; Aronson, A R; Mork, J G; de Jong-van den Berg, L T; Vos, R
2000-01-01
Current scientific research takes place in highly specialized contexts with poor communication between disciplines as a likely consequence. Knowledge from one discipline may be useful for the other without researchers knowing it. As scientific publications are a condensation of this knowledge, literature-based discovery tools may help the individual scientist to explore new useful domains. We report on the development of the DAD-system, a concept-based Natural Language Processing system for PubMed citations that provides the biomedical researcher such a tool. We describe the general architecture and illustrate its operation by a simulation of a well-known text-based discovery: The favorable effects of fish oil on patients suffering from Raynaud's disease [1].
Which are the greatest recent discoveries and the greatest future challenges in nutrition?
Katan, M B; Boekschoten, M V; Connor, W E; Mensink, R P; Seidell, J; Vessby, B; Willett, W
2009-01-01
Nutrition science aims to create new knowledge, but scientists rarely sit back to reflect on what nutrition research has achieved in recent decades. We report the outcome of a 1-day symposium at which the audience was asked to vote on the greatest discoveries in nutrition since 1976 and on the greatest challenges for the coming 30 years. Most of the 128 participants were Dutch scientists working in nutrition or related biomedical and public health fields. Candidate discoveries and challenges were nominated by five invited speakers and by members of the audience. Ballot forms were then prepared on which participants selected one discovery and one challenge. A total of 15 discoveries and 14 challenges were nominated. The audience elected Folic acid prevents birth defects as the greatest discovery in nutrition science since 1976. Controlling obesity and insulin resistance through activity and diet was elected as the greatest challenge for the coming 30 years. This selection was probably biased by the interests and knowledge of the speakers and the audience. For the present review, we therefore added 12 discoveries from the period 1976 to 2006 that we judged worthy of consideration, but that had not been nominated at the meeting. The meeting did not represent an objective selection process, but it did demonstrate that the past 30 years have yielded major new discoveries in nutrition and health.
Translating three states of knowledge--discovery, invention, and innovation
2010-01-01
Background Knowledge Translation (KT) has historically focused on the proper use of knowledge in healthcare delivery. A knowledge base has been created through empirical research and resides in scholarly literature. Some knowledge is amenable to direct application by stakeholders who are engaged during or after the research process, as shown by the Knowledge to Action (KTA) model. Other knowledge requires multiple transformations before achieving utility for end users. For example, conceptual knowledge generated through science or engineering may become embodied as a technology-based invention through development methods. The invention may then be integrated within an innovative device or service through production methods. To what extent is KT relevant to these transformations? How might the KTA model accommodate these additional development and production activities while preserving the KT concepts? Discussion Stakeholders adopt and use knowledge that has perceived utility, such as a solution to a problem. Achieving a technology-based solution involves three methods that generate knowledge in three states, analogous to the three classic states of matter. Research activity generates discoveries that are intangible and highly malleable like a gas; development activity transforms discoveries into inventions that are moderately tangible yet still malleable like a liquid; and production activity transforms inventions into innovations that are tangible and immutable like a solid. The paper demonstrates how the KTA model can accommodate all three types of activity and address all three states of knowledge. Linking the three activities in one model also illustrates the importance of engaging the relevant stakeholders prior to initiating any knowledge-related activities. Summary Science and engineering focused on technology-based devices or services change the state of knowledge through three successive activities. Achieving knowledge implementation requires methods that accommodate these three activities and knowledge states. Accomplishing beneficial societal impacts from technology-based knowledge involves the successful progression through all three activities, and the effective communication of each successive knowledge state to the relevant stakeholders. The KTA model appears suitable for structuring and linking these processes. PMID:20205873
Teng, Rui; Leibnitz, Kenji; Miura, Ryu
2013-01-01
An essential application of wireless sensor networks is to successfully respond to user queries. Query packet losses occur in the query dissemination due to wireless communication problems such as interference, multipath fading, packet collisions, etc. The losses of query messages at sensor nodes result in the failure of sensor nodes reporting the requested data. Hence, the reliable and successful dissemination of query messages to sensor nodes is a non-trivial problem. The target of this paper is to enable highly successful query delivery to sensor nodes by localized and energy-efficient discovery, and recovery of query losses. We adopt local and collective cooperation among sensor nodes to increase the success rate of distributed discoveries and recoveries. To enable the scalability in the operations of discoveries and recoveries, we employ a distributed name resolution mechanism at each sensor node to allow sensor nodes to self-detect the correlated queries and query losses, and then efficiently locally respond to the query losses. We prove that the collective discovery of query losses has a high impact on the success of query dissemination and reveal that scalability can be achieved by using the proposed approach. We further study the novel features of the cooperation and competition in the collective recovery at PHY and MAC layers, and show that the appropriate number of detectors can achieve optimal successful recovery rate. We evaluate the proposed approach with both mathematical analyses and computer simulations. The proposed approach enables a high rate of successful delivery of query messages and it results in short route lengths to recover from query losses. The proposed approach is scalable and operates in a fully distributed manner. PMID:23748172
Semantic Service Design for Collaborative Business Processes in Internetworked Enterprises
NASA Astrophysics Data System (ADS)
Bianchini, Devis; Cappiello, Cinzia; de Antonellis, Valeria; Pernici, Barbara
Modern collaborating enterprises can be seen as borderless organizations whose processes are dynamically transformed and integrated with the ones of their partners (Internetworked Enterprises, IE), thus enabling the design of collaborative business processes. The adoption of Semantic Web and service-oriented technologies for implementing collaboration in such distributed and heterogeneous environments promises significant benefits. IE can model their own processes independently by using the Software as a Service paradigm (SaaS). Each enterprise maintains a catalog of available services and these can be shared across IE and reused to build up complex collaborative processes. Moreover, each enterprise can adopt its own terminology and concepts to describe business processes and component services. This brings requirements to manage semantic heterogeneity in process descriptions which are distributed across different enterprise systems. To enable effective service-based collaboration, IEs have to standardize their process descriptions and model them through component services using the same approach and principles. For enabling collaborative business processes across IE, services should be designed following an homogeneous approach, possibly maintaining a uniform level of granularity. In the paper we propose an ontology-based semantic modeling approach apt to enrich and reconcile semantics of process descriptions to facilitate process knowledge management and to enable semantic service design (by discovery, reuse and integration of process elements/constructs). The approach brings together Semantic Web technologies, techniques in process modeling, ontology building and semantic matching in order to provide a comprehensive semantic modeling framework.
Yewdell, Jonathan W.
2009-01-01
Making discoveries is the most important part of being a scientist, and also the most fun. Young scientists need to develop the experimental and mental skill sets that enable them to make discoveries, including how to recognize and exploit serendipity when it strikes. Here, I provide practical advice to young scientists on choosing a research topic, designing, performing and interpreting experiments and, last but not least, on maintaining your sanity in the process. PMID:18401347
Photoreactive Stapled BH3 Peptides to Dissect the BCL-2 Family Interactome
Braun, Craig R.; Mintseris, Julian; Gavathiotis, Evripidis; Bird, Gregory H.; Gygi, Steven P.; Walensky, Loren D.
2010-01-01
SUMMARY Defining protein interactions forms the basis for discovery of biological pathways, disease mechanisms, and opportunities for therapeutic intervention. To harness the robust binding affinity and selectivity of structured peptides for interactome discovery, we engineered photoreactive stapled BH3 peptide helices that covalently capture their physiologic BCL-2 family targets. The crosslinking α-helices covalently trap both static and dynamic protein interactors, and enable rapid identification of interaction sites, providing a critical link between interactome discovery and targeted drug design. PMID:21168768
Yewdell, Jonathan W
2008-06-01
Making discoveries is the most important part of being a scientist, and also the most fun. Young scientists need to develop the experimental and mental skill sets that enable them to make discoveries, including how to recognize and exploit serendipity when it strikes. Here, I provide practical advice to young scientists on choosing a research topic, designing, performing and interpreting experiments and, last but not least, on maintaining your sanity in the process.
Distributed data mining on grids: services, tools, and applications.
Cannataro, Mario; Congiusta, Antonio; Pugliese, Andrea; Talia, Domenico; Trunfio, Paolo
2004-12-01
Data mining algorithms are widely used today for the analysis of large corporate and scientific datasets stored in databases and data archives. Industry, science, and commerce fields often need to analyze very large datasets maintained over geographically distributed sites by using the computational power of distributed and parallel systems. The grid can play a significant role in providing an effective computational support for distributed knowledge discovery applications. For the development of data mining applications on grids we designed a system called Knowledge Grid. This paper describes the Knowledge Grid framework and presents the toolset provided by the Knowledge Grid for implementing distributed knowledge discovery. The paper discusses how to design and implement data mining applications by using the Knowledge Grid tools starting from searching grid resources, composing software and data components, and executing the resulting data mining process on a grid. Some performance results are also discussed.
Knowledge Discovery/A Collaborative Approach, an Innovative Solution
NASA Technical Reports Server (NTRS)
Fitts, Mary A.
2009-01-01
Collaboration between Medical Informatics and Healthcare Systems (MIHCS) at NASA/Johnson Space Center (JSC) and the Texas Medical Center (TMC) Library was established to investigate technologies for facilitating knowledge discovery across multiple life sciences research disciplines in multiple repositories. After reviewing 14 potential Enterprise Search System (ESS) solutions, Collexis was determined to best meet the expressed needs. A three month pilot evaluation of Collexis produced positive reports from multiple scientists across 12 research disciplines. The joint venture and a pilot-phased approach achieved the desired results without the high cost of purchasing software, hardware or additional resources to conduct the task. Medical research is highly compartmentalized by discipline, e.g. cardiology, immunology, neurology. The medical research community at large, as well as at JSC, recognizes the need for cross-referencing relevant information to generate best evidence. Cross-discipline collaboration at JSC is specifically required to close knowledge gaps affecting space exploration. To facilitate knowledge discovery across these communities, MIHCS combined expertise with the TMC library and found Collexis to best fit the needs of our researchers including:
Building Scalable Knowledge Graphs for Earth Science
NASA Technical Reports Server (NTRS)
Ramachandran, Rahul; Maskey, Manil; Gatlin, Patrick; Zhang, Jia; Duan, Xiaoyi; Miller, J. J.; Bugbee, Kaylin; Christopher, Sundar; Freitag, Brian
2017-01-01
Knowledge Graphs link key entities in a specific domain with other entities via relationships. From these relationships, researchers can query knowledge graphs for probabilistic recommendations to infer new knowledge. Scientific papers are an untapped resource which knowledge graphs could leverage to accelerate research discovery. Goal: Develop an end-to-end (semi) automated methodology for constructing Knowledge Graphs for Earth Science.
Genetic discoveries and nursing implications for complex disease prevention and management.
Frazier, Lorraine; Meininger, Janet; Halsey Lea, Dale; Boerwinkle, Eric
2004-01-01
The purpose of this article is to examine the management of patients with complex diseases, in light of recent genetic discoveries, and to explore how these genetic discoveries will impact nursing practice and nursing research. The nursing science processes discussed are not comprehensive of all nursing practice but, instead, are concentrated in areas where genetics will have the greatest influence. Advances in genetic science will revolutionize our approach to patients and to health care in the prevention, diagnosis, and treatment of disease, raising many issues for nursing research and practice. As the scope of genetics expands to encompass multifactorial disease processes, a continuing reexamination of the knowledge base is required for nursing practice, with incorporation of genetic knowledge into the repertoire of every nurse, and with advanced knowledge for nurses who select specialty roles in the genetics area. This article explores the impact of this revolution on nursing science and practice as well as the opportunities for nursing science and practice to participate fully in this revolution. Because of the high proportion of the population at risk for complex diseases and because nurses are occupied every day in the prevention, assessment, treatment, and therapeutic intervention of patients with such diseases in practice and research, there is great opportunity for nurses to improve health care through the application (nursing practice) and discovery (nursing research) of genetic knowledge.
Medical data mining: knowledge discovery in a clinical data warehouse.
Prather, J. C.; Lobach, D. F.; Goodwin, L. K.; Hales, J. W.; Hage, M. L.; Hammond, W. E.
1997-01-01
Clinical databases have accumulated large quantities of information about patients and their medical conditions. Relationships and patterns within this data could provide new medical knowledge. Unfortunately, few methodologies have been developed and applied to discover this hidden knowledge. In this study, the techniques of data mining (also known as Knowledge Discovery in Databases) were used to search for relationships in a large clinical database. Specifically, data accumulated on 3,902 obstetrical patients were evaluated for factors potentially contributing to preterm birth using exploratory factor analysis. Three factors were identified by the investigators for further exploration. This paper describes the processes involved in mining a clinical database including data warehousing, data query and cleaning, and data analysis. PMID:9357597
NIPTE: a multi-university partnership supporting academic drug development.
Gurvich, Vadim J; Byrn, Stephen R
2013-10-01
The strategic goal of academic translational research is to accelerate translational science through the improvement and development of resources for moving discoveries across translational barriers through 'first in humans' studies. To achieve this goal, access to drug discovery resources and preclinical IND-enabling infrastructure is crucial. One potential approach of research institutions for coordinating preclinical development, based on a model from the National Institute for Pharmaceutical Technology and Education (NIPTE), can provide academic translational and medical centers with access to a wide variety of enabling infrastructure for developing small molecule clinical candidates in an efficient, cost-effective manner. Copyright © 2013 Elsevier Ltd. All rights reserved.
Investigation of the pathogenesis of autoimmune diseases by iPS cells.
Natsumoto, Bunki; Shoda, Hirofumi; Fujio, Keishi; Otsu, Makoto; Yamamoto, Kazuhiko
2017-01-01
The pluripotent stem cells have a self-renewal ability and can be differentiated into theoretically all of cell types. The induced pluripotent stem (iPS) cells overcame the ethical problems of the human embryonic stem (ES) cell, and enable pathologic analysis of intractable diseases and drug discovery. The in vitro disease model using disease-specific iPS cells enables repeated analyses of human cells without influence of environment factors. Even though autoimmune diseases are polygenic diseases, autoimmune disease-specific iPS cells are thought to be a promising tool for analyzing the pathogenesis of the diseases and drug discovery in future.
Discovery informatics in biological and biomedical sciences: research challenges and opportunities.
Honavar, Vasant
2015-01-01
New discoveries in biological, biomedical and health sciences are increasingly being driven by our ability to acquire, share, integrate and analyze, and construct and simulate predictive models of biological systems. While much attention has focused on automating routine aspects of management and analysis of "big data", realizing the full potential of "big data" to accelerate discovery calls for automating many other aspects of the scientific process that have so far largely resisted automation: identifying gaps in the current state of knowledge; generating and prioritizing questions; designing studies; designing, prioritizing, planning, and executing experiments; interpreting results; forming hypotheses; drawing conclusions; replicating studies; validating claims; documenting studies; communicating results; reviewing results; and integrating results into the larger body of knowledge in a discipline. Against this background, the PSB workshop on Discovery Informatics in Biological and Biomedical Sciences explores the opportunities and challenges of automating discovery or assisting humans in discovery through advances (i) Understanding, formalization, and information processing accounts of, the entire scientific process; (ii) Design, development, and evaluation of the computational artifacts (representations, processes) that embody such understanding; and (iii) Application of the resulting artifacts and systems to advance science (by augmenting individual or collective human efforts, or by fully automating science).
Ligand-based receptor tyrosine kinase partial agonists: New paradigm for cancer drug discovery?
Riese, David J
2011-02-01
INTRODUCTION: Receptor tyrosine kinases (RTKs) are validated targets for oncology drug discovery and several RTK antagonists have been approved for the treatment of human malignancies. Nonetheless, the discovery and development of RTK antagonists has lagged behind the discovery and development of agents that target G-protein coupled receptors. In part, this is because it has been difficult to discover analogs of naturally-occurring RTK agonists that function as antagonists. AREAS COVERED: Here we describe ligands of ErbB receptors that function as partial agonists for these receptors, thereby enabling these ligands to antagonize the activity of full agonists for these receptors. We provide insights into the mechanisms by which these ligands function as antagonists. We discuss how information concerning these mechanisms can be translated into screens for novel small molecule- and antibody-based antagonists of ErbB receptors and how such antagonists hold great potential as targeted cancer chemotherapeutics. EXPERT OPINION: While there have been a number of important key findings into this field, the identification of the structural basis of ligand functional specificity is still of the greatest importance. While it is true that, with some notable exceptions, peptide hormones and growth factors have not proven to be good platforms for oncology drug discovery; addressing the fundamental issues of antagonistic partial agonists for receptor tyrosine kinases has the potential to steer oncology drug discovery in new directions. Mechanism based approaches are now emerging to enable the discovery of RTK partial agonists that may antagonize both agonist-dependent and -independent RTK signaling and may hold tremendous promise as targeted cancer chemotherapeutics.
Schmalhofer, F J; Tschaitschian, B
1998-11-01
In this paper, we perform a cognitive analysis of knowledge discovery processes. As a result of this analysis, the construction-integration theory is proposed as a general framework for developing cooperative knowledge evolution systems. We thus suggest that for the acquisition of new domain knowledge in medicine, one should first construct pluralistic views on a given topic which may contain inconsistencies as well as redundancies. Only thereafter does this knowledge become consolidated into a situation-specific circumscription and the early inconsistencies become eliminated. As a proof for the viability of such knowledge acquisition processes in medicine, we present the IDEAS system, which can be used for the intelligent documentation of adverse events in clinical studies. This system provides a better documentation of the side-effects of medical drugs. Thereby, knowledge evolution occurs by achieving consistent explanations in increasingly larger contexts (i.e., more cases and more pharmaceutical substrates). Finally, it is shown how prototypes, model-based approaches and cooperative knowledge evolution systems can be distinguished as different classes of knowledge-based systems.
Building Better Decision-Support by Using Knowledge Discovery.
ERIC Educational Resources Information Center
Jurisica, Igor
2000-01-01
Discusses knowledge-based decision-support systems that use artificial intelligence approaches. Addresses the issue of how to create an effective case-based reasoning system for complex and evolving domains, focusing on automated methods for system optimization and domain knowledge evolution that can supplement knowledge acquired from domain…
McDermott, Jason E.; Wang, Jing; Mitchell, Hugh; Webb-Robertson, Bobbie-Jo; Hafen, Ryan; Ramey, John; Rodland, Karin D.
2012-01-01
Introduction The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful molecular signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities for more sophisticated approaches to integrating purely statistical and expert knowledge-based approaches. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges that have been encountered in deriving valid and useful signatures of disease. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to identify predictive signatures of disease are key to future success in the biomarker field. We will describe our recommendations for possible approaches to this problem including metrics for the evaluation of biomarkers. PMID:23335946
Enabling a new Paradigm to Address Big Data and Open Science Challenges
NASA Astrophysics Data System (ADS)
Ramamurthy, Mohan; Fisher, Ward
2017-04-01
Data are not only the lifeblood of the geosciences but they have become the currency of the modern world in science and society. Rapid advances in computing, communi¬cations, and observational technologies — along with concomitant advances in high-resolution modeling, ensemble and coupled-systems predictions of the Earth system — are revolutionizing nearly every aspect of our field. Modern data volumes from high-resolution ensemble prediction/projection/simulation systems and next-generation remote-sensing systems like hyper-spectral satellite sensors and phased-array radars are staggering. For example, CMIP efforts alone will generate many petabytes of climate projection data for use in assessments of climate change. And NOAA's National Climatic Data Center projects that it will archive over 350 petabytes by 2030. For researchers and educators, this deluge and the increasing complexity of data brings challenges along with the opportunities for discovery and scientific breakthroughs. The potential for big data to transform the geosciences is enormous, but realizing the next frontier depends on effectively managing, analyzing, and exploiting these heterogeneous data sources, extracting knowledge and useful information from heterogeneous data sources in ways that were previously impossible, to enable discoveries and gain new insights. At the same time, there is a growing focus on the area of "Reproducibility or Replicability in Science" that has implications for Open Science. The advent of cloud computing has opened new avenues for not only addressing both big data and Open Science challenges to accelerate scientific discoveries. However, to successfully leverage the enormous potential of cloud technologies, it will require the data providers and the scientific communities to develop new paradigms to enable next-generation workflows and transform the conduct of science. Making data readily available is a necessary but not a sufficient condition. Data providers also need to give scientists an ecosystem that includes data, tools, workflows and other services needed to perform analytics, integration, interpretation, and synthesis - all in the same environment or platform. Instead of moving data to processing systems near users, as is the tradition, the cloud permits one to bring processing, computing, analysis and visualization to data - so called data proximate workbench capabilities, also known as server-side processing. In this talk, I will present the ongoing work at Unidata to facilitate a new paradigm for doing science by offering a suite of tools, resources, and platforms to leverage cloud technologies for addressing both big data and Open Science/reproducibility challenges. That work includes the development and deployment of new protocols for data access and server-side operations and Docker container images of key applications, JupyterHub Python notebook tools, and cloud-based analysis and visualization capability via the CloudIDV tool to enable reproducible workflows and effectively use the accessed data.
Behavior change interventions: the potential of ontologies for advancing science and practice.
Larsen, Kai R; Michie, Susan; Hekler, Eric B; Gibson, Bryan; Spruijt-Metz, Donna; Ahern, David; Cole-Lewis, Heather; Ellis, Rebecca J Bartlett; Hesse, Bradford; Moser, Richard P; Yi, Jean
2017-02-01
A central goal of behavioral medicine is the creation of evidence-based interventions for promoting behavior change. Scientific knowledge about behavior change could be more effectively accumulated using "ontologies." In information science, an ontology is a systematic method for articulating a "controlled vocabulary" of agreed-upon terms and their inter-relationships. It involves three core elements: (1) a controlled vocabulary specifying and defining existing classes; (2) specification of the inter-relationships between classes; and (3) codification in a computer-readable format to enable knowledge generation, organization, reuse, integration, and analysis. This paper introduces ontologies, provides a review of current efforts to create ontologies related to behavior change interventions and suggests future work. This paper was written by behavioral medicine and information science experts and was developed in partnership between the Society of Behavioral Medicine's Technology Special Interest Group (SIG) and the Theories and Techniques of Behavior Change Interventions SIG. In recent years significant progress has been made in the foundational work needed to develop ontologies of behavior change. Ontologies of behavior change could facilitate a transformation of behavioral science from a field in which data from different experiments are siloed into one in which data across experiments could be compared and/or integrated. This could facilitate new approaches to hypothesis generation and knowledge discovery in behavioral science.
Garseth, Å H; Fritsvold, C; Svendsen, J C; Bang Jensen, B; Mikalsen, A B
2018-01-01
Cardiomyopathy syndrome (CMS) is a severe cardiac disease affecting Atlantic salmon Salmo salar L. The disease was first recognized in farmed Atlantic salmon in Norway in 1985 and subsequently in farmed salmon in the Faroe Islands, Scotland and Ireland. CMS has also been described in wild Atlantic salmon in Norway. The demonstration of CMS as a transmissible disease in 2009, and the subsequent detection and initial characterization of piscine myocarditis virus (PMCV) in 2010 and 2011 were significant discoveries that gave new impetus to the CMS research. In Norway, CMS usually causes mortality in large salmon in ongrowing and broodfish farms, resulting in reduced fish welfare, significant management-related challenges and substantial economic losses. The disease thus has a significant impact on the Atlantic salmon farming industry. There is a need to gain further basic knowledge about the virus, the disease and its epidemiology, but also applied knowledge from the industry to enable the generation and implementation of effective prevention and control measures. This review summarizes the currently available, scientific information on CMS and PMCV with special focus on epidemiology and factors influencing the development of CMS. © 2017 The Authors. Journal of Fish Diseases Published by John Wiley & Sons Ltd.
Toward a Unified Theory of Visual Area V4
Roe, Anna W.; Chelazzi, Leonardo; Connor, Charles E.; Conway, Bevil R.; Fujita, Ichiro; Gallant, Jack L.; Lu, Haidong; Vanduffel, Wim
2016-01-01
Visual area V4 is a midtier cortical area in the ventral visual pathway. It is crucial for visual object recognition and has been a focus of many studies on visual attention. However, there is no unifying view of V4’s role in visual processing. Neither is there an understanding of how its role in feature processing interfaces with its role in visual attention. This review captures our current knowledge of V4, largely derived from electrophysiological and imaging studies in the macaque monkey. Based on recent discovery of functionally specific domains in V4, we propose that the unifying function of V4 circuitry is to enable selective extraction of specific functional domain-based networks, whether it be by bottom-up specification of object features or by top-down attentionally driven selection. PMID:22500626
A two-step approach for mining patient treatment pathways in administrative healthcare databases.
Najjar, Ahmed; Reinharz, Daniel; Girouard, Catherine; Gagné, Christian
2018-05-01
Clustering electronic medical records allows the discovery of information on healthcare practices. Entries in such medical records are usually composed of a succession of diagnostics or therapeutic steps. The corresponding processes are complex and heterogeneous since they depend on medical knowledge integrating clinical guidelines, the physician's individual experience, and patient data and conditions. To analyze such data, we are first proposing to cluster medical visits, consultations, and hospital stays into homogeneous groups, and then to construct higher-level patient treatment pathways over these different groups. These pathways are then also clustered to distill typical pathways, enabling interpretation of clusters by experts. This approach is evaluated on a real-world administrative database of elderly people in Québec suffering from heart failures. Copyright © 2018 Elsevier B.V. All rights reserved.
Physics of Intracellular Organization in Bacteria.
Wingreen, Ned S; Huang, Kerwyn Casey
2015-01-01
With the realization that bacteria achieve exquisite levels of spatiotemporal organization has come the challenge of discovering the underlying mechanisms. In this review, we describe three classes of such mechanisms, each of which has physical origins: the use of landmarks, the creation of higher-order structures that enable geometric sensing, and the emergence of length scales from systems of chemical reactions coupled to diffusion. We then examine the diversity of geometric cues that exist even in cells with relatively simple geometries, and end by discussing both new technologies that could drive further discovery and the implications of our current knowledge for the behavior, fitness, and evolution of bacteria. The organizational strategies described here are employed in a wide variety of systems and in species across all kingdoms of life; in many ways they provide a general blueprint for organizing the building blocks of life.
Soh, Jung; Turinsky, Andrei L; Trinh, Quang M; Chang, Jasmine; Sabhaney, Ajay; Dong, Xiaoli; Gordon, Paul Mk; Janzen, Ryan Pw; Hau, David; Xia, Jianguo; Wishart, David S; Sensen, Christoph W
2009-01-01
We have developed a computational framework for spatiotemporal integration of molecular and anatomical datasets in a virtual reality environment. Using two case studies involving gene expression data and pharmacokinetic data, respectively, we demonstrate how existing knowledge bases for molecular data can be semantically mapped onto a standardized anatomical context of human body. Our data mapping methodology uses ontological representations of heterogeneous biomedical datasets and an ontology reasoner to create complex semantic descriptions of biomedical processes. This framework provides a means to systematically combine an increasing amount of biomedical imaging and numerical data into spatiotemporally coherent graphical representations. Our work enables medical researchers with different expertise to simulate complex phenomena visually and to develop insights through the use of shared data, thus paving the way for pathological inference, developmental pattern discovery and biomedical hypothesis testing.
Cretaceous Footprints Found on Goddard Campus
2012-08-20
About 110 million light years away, the bright, barred spiral galaxy NGC3259 was just forming stars in dark bands of dust and gas. On Earth, a plant-eating dinosaur left footprints in the Cretaceous mud of what would later become the grounds of NASA’s Goddard Space Flight Center in Greenbelt, Md. Local dinosaur hunter Ray Stanford speaks to local press and Goddard officials about this discovery. To read more go to: www.nasa.gov/centers/goddard/news/features/2012/nodosaur.... Credit: NASA/Goddard/Rebecca Roth NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram
Vilariño Besteiro, M P; Pérez Franco, C; Gallego Morales, L; Calvo Sagardoy, R; García de Lorenzo, A
2009-01-01
This paper intends to show the combination of therapeutical strategies in the treatment of long evolution food disorders. This fashion of work entitled "Modelo Santa Cristina" is based on several theoretical paradigms: Enabling Model, Action Control Model, Change Process Transtheoretical Model and Cognitive-Behavioural Model (Cognitive Restructuring and Learning Theories). Furthermore, Gestalt, Systemic and Psychodrama Orientation Techniques. The purpose of the treatment is both the normalization of food patterns and the increase in self-knowledge, self-acceptance and self-efficacy of patients. The exploration of ambivalence to change, the discovery of the functions of symptoms and the search for alternative behaviours, the normalization of food patterns, bodily image, cognitive restructuring, decision taking, communication skills and elaboration of traumatic experiences are among the main areas of intervention.
Application of theoretical methods to increase succinate production in engineered strains.
Valderrama-Gomez, M A; Kreitmayer, D; Wolf, S; Marin-Sanguino, A; Kremling, A
2017-04-01
Computational methods have enabled the discovery of non-intuitive strategies to enhance the production of a variety of target molecules. In the case of succinate production, reviews covering the topic have not yet analyzed the impact and future potential that such methods may have. In this work, we review the application of computational methods to the production of succinic acid. We found that while a total of 26 theoretical studies were published between 2002 and 2016, only 10 studies reported the successful experimental implementation of any kind of theoretical knowledge. None of the experimental studies reported an exact application of the computational predictions. However, the combination of computational analysis with complementary strategies, such as directed evolution and comparative genome analysis, serves as a proof of concept and demonstrates that successful metabolic engineering can be guided by rational computational methods.
Perinatal biomarkers in prematurity: Early identification of neurologic injury
Andrikopoulou, Maria; Almalki, Ahmad; Farzin, Azadeh; Cordeiro, Christina N.; Johnston, Michael V.; Burd, Irina
2014-01-01
Over the past few decades, biomarkers have become increasingly utilized as non-invasive tools in the early diagnosis and management of various clinical conditions. In perinatal medicine, the improved survival of extremely premature infants who are at high risk for adverse neurologic outcomes has increased the demand for the discovery of biomarkers in detecting and predicting the prognosis of infants with neonatal brain injury. By enabling the clinician to recognize potential brain damage early, biomarkers could allow clinicians to intervene at the early stages of disease, and to monitor the efficacy of those interventions. This review will first examine the potential perinatal biomarkers for neurologic complications of prematurity, specifically, intraventricular hemorrhage (IVH), periventricular leukomalacia (PVL) and posthemorrhagic hydrocephalus (PHH). It will also evaluate knowledge gained from animal models regarding the pathogenesis of perinatal brain injury in prematurity. PMID:24768951
Combining data from multiple sources using the CUAHSI Hydrologic Information System
NASA Astrophysics Data System (ADS)
Tarboton, D. G.; Ames, D. P.; Horsburgh, J. S.; Goodall, J. L.
2012-12-01
The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) has developed a Hydrologic Information System (HIS) to provide better access to data by enabling the publication, cataloging, discovery, retrieval, and analysis of hydrologic data using web services. The CUAHSI HIS is an Internet based system comprised of hydrologic databases and servers connected through web services as well as software for data publication, discovery and access. The HIS metadata catalog lists close to 100 web services registered to provide data through this system, ranging from large federal agency data sets to experimental watersheds managed by University investigators. The system's flexibility in storing and enabling public access to similarly formatted data and metadata has created a community data resource from governmental and academic data that might otherwise remain private or analyzed only in isolation. Comprehensive understanding of hydrology requires integration of this information from multiple sources. HydroDesktop is the client application developed as part of HIS to support data discovery and access through this system. HydroDesktop is founded on an open source GIS client and has a plug-in architecture that has enabled the integration of modeling and analysis capability with the functionality for data discovery and access. Model integration is possible through a plug-in built on the OpenMI standard and data visualization and analysis is supported by an R plug-in. This presentation will demonstrate HydroDesktop, showing how it provides an analysis environment within which data from multiple sources can be discovered, accessed and integrated.
Bioinformatics in translational drug discovery.
Wooller, Sarah K; Benstead-Hume, Graeme; Chen, Xiangrong; Ali, Yusuf; Pearl, Frances M G
2017-08-31
Bioinformatics approaches are becoming ever more essential in translational drug discovery both in academia and within the pharmaceutical industry. Computational exploitation of the increasing volumes of data generated during all phases of drug discovery is enabling key challenges of the process to be addressed. Here, we highlight some of the areas in which bioinformatics resources and methods are being developed to support the drug discovery pipeline. These include the creation of large data warehouses, bioinformatics algorithms to analyse 'big data' that identify novel drug targets and/or biomarkers, programs to assess the tractability of targets, and prediction of repositioning opportunities that use licensed drugs to treat additional indications. © 2017 The Author(s).
An interactive web application for the dissemination of human systems immunology data.
Speake, Cate; Presnell, Scott; Domico, Kelly; Zeitner, Brad; Bjork, Anna; Anderson, David; Mason, Michael J; Whalen, Elizabeth; Vargas, Olivia; Popov, Dimitry; Rinchai, Darawan; Jourde-Chiche, Noemie; Chiche, Laurent; Quinn, Charlie; Chaussabel, Damien
2015-06-19
Systems immunology approaches have proven invaluable in translational research settings. The current rate at which large-scale datasets are generated presents unique challenges and opportunities. Mining aggregates of these datasets could accelerate the pace of discovery, but new solutions are needed to integrate the heterogeneous data types with the contextual information that is necessary for interpretation. In addition, enabling tools and technologies facilitating investigators' interaction with large-scale datasets must be developed in order to promote insight and foster knowledge discovery. State of the art application programming was employed to develop an interactive web application for browsing and visualizing large and complex datasets. A collection of human immune transcriptome datasets were loaded alongside contextual information about the samples. We provide a resource enabling interactive query and navigation of transcriptome datasets relevant to human immunology research. Detailed information about studies and samples are displayed dynamically; if desired the associated data can be downloaded. Custom interactive visualizations of the data can be shared via email or social media. This application can be used to browse context-rich systems-scale data within and across systems immunology studies. This resource is publicly available online at [Gene Expression Browser Landing Page ( https://gxb.benaroyaresearch.org/dm3/landing.gsp )]. The source code is also available openly [Gene Expression Browser Source Code ( https://github.com/BenaroyaResearch/gxbrowser )]. We have developed a data browsing and visualization application capable of navigating increasingly large and complex datasets generated in the context of immunological studies. This intuitive tool ensures that, whether taken individually or as a whole, such datasets generated at great effort and expense remain interpretable and a ready source of insight for years to come.
An integrated SNP mining and utilization (ISMU) pipeline for next generation sequencing data.
Azam, Sarwar; Rathore, Abhishek; Shah, Trushar M; Telluri, Mohan; Amindala, BhanuPrakash; Ruperao, Pradeep; Katta, Mohan A V S K; Varshney, Rajeev K
2014-01-01
Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone free software.
Isidro-Llobet, Albert; Hadje Georgiou, Kathy; Galloway, Warren R. J. D.; Giacomini, Elisa; Hansen, Mette R.; Méndez-Abt, Gabriela; Tan, Yaw Sing; Carro, Laura; Sore, Hannah F.
2015-01-01
Macrocyclic peptidomimetics are associated with a broad range of biological activities. However, despite such potentially valuable properties, the macrocyclic peptidomimetic structural class is generally considered as being poorly explored within drug discovery. This has been attributed to the lack of general methods for producing collections of macrocyclic peptidomimetics with high levels of structural, and thus shape, diversity. In particular, there is a lack of scaffold diversity in current macrocyclic peptidomimetic libraries; indeed, the efficient construction of diverse molecular scaffolds presents a formidable general challenge to the synthetic chemist. Herein we describe a new, advanced strategy for the diversity-oriented synthesis (DOS) of macrocyclic peptidomimetics that enables the combinatorial variation of molecular scaffolds (core macrocyclic ring architectures). The generality and robustness of this DOS strategy is demonstrated by the step-efficient synthesis of a structurally diverse library of over 200 macrocyclic peptidomimetic compounds, each based around a distinct molecular scaffold and isolated in milligram quantities, from readily available building-blocks. To the best of our knowledge this represents an unprecedented level of scaffold diversity in a synthetically derived library of macrocyclic peptidomimetics. Cheminformatic analysis indicated that the library compounds access regions of chemical space that are distinct from those addressed by top-selling brand-name drugs and macrocyclic natural products, illustrating the value of our DOS approach to sample regions of chemical space underexploited in current drug discovery efforts. An analysis of three-dimensional molecular shapes illustrated that the DOS library has a relatively high level of shape diversity. PMID:25778821
Hassani-Pak, Keywan; Rawlings, Christopher
2017-06-13
Genetics and "omics" studies designed to uncover genotype to phenotype relationships often identify large numbers of potential candidate genes, among which the causal genes are hidden. Scientists generally lack the time and technical expertise to review all relevant information available from the literature, from key model species and from a potentially wide range of related biological databases in a variety of data formats with variable quality and coverage. Computational tools are needed for the integration and evaluation of heterogeneous information in order to prioritise candidate genes and components of interaction networks that, if perturbed through potential interventions, have a positive impact on the biological outcome in the whole organism without producing negative side effects. Here we review several bioinformatics tools and databases that play an important role in biological knowledge discovery and candidate gene prioritization. We conclude with several key challenges that need to be addressed in order to facilitate biological knowledge discovery in the future.
A Community Roadmap for Discovery of Geosciences Data
NASA Astrophysics Data System (ADS)
Baru, C.
2012-12-01
This talk will summarize on-going discussions and deliberations related to data discovery undertaken as part of the EarthCube initiative and in the context of current trends and technologies in search and discovery of scientific data and information. The goal of the EarthCube initiative is to transform the conduct of research by supporting the development of community-guided cyberinfrastructure to integrate data and information for knowledge management across the Geosciences. The vision of EarthCube is to provide a coherent framework for finding and using information about the Earth system across the entire research enterprise that will allow for substantial improved collaboration between specialties using each other's data (e.g. subdomains of geo- and biological sciences). Indeed, data discovery is an essential prerequisite to any action that an EarthCube user would undertake. The community roadmap activity addresses challenges in data discovery, beginning with an assessment of the state-of-the-art, and then identifying issues, challenges, and risks in reaching the data discovery vision. Many of the lessons learned are general and applicable not only to the geosciences but also to a variety of other science communities. The roadmap considers data discovery issues in Geoscience that include but are not limited to metadata-based discovery and the use of semantic information and ontologies; content-based discovery and integration with data mining activities; integration with data access services; and policy and governance issues. Furthermore, many geoscience use cases require access to heterogeneous data from multiple disciplinary sources in order to analyze and make intelligent connections between data to advance research frontiers. Examples include, say, assessing the rise of sea surface temperatures; modeling geodynamical earth systems from deep time to present; or, examining in detail the causes and consequences of global climate change. It has taken the past one to two decades for the community to arrive at a few commonly understood and commonly agreed upon standards for metadata and services. There have been significant advancements in the development of prototype systems in the area of metadata-based data discovery, including efforts such as OpenDAP and THREDDS catalogs, the GEON Portal and Catalog Services (www.geongrid.org), OGC standards, and development of systems like OneGeology (onegeology.org), the USGIN (usgin.org), the Earth System Grid, and EOSDIS. Such efforts have set the stage now for the development of next generation, production-quality, advanced discovery services. The next challenge is in converting these into robust, sustained services for the community and developing capabilities such as content-based search and ontology-enabled search, and ensuring that the long tail of geoscience data are fully included in any future discovery services. As EarthCube attempts to pursue these challenges, the key question to pose is whether we will be able to establish a cultural environment that is able to sustain, extend, and manage an infrastructure that will last 50, 100 years?
Open data in drug discovery and development: lessons from malaria.
Wells, Timothy N C; Willis, Paul; Burrows, Jeremy N; Hooft van Huijsduijnen, Rob
2016-10-01
There is a growing consensus that drug discovery thrives in an open environment. Here, we describe how the malaria community has embraced four levels of open data - open science, open innovation, open access and open source - to catalyse the development of new medicines, and consider principles that could enable open data approaches to be applied to other disease areas.
ERIC Educational Resources Information Center
Young, Barbara N.; Hoffman, Lyubov
Demonstration of chemical reactions is a tool used in the teaching of inorganic descriptive chemistry to enable students to understand the fundamental concepts of chemistry through the use of concrete examples. For maximum benefit, students need to learn through discovery to observe, interpret, hypothesize, and draw conclusions; however, chemical…
ERIC Educational Resources Information Center
McMillin, Bill; Gibson, Sally; MacDonald, Jean
2016-01-01
Animated maps of the library stacks were integrated into the catalog interface at Pratt Institute and into the EBSCO Discovery Service interface at Illinois State University. The mapping feature was developed for optimal automation of the update process to enable a range of library personnel to update maps and call-number ranges. The development…
2006-02-18
KENNEDY SPACE CENTER, FLA. - In NASA Kennedy Space Center's Orbiter Processing Facility bay 3, United Space Alliance shuttle technicians remove the hard cover from a window on Space Shuttle Discovery to enable STS-121 crew members to inspect the window from the cockpit. Launch of Space Shuttle Discovery on mission STS-121, the second return-to-flight mission, is scheduled no earlier than May.
[From the discovery of antibiotics to emerging highly drug-resistant bacteria].
Meunier, Olivier
2015-01-01
The discovery of antibiotics has enabled serious infections to be treated. However, bacteria resistant to several families of antibiotics and the emergence of new highly drug-resistant bacteria constitute a public health issue in France and across the world. Actions to prevent their transmission are being put in place. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
How Formal Methods Impels Discovery: A Short History of an Air Traffic Management Project
NASA Technical Reports Server (NTRS)
Butler, Ricky W.; Hagen, George; Maddalon, Jeffrey M.; Munoz, Cesar A.; Narkawicz, Anthony; Dowek, Gilles
2010-01-01
In this paper we describe a process of algorithmic discovery that was driven by our goal of achieving complete, mechanically verified algorithms that compute conflict prevention bands for use in en route air traffic management. The algorithms were originally defined in the PVS specification language and subsequently have been implemented in Java and C++. We do not present the proofs in this paper: instead, we describe the process of discovery and the key ideas that enabled the final formal proof of correctness
Ontology for Transforming Geo-Spatial Data for Discovery and Integration of Scientific Data
NASA Astrophysics Data System (ADS)
Nguyen, L.; Chee, T.; Minnis, P.
2013-12-01
Discovery and access to geo-spatial scientific data across heterogeneous repositories and multi-discipline datasets can present challenges for scientist. We propose to build a workflow for transforming geo-spatial datasets into semantic environment by using relationships to describe the resource using OWL Web Ontology, RDF, and a proposed geo-spatial vocabulary. We will present methods for transforming traditional scientific dataset, use of a semantic repository, and querying using SPARQL to integrate and access datasets. This unique repository will enable discovery of scientific data by geospatial bound or other criteria.
Tyndall, Timothy; Tyndall, Ayami
2018-01-01
Healthcare directories are vital for interoperability among healthcare providers, researchers and patients. Past efforts at directory services have not provided the tools to allow integration of the diverse data sources. Many are overly strict, incompatible with legacy databases, and do not provide Data Provenance. A more architecture-independent system is needed to enable secure, GDPR-compatible (8) service discovery across organizational boundaries. We review our development of a portable Data Provenance Toolkit supporting provenance within Health Information Exchange (HIE) systems. The Toolkit has been integrated with client software and successfully leveraged in clinical data integration. The Toolkit validates provenance stored in a Blockchain or Directory record and creates provenance signatures, providing standardized provenance that moves with the data. This healthcare directory suite implements discovery of healthcare data by HIE and EHR systems via FHIR. Shortcomings of past directory efforts include the ability to map complex datasets and enabling interoperability via exchange endpoint discovery. By delivering data without dictating how it is stored we improve exchange and facilitate discovery on a multi-national level through open source, fully interoperable tools. With the development of Data Provenance resources we enhance exchange and improve security and usability throughout the health data continuum.
Great Originals of Modern Physics
ERIC Educational Resources Information Center
Decker, Fred W.
1972-01-01
European travel can provide an intimate view of the implements and locales of great discoveries in physics for the knowledgeable traveler. The four museums at Cambridge, London, Remscheid-Lennep, and Munich display a full range of discovery apparatus in modern physics as outlined here. (Author/TS)
ERIC Educational Resources Information Center
MacKenzie, Marion
1983-01-01
Scientific research leading to the discovery of female plants of the red alga Palmaria plamata (dulse) is described. This discovery has not only advanced knowledge of marine organisms and taxonomic relationships but also has practical implications. The complete life cycle of this organism is included. (JN)
43 CFR 4.1132 - Scope of discovery.
Code of Federal Regulations, 2014 CFR
2014-10-01
..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
43 CFR 4.1132 - Scope of discovery.
Code of Federal Regulations, 2012 CFR
2012-10-01
..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
43 CFR 4.1132 - Scope of discovery.
Code of Federal Regulations, 2013 CFR
2013-10-01
..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
43 CFR 4.1132 - Scope of discovery.
Code of Federal Regulations, 2011 CFR
2011-10-01
..., the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature, custody... persons having knowledge of any discoverable matter. (b) It is not ground for objection that information...
Scientific Knowledge Discovery in Complex Semantic Networks of Geophysical Systems
NASA Astrophysics Data System (ADS)
Fox, P.
2012-04-01
The vast majority of explorations of the Earth's systems are limited in their ability to effectively explore the most important (often most difficult) problems because they are forced to interconnect at the data-element, or syntactic, level rather than at a higher scientific, or semantic, level. Recent successes in the application of complex network theory and algorithms to climate data, raise expectations that more general graph-based approaches offer the opportunity for new discoveries. In the past ~ 5 years in the natural sciences there has substantial progress in providing both specialists and non-specialists the ability to describe in machine readable form, geophysical quantities and relations among them in meaningful and natural ways, effectively breaking the prior syntax barrier. The corresponding open-world semantics and reasoning provide higher-level interconnections. That is, semantics provided around the data structures, using semantically-equipped tools, and semantically aware interfaces between science application components allowing for discovery at the knowledge level. More recently, formal semantic approaches to continuous and aggregate physical processes are beginning to show promise and are soon likely to be ready to apply to geoscientific systems. To illustrate these opportunities, this presentation presents two application examples featuring domain vocabulary (ontology) and property relations (named and typed edges in the graphs). First, a climate knowledge discovery pilot encoding and exploration of CMIP5 catalog information with the eventual goal to encode and explore CMIP5 data. Second, a multi-stakeholder knowledge network for integrated assessments in marine ecosystems, where the data is highly inter-disciplinary.
Molecular dynamics-driven drug discovery: leaping forward with confidence.
Ganesan, Aravindhan; Coote, Michelle L; Barakat, Khaled
2017-02-01
Given the significant time and financial costs of developing a commercial drug, it remains important to constantly reform the drug discovery pipeline with novel technologies that can narrow the candidates down to the most promising lead compounds for clinical testing. The past decade has witnessed tremendous growth in computational capabilities that enable in silico approaches to expedite drug discovery processes. Molecular dynamics (MD) has become a particularly important tool in drug design and discovery. From classical MD methods to more sophisticated hybrid classical/quantum mechanical (QM) approaches, MD simulations are now able to offer extraordinary insights into ligand-receptor interactions. In this review, we discuss how the applications of MD approaches are significantly transforming current drug discovery and development efforts. Copyright © 2016 Elsevier Ltd. All rights reserved.
Johns, Margaret A; Meyerkord-Belton, Cheryl L; Du, Yuhong; Fu, Haian
2014-03-01
The Emory Chemical Biology Discovery Center (ECBDC) aims to accelerate high throughput biology and translation of biomedical research discoveries into therapeutic targets and future medicines by providing high throughput research platforms to scientific collaborators worldwide. ECBDC research is focused at the interface of chemistry and biology, seeking to fundamentally advance understanding of disease-related biology with its HTS/HCS platforms and chemical tools, ultimately supporting drug discovery. Established HTS/HCS capabilities, university setting, and expertise in diverse assay formats, including protein-protein interaction interrogation, have enabled the ECBDC to contribute to national chemical biology efforts, empower translational research, and serve as a training ground for young scientists. With these resources, the ECBDC is poised to leverage academic innovation to advance biology and therapeutic discovery.
Teaching a changing paradigm in physiology: a historical perspective on gut interstitial cells.
Drumm, Bernard T; Baker, Salah A
2017-03-01
The study and teaching of gastrointestinal (GI) physiology necessitates an understanding of the cellular basis of contractile and electrical coupling behaviors in the muscle layers that comprise the gut wall. Our knowledge of the cellular origin of GI motility has drastically changed over the last 100 yr. While the pacing and coordination of GI contraction was once thought to be solely attributable to smooth muscle cells, it is now widely accepted that the motility patterns observed in the GI tract exist as a result of a multicellular system, consisting of not only smooth muscle cells but also enteric neurons and distinct populations of specialized interstitial cells that all work in concert to ensure proper GI functions. In this historical perspective, we focus on the emerging role of interstitial cells in GI motility and examine the key discoveries and experiments that led to a major shift in a paradigm of GI physiology regarding the role of interstitial cells in modulating GI contractile patterns. A review of these now classic experiments and papers will enable students and educators to fully appreciate the complex, multicellular nature of GI muscles as well as impart lessons on how shifting paradigms in physiology are fueled by new technologies that lead to new emerging discoveries. Copyright © 2017 the American Physiological Society.
Global Health Innovation Technology Models.
Harding, Kimberly
2016-01-01
Chronic technology and business process disparities between High Income, Low Middle Income and Low Income (HIC, LMIC, LIC) research collaborators directly prevent the growth of sustainable Global Health innovation for infectious and rare diseases. There is a need for an Open Source-Open Science Architecture Framework to bridge this divide. We are proposing such a framework for consideration by the Global Health community, by utilizing a hybrid approach of integrating agnostic Open Source technology and healthcare interoperability standards and Total Quality Management principles. We will validate this architecture framework through our programme called Project Orchid. Project Orchid is a conceptual Clinical Intelligence Exchange and Virtual Innovation platform utilizing this approach to support clinical innovation efforts for multi-national collaboration that can be locally sustainable for LIC and LMIC research cohorts. The goal is to enable LIC and LMIC research organizations to accelerate their clinical trial process maturity in the field of drug discovery, population health innovation initiatives and public domain knowledge networks. When sponsored, this concept will be tested by 12 confirmed clinical research and public health organizations in six countries. The potential impact of this platform is reduced drug discovery and public health innovation lag time and improved clinical trial interventions, due to reliable clinical intelligence and bio-surveillance across all phases of the clinical innovation process.
Enhancing knowledge discovery from cancer genomics data with Galaxy
Albuquerque, Marco A.; Grande, Bruno M.; Ritch, Elie J.; Pararajalingam, Prasath; Jessa, Selin; Krzywinski, Martin; Grewal, Jasleen K.; Shah, Sohrab P.; Boutros, Paul C.
2017-01-01
Abstract The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. PMID:28327945
Enhancing knowledge discovery from cancer genomics data with Galaxy.
Albuquerque, Marco A; Grande, Bruno M; Ritch, Elie J; Pararajalingam, Prasath; Jessa, Selin; Krzywinski, Martin; Grewal, Jasleen K; Shah, Sohrab P; Boutros, Paul C; Morin, Ryan D
2017-05-01
The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. © The Author 2017. Published by Oxford University Press.
Global Health Innovation Technology Models
Harding, Kimberly
2016-01-01
Chronic technology and business process disparities between High Income, Low Middle Income and Low Income (HIC, LMIC, LIC) research collaborators directly prevent the growth of sustainable Global Health innovation for infectious and rare diseases. There is a need for an Open Source-Open Science Architecture Framework to bridge this divide. We are proposing such a framework for consideration by the Global Health community, by utilizing a hybrid approach of integrating agnostic Open Source technology and healthcare interoperability standards and Total Quality Management principles. We will validate this architecture framework through our programme called Project Orchid. Project Orchid is a conceptual Clinical Intelligence Exchange and Virtual Innovation platform utilizing this approach to support clinical innovation efforts for multi-national collaboration that can be locally sustainable for LIC and LMIC research cohorts. The goal is to enable LIC and LMIC research organizations to accelerate their clinical trial process maturity in the field of drug discovery, population health innovation initiatives and public domain knowledge networks. When sponsored, this concept will be tested by 12 confirmed clinical research and public health organizations in six countries. The potential impact of this platform is reduced drug discovery and public health innovation lag time and improved clinical trial interventions, due to reliable clinical intelligence and bio-surveillance across all phases of the clinical innovation process.
NASA Astrophysics Data System (ADS)
Cui, Wei; Parker, Laurie L.
2016-07-01
Fluorescent drug screening assays are essential for tyrosine kinase inhibitor discovery. Here we demonstrate a flexible, antibody-free TR-LRET kinase assay strategy that is enabled by the combination of streptavidin-coated quantum dot (QD) acceptors and biotinylated, Tb3+ sensitizing peptide donors. By exploiting the spectral features of Tb3+ and QD, and the high binding affinity of the streptavidin-biotin interaction, we achieved multiplexed detection of kinase activity in a modular fashion without requiring additional covalent labeling of each peptide substrate. This strategy is compatible with high-throughput screening, and should be adaptable to the rapidly changing workflows and targets involved in kinase inhibitor discovery.
Scalable Algorithms for Clustering Large Geospatiotemporal Data Sets on Manycore Architectures
NASA Astrophysics Data System (ADS)
Mills, R. T.; Hoffman, F. M.; Kumar, J.; Sreepathi, S.; Sripathi, V.
2016-12-01
The increasing availability of high-resolution geospatiotemporal data sets from sources such as observatory networks, remote sensing platforms, and computational Earth system models has opened new possibilities for knowledge discovery using data sets fused from disparate sources. Traditional algorithms and computing platforms are impractical for the analysis and synthesis of data sets of this size; however, new algorithmic approaches that can effectively utilize the complex memory hierarchies and the extremely high levels of available parallelism in state-of-the-art high-performance computing platforms can enable such analysis. We describe a massively parallel implementation of accelerated k-means clustering and some optimizations to boost computational intensity and utilization of wide SIMD lanes on state-of-the art multi- and manycore processors, including the second-generation Intel Xeon Phi ("Knights Landing") processor based on the Intel Many Integrated Core (MIC) architecture, which includes several new features, including an on-package high-bandwidth memory. We also analyze the code in the context of a few practical applications to the analysis of climatic and remotely-sensed vegetation phenology data sets, and speculate on some of the new applications that such scalable analysis methods may enable.
Automatic Beam Path Analysis of Laser Wakefield Particle Acceleration Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rubel, Oliver; Geddes, Cameron G.R.; Cormier-Michel, Estelle
2009-10-19
Numerical simulations of laser wakefield particle accelerators play a key role in the understanding of the complex acceleration process and in the design of expensive experimental facilities. As the size and complexity of simulation output grows, an increasingly acute challenge is the practical need for computational techniques that aid in scientific knowledge discovery. To that end, we present a set of data-understanding algorithms that work in concert in a pipeline fashion to automatically locate and analyze high energy particle bunches undergoing acceleration in very large simulation datasets. These techniques work cooperatively by first identifying features of interest in individual timesteps,more » then integrating features across timesteps, and based on the information derived perform analysis of temporally dynamic features. This combination of techniques supports accurate detection of particle beams enabling a deeper level of scientific understanding of physical phenomena than hasbeen possible before. By combining efficient data analysis algorithms and state-of-the-art data management we enable high-performance analysis of extremely large particle datasets in 3D. We demonstrate the usefulness of our methods for a variety of 2D and 3D datasets and discuss the performance of our analysis pipeline.« less
Application of the MIDAS approach for analysis of lysine acetylation sites.
Evans, Caroline A; Griffiths, John R; Unwin, Richard D; Whetton, Anthony D; Corfe, Bernard M
2013-01-01
Multiple Reaction Monitoring Initiated Detection and Sequencing (MIDAS™) is a mass spectrometry-based technique for the detection and characterization of specific post-translational modifications (Unwin et al. 4:1134-1144, 2005), for example acetylated lysine residues (Griffiths et al. 18:1423-1428, 2007). The MIDAS™ technique has application for discovery and analysis of acetylation sites. It is a hypothesis-driven approach that requires a priori knowledge of the primary sequence of the target protein and a proteolytic digest of this protein. MIDAS essentially performs a targeted search for the presence of modified, for example acetylated, peptides. The detection is based on the combination of the predicted molecular weight (measured as mass-charge ratio) of the acetylated proteolytic peptide and a diagnostic fragment (product ion of m/z 126.1), which is generated by specific fragmentation of acetylated peptides during collision induced dissociation performed in tandem mass spectrometry (MS) analysis. Sequence information is subsequently obtained which enables acetylation site assignment. The technique of MIDAS was later trademarked by ABSciex for targeted protein analysis where an MRM scan is combined with full MS/MS product ion scan to enable sequence confirmation.
Sadler, Euan; Fisher, Helen R.; Maher, John; Wolfe, Charles D. A.; McKevitt, Christopher
2016-01-01
Introduction Translational research is central to international health policy, research and funding initiatives. Despite increasing use of the term, the translation of basic science discoveries into clinical practice is not straightforward. This systematic search and narrative synthesis aimed to examine factors enabling or hindering translational research from the perspective of basic and clinician scientists, a key stakeholder group in translational research, and to draw policy-relevant implications for organisations seeking to optimise translational research opportunities. Methods and Results We searched SCOPUS and Web of Science from inception until April 2015 for papers reporting scientists’ views of the factors they perceive as enabling or hindering the conduct of translational research. We screened 8,295 papers from electronic database searches and 20 papers from hand searches and citation tracking, identifying 26 studies of qualitative, quantitative or mixed method designs. We used a narrative synthesis approach and identified the following themes: 1) differing concepts of translational research 2) research processes as a barrier to translational research; 3) perceived cultural divide between research and clinical care; 4) interdisciplinary collaboration as enabling translation research, but dependent on the quality of prior and current social relationships; 5) translational research as entrepreneurial science. Across all five themes, factors enabling or hindering translational research were largely shaped by wider social, organisational, and structural factors. Conclusion To optimise translational research, policy could consider refining translational research models to better reflect scientists’ experiences, fostering greater collaboration and buy in from all types of scientists. Organisations could foster cultural change, ensuring that organisational practices and systems keep pace with the change in knowledge production brought about by the translational research agenda. PMID:27490373
Fudge, Nina; Sadler, Euan; Fisher, Helen R; Maher, John; Wolfe, Charles D A; McKevitt, Christopher
2016-01-01
Translational research is central to international health policy, research and funding initiatives. Despite increasing use of the term, the translation of basic science discoveries into clinical practice is not straightforward. This systematic search and narrative synthesis aimed to examine factors enabling or hindering translational research from the perspective of basic and clinician scientists, a key stakeholder group in translational research, and to draw policy-relevant implications for organisations seeking to optimise translational research opportunities. We searched SCOPUS and Web of Science from inception until April 2015 for papers reporting scientists' views of the factors they perceive as enabling or hindering the conduct of translational research. We screened 8,295 papers from electronic database searches and 20 papers from hand searches and citation tracking, identifying 26 studies of qualitative, quantitative or mixed method designs. We used a narrative synthesis approach and identified the following themes: 1) differing concepts of translational research 2) research processes as a barrier to translational research; 3) perceived cultural divide between research and clinical care; 4) interdisciplinary collaboration as enabling translation research, but dependent on the quality of prior and current social relationships; 5) translational research as entrepreneurial science. Across all five themes, factors enabling or hindering translational research were largely shaped by wider social, organisational, and structural factors. To optimise translational research, policy could consider refining translational research models to better reflect scientists' experiences, fostering greater collaboration and buy in from all types of scientists. Organisations could foster cultural change, ensuring that organisational practices and systems keep pace with the change in knowledge production brought about by the translational research agenda.
Beginning to manage drug discovery and development knowledge.
Sumner-Smith, M
2001-05-01
Knowledge management approaches and technologies are beginning to be implemented by the pharmaceutical industry in support of new drug discovery and development processes aimed at greater efficiencies and effectiveness. This trend coincides with moves to reduce paper, coordinate larger teams with more diverse skills that are distributed around the globe, and to comply with regulatory requirements for electronic submissions and the associated maintenance of electronic records. Concurrently, the available technologies have implemented web-based architectures with a greater range of collaborative tools and personalization through portal approaches. However, successful application of knowledge management methods depends on effective cultural change management, as well as proper architectural design to match the organizational and work processes within a company.
atBioNet--an integrated network analysis tool for genomics and biomarker discovery.
Ding, Yijun; Chen, Minjun; Liu, Zhichao; Ding, Don; Ye, Yanbin; Zhang, Min; Kelly, Reagan; Guo, Li; Su, Zhenqiang; Harris, Stephen C; Qian, Feng; Ge, Weigong; Fang, Hong; Xu, Xiaowei; Tong, Weida
2012-07-20
Large amounts of mammalian protein-protein interaction (PPI) data have been generated and are available for public use. From a systems biology perspective, Proteins/genes interactions encode the key mechanisms distinguishing disease and health, and such mechanisms can be uncovered through network analysis. An effective network analysis tool should integrate different content-specific PPI databases into a comprehensive network format with a user-friendly platform to identify key functional modules/pathways and the underlying mechanisms of disease and toxicity. atBioNet integrates seven publicly available PPI databases into a network-specific knowledge base. Knowledge expansion is achieved by expanding a user supplied proteins/genes list with interactions from its integrated PPI network. The statistically significant functional modules are determined by applying a fast network-clustering algorithm (SCAN: a Structural Clustering Algorithm for Networks). The functional modules can be visualized either separately or together in the context of the whole network. Integration of pathway information enables enrichment analysis and assessment of the biological function of modules. Three case studies are presented using publicly available disease gene signatures as a basis to discover new biomarkers for acute leukemia, systemic lupus erythematosus, and breast cancer. The results demonstrated that atBioNet can not only identify functional modules and pathways related to the studied diseases, but this information can also be used to hypothesize novel biomarkers for future analysis. atBioNet is a free web-based network analysis tool that provides a systematic insight into proteins/genes interactions through examining significant functional modules. The identified functional modules are useful for determining underlying mechanisms of disease and biomarker discovery. It can be accessed at: http://www.fda.gov/ScienceResearch/BioinformaticsTools/ucm285284.htm.
NASA Astrophysics Data System (ADS)
Seul, M.; Brazil, L.; Castronova, A. M.
2017-12-01
CUAHSI Data Services: Tools and Cyberinfrastructure for Water Data Discovery, Research and CollaborationEnabling research surrounding interdisciplinary topics often requires a combination of finding, managing, and analyzing large data sets and models from multiple sources. This challenge has led the National Science Foundation to make strategic investments in developing community data tools and cyberinfrastructure that focus on water data, as it is central need for many of these research topics. CUAHSI (The Consortium of Universities for the Advancement of Hydrologic Science, Inc.) is a non-profit organization funded by the National Science Foundation to aid students, researchers, and educators in using and managing data and models to support research and education in the water sciences. This presentation will focus on open-source CUAHSI-supported tools that enable enhanced data discovery online using advanced searching capabilities and computational analysis run in virtual environments pre-designed for educators and scientists so they can focus their efforts on data analysis rather than IT set-up.
From Information Center to Discovery System: Next Step for Libraries?
ERIC Educational Resources Information Center
Marcum, James W.
2001-01-01
Proposes a discovery system model to guide technology integration in academic libraries that fuses organizational learning, systems learning, and knowledge creation techniques with constructivist learning practices to suggest possible future directions for digital libraries. Topics include accessing visual and continuous media; information…
Foreword to "The Secret of Childhood."
ERIC Educational Resources Information Center
Stephenson, Margaret E.
2000-01-01
Discusses the basic discoveries of Montessori's Casa dei Bambini. Considers principles of Montessori's organizing theory: the absorbent mind, the unfolding nature of life, the spiritual embryo, self-construction, acquisition of culture, creativity of life, repetition of exercise, freedom within limits, children's discovery of knowledge, the secret…
NASA Astrophysics Data System (ADS)
Harwit, Martin
1984-04-01
In the remarkable opening section of this book, a well-known Cornell astronomer gives precise thumbnail histories of the 43 basic cosmic discoveries - stars, planets, novae, pulsars, comets, gamma-ray bursts, and the like - that form the core of our knowledge of the universe. Many of them, he points out, were made accidentally and outside the mainstream of astronomical research and funding. This observation leads him to speculate on how many more major phenomena there might be and how they might be most effectively sought out in afield now dominated by large instruments and complex investigative modes and observational conditions. The book also examines discovery in terms of its political, financial, and sociological context - the role of new technologies and of industry and the military in revealing new knowledge; and methods of funding, of peer review, and of allotting time on our largest telescopes. It concludes with specific recommendations for organizing astronomy in ways that will best lead to the discovery of the many - at least sixty - phenomena that Harwit estimates are still waiting to be found.
MOPED enables discoveries through consistently processed proteomics data
Higdon, Roger; Stewart, Elizabeth; Stanberry, Larissa; Haynes, Winston; Choiniere, John; Montague, Elizabeth; Anderson, Nathaniel; Yandl, Gregory; Janko, Imre; Broomall, William; Fishilevich, Simon; Lancet, Doron; Kolker, Natali; Kolker, Eugene
2014-01-01
The Model Organism Protein Expression Database (MOPED, http://moped.proteinspire.org), is an expanding proteomics resource to enable biological and biomedical discoveries. MOPED aggregates simple, standardized and consistently processed summaries of protein expression and metadata from proteomics (mass spectrometry) experiments from human and model organisms (mouse, worm and yeast). The latest version of MOPED adds new estimates of protein abundance and concentration, as well as relative (differential) expression data. MOPED provides a new updated query interface that allows users to explore information by organism, tissue, localization, condition, experiment, or keyword. MOPED supports the Human Proteome Project’s efforts to generate chromosome and diseases specific proteomes by providing links from proteins to chromosome and disease information, as well as many complementary resources. MOPED supports a new omics metadata checklist in order to harmonize data integration, analysis and use. MOPED’s development is driven by the user community, which spans 90 countries guiding future development that will transform MOPED into a multi-omics resource. MOPED encourages users to submit data in a simple format. They can use the metadata a checklist generate a data publication for this submission. As a result, MOPED will provide even greater insights into complex biological processes and systems and enable deeper and more comprehensive biological and biomedical discoveries. PMID:24350770
The discovery of medicines for rare diseases
Swinney, David C; Xia, Shuangluo
2015-01-01
There is a pressing need for new medicines (new molecular entities; NMEs) for rare diseases as few of the 6800 rare diseases (according to the NIH) have approved treatments. Drug discovery strategies for the 102 orphan NMEs approved by the US FDA between 1999 and 2012 were analyzed to learn from past success: 46 NMEs were first in class; 51 were followers; and five were imaging agents. First-in-class medicines were discovered with phenotypic assays (15), target-based approaches (12) and biologic strategies (18). Identification of genetic causes in areas with more basic and translational research such as cancer and in-born errors in metabolism contributed to success regardless of discovery strategy. In conclusion, greater knowledge increases the chance of success and empirical solutions can be effective when knowledge is incomplete. PMID:25068983
The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery
2014-01-01
The Semanticscience Integrated Ontology (SIO) is an ontology to facilitate biomedical knowledge discovery. SIO features a simple upper level comprised of essential types and relations for the rich description of arbitrary (real, hypothesized, virtual, fictional) objects, processes and their attributes. SIO specifies simple design patterns to describe and associate qualities, capabilities, functions, quantities, and informational entities including textual, geometrical, and mathematical entities, and provides specific extensions in the domains of chemistry, biology, biochemistry, and bioinformatics. SIO provides an ontological foundation for the Bio2RDF linked data for the life sciences project and is used for semantic integration and discovery for SADI-based semantic web services. SIO is freely available to all users under a creative commons by attribution license. See website for further information: http://sio.semanticscience.org. PMID:24602174
Knowledge discovery from structured mammography reports using inductive logic programming.
Burnside, Elizabeth S; Davis, Jesse; Costa, Victor Santos; Dutra, Inês de Castro; Kahn, Charles E; Fine, Jason; Page, David
2005-01-01
The development of large mammography databases provides an opportunity for knowledge discovery and data mining techniques to recognize patterns not previously appreciated. Using a database from a breast imaging practice containing patient risk factors, imaging findings, and biopsy results, we tested whether inductive logic programming (ILP) could discover interesting hypotheses that could subsequently be tested and validated. The ILP algorithm discovered two hypotheses from the data that were 1) judged as interesting by a subspecialty trained mammographer and 2) validated by analysis of the data itself.
A Metadata based Knowledge Discovery Methodology for Seeding Translational Research.
Kothari, Cartik R; Payne, Philip R O
2015-01-01
In this paper, we present a semantic, metadata based knowledge discovery methodology for identifying teams of researchers from diverse backgrounds who can collaborate on interdisciplinary research projects: projects in areas that have been identified as high-impact areas at The Ohio State University. This methodology involves the semantic annotation of keywords and the postulation of semantic metrics to improve the efficiency of the path exploration algorithm as well as to rank the results. Results indicate that our methodology can discover groups of experts from diverse areas who can collaborate on translational research projects.
NASA Astrophysics Data System (ADS)
Pouchard, L. C.; Depriest, A.; Huhns, M.
2012-12-01
Ontologies and semantic technologies are an essential infrastructure component of systems supporting knowledge integration in the Earth Sciences. Numerous earth science ontologies exist, but are hard to discover because they tend to be hosted with the projects that develop them. There are often few quality measures and sparse metadata associated with these ontologies, such as modification dates, versioning, purpose, number of classes, and properties. Projects often develop ontologies for their own needs without considering existing ontology entities or derivations from formal and more basic ontologies. The result is mostly orthogonal ontologies, and ontologies that are not modular enough to reuse in part or adapt for new purposes, in spite of existing, standards for ontology representation. Additional obstacles to sharing and reuse include a lack of maintenance once a project is completed. The obstacles prevent the full exploitation of semantic technologies in a context where they could become needed enablers for service discovery and for matching data with services. To start addressing this gap, we have deployed BioPortal, a mature, domain-independent ontology and semantic service system developed by the National Center for Biomedical Ontologies (NCBO), on the ESIP Testbed under the governance of the ESIP Semantic Web cluster. ESIP provides a forum for a broad-based, distributed community of data and information technology practitioners and stakeholders to coordinate their efforts and develop new ideas for interoperability solutions. The Testbed provides an environment where innovations and best practices can be explored and evaluated. One objective of this deployment is to provide a community platform that would harness the organizational and cyber infrastructure provided by ESIP at minimal costs. Another objective is to host ontology services on a scalable, public cloud and investigate the business case for crowd sourcing of ontology maintenance. We deployed the system on Amazon 's Elastic Compute Cloud (EC2) where ESIP maintains an account. Our approach had three phases: 1) set up a private cloud environment at the University of South Carolina to become familiar with the complex architecture of the system and enable some basic customization, 2) coordinate the production of a Virtual Appliance for the system with NCBO and deploy it on the Amazon cloud, and 3) outreach to the ESIP community to solicit participation, populate the repository, and develop new use cases. Phase 2 is nearing completion and Phase 3 is underway. Ontologies were gathered during updates to the ESIP cluster. Discussion points included the criteria for a shareable ontology and how to determine the best size for an ontology to be reusable. Outreach highlighted that the system can start addressing an integration of discovery frameworks via linking data and services in a pull model (data and service casting), a key issue of the Discovery cluster. This work thus presents several contributions: 1) technology injection from another domain into the earth sciences, 2) the deployment of a mature knowledge platform on the EC2 cloud, and 3) the successful engagement of the community through the ESIP clusters and Testbed model.
Enabling Service Discovery in a Federation of Systems: WS-Discovery Case Study
2014-06-01
found that Pastry [3] coupled with SCRIBE [4] provides everything we require from the overlay network: Pastry nodes form a decentralized, self...application-independent manner. Furthermore, Pastry provides mechanisms that support and facilitate application-specific object replication, caching, and fault...recovery. Add SCRIBE to Pastry , and you get a generic, scalable and efficient group communication and event notification system providing
ERIC Educational Resources Information Center
Kraft, Donald H., Ed.
The 2000 ASIS (American Society for Information Science) conference explored knowledge innovation. The tracks in the conference program included knowledge discovery, capture, and creation; classification and representation; information retrieval; knowledge dissemination; and social, behavioral, ethical, and legal aspects. This proceedings is…
Evaluating the Science of Discovery in Complex Health Systems
ERIC Educational Resources Information Center
Norman, Cameron D.; Best, Allan; Mortimer, Sharon; Huerta, Timothy; Buchan, Alison
2011-01-01
Complex health problems such as chronic disease or pandemics require knowledge that transcends disciplinary boundaries to generate solutions. Such transdisciplinary discovery requires researchers to work and collaborate across boundaries, combining elements of basic and applied science. At the same time, calls for more interdisciplinary health…
29 CFR 18.14 - Scope of discovery.
Code of Federal Regulations, 2014 CFR
2014-07-01
... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
49 CFR 386.38 - Scope of discovery.
Code of Federal Regulations, 2011 CFR
2011-10-01
... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
49 CFR 386.38 - Scope of discovery.
Code of Federal Regulations, 2012 CFR
2012-10-01
... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
29 CFR 18.14 - Scope of discovery.
Code of Federal Regulations, 2012 CFR
2012-07-01
... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
49 CFR 386.38 - Scope of discovery.
Code of Federal Regulations, 2013 CFR
2013-10-01
... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
29 CFR 18.14 - Scope of discovery.
Code of Federal Regulations, 2011 CFR
2011-07-01
... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
29 CFR 18.14 - Scope of discovery.
Code of Federal Regulations, 2013 CFR
2013-07-01
... administrative law judge in accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the... things and the identity and location of persons having knowledge of any discoverable matter. (b) It is...
49 CFR 386.38 - Scope of discovery.
Code of Federal Regulations, 2014 CFR
2014-10-01
... accordance with these rules, the parties may obtain discovery regarding any matter, not privileged, which is relevant to the subject matter involved in the proceeding, including the existence, description, nature... location of persons having knowledge of any discoverable matter. (b) It is not ground for objection that...
Shared strategies for β-lactam catabolism in the soil microbiome.
Crofts, Terence S; Wang, Bin; Spivak, Aaron; Gianoulis, Tara A; Forsberg, Kevin J; Gibson, Molly K; Johnsky, Lauren A; Broomall, Stacey M; Rosenzweig, C Nicole; Skowronski, Evan W; Gibbons, Henry S; Sommer, Morten O A; Dantas, Gautam
2018-06-01
The soil microbiome can produce, resist, or degrade antibiotics and even catabolize them. While resistance genes are widely distributed in the soil, there is a dearth of knowledge concerning antibiotic catabolism. Here we describe a pathway for penicillin catabolism in four isolates. Genomic and transcriptomic sequencing revealed β-lactamase, amidase, and phenylacetic acid catabolon upregulation. Knocking out part of the phenylacetic acid catabolon or an apparent penicillin utilization operon (put) resulted in loss of penicillin catabolism in one isolate. A hydrolase from the put operon was found to degrade in vitro benzylpenicilloic acid, the β-lactamase penicillin product. To test the generality of this strategy, an Escherichia coli strain was engineered to co-express a β-lactamase and a penicillin amidase or the put operon, enabling it to grow using penicillin or benzylpenicilloic acid, respectively. Elucidation of additional pathways may allow bioremediation of antibiotic-contaminated soils and discovery of antibiotic-remodeling enzymes with industrial utility.
Reconstitution reveals motor activation for intraflagellar transport.
Mohamed, Mohamed A A; Stepp, Willi L; Ökten, Zeynep
2018-05-01
The human body represents a notable example of ciliary diversification. Extending from the surface of most cells, cilia accomplish a diverse set of tasks. Predictably, mutations in ciliary genes cause a wide range of human diseases such as male infertility and blindness. In Caenorhabditis elegans sensory cilia, this functional diversity appears to be traceable to the differential regulation of the kinesin-2-powered intraflagellar-transport (IFT) machinery. Here we reconstituted the first, to our knowledge, functional multi-component IFT complex that is deployed in the sensory cilia of C. elegans. Our bottom-up approach revealed the molecular basis of specific motor recruitment to the IFT trains. We identified the key component that incorporates homodimeric kinesin-2 into its physiologically relevant context, which in turn allosterically activates the motor for efficient transport. These results will enable the molecular delineation of IFT regulation, which has eluded understanding since its discovery more than two decades ago.
Roussigne, Myriam; Blader, Patrick; Wilson, Stephen W
2012-03-01
How does left-right asymmetry develop in the brain and how does the resultant asymmetric circuitry impact on brain function and lateralized behaviors? By enabling scientists to address these questions at the levels of genes, neurons, circuitry and behavior,the zebrafish model system provides a route to resolve the complexity of brain lateralization. In this review, we present the progress made towards characterizing the nature of the gene networks and the sequence of morphogenetic events involved in the asymmetric development of zebrafish epithalamus. In an attempt to integrate the recent extensive knowledge into a working model and to identify the future challenges,we discuss how insights gained at a cellular/developmental level can be linked to the data obtained at a molecular/genetic level. Finally, we present some evolutionary thoughts and discuss how significant discoveries made in zebrafish should provide entry points to better understand the evolutionary origins of brain lateralization.
Comparative Genomics and Host Resistance against Infectious Diseases
Qureshi, Salman T.; Skamene, Emil
1999-01-01
The large size and complexity of the human genome have limited the identification and functional characterization of components of the innate immune system that play a critical role in front-line defense against invading microorganisms. However, advances in genome analysis (including the development of comprehensive sets of informative genetic markers, improved physical mapping methods, and novel techniques for transcript identification) have reduced the obstacles to discovery of novel host resistance genes. Study of the genomic organization and content of widely divergent vertebrate species has shown a remarkable degree of evolutionary conservation and enables meaningful cross-species comparison and analysis of newly discovered genes. Application of comparative genomics to host resistance will rapidly expand our understanding of human immune defense by facilitating the translation of knowledge acquired through the study of model organisms. We review the rationale and resources for comparative genomic analysis and describe three examples of host resistance genes successfully identified by this approach. PMID:10081670
2017-12-08
Dr. Robert Weems, emeritus paleontologist for the USGS verifies the recently discovered dinosaur track found on the NASA Goddard Space Flight Center campus. This imprint shows the right rear foot of a nodosaur - a low-slung, spiny leaf-eater - apparently moving in haste as the heel did not fully settle in the cretaceous mud, according to dinosaur tracker Ray Stanford. It was found recently on NASA's Goddard Space Flight Center campus and is being preserved for study. To read more about this discovery go to: 1.usa.gov/P9NYg7 Credit: NASA/GSFC/Rebecca Roth NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram
2017-12-08
Dinosaur tracker Ray Stanford describes the cretaceous-era nodosaur track he found on the Goddard Space Flight Center campus with Dr. Robert Weems, emeritus paleontologist for the USGS who verified his discovery. This imprint shows the right rear foot of a nodosaur - a low-slung, spiny leaf-eater - apparently moving in haste as the heel did not fully settle in the cretaceous mud, according to dinosaur tracker Ray Stanford. It was found recently on NASA's Goddard Space Flight Center campus and is being preserved for study. To read more go to: 1.usa.gov/P9NYg7 Credit: NASA/GSFC/Rebecca Roth NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram
2012-08-23
Dr. Robert Weems, emeritus paleontologist for the USGS verifies the recently discovered dinosaur track found on the NASA Goddard Space Flight Center campus. This imprint shows the right rear foot of a nodosaur - a low-slung, spiny leaf-eater - apparently moving in haste as the heel did not fully settle in the cretaceous mud, according to dinosaur tracker Ray Stanford. It was found recently on NASA's Goddard Space Flight Center campus and is being preserved for study. To read more about this discovery go to: 1.usa.gov/P9NYg7 Credit: NASA/GSFC/Rebecca Roth NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram
2012-08-23
Dinosaur tracker Ray Stanford describes the cretaceous-era nodosaur track he found on the Goddard Space Flight Center campus this year. The imprint shows the right rear foot of a nodosaur - a low-slung, spiny leaf-eater - apparently moving in haste as the heel did not fully settle in the cretaceous mud, according to dinosaur tracker Ray Stanford. It was found recently on NASA's Goddard Space Flight Center campus and is being preserved for study. To read more about this discovery go to: 1.usa.gov/P9NYg7 Credit: NASA/GSFC/Rebecca Roth NASA image use policy. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Like us on Facebook Find us on Instagram
Building on the Cornerstone: Destinations for Nearside Sample Return
NASA Technical Reports Server (NTRS)
Lawrence, S. J.; Jolliff, B. L.; Draper, D.; Stopar, J. D.; Petro, N. E.; Cohen, B. A.; Speyerer, E. J.; Gruener, J. E.
2016-01-01
Discoveries from LRO (Lunar Reconnaissance Orbiter) have transformed our knowledge of the Moon, but LRO's instruments were originally designed to collect the measurements required to enable future lunar surface exploration. Compelling science questions and critical resources make the Moon a key destination for future human and robotic exploration. Lunar surface exploration, including rovers and other landed missions, must be part of a balanced planetary science and exploration portfolio. Among the highest planetary exploration priorities is the collection of new samples and their return to Earth for more comprehensive analysis than can be done in-situ. The Moon is the closest and most accessible location to address key science questions through targeted sample return. The Moon is the only other planet from which we have contextualized samples, yet critical issues need to be addressed: we lack important details of the Moon's early and recent geologic history, the full compositional and age ranges of its crust, and its bulk composition.
Single-cell RNA-sequencing: The future of genome biology is now
Picelli, Simone
2017-01-01
ABSTRACT Genome-wide single-cell analysis represents the ultimate frontier of genomics research. In particular, single-cell RNA-sequencing (scRNA-seq) studies have been boosted in the last few years by an explosion of new technologies enabling the study of the transcriptomic landscape of thousands of single cells in complex multicellular organisms. More sensitive and automated methods are being continuously developed and promise to deliver better data quality and higher throughput with less hands-on time. The outstanding amount of knowledge that is going to be gained from present and future studies will have a profound impact in many aspects of our society, from the introduction of truly tailored cancer treatments, to a better understanding of antibiotic resistance and host-pathogen interactions; from the discovery of the mechanisms regulating stem cell differentiation to the characterization of the early event of human embryogenesis. PMID:27442339
McEntire, Robin; Szalkowski, Debbie; Butler, James; Kuo, Michelle S; Chang, Meiping; Chang, Man; Freeman, Darren; McQuay, Sarah; Patel, Jagruti; McGlashen, Michael; Cornell, Wendy D; Xu, Jinghai James
2016-05-01
External content sources such as MEDLINE(®), National Institutes of Health (NIH) grants and conference websites provide access to the latest breaking biomedical information, which can inform pharmaceutical and biotechnology company pipeline decisions. The value of the sites for industry, however, is limited by the use of the public internet, the limited synonyms, the rarity of batch searching capability and the disconnected nature of the sites. Fortunately, many sites now offer their content for download and we have developed an automated internal workflow that uses text mining and tailored ontologies for programmatic search and knowledge extraction. We believe such an efficient and secure approach provides a competitive advantage to companies needing access to the latest information for a range of use cases and complements manually curated commercial sources. Copyright © 2016. Published by Elsevier Ltd.
Killion, Chery M
2007-01-01
A nurse anthropologist with a background in international collaborations attended Project LEAD for two years, which enabled her to continue to serve as an advocate for the mentally ill in Belize. The anthropologist collaborated with a psychiatrist from Belize to develop a cross-cultural, cross-discipline publication, "Mental Health in Belize: A National Priority, " which highlights the work of psychiatric nurse practitioners in the country. The researcher learned to collaborate with her peer in Belize through face to face discussions and e-mail and overcame technological difficulties and cultural barriers to produce an international publication. Project LEAD gave the author a sense of self-discovery and self-knowledge, reinforced core values, and developed a frame of reference for leadership. The author also benefited from discussions by local, national, and international leaders on leadership in terms of its key components, contexts, challenges, triumphs, and styles.
Trends in Modern Drug Discovery.
Eder, Jörg; Herrling, Paul L
2016-01-01
Drugs discovered by the pharmaceutical industry over the past 100 years have dramatically changed the practice of medicine and impacted on many aspects of our culture. For many years, drug discovery was a target- and mechanism-agnostic approach that was based on ethnobotanical knowledge often fueled by serendipity. With the advent of modern molecular biology methods and based on knowledge of the human genome, drug discovery has now largely changed into a hypothesis-driven target-based approach, a development which was paralleled by significant environmental changes in the pharmaceutical industry. Laboratories became increasingly computerized and automated, and geographically dispersed research sites are now more and more clustered into large centers to capture technological and biological synergies. Today, academia, the regulatory agencies, and the pharmaceutical industry all contribute to drug discovery, and, in order to translate the basic science into new medical treatments for unmet medical needs, pharmaceutical companies have to have a critical mass of excellent scientists working in many therapeutic fields, disciplines, and technologies. The imperative for the pharmaceutical industry to discover breakthrough medicines is matched by the increasing numbers of first-in-class drugs approved in recent years and reflects the impact of modern drug discovery approaches, technologies, and genomics.
The Role of Learning Goals in Building a Knowledge Base for Elementary Mathematics Teacher Education
ERIC Educational Resources Information Center
Jansen, Amanda; Bartell, Tonya; Berk, Dawn
2009-01-01
In this article, we describe features of learning goals that enable indexing knowledge for teacher education. Learning goals are the key enabler for building a knowledge base for teacher education; they define what counts as essential knowledge for prospective teachers. We argue that 2 characteristics of learning goals support knowledge-building…
Reuniting Virtue and Knowledge
ERIC Educational Resources Information Center
Culham, Tom
2015-01-01
Einstein held that intuition is more important than rational inquiry as a source of discovery. Further, he explicitly and implicitly linked the heart, the sacred, devotion and intuitive knowledge. The raison d'être of universities is the advance of knowledge; however, they have primarily focused on developing student's skills in working with…
The Service Environment for Enhanced Knowledge and Research (SEEKR) Framework
NASA Astrophysics Data System (ADS)
King, T. A.; Walker, R. J.; Weigel, R. S.; Narock, T. W.; McGuire, R. E.; Candey, R. M.
2011-12-01
The Service Environment for Enhanced Knowledge and Research (SEEKR) Framework is a configurable service oriented framework to enable the discovery, access and analysis of data shared in a community. The SEEKR framework integrates many existing independent services through the use of web technologies and standard metadata. Services are hosted on systems by using an application server and are callable by using REpresentational State Transfer (REST) protocols. Messages and metadata are transferred with eXtensible Markup Language (XML) encoding which conform to a published XML schema. Space Physics Archive Search and Extract (SPASE) metadata is central to utilizing the services. Resources (data, documents, software, etc.) are described with SPASE and the associated Resource Identifier is used to access and exchange resources. The configurable options for the service can be set by using a web interface. Services are packaged as web application resource (WAR) files for direct deployment on application services such as Tomcat or Jetty. We discuss the composition of the SEEKR framework, how new services can be integrated and the steps necessary to deploying the framework. The SEEKR Framework emerged from NASA's Virtual Magnetospheric Observatory (VMO) and other systems and we present an overview of these systems from a SEEKR Framework perspective.
de la Iglesia, D; Cachau, R E; García-Remesal, M; Maojo, V
2013-11-27
Nanotechnology represents an area of particular promise and significant opportunity across multiple scientific disciplines. Ongoing nanotechnology research ranges from the characterization of nanoparticles and nanomaterials to the analysis and processing of experimental data seeking correlations between nanoparticles and their functionalities and side effects. Due to their special properties, nanoparticles are suitable for cellular-level diagnostics and therapy, offering numerous applications in medicine, e.g. development of biomedical devices, tissue repair, drug delivery systems and biosensors. In nanomedicine, recent studies are producing large amounts of structural and property data, highlighting the role for computational approaches in information management. While in vitro and in vivo assays are expensive, the cost of computing is falling. Furthermore, improvements in the accuracy of computational methods (e.g. data mining, knowledge discovery, modeling and simulation) have enabled effective tools to automate the extraction, management and storage of these vast data volumes. Since this information is widely distributed, one major issue is how to locate and access data where it resides (which also poses data-sharing limitations). The novel discipline of nanoinformatics addresses the information challenges related to nanotechnology research. In this paper, we summarize the needs and challenges in the field and present an overview of extant initiatives and efforts.
NASA Astrophysics Data System (ADS)
de la Iglesia, D.; Cachau, R. E.; García-Remesal, M.; Maojo, V.
2013-01-01
Nanotechnology represents an area of particular promise and significant opportunity across multiple scientific disciplines. Ongoing nanotechnology research ranges from the characterization of nanoparticles and nanomaterials to the analysis and processing of experimental data seeking correlations between nanoparticles and their functionalities and side effects. Due to their special properties, nanoparticles are suitable for cellular-level diagnostics and therapy, offering numerous applications in medicine, e.g. development of biomedical devices, tissue repair, drug delivery systems and biosensors. In nanomedicine, recent studies are producing large amounts of structural and property data, highlighting the role for computational approaches in information management. While in vitro and in vivo assays are expensive, the cost of computing is falling. Furthermore, improvements in the accuracy of computational methods (e.g. data mining, knowledge discovery, modeling and simulation) have enabled effective tools to automate the extraction, management and storage of these vast data volumes. Since this information is widely distributed, one major issue is how to locate and access data where it resides (which also poses data-sharing limitations). The novel discipline of nanoinformatics addresses the information challenges related to nanotechnology research. In this paper, we summarize the needs and challenges in the field and present an overview of extant initiatives and efforts.
Bellen, Hugo J; Tong, Chao; Tsuda, Hiroshi
2010-07-01
Discoveries in fruit flies have greatly contributed to our understanding of neuroscience. The use of an unparalleled wealth of tools, many of which originated between 1910–1960, has enabled milestone discoveries in nervous system development and function. Such findings have triggered and guided many research efforts in vertebrate neuroscience. After 100 years, fruit flies continue to be the choice model system for many neuroscientists. The combinational use of powerful research tools will ensure that this model organism will continue to lead to key discoveries that will impact vertebrate neuroscience.
Bellen, Hugo J; Tong, Chao; Tsuda, Hiroshi
2014-01-01
Discoveries in fruit flies have greatly contributed to our understanding of neuroscience. The use of an unparalleled wealth of tools, many of which originated between 1910–1960, has enabled milestone discoveries in nervous system development and function. Such findings have triggered and guided many research efforts in vertebrate neuroscience. After 100 years, fruit flies continue to be the choice model system for many neuroscientists. The combinational use of powerful research tools will ensure that this model organism will continue to lead to key discoveries that will impact vertebrate neuroscience. PMID:20383202
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gerber, Richard; Allcock, William; Beggio, Chris
2014-10-17
U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at themore » DOE national laboratories. The report contains findings from that review.« less
Selective transformations of complex molecules are enabled by aptameric protective groups
NASA Astrophysics Data System (ADS)
Bastian, Andreas A.; Marcozzi, Alessio; Herrmann, Andreas
2012-10-01
Emerging trends in drug discovery are prompting a renewed interest in natural products as a source of chemical diversity and lead structures. However, owing to the structural complexity of many natural compounds, the synthesis of derivatives is not easily realized. Here, we demonstrate a conceptually new approach using oligonucleotides as aptameric protective groups. These block several functionalities by non-covalent interactions in a complex molecule and enable the highly chemo- and regioselective derivatization (>99%) of natural antibiotics in a single synthetic step with excellent conversions of up to 83%. This technique reveals an important structure-activity relationship in neamine-based antibiotics and should help both to accelerate the discovery of new biologically active structures and to avoid potentially costly and cumbersome synthetic routes.
Laboratory Astrophysics: Enabling Scientific Discovery and Understanding
NASA Technical Reports Server (NTRS)
Kirby, K.
2006-01-01
NASA's Science Strategic Roadmap for Universe Exploration lays out a series of science objectives on a grand scale and discusses the various missions, over a wide range of wavelengths, which will enable discovery. Astronomical spectroscopy is arguably the most powerful tool we have for exploring the Universe. Experimental and theoretical studies in Laboratory Astrophysics convert "hard-won data into scientific understanding". However, the development of instruments with increasingly high spectroscopic resolution demands atomic and molecular data of unprecedented accuracy and completeness. How to meet these needs, in a time of severe budgetary constraints, poses a significant challenge both to NASA, the astronomical observers and model-builders, and the laboratory astrophysics community. I will discuss these issues, together with some recent examples of productive astronomy/lab astro collaborations.
18 CFR 385.402 - Scope of discovery (Rule 402).
Code of Federal Regulations, 2010 CFR
2010-04-01
... 18 Conservation of Power and Water Resources 1 2010-04-01 2010-04-01 false Scope of discovery (Rule 402). 385.402 Section 385.402 Conservation of Power and Water Resources FEDERAL ENERGY REGULATORY... persons having any knowledge of any discoverable matter. It is not ground for objection that the...
Doors to Discovery[TM]. What Works Clearinghouse Intervention Report
ERIC Educational Resources Information Center
What Works Clearinghouse, 2013
2013-01-01
"Doors to Discovery"]TM] is a preschool literacy curriculum that uses eight thematic units of activities to help children build fundamental early literacy skills in oral language, phonological awareness, concepts of print, alphabet knowledge, writing, and comprehension. The eight thematic units cover topics such as nature, friendship,…
78 FR 12933 - Proceedings Before the Commodity Futures Trading Commission
Federal Register 2010, 2011, 2012, 2013, 2014
2013-02-26
... proceedings. These new amendments also provide that Judgment Officers may conduct sua sponte discovery in... discovery; (4) sound risk management practices; and (5) other public interest considerations. The amendments... representative capacity, it was done with full power and authority to do so; (C) To the best of his knowledge...
76 FR 64803 - Rules of Adjudication and Enforcement
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-19
...) is also amended to clarify the limits on discovery when the Commission orders the ALJ to consider the... that the complainant identify, to the best of its knowledge, the ``like or directly competitive... the taking of discovery by the parties shall be at the discretion of the presiding ALJ. The ITCTLA...
78 FR 63253 - Davidson Kempner Capital Management LLC; Notice of Application
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-23
... employees of the Adviser other than the Contributor have any knowledge of the Contribution prior to its discovery by the Adviser on November 2, 2011. The Contribution was discovered by the Adviser's compliance... names of employees. After discovery of the Contribution, the Adviser and Contributor obtained the...
Exobiology opportunities from Discovery-class missions. [Abstract only
NASA Technical Reports Server (NTRS)
Meyer, Michael A.; Rummel, John D.
1994-01-01
Discovery-class missions that are now planned, and those in the concept stage, have the potential to expand our knowledge of the origins and evolution of biogenic compounds, and ultimately, of the origins of life in the solar system. This class of missions, recently developed within NASA's Solar System Exploration Program, is designed to meet important scientific objectives within stringent guidelines--$150 million cap on development cost and a 3-year cap on the development schedule. The Discovery Program will effectively enable "faster, cheaper" missions to explore the inner solar system. The first two missions are Mars Environmental Survey (MESUR) Pathfinder and Near Earth Asteroid Rendezvous (NEAR). MESUR Pathfinder will be the first Discovery mission, with launch planned for November/December 1996. It will be primarily a technical demonstration and validation of the MESUR Program--a network of automated landers to study the internal structure, meteorology, and surface properties of Mars. Besides providing engineering data, Pathfinder will carry atmospheric instrumentation and imaging capabilities, and may deploy a microrover equipped with an alpha proton X-ray spectrometer to determine elemental composition, particularly the lighter elements of exobiological interest. NEAR is expected to be launched in 1998 and to rendezvous with a near-Earth asteroid for up to 1 year. During this time, the spacecraft will assess the asteroid's mass, size, density, map its surface topography and composition, determine its internal properties, and study its interaction with the interplanetary environment. A gamma ray or X-ray spectrometer will be used to determine elemental composition. An imaging spectrograph, with 0.35 to 2.5 micron spectral range, will be used to determine the asteroid's compositional disbribution. Of the 11 Discovery mission concepts that have been designated as warranting further study, several are promising in terms of determining the composition and chemical evolution of organic matter on small planetary bodies. The following mission concepts are of particular interest to the Exobiology Program: Cometary coma chemical composition, comet nucleus tour, near earth asteroid returned sample, small missions to asteroids and comets, and solar wind sample return. The following three Discovery mission concepts that have been targeted for further consideration are relevant to the study of the evolution of biogenic compounds: Comet nucleus penetrator, mainbelt asteroid rendezvous explorer, and the Mars polar Pathfinder.
Revisiting lab-on-a-chip technology for drug discovery.
Neuži, Pavel; Giselbrecht, Stefan; Länge, Kerstin; Huang, Tony Jun; Manz, Andreas
2012-08-01
The field of microfluidics or lab-on-a-chip technology aims to improve and extend the possibilities of bioassays, cell biology and biomedical research based on the idea of miniaturization. Microfluidic systems allow more accurate modelling of physiological situations for both fundamental research and drug development, and enable systematic high-volume testing for various aspects of drug discovery. Microfluidic systems are in development that not only model biological environments but also physically mimic biological tissues and organs; such 'organs on a chip' could have an important role in expediting early stages of drug discovery and help reduce reliance on animal testing. This Review highlights the latest lab-on-a-chip technologies for drug discovery and discusses the potential for future developments in this field.
Ultsch, Alfred; Kringel, Dario; Kalso, Eija; Mogil, Jeffrey S; Lötsch, Jörn
2016-12-01
The increasing availability of "big data" enables novel research approaches to chronic pain while also requiring novel techniques for data mining and knowledge discovery. We used machine learning to combine the knowledge about n = 535 genes identified empirically as relevant to pain with the knowledge about the functions of thousands of genes. Starting from an accepted description of chronic pain as displaying systemic features described by the terms "learning" and "neuronal plasticity," a functional genomics analysis proposed that among the functions of the 535 "pain genes," the biological processes "learning or memory" (P = 8.6 × 10) and "nervous system development" (P = 2.4 × 10) are statistically significantly overrepresented as compared with the annotations to these processes expected by chance. After establishing that the hypothesized biological processes were among important functional genomics features of pain, a subset of n = 34 pain genes were found to be annotated with both Gene Ontology terms. Published empirical evidence supporting their involvement in chronic pain was identified for almost all these genes, including 1 gene identified in March 2016 as being involved in pain. By contrast, such evidence was virtually absent in a randomly selected set of 34 other human genes. Hence, the present computational functional genomics-based method can be used for candidate gene selection, providing an alternative to established methods.
Network-based approaches to climate knowledge discovery
NASA Astrophysics Data System (ADS)
Budich, Reinhard; Nyberg, Per; Weigel, Tobias
2011-11-01
Climate Knowledge Discovery Workshop; Hamburg, Germany, 30 March to 1 April 2011 Do complex networks combined with semantic Web technologies offer the next generation of solutions in climate science? To address this question, a first Climate Knowledge Discovery (CKD) Workshop, hosted by the German Climate Computing Center (Deutsches Klimarechenzentrum (DKRZ)), brought together climate and computer scientists from major American and European laboratories, data centers, and universities, as well as representatives from industry, the broader academic community, and the semantic Web communities. The participants, representing six countries, were concerned with large-scale Earth system modeling and computational data analysis. The motivation for the meeting was the growing problem that climate scientists generate data faster than it can be interpreted and the need to prepare for further exponential data increases. Current analysis approaches are focused primarily on traditional methods, which are best suited for large-scale phenomena and coarse-resolution data sets. The workshop focused on the open discussion of ideas and technologies to provide the next generation of solutions to cope with the increasing data volumes in climate science.
Mott, Meghan; Koroshetz, Walter
2015-07-01
The mission of the National Institute of Neurological Disorders and Stroke (NINDS) is to seek fundamental knowledge about the brain and nervous system and to use that knowledge to reduce the burden of neurological disease. NINDS supports early- and late-stage therapy development funding programs to accelerate preclinical discovery and the development of new therapeutic interventions for neurological disorders. The NINDS Office of Translational Research facilitates and funds the movement of discoveries from the laboratory to patients. Its grantees include academics, often with partnerships with the private sector, as well as small businesses, which, by Congressional mandate, receive > 3% of the NINDS budget for small business innovation research. This article provides an overview of NINDS-funded therapy development programs offered by the NINDS Office of Translational Research.
Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases
Frijters, Raoul; van Vugt, Marianne; Smeets, Ruben; van Schaik, René; de Vlieg, Jacob; Alkema, Wynand
2010-01-01
The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs. PMID:20885778
Asymmetric threat data mining and knowledge discovery
NASA Astrophysics Data System (ADS)
Gilmore, John F.; Pagels, Michael A.; Palk, Justin
2001-03-01
Asymmetric threats differ from the conventional force-on- force military encounters that the Defense Department has historically been trained to engage. Terrorism by its nature is now an operational activity that is neither easily detected or countered as its very existence depends on small covert attacks exploiting the element of surprise. But terrorism does have defined forms, motivations, tactics and organizational structure. Exploiting a terrorism taxonomy provides the opportunity to discover and assess knowledge of terrorist operations. This paper describes the Asymmetric Threat Terrorist Assessment, Countering, and Knowledge (ATTACK) system. ATTACK has been developed to (a) data mine open source intelligence (OSINT) information from web-based newspaper sources, video news web casts, and actual terrorist web sites, (b) evaluate this information against a terrorism taxonomy, (c) exploit country/region specific social, economic, political, and religious knowledge, and (d) discover and predict potential terrorist activities and association links. Details of the asymmetric threat structure and the ATTACK system architecture are presented with results of an actual terrorist data mining and knowledge discovery test case shown.
Hoehndorf, Robert; Dumontier, Michel; Oellrich, Anika; Rebholz-Schuhmann, Dietrich; Schofield, Paul N; Gkoutos, Georgios V
2011-01-01
Researchers design ontologies as a means to accurately annotate and integrate experimental data across heterogeneous and disparate data- and knowledge bases. Formal ontologies make the semantics of terms and relations explicit such that automated reasoning can be used to verify the consistency of knowledge. However, many biomedical ontologies do not sufficiently formalize the semantics of their relations and are therefore limited with respect to automated reasoning for large scale data integration and knowledge discovery. We describe a method to improve automated reasoning over biomedical ontologies and identify several thousand contradictory class definitions. Our approach aligns terms in biomedical ontologies with foundational classes in a top-level ontology and formalizes composite relations as class expressions. We describe the semi-automated repair of contradictions and demonstrate expressive queries over interoperable ontologies. Our work forms an important cornerstone for data integration, automatic inference and knowledge discovery based on formal representations of knowledge. Our results and analysis software are available at http://bioonto.de/pmwiki.php/Main/ReasonableOntologies.
Payne, Philip R O; Kwok, Alan; Dhaval, Rakesh; Borlawsky, Tara B
2009-03-01
The conduct of large-scale translational studies presents significant challenges related to the storage, management and analysis of integrative data sets. Ideally, the application of methodologies such as conceptual knowledge discovery in databases (CKDD) provides a means for moving beyond intuitive hypothesis discovery and testing in such data sets, and towards the high-throughput generation and evaluation of knowledge-anchored relationships between complex bio-molecular and phenotypic variables. However, the induction of such high-throughput hypotheses is non-trivial, and requires correspondingly high-throughput validation methodologies. In this manuscript, we describe an evaluation of the efficacy of a natural language processing-based approach to validating such hypotheses. As part of this evaluation, we will examine a phenomenon that we have labeled as "Conceptual Dissonance" in which conceptual knowledge derived from two or more sources of comparable scope and granularity cannot be readily integrated or compared using conventional methods and automated tools.
Oak Ridge Graph Analytics for Medical Innovation (ORiGAMI)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roberts, Larry W.; Lee, Sangkeun
2016-01-01
In this era of data-driven decisions and discovery where Big Data is producing Bigger Data, data scientists at the Oak Ridge National Laboratory are leveraging unique leadership infrastructure (e.g., Urika XA and Urika GD appliances) to develop scalable algorithms for semantic, logical and statistical reasoning with Big Data (i.e., data stored in databases as well as unstructured data in documents). ORiGAMI is a next-generation knowledge-discovery framework that is: (a) knowledge nurturing (i.e., evolves seamlessly with newer knowledge and data), (b) smart and curious (i.e. using information-foraging and reasoning algorithms to digest content) and (c) synergistic (i.e., interfaces computers with whatmore » they do best to help subject-matter-experts do their best. ORiGAMI has been demonstrated using the National Library of Medicine's SEMANTIC MEDLINE (archive of medical knowledge since 1994).« less
Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R
2017-07-05
Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell.
NASA Astrophysics Data System (ADS)
Cantwell, K. L.; Kennedy, B. R.; Malik, M.; Gray, L. M.; Elliott, K.; Lobecker, E.; Drewniak, J.; Reser, B.; Crum, E.; Lovalvo, D.
2016-02-01
Since it's commissioning in 2008, NOAA Ship Okeanos Explorer has used telepresence technology both as an outreach tool and as a new way to conduct interdisciplinary science expeditions. NOAA's Office of Ocean Exploration and Research (OER) has developed a set of collaboration tools and protocols to enable extensive shore-based participation. Telepresence offers unique advantages including access to a large pool of expertise on shore and flexibility to react to new discoveries as they occur. During early years, the telepresence experience was limited to Internet 2 enabled Exploration Command Centers, but with advent of improved bandwidth and new video transcoders, scientists from anywhere with an internet connection can participate in a telepresence expedition. Scientists have also capitalized on social media (Twitter, Facebook, Reddit etc.) by sharing discoveries to leverage the intellectual capital of scientists worldwide and engaging the general public in real-time. Aside from using telepresence to stream video off the ship, the high-bandwidth satellite connection allows for the transfer of large quantities of data in near real-time. This enables not only ship - shore data transfers, but can also support ship - ship collaborations as demonstrated during the 2015 and 2014 seasons where Okeanos worked directly with science teams onboard other vessels to share data and immediately follow up on features of interest, leading to additional discoveries. OER continues to expand its use of telepresence by experimenting with procedures to offload roles previously tied to the ship, such as data acquisition watch standers; prototyping tools for distributed user data analysis and video annotation; and incorporating in-situ sampling devices. OER has also developed improved tools to provide access to archived data to increase data distribution and facilitate additional discoveries post-expedition.
Celedon, J M; Bohlmann, J
2016-01-01
Terpenoid fragrances are powerful mediators of ecological interactions in nature and have a long history of traditional and modern industrial applications. Plants produce a great diversity of fragrant terpenoid metabolites, which make them a superb source of biosynthetic genes and enzymes. Advances in fragrance gene discovery have enabled new approaches in synthetic biology of high-value speciality molecules toward applications in the fragrance and flavor, food and beverage, cosmetics, and other industries. Rapid developments in transcriptome and genome sequencing of nonmodel plant species have accelerated the discovery of fragrance biosynthetic pathways. In parallel, advances in metabolic engineering of microbial and plant systems have established platforms for synthetic biology applications of some of the thousands of plant genes that underlie fragrance diversity. While many fragrance molecules (eg, simple monoterpenes) are abundant in readily renewable plant materials, some highly valuable fragrant terpenoids (eg, santalols, ambroxides) are rare in nature and interesting targets for synthetic biology. As a representative example for genomics/transcriptomics enabled gene and enzyme discovery, we describe a strategy used successfully for elucidation of a complete fragrance biosynthetic pathway in sandalwood (Santalum album) and its reconstruction in yeast (Saccharomyces cerevisiae). We address questions related to the discovery of specific genes within large gene families and recovery of rare gene transcripts that are selectively expressed in recalcitrant tissues. To substantiate the validity of the approaches, we describe the combination of methods used in the gene and enzyme discovery of a cytochrome P450 in the fragrant heartwood of tropical sandalwood, responsible for the fragrance defining, final step in the biosynthesis of (Z)-santalols. © 2016 Elsevier Inc. All rights reserved.
García-Alonso, Carlos; Pérez-Naranjo, Leonor
2009-01-01
Introduction Knowledge management, based on information transfer between experts and analysts, is crucial for the validity and usability of data envelopment analysis (DEA). Aim To design and develop a methodology: i) to assess technical efficiency of small health areas (SHA) in an uncertainty environment, and ii) to transfer information between experts and operational models, in both directions, for improving expert’s knowledge. Method A procedure derived from knowledge discovery from data (KDD) is used to select, interpret and weigh DEA inputs and outputs. Based on KDD results, an expert-driven Monte-Carlo DEA model has been designed to assess the technical efficiency of SHA in Andalusia. Results In terms of probability, SHA 29 is the most efficient being, on the contrary, SHA 22 very inefficient. 73% of analysed SHA have a probability of being efficient (Pe) >0.9 and 18% <0.5. Conclusions Expert knowledge is necessary to design and validate any operational model. KDD techniques make the transfer of information from experts to any operational model easy and results obtained from the latter improve expert’s knowledge.
NASA Astrophysics Data System (ADS)
Kicza, Mary; Bruegge, Richard Vorder
1995-01-01
NASA's Discovery Program represents an new era in planetary exploration. Discovery's primary goal: to maintain U.S. scientific leadership in planetary research by conducting a series of highly focused, cost effective missions to answer critical questions in solar system science. The Program will stimulate the development of innovative management approaches by encouraging new teaming arrangements among industry, universities and the government. The program encourages the prudent use of new technologies to enable/enhance science return and to reduce life cycle cost, and it supports the transfer of these technologies to the private sector for secondary applications. The Near-Earth Asteroid Rendezvous and Mars Pathfinder missions have been selected as the first two Discovery missions. Both will be launched in 1996. Subsequent, competitively selected missions will be conceived and proposed to NASA by teams of scientists and engineers from industry, academia, and government organizations. This paper summarizes the status of Discovery Program planning.
Simple animal models for amyotrophic lateral sclerosis drug discovery.
Patten, Shunmoogum A; Parker, J Alex; Wen, Xiao-Yan; Drapeau, Pierre
2016-08-01
Simple animal models have enabled great progress in uncovering the disease mechanisms of amyotrophic lateral sclerosis (ALS) and are helping in the selection of therapeutic compounds through chemical genetic approaches. Within this article, the authors provide a concise overview of simple model organisms, C. elegans, Drosophila and zebrafish, which have been employed to study ALS and discuss their value to ALS drug discovery. In particular, the authors focus on innovative chemical screens that have established simple organisms as important models for ALS drug discovery. There are several advantages of using simple animal model organisms to accelerate drug discovery for ALS. It is the authors' particular belief that the amenability of simple animal models to various genetic manipulations, the availability of a wide range of transgenic strains for labelling motoneurons and other cell types, combined with live imaging and chemical screens should allow for new detailed studies elucidating early pathological processes in ALS and subsequent drug and target discovery.
Discovery of novel drugs for promising targets.
Martell, Robert E; Brooks, David G; Wang, Yan; Wilcoxen, Keith
2013-09-01
Once a promising drug target is identified, the steps to actually discover and optimize a drug are diverse and challenging. The goal of this study was to provide a road map to navigate drug discovery. Review general steps for drug discovery and provide illustrating references. A number of approaches are available to enhance and accelerate target identification and validation. Consideration of a variety of potential mechanisms of action of potential drugs can guide discovery efforts. The hit to lead stage may involve techniques such as high-throughput screening, fragment-based screening, and structure-based design, with informatics playing an ever-increasing role. Biologically relevant screening models are discussed, including cell lines, 3-dimensional culture, and in vivo screening. The process of enabling human studies for an investigational drug is also discussed. Drug discovery is a complex process that has significantly evolved in recent years. © 2013 Elsevier HS Journals, Inc. All rights reserved.
Phenotypic screening in cancer drug discovery - past, present and future.
Moffat, John G; Rudolph, Joachim; Bailey, David
2014-08-01
There has been a resurgence of interest in the use of phenotypic screens in drug discovery as an alternative to target-focused approaches. Given that oncology is currently the most active therapeutic area, and also one in which target-focused approaches have been particularly prominent in the past two decades, we investigated the contribution of phenotypic assays to oncology drug discovery by analysing the origins of all new small-molecule cancer drugs approved by the US Food and Drug Administration (FDA) over the past 15 years and those currently in clinical development. Although the majority of these drugs originated from target-based discovery, we identified a significant number whose discovery depended on phenotypic screening approaches. We postulate that the contribution of phenotypic screening to cancer drug discovery has been hampered by a reliance on 'classical' nonspecific drug effects such as cytotoxicity and mitotic arrest, exacerbated by a paucity of mechanistically defined cellular models for therapeutically translatable cancer phenotypes. However, technical and biological advances that enable such mechanistically informed phenotypic models have the potential to empower phenotypic drug discovery in oncology.
The CUAHSI Water Data Center: Enabling Data Publication, Discovery and Re-use
NASA Astrophysics Data System (ADS)
Seul, M.; Pollak, J.
2014-12-01
The CUAHSI Water Data Center (WDC) supports a standards-based, services-oriented architecture for time-series data and provides a separate service to publish spatial data layers as shape files. Two new services that the WDC offers are a cloud-based server (Cloud HydroServer) for publishing data and a web-based client for data discovery. The Cloud HydroServer greatly simplifies data publication by eliminating the need for scientists to set up an SQL-server data base, a requirement that has proven to be a significant barrier, and ensures greater reliability and continuity of service. Uploaders have been developed to simplify the metadata documentation process. The web-based data client eliminates the need for installing a program to be used as a client and works across all computer operating systems. The services provided by the WDC is a foundation for big data use, re-use, and meta-analyses. Using data transmission standards enables far more effective data sharing and discovery; standards used by the WDC are part of a global set of standards that should enable scientists to access unprecedented amount of data to address larger-scale research questions than was previously possible. A central mission of the WDC is to ensure these services meet the needs of the water science community and are effective at advancing water science.
2010-01-01
Background An important focus of genomic science is the discovery and characterization of all functional elements within genomes. In silico methods are used in genome studies to discover putative regulatory genomic elements (called words or motifs). Although a number of methods have been developed for motif discovery, most of them lack the scalability needed to analyze large genomic data sets. Methods This manuscript presents WordSeeker, an enumerative motif discovery toolkit that utilizes multi-core and distributed computational platforms to enable scalable analysis of genomic data. A controller task coordinates activities of worker nodes, each of which (1) enumerates a subset of the DNA word space and (2) scores words with a distributed Markov chain model. Results A comprehensive suite of performance tests was conducted to demonstrate the performance, speedup and efficiency of WordSeeker. The scalability of the toolkit enabled the analysis of the entire genome of Arabidopsis thaliana; the results of the analysis were integrated into The Arabidopsis Gene Regulatory Information Server (AGRIS). A public version of WordSeeker was deployed on the Glenn cluster at the Ohio Supercomputer Center. Conclusion WordSeeker effectively utilizes concurrent computing platforms to enable the identification of putative functional elements in genomic data sets. This capability facilitates the analysis of the large quantity of sequenced genomic data. PMID:21210985
18 CFR 385.403 - Methods of discovery; general provisions (Rule 403).
Code of Federal Regulations, 2010 CFR
2010-04-01
... 18 Conservation of Power and Water Resources 1 2010-04-01 2010-04-01 false Methods of discovery; general provisions (Rule 403). 385.403 Section 385.403 Conservation of Power and Water Resources FEDERAL... the response is true and accurate to the best of that person's knowledge, information, and belief...
ERIC Educational Resources Information Center
Heffernan, Bernadette M.
1998-01-01
Describes work done to provide staff of the Sandy Point Discovery Center with methods for evaluating exhibits and interpretive programming. Quantitative and qualitative evaluation measures were designed to assess the program's objective of estuary education. Pretest-posttest questionnaires and interviews are used to measure subjects' knowledge and…
The Prehistory of Discovery: Precursors of Representational Change in Solving Gear System Problems.
ERIC Educational Resources Information Center
Dixon, James A.; Bangert, Ashley S.
2002-01-01
This study investigated whether the process of representational change undergoes developmental change or different processes occupy different niches in the course of knowledge acquisition. Subjects--college, third-, and sixth-grade students--solved gear system problems over two sessions. Findings indicated that for all grades, discovery of the…
40 CFR 300.300 - Phase I-Discovery or notification.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 40 Protection of Environment 27 2010-07-01 2010-07-01 false Phase I-Discovery or notification. 300.300 Section 300.300 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) SUPERFUND... person in charge of a vessel or a facility shall, as soon as he or she has knowledge of any discharge...
Current Advances on Virus Discovery and Diagnostic Role of Viral Metagenomics in Aquatic Organisms
Munang'andu, Hetron M.; Mugimba, Kizito K.; Byarugaba, Denis K.; Mutoloki, Stephen; Evensen, Øystein
2017-01-01
The global expansion of the aquaculture industry has brought with it a corresponding increase of novel viruses infecting different aquatic organisms. These emerging viral pathogens have proved to be a challenge to the use of traditional cell-cultures and immunoassays for identification of new viruses especially in situations where the novel viruses are unculturable and no antibodies exist for their identification. Viral metagenomics has the potential to identify novel viruses without prior knowledge of their genomic sequence data and may provide a solution for the study of unculturable viruses. This review provides a synopsis on the contribution of viral metagenomics to the discovery of viruses infecting different aquatic organisms as well as its potential role in viral diagnostics. High throughput Next Generation sequencing (NGS) and library construction used in metagenomic projects have simplified the task of generating complete viral genomes unlike the challenge faced in traditional methods that use multiple primers targeted at different segments and VPs to generate the entire genome of a novel virus. In terms of diagnostics, studies carried out this far show that viral metagenomics has the potential to serve as a multifaceted tool able to study and identify etiological agents of single infections, co-infections, tissue tropism, profiling viral infections of different aquatic organisms, epidemiological monitoring of disease prevalence, evolutionary phylogenetic analyses, and the study of genomic diversity in quasispecies viruses. With sequencing technologies and bioinformatics analytical tools becoming cheaper and easier, we anticipate that metagenomics will soon become a routine tool for the discovery, study, and identification of novel pathogens including viruses to enable timely disease control for emerging diseases in aquaculture. PMID:28382024
NASA Technical Reports Server (NTRS)
Griffin, Amanda
2012-01-01
Among 2011's many accomplishments, we safely retired the Space Shuttle Program after 30 incredible years; completed the International Space Station and are taking steps to enable it to reach its full potential as a multi-purpose laboratory; and helped to expand scientific knowledge with missions like Aquarius, GRAIL, and the Mars Science Laboratory. Responding to national budget challenges, we are prioritizing critical capabilities and divesting ourselves of assets no longer needed for NASA's future exploration programs. Since these facilities do not have to be maintained or demolished, the government saves money. At the same time, our commercial partners save money because they do not have to build new facilities. It is a win-win for everyone. Moving forward, 2012 will be even more historically significant as we celebrate the 50th Anniversary of Kennedy Space Center. In the coming year, KSC will facilitate commercial transportation to low-Earth orbit and support the evolution of the Space Launch System and Orion crew vehicle as they ready for exploration missions, which will shape how human beings view the universe. While NASA's Vision is to lead scientific and technological advances in aeronautics and space for a Nation on the frontier of discovery KSC's vision is to be the world's preeminent launch complex for government and commercial space access, enabling the world to explore and work in space. KSC's Mission is to safely manage, develop, integrate, and sustain space systems through partnerships that enable innovative, diverse access to space and inspires the Nation's future explorers.
Serendipity: Accidental Discoveries in Science
NASA Astrophysics Data System (ADS)
Roberts, Royston M.
1989-06-01
Many of the things discovered by accident are important in our everyday lives: Teflon, Velcro, nylon, x-rays, penicillin, safety glass, sugar substitutes, and polyethylene and other plastics. And we owe a debt to accident for some of our deepest scientific knowledge, including Newton's theory of gravitation, the Big Bang theory of Creation, and the discovery of DNA. Even the Rosetta Stone, the Dead Sea Scrolls, and the ruins of Pompeii came to light through chance. This book tells the fascinating stories of these and other discoveries and reveals how the inquisitive human mind turns accident into discovery. Written for the layman, yet scientifically accurate, this illuminating collection of anecdotes portrays invention and discovery as quintessentially human acts, due in part to curiosity, perserverance, and luck.
Explorations of Psyche and Callisto Enabled by Ion Propulsion
NASA Technical Reports Server (NTRS)
Wenkert, Daniel D.; Landau, Damon F.; Bills, Bruce G.; Elkins-Tanton, Linda T.
2013-01-01
Recent developments in ion propulsion (specifically solar electric propulsion - SEP) have the potential for dramatically reducing the transportation cost of planetary missions. We examine two representative cases, where these new developments enable missions which, until recently, would have required resouces well beyond those allocated to the Discovery program. The two cases of interest address differentiation of asteroids and large icy satellites
Bigger data, collaborative tools and the future of predictive drug discovery
NASA Astrophysics Data System (ADS)
Ekins, Sean; Clark, Alex M.; Swamidass, S. Joshua; Litterman, Nadia; Williams, Antony J.
2014-10-01
Over the past decade we have seen a growth in the provision of chemistry data and cheminformatics tools as either free websites or software as a service commercial offerings. These have transformed how we find molecule-related data and use such tools in our research. There have also been efforts to improve collaboration between researchers either openly or through secure transactions using commercial tools. A major challenge in the future will be how such databases and software approaches handle larger amounts of data as it accumulates from high throughput screening and enables the user to draw insights, enable predictions and move projects forward. We now discuss how information from some drug discovery datasets can be made more accessible and how privacy of data should not overwhelm the desire to share it at an appropriate time with collaborators. We also discuss additional software tools that could be made available and provide our thoughts on the future of predictive drug discovery in this age of big data. We use some examples from our own research on neglected diseases, collaborations, mobile apps and algorithm development to illustrate these ideas.
NHS-Esters As Versatile Reactivity-Based Probes for Mapping Proteome-Wide Ligandable Hotspots.
Ward, Carl C; Kleinman, Jordan I; Nomura, Daniel K
2017-06-16
Most of the proteome is considered undruggable, oftentimes hindering translational efforts for drug discovery. Identifying previously unknown druggable hotspots in proteins would enable strategies for pharmacologically interrogating these sites with small molecules. Activity-based protein profiling (ABPP) has arisen as a powerful chemoproteomic strategy that uses reactivity-based chemical probes to map reactive, functional, and ligandable hotspots in complex proteomes, which has enabled inhibitor discovery against various therapeutic protein targets. Here, we report an alkyne-functionalized N-hydroxysuccinimide-ester (NHS-ester) as a versatile reactivity-based probe for mapping the reactivity of a wide range of nucleophilic ligandable hotspots, including lysines, serines, threonines, and tyrosines, encompassing active sites, allosteric sites, post-translational modification sites, protein interaction sites, and previously uncharacterized potential binding sites. Surprisingly, we also show that fragment-based NHS-ester ligands can be made to confer selectivity for specific lysine hotspots on specific targets including Dpyd, Aldh2, and Gstt1. We thus put forth NHS-esters as promising reactivity-based probes and chemical scaffolds for covalent ligand discovery.
Salvador-Carulla, L; Lukersmith, S; Sullivan, W
2017-04-01
Guideline methods to develop recommendations dedicate most effort around organising discovery and corroboration knowledge following the evidence-based medicine (EBM) framework. Guidelines typically use a single dimension of information, and generally discard contextual evidence and formal expert knowledge and consumer's experiences in the process. In recognition of the limitations of guidelines in complex cases, complex interventions and systems research, there has been significant effort to develop new tools, guides, resources and structures to use alongside EBM methods of guideline development. In addition to these advances, a new framework based on the philosophy of science is required. Guidelines should be defined as implementation decision support tools for improving the decision-making process in real-world practice and not only as a procedure to optimise the knowledge base of scientific discovery and corroboration. A shift from the model of the EBM pyramid of corroboration of evidence to the use of broader multi-domain perspective graphically depicted as 'Greek temple' could be considered. This model takes into account the different stages of scientific knowledge (discovery, corroboration and implementation), the sources of knowledge relevant to guideline development (experimental, observational, contextual, expert-based and experiential); their underlying inference mechanisms (deduction, induction, abduction, means-end inferences) and a more precise definition of evidence and related terms. The applicability of this broader approach is presented for the development of the Canadian Consensus Guidelines for the Primary Care of People with Developmental Disabilities.
Anguera, A; Barreiro, J M; Lara, J A; Lizcano, D
2016-01-01
One of the major challenges in the medical domain today is how to exploit the huge amount of data that this field generates. To do this, approaches are required that are capable of discovering knowledge that is useful for decision making in the medical field. Time series are data types that are common in the medical domain and require specialized analysis techniques and tools, especially if the information of interest to specialists is concentrated within particular time series regions, known as events. This research followed the steps specified by the so-called knowledge discovery in databases (KDD) process to discover knowledge from medical time series derived from stabilometric (396 series) and electroencephalographic (200) patient electronic health records (EHR). The view offered in the paper is based on the experience gathered as part of the VIIP project. Knowledge discovery in medical time series has a number of difficulties and implications that are highlighted by illustrating the application of several techniques that cover the entire KDD process through two case studies. This paper illustrates the application of different knowledge discovery techniques for the purposes of classification within the above domains. The accuracy of this application for the two classes considered in each case is 99.86% and 98.11% for epilepsy diagnosis in the electroencephalography (EEG) domain and 99.4% and 99.1% for early-age sports talent classification in the stabilometry domain. The KDD techniques achieve better results than other traditional neural network-based classification techniques.
ERIC Educational Resources Information Center
Powell, Lesley; Cheshire, Anna
2008-01-01
The purpose of this study is to adapt, deliver, and pilot test the Self-discovery Programme (SDP) for teachers in mainstream school. The study used a pre-test post-test design. Quantitative data were collected by self-administered questionnaires given to teachers at two points in time: baseline (immediately pre-SDP) and immediately post-SDP.…
Burghaus, R; Cosson, V; Cheung, SYA; Chenel, M; DellaPasqua, O; Frey, N; Hamrén, B; Harnisch, L; Ivanow, F; Kerbusch, T; Lippert, J; Milligan, PA; Rohou, S; Staab, A; Steimer, JL; Tornøe, C; Visser, SAG
2016-01-01
This document was developed to enable greater consistency in the practice, application, and documentation of Model‐Informed Drug Discovery and Development (MID3) across the pharmaceutical industry. A collection of “good practice” recommendations are assembled here in order to minimize the heterogeneity in both the quality and content of MID3 implementation and documentation. The three major objectives of this white paper are to: i) inform company decision makers how the strategic integration of MID3 can benefit R&D efficiency; ii) provide MID3 analysts with sufficient material to enhance the planning, rigor, and consistency of the application of MID3; and iii) provide regulatory authorities with substrate to develop MID3 related and/or MID3 enabled guidelines. PMID:27069774
Collaborative drug discovery for More Medicines for Tuberculosis (MM4TB)
Ekins, Sean; Spektor, Anna Coulon; Clark, Alex M.; Dole, Krishna; Bunin, Barry A.
2016-01-01
Neglected disease drug discovery is generally poorly funded compared with major diseases and hence there is an increasing focus on collaboration and precompetitive efforts such as public–private partnerships (PPPs). The More Medicines for Tuberculosis (MM4TB) project is one such collaboration funded by the EU with the goal of discovering new drugs for tuberculosis. Collaborative Drug Discovery has provided a commercial web-based platform called CDD Vault which is a hosted collaborative solution for securely sharing diverse chemistry and biology data. Using CDD Vault alongside other commercial and free cheminformatics tools has enabled support of this and other large collaborative projects, aiding drug discovery efforts and fostering collaboration. We will describe CDD's efforts in assisting with the MM4TB project. PMID:27884746
Flagg, Jennifer L; Lane, Joseph P; Lockett, Michelle M
2013-02-15
Traditional government policies suggest that upstream investment in scientific research is necessary and sufficient to generate technological innovations. The expected downstream beneficial socio-economic impacts are presumed to occur through non-government market mechanisms. However, there is little quantitative evidence for such a direct and formulaic relationship between public investment at the input end and marketplace benefits at the impact end. Instead, the literature demonstrates that the technological innovation process involves a complex interaction between multiple sectors, methods, and stakeholders. The authors theorize that accomplishing the full process of technological innovation in a deliberate and systematic manner requires an operational-level model encompassing three underlying methods, each designed to generate knowledge outputs in different states: scientific research generates conceptual discoveries; engineering development generates prototype inventions; and industrial production generates commercial innovations. Given the critical roles of engineering and business, the entire innovation process should continuously consider the practical requirements and constraints of the commercial marketplace.The Need to Knowledge (NtK) Model encompasses the activities required to successfully generate innovations, along with associated strategies for effectively communicating knowledge outputs in all three states to the various stakeholders involved. It is intentionally grounded in evidence drawn from academic analysis to facilitate objective and quantitative scrutiny, and industry best practices to enable practical application. The Need to Knowledge (NtK) Model offers a practical, market-oriented approach that avoids the gaps, constraints and inefficiencies inherent in undirected activities and disconnected sectors. The NtK Model is a means to realizing increased returns on public investments in those science and technology programs expressly intended to generate beneficial socio-economic impacts.
2013-01-01
Background Traditional government policies suggest that upstream investment in scientific research is necessary and sufficient to generate technological innovations. The expected downstream beneficial socio-economic impacts are presumed to occur through non-government market mechanisms. However, there is little quantitative evidence for such a direct and formulaic relationship between public investment at the input end and marketplace benefits at the impact end. Instead, the literature demonstrates that the technological innovation process involves a complex interaction between multiple sectors, methods, and stakeholders. Discussion The authors theorize that accomplishing the full process of technological innovation in a deliberate and systematic manner requires an operational-level model encompassing three underlying methods, each designed to generate knowledge outputs in different states: scientific research generates conceptual discoveries; engineering development generates prototype inventions; and industrial production generates commercial innovations. Given the critical roles of engineering and business, the entire innovation process should continuously consider the practical requirements and constraints of the commercial marketplace. The Need to Knowledge (NtK) Model encompasses the activities required to successfully generate innovations, along with associated strategies for effectively communicating knowledge outputs in all three states to the various stakeholders involved. It is intentionally grounded in evidence drawn from academic analysis to facilitate objective and quantitative scrutiny, and industry best practices to enable practical application. Summary The Need to Knowledge (NtK) Model offers a practical, market-oriented approach that avoids the gaps, constraints and inefficiencies inherent in undirected activities and disconnected sectors. The NtK Model is a means to realizing increased returns on public investments in those science and technology programs expressly intended to generate beneficial socio-economic impacts. PMID:23414369
NASA Astrophysics Data System (ADS)
Ganzert, Steven; Guttmann, Josef; Steinmann, Daniel; Kramer, Stefan
Lung protective ventilation strategies reduce the risk of ventilator associated lung injury. To develop such strategies, knowledge about mechanical properties of the mechanically ventilated human lung is essential. This study was designed to develop an equation discovery system to identify mathematical models of the respiratory system in time-series data obtained from mechanically ventilated patients. Two techniques were combined: (i) the usage of declarative bias to reduce search space complexity and inherently providing the processing of background knowledge. (ii) A newly developed heuristic for traversing the hypothesis space with a greedy, randomized strategy analogical to the GSAT algorithm. In 96.8% of all runs the applied equation discovery system was capable to detect the well-established equation of motion model of the respiratory system in the provided data. We see the potential of this semi-automatic approach to detect more complex mathematical descriptions of the respiratory system from respiratory data.
Antisense oligonucleotide technologies in drug discovery.
Aboul-Fadl, Tarek
2006-09-01
The principle of antisense oligonucleotide (AS-OD) technologies is based on the specific inhibition of unwanted gene expression by blocking mRNA activity. It has long appeared to be an ideal strategy to leverage new genomic knowledge for drug discovery and development. In recent years, AS-OD technologies have been widely used as potent and promising tools for this purpose. There is a rapid increase in the number of antisense molecules progressing in clinical trials. AS-OD technologies provide a simple and efficient approach for drug discovery and development and are expected to become a reality in the near future. This editorial describes the established and emerging AS-OD technologies in drug discovery.
100 years of elementary particles [Beam Line, vol. 27, issue 1, Spring 1997
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pais, Abraham; Weinberg, Steven; Quigg, Chris
1997-04-01
This issue of Beam Line commemorates the 100th anniversary of the April 30, 1897 report of the discovery of the electron by J.J. Thomson and the ensuing discovery of other subatomic particles. In the first three articles, theorists Abraham Pais, Steven Weinberg, and Chris Quigg provide their perspectives on the discoveries of elementary particles as well as the implications and future directions resulting from these discoveries. In the following three articles, Michael Riordan, Wolfgang Panofsky, and Virginia Trimble apply our knowledge about elementary particles to high-energy research, electronics technology, and understanding the origin and evolution of our Universe.
100 years of Elementary Particles [Beam Line, vol. 27, issue 1, Spring 1997
DOE R&D Accomplishments Database
Pais, Abraham; Weinberg, Steven; Quigg, Chris; Riordan, Michael; Panofsky, Wolfgang K. H.; Trimble, Virginia
1997-04-01
This issue of Beam Line commemorates the 100th anniversary of the April 30, 1897 report of the discovery of the electron by J.J. Thomson and the ensuing discovery of other subatomic particles. In the first three articles, theorists Abraham Pais, Steven Weinberg, and Chris Quigg provide their perspectives on the discoveries of elementary particles as well as the implications and future directions resulting from these discoveries. In the following three articles, Michael Riordan, Wolfgang Panofsky, and Virginia Trimble apply our knowledge about elementary particles to high-energy research, electronics technology, and understanding the origin and evolution of our Universe.
A knowledgebase system to enhance scientific discovery: Telemakus
Fuller, Sherrilynne S; Revere, Debra; Bugni, Paul F; Martin, George M
2004-01-01
Background With the rapid expansion of scientific research, the ability to effectively find or integrate new domain knowledge in the sciences is proving increasingly difficult. Efforts to improve and speed up scientific discovery are being explored on a number of fronts. However, much of this work is based on traditional search and retrieval approaches and the bibliographic citation presentation format remains unchanged. Methods Case study. Results The Telemakus KnowledgeBase System provides flexible new tools for creating knowledgebases to facilitate retrieval and review of scientific research reports. In formalizing the representation of the research methods and results of scientific reports, Telemakus offers a potential strategy to enhance the scientific discovery process. While other research has demonstrated that aggregating and analyzing research findings across domains augments knowledge discovery, the Telemakus system is unique in combining document surrogates with interactive concept maps of linked relationships across groups of research reports. Conclusion Based on how scientists conduct research and read the literature, the Telemakus KnowledgeBase System brings together three innovations in analyzing, displaying and summarizing research reports across a domain: (1) research report schema, a document surrogate of extracted research methods and findings presented in a consistent and structured schema format which mimics the research process itself and provides a high-level surrogate to facilitate searching and rapid review of retrieved documents; (2) research findings, used to index the documents, allowing searchers to request, for example, research studies which have studied the relationship between neoplasms and vitamin E; and (3) visual exploration interface of linked relationships for interactive querying of research findings across the knowledgebase and graphical displays of what is known as well as, through gaps in the map, what is yet to be tested. The rationale and system architecture are described and plans for the future are discussed. PMID:15507158
Li, Jin; Zheng, Le; Uchiyama, Akihiko; Bin, Lianghua; Mauro, Theodora M; Elias, Peter M; Pawelczyk, Tadeusz; Sakowicz-Burkiewicz, Monika; Trzeciak, Magdalena; Leung, Donald Y M; Morasso, Maria I; Yu, Peng
2018-06-13
A large volume of biological data is being generated for studying mechanisms of various biological processes. These precious data enable large-scale computational analyses to gain biological insights. However, it remains a challenge to mine the data efficiently for knowledge discovery. The heterogeneity of these data makes it difficult to consistently integrate them, slowing down the process of biological discovery. We introduce a data processing paradigm to identify key factors in biological processes via systematic collection of gene expression datasets, primary analysis of data, and evaluation of consistent signals. To demonstrate its effectiveness, our paradigm was applied to epidermal development and identified many genes that play a potential role in this process. Besides the known epidermal development genes, a substantial proportion of the identified genes are still not supported by gain- or loss-of-function studies, yielding many novel genes for future studies. Among them, we selected a top gene for loss-of-function experimental validation and confirmed its function in epidermal differentiation, proving the ability of this paradigm to identify new factors in biological processes. In addition, this paradigm revealed many key genes in cold-induced thermogenesis using data from cold-challenged tissues, demonstrating its generalizability. This paradigm can lead to fruitful results for studying molecular mechanisms in an era of explosive accumulation of publicly available biological data.
Novel ageing-biomarker discovery using data-intensive technologies.
Griffiths, H R; Augustyniak, E M; Bennett, S J; Debacq-Chainiaux, F; Dunston, C R; Kristensen, P; Melchjorsen, C J; Navarrete, Santos A; Simm, A; Toussaint, O
2015-11-01
Ageing is accompanied by many visible characteristics. Other biological and physiological markers are also well-described e.g. loss of circulating sex hormones and increased inflammatory cytokines. Biomarkers for healthy ageing studies are presently predicated on existing knowledge of ageing traits. The increasing availability of data-intensive methods enables deep-analysis of biological samples for novel biomarkers. We have adopted two discrete approaches in MARK-AGE Work Package 7 for biomarker discovery; (1) microarray analyses and/or proteomics in cell systems e.g. endothelial progenitor cells or T cell ageing including a stress model; and (2) investigation of cellular material and plasma directly from tightly-defined proband subsets of different ages using proteomic, transcriptomic and miR array. The first approach provided longitudinal insight into endothelial progenitor and T cell ageing. This review describes the strategy and use of hypothesis-free, data-intensive approaches to explore cellular proteins, miR, mRNA and plasma proteins as healthy ageing biomarkers, using ageing models and directly within samples from adults of different ages. It considers the challenges associated with integrating multiple models and pilot studies as rational biomarkers for a large cohort study. From this approach, a number of high-throughput methods were developed to evaluate novel, putative biomarkers of ageing in the MARK-AGE cohort. Crown Copyright © 2015. Published by Elsevier Ireland Ltd. All rights reserved.
NASA's Hubble Celebrates 21st Anniversary with "Rose" of Galaxies
2017-12-08
NASA image release April 20, 2011 To see a video of this image go here: www.flickr.com/photos/gsfc/5637796622 To celebrate the 21st anniversary of the Hubble Space Telescope's deployment into space, astronomers at the Space Telescope Science Institute in Baltimore, Md., pointed Hubble's eye at an especially photogenic pair of interacting galaxies called Arp 273. The larger of the spiral galaxies, known as UGC 1810, has a disk that is distorted into a rose-like shape by the gravitational tidal pull of the companion galaxy below it, known as UGC 1813. This image is a composite of Hubble Wide Field Camera 3 data taken on December 17, 2010, with three separate filters that allow a broad range of wavelengths covering the ultraviolet, blue, and red portions of the spectrum. Hubble was launched April 24, 1990, aboard Discovery's STS-31 mission. Hubble discoveries revolutionized nearly all areas of current astronomical research from planetary science to cosmology. Credit: NASA, ESA, and the Hubble Heritage Team (STScI/AURA) To read more about this image go here: www.nasa.gov/mission_pages/hubble/science/hubble-rose.html NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Join us on Facebook
Discovering Free Energy Basins for Macromolecular Systems via Guided Multiscale Simulation
Sereda, Yuriy V.; Singharoy, Abhishek B.; Jarrold, Martin F.; Ortoleva, Peter J.
2012-01-01
An approach for the automated discovery of low free energy states of macromolecular systems is presented. The method does not involve delineating the entire free energy landscape but proceeds in a sequential free energy minimizing state discovery, i.e., it first discovers one low free energy state and then automatically seeks a distinct neighboring one. These states and the associated ensembles of atomistic configurations are characterized by coarse-grained variables capturing the large-scale structure of the system. A key facet of our approach is the identification of such coarse-grained variables. Evolution of these variables is governed by Langevin dynamics driven by thermal-average forces and mediated by diffusivities, both of which are constructed by an ensemble of short molecular dynamics runs. In the present approach, the thermal-average forces are modified to account for the entropy changes following from our knowledge of the free energy basins already discovered. Such forces guide the system away from the known free energy minima, over free energy barriers, and to a new one. The theory is demonstrated for lactoferrin, known to have multiple energy-minimizing structures. The approach is validated using experimental structures and traditional molecular dynamics. The method can be generalized to enable the interpretation of nanocharacterization data (e.g., ion mobility – mass spectrometry, atomic force microscopy, chemical labeling, and nanopore measurements). PMID:22423635
A network model of knowledge accumulation through diffusion and upgrade
NASA Astrophysics Data System (ADS)
Zhuang, Enyu; Chen, Guanrong; Feng, Gang
2011-07-01
In this paper, we introduce a model to describe knowledge accumulation through knowledge diffusion and knowledge upgrade in a multi-agent network. Here, knowledge diffusion refers to the distribution of existing knowledge in the network, while knowledge upgrade means the discovery of new knowledge. It is found that the population of the network and the number of each agent’s neighbors affect the speed of knowledge accumulation. Four different policies for updating the neighboring agents are thus proposed, and their influence on the speed of knowledge accumulation and the topology evolution of the network are also studied.
Interfaith Education: An Islamic Perspective
ERIC Educational Resources Information Center
Pallavicini, Yahya Sergio Yahe
2016-01-01
According to a teaching of the Prophet Muhammad, "the quest for knowledge is the duty of each Muslim, male or female", where knowledge is meant as the discovery of the real value of things and of oneself in relationship with the world in which God has placed us. This universal dimension of knowledge is in fact a wealth of wisdom of the…
Time-course human urine proteomics in space-flight simulation experiments.
Binder, Hans; Wirth, Henry; Arakelyan, Arsen; Lembcke, Kathrin; Tiys, Evgeny S; Ivanisenko, Vladimir A; Kolchanov, Nikolay A; Kononikhin, Alexey; Popov, Igor; Nikolaev, Evgeny N; Pastushkova, Lyudmila; Larina, Irina M
2014-01-01
Long-term space travel simulation experiments enabled to discover different aspects of human metabolism such as the complexity of NaCl salt balance. Detailed proteomics data were collected during the Mars105 isolation experiment enabling a deeper insight into the molecular processes involved. We studied the abundance of about two thousand proteins extracted from urine samples of six volunteers collected weekly during a 105-day isolation experiment under controlled dietary conditions including progressive reduction of salt consumption. Machine learning using Self Organizing maps (SOM) in combination with different analysis tools was applied to describe the time trajectories of protein abundance in urine. The method enables a personalized and intuitive view on the physiological state of the volunteers. The abundance of more than one half of the proteins measured clearly changes in the course of the experiment. The trajectory splits roughly into three time ranges, an early (week 1-6), an intermediate (week 7-11) and a late one (week 12-15). Regulatory modes associated with distinct biological processes were identified using previous knowledge by applying enrichment and pathway flow analysis. Early protein activation modes can be related to immune response and inflammatory processes, activation at intermediate times to developmental and proliferative processes and late activations to stress and responses to chemicals. The protein abundance profiles support previous results about alternative mechanisms of salt storage in an osmotically inactive form. We hypothesize that reduced NaCl consumption of about 6 g/day presumably will reduce or even prevent the activation of inflammatory processes observed in the early time range of isolation. SOM machine learning in combination with analysis methods of class discovery and functional annotation enable the straightforward analysis of complex proteomics data sets generated by means of mass spectrometry.
A metabolic pathway for catabolizing levulinic acid in bacteria
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rand, Jacqueline M.; Pisithkul, Tippapha; Clark, Ryan L.
Microorganisms can catabolize a wide range of organic compounds and therefore have the potential to perform many industrially relevant bioconversions. One barrier to realizing the potential of biorefining strategies lies in our incomplete knowledge of metabolic pathways, including those that can be used to assimilate naturally abundant or easily generated feedstocks. For instance, levulinic acid (LA) is a carbon source that is readily obtainable as a dehydration product of lignocellulosic biomass and can serve as the sole carbon source for some bacteria. Yet, the genetics and structure of LA catabolism have remained unknown. Here, we report the identification and characterizationmore » of a seven-gene operon that enables LA catabolism in Pseudomonas putida KT2440. When the pathway was reconstituted with purified proteins, we observed the formation of four acyl-CoA intermediates, including a unique 4-phosphovaleryl-CoA and the previously observed 3-hydroxyvaleryl-CoA product. Using adaptive evolution, we obtained a mutant of Escherichia coli LS5218 with functional deletions of fadE and atoC that was capable of robust growth on LA when it expressed the five enzymes from the P. putida operon. Here, this discovery will enable more efficient use of biomass hydrolysates and metabolic engineering to develop bioconversions using LA as a feedstock.« less
A metabolic pathway for catabolizing levulinic acid in bacteria
Rand, Jacqueline M.; Pisithkul, Tippapha; Clark, Ryan L.; ...
2017-09-25
Microorganisms can catabolize a wide range of organic compounds and therefore have the potential to perform many industrially relevant bioconversions. One barrier to realizing the potential of biorefining strategies lies in our incomplete knowledge of metabolic pathways, including those that can be used to assimilate naturally abundant or easily generated feedstocks. For instance, levulinic acid (LA) is a carbon source that is readily obtainable as a dehydration product of lignocellulosic biomass and can serve as the sole carbon source for some bacteria. Yet, the genetics and structure of LA catabolism have remained unknown. Here, we report the identification and characterizationmore » of a seven-gene operon that enables LA catabolism in Pseudomonas putida KT2440. When the pathway was reconstituted with purified proteins, we observed the formation of four acyl-CoA intermediates, including a unique 4-phosphovaleryl-CoA and the previously observed 3-hydroxyvaleryl-CoA product. Using adaptive evolution, we obtained a mutant of Escherichia coli LS5218 with functional deletions of fadE and atoC that was capable of robust growth on LA when it expressed the five enzymes from the P. putida operon. Here, this discovery will enable more efficient use of biomass hydrolysates and metabolic engineering to develop bioconversions using LA as a feedstock.« less
NASA Astrophysics Data System (ADS)
Yuan, Yifei; Amine, Khalil; Lu, Jun; Shahbazian-Yassar, Reza
2017-08-01
An in-depth understanding of material behaviours under complex electrochemical environment is critical for the development of advanced materials for the next-generation rechargeable ion batteries. The dynamic conditions inside a working battery had not been intensively explored until the advent of various in situ characterization techniques. Real-time transmission electron microscopy of electrochemical reactions is one of the most significant breakthroughs poised to enable radical shift in our knowledge on how materials behave in the electrochemical environment. This review, therefore, summarizes the scientific discoveries enabled by in situ transmission electron microscopy, and specifically emphasizes the applicability of this technique to address the critical challenges in the rechargeable ion battery electrodes, electrolyte and their interfaces. New electrochemical systems such as lithium-oxygen, lithium-sulfur and sodium ion batteries are included, considering the rapidly increasing application of in situ transmission electron microscopy in these areas. A systematic comparison between lithium ion-based electrochemistry and sodium ion-based electrochemistry is also given in terms of their thermodynamic and kinetic differences. The effect of the electron beam on the validity of in situ observation is also covered. This review concludes by providing a renewed perspective for the future directions of in situ transmission electron microscopy in rechargeable ion batteries.
Using Linked Open Data and Semantic Integration to Search Across Geoscience Repositories
NASA Astrophysics Data System (ADS)
Mickle, A.; Raymond, L. M.; Shepherd, A.; Arko, R. A.; Carbotte, S. M.; Chandler, C. L.; Cheatham, M.; Fils, D.; Hitzler, P.; Janowicz, K.; Jones, M.; Krisnadhi, A.; Lehnert, K. A.; Narock, T.; Schildhauer, M.; Wiebe, P. H.
2014-12-01
The MBLWHOI Library is a partner in the OceanLink project, an NSF EarthCube Building Block, applying semantic technologies to enable knowledge discovery, sharing and integration. OceanLink is testing ontology design patterns that link together: two data repositories, Rolling Deck to Repository (R2R), Biological and Chemical Oceanography Data Management Office (BCO-DMO); the MBLWHOI Library Institutional Repository (IR) Woods Hole Open Access Server (WHOAS); National Science Foundation (NSF) funded awards; and American Geophysical Union (AGU) conference presentations. The Library is collaborating with scientific users, data managers, DSpace engineers, experts in ontology design patterns, and user interface developers to make WHOAS, a DSpace repository, linked open data enabled. The goal is to allow searching across repositories without any of the information providers having to change how they manage their collections. The tools developed for DSpace will be made available to the community of users. There are 257 registered DSpace repositories in the United Stated and over 1700 worldwide. Outcomes include: Integration of DSpace with OpenRDF Sesame triple store to provide SPARQL endpoint for the storage and query of RDF representation of DSpace resources, Mapping of DSpace resources to OceanLink ontology, and DSpace "data" add on to provide resolvable linked open data representation of DSpace resources.
Moser, Richard P.; Hesse, Bradford W.; Shaikh, Abdul R.; Courtney, Paul; Morgan, Glen; Augustson, Erik; Kobrin, Sarah; Levin, Kerry; Helba, Cynthia; Garner, David; Dunn, Marsha; Coa, Kisha
2011-01-01
Scientists are taking advantage of the Internet and collaborative web technology to accelerate discovery in a massively connected, participative environment —a phenomenon referred to by some as Science 2.0. As a new way of doing science, this phenomenon has the potential to push science forward in a more efficient manner than was previously possible. The Grid-Enabled Measures (GEM) database has been conceptualized as an instantiation of Science 2.0 principles by the National Cancer Institute with two overarching goals: (1) Promote the use of standardized measures, which are tied to theoretically based constructs; and (2) Facilitate the ability to share harmonized data resulting from the use of standardized measures. This is done by creating an online venue connected to the Cancer Biomedical Informatics Grid (caBIG®) where a virtual community of researchers can collaborate together and come to consensus on measures by rating, commenting and viewing meta-data about the measures and associated constructs. This paper will describe the web 2.0 principles on which the GEM database is based, describe its functionality, and discuss some of the important issues involved with creating the GEM database, such as the role of mutually agreed-on ontologies (i.e., knowledge categories and the relationships among these categories— for data sharing). PMID:21521586
Yuan, Yifei; Amine, Khalil; Lu, Jun; Shahbazian-Yassar, Reza
2017-01-01
An in-depth understanding of material behaviours under complex electrochemical environment is critical for the development of advanced materials for the next-generation rechargeable ion batteries. The dynamic conditions inside a working battery had not been intensively explored until the advent of various in situ characterization techniques. Real-time transmission electron microscopy of electrochemical reactions is one of the most significant breakthroughs poised to enable radical shift in our knowledge on how materials behave in the electrochemical environment. This review, therefore, summarizes the scientific discoveries enabled by in situ transmission electron microscopy, and specifically emphasizes the applicability of this technique to address the critical challenges in the rechargeable ion battery electrodes, electrolyte and their interfaces. New electrochemical systems such as lithium–oxygen, lithium–sulfur and sodium ion batteries are included, considering the rapidly increasing application of in situ transmission electron microscopy in these areas. A systematic comparison between lithium ion-based electrochemistry and sodium ion-based electrochemistry is also given in terms of their thermodynamic and kinetic differences. The effect of the electron beam on the validity of in situ observation is also covered. This review concludes by providing a renewed perspective for the future directions of in situ transmission electron microscopy in rechargeable ion batteries.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yue, Peng; Gong, Jianya; Di, Liping
Abstract A geospatial catalogue service provides a network-based meta-information repository and interface for advertising and discovering shared geospatial data and services. Descriptive information (i.e., metadata) for geospatial data and services is structured and organized in catalogue services. The approaches currently available for searching and using that information are often inadequate. Semantic Web technologies show promise for better discovery methods by exploiting the underlying semantics. Such development needs special attention from the Cyberinfrastructure perspective, so that the traditional focus on discovery of and access to geospatial data can be expanded to support the increased demand for processing of geospatial information andmore » discovery of knowledge. Semantic descriptions for geospatial data, services, and geoprocessing service chains are structured, organized, and registered through extending elements in the ebXML Registry Information Model (ebRIM) of a geospatial catalogue service, which follows the interface specifications of the Open Geospatial Consortium (OGC) Catalogue Services for the Web (CSW). The process models for geoprocessing service chains, as a type of geospatial knowledge, are captured, registered, and discoverable. Semantics-enhanced discovery for geospatial data, services/service chains, and process models is described. Semantic search middleware that can support virtual data product materialization is developed for the geospatial catalogue service. The creation of such a semantics-enhanced geospatial catalogue service is important in meeting the demands for geospatial information discovery and analysis in Cyberinfrastructure.« less
NASA Astrophysics Data System (ADS)
Kingdon, Andrew; Nayembil, Martin L.; Richardson, Anne E.; Smith, A. Graham
2016-11-01
New requirements to understand geological properties in three dimensions have led to the development of PropBase, a data structure and delivery tools to deliver this. At the BGS, relational database management systems (RDBMS) has facilitated effective data management using normalised subject-based database designs with business rules in a centralised, vocabulary controlled, architecture. These have delivered effective data storage in a secure environment. However, isolated subject-oriented designs prevented efficient cross-domain querying of datasets. Additionally, the tools provided often did not enable effective data discovery as they struggled to resolve the complex underlying normalised structures providing poor data access speeds. Users developed bespoke access tools to structures they did not fully understand sometimes delivering them incorrect results. Therefore, BGS has developed PropBase, a generic denormalised data structure within an RDBMS to store property data, to facilitate rapid and standardised data discovery and access, incorporating 2D and 3D physical and chemical property data, with associated metadata. This includes scripts to populate and synchronise the layer with its data sources through structured input and transcription standards. A core component of the architecture includes, an optimised query object, to deliver geoscience information from a structure equivalent to a data warehouse. This enables optimised query performance to deliver data in multiple standardised formats using a web discovery tool. Semantic interoperability is enforced through vocabularies combined from all data sources facilitating searching of related terms. PropBase holds 28.1 million spatially enabled property data points from 10 source databases incorporating over 50 property data types with a vocabulary set that includes 557 property terms. By enabling property data searches across multiple databases PropBase has facilitated new scientific research, previously considered impractical. PropBase is easily extended to incorporate 4D data (time series) and is providing a baseline for new "big data" monitoring projects.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-20
... both sides would participate in an Exchange Auction, this proposed change would aid in price discovery... auction price. This proposed change would aid in price discovery and help to reduce the likelihood of... Sell Shares and, therefore, a User would never have complete knowledge of liquidity available on both...
ERIC Educational Resources Information Center
Yu, Pulan
2012-01-01
Classification, clustering and association mining are major tasks of data mining and have been widely used for knowledge discovery. Associative classification mining, the combination of both association rule mining and classification, has emerged as an indispensable way to support decision making and scientific research. In particular, it offers a…
ERIC Educational Resources Information Center
Silbersack, Elionora W.
2014-01-01
The purpose of this qualitative study was to expand the scarce information available on how mothers first observe their children's early development, assess potential problems, and then come to recognize their concerns. In-depth knowledge about mothers' perspectives on the discovery process can help social workers to promote identification of…
Augmented Reality-Based Simulators as Discovery Learning Tools: An Empirical Study
ERIC Educational Resources Information Center
Ibáñez, María-Blanca; Di-Serio, Ángela; Villarán-Molina, Diego; Delgado-Kloos, Carlos
2015-01-01
This paper reports empirical evidence on having students use AR-SaBEr, a simulation tool based on augmented reality (AR), to discover the basic principles of electricity through a series of experiments. AR-SaBEr was enhanced with knowledge-based support and inquiry-based scaffolding mechanisms, which proved useful for discovery learning in…
76 FR 36320 - Rules of Practice in Proceedings Relative to False Representation and Lottery Orders
Federal Register 2010, 2011, 2012, 2013, 2014
2011-06-22
... officers. 952.18 Evidence. 952.19 Subpoenas. 952.20 Witness fees. 952.21 Discovery. 952.22 Transcript. 952..., motions, proposed orders, and other documents for the record. Discovery need not be filed except as may be... witnesses, that the statement correctly states the witness's opinion or knowledge concerning the matters in...
Making the Long Tail Visible: Social Networking Sites and Independent Music Discovery
ERIC Educational Resources Information Center
Gaffney, Michael; Rafferty, Pauline
2009-01-01
Purpose: The purpose of this paper is to investigate users' knowledge and use of social networking sites and folksonomies to discover if social tagging and folksonomies, within the area of independent music, aid in its information retrieval and discovery. The sites examined in this project are MySpace, Lastfm, Pandora and Allmusic. In addition,…
Towards a Conceptual Design of a Cross-Domain Integrative Information System for the Geosciences
NASA Astrophysics Data System (ADS)
Zaslavsky, I.; Richard, S. M.; Valentine, D. W.; Malik, T.; Gupta, A.
2013-12-01
As geoscientists increasingly focus on studying processes that span multiple research domains, there is an increased need for cross-domain interoperability solutions that can scale to the entire geosciences, bridging information and knowledge systems, models, software tools, as well as connecting researchers and organization. Creating a community-driven cyberinfrastructure (CI) to address the grand challenges of integrative Earth science research and education is the focus of EarthCube, a new research initiative of the U.S. National Science Foundation. We are approaching EarthCube design as a complex socio-technical system of systems, in which communication between various domain subsystems, people and organizations enables more comprehensive, data-intensive research designs and knowledge sharing. In particular, we focus on integrating 'traditional' layered CI components - including information sources, catalogs, vocabularies, services, analysis and modeling tools - with CI components supporting scholarly communication, self-organization and social networking (e.g. research profiles, Q&A systems, annotations), in a manner that follows and enhances existing patterns of data, information and knowledge exchange within and across geoscience domains. We describe an initial architecture design focused on enabling the CI to (a) provide an environment for scientifically sound information and software discovery and reuse; (b) evolve by factoring in the impact of maturing movements like linked data, 'big data', and social collaborations, as well as experience from work on large information systems in other domains; (c) handle the ever increasing volume, complexity and diversity of geoscience information; (d) incorporate new information and analytical requirements, tools, and techniques, and emerging types of earth observations and models; (e) accommodate different ideas and approaches to research and data stewardship; (f) be responsive to the existing and anticipated needs of researchers and organizations representing both established and emerging CI users; and (g) make best use of NSF's current investment in the geoscience CI. The presentation will focus on the challenges and methodology of EarthCube CI design, in particular on supporting social engagement and interaction between geoscientists and computer scientists as a core function of EarthCube architecture. This capability must include mechanisms to not only locate and integrate available geoscience resources, but also engage individuals and projects, research products and publications, and enable efficient communication across many EarthCube stakeholders leading to long-term institutional alignment and trusted collaborations.
Experiments for Modern Introductory Chemistry.
ERIC Educational Resources Information Center
Kildahl, Nicholas; Berka, Ladislav H.
1995-01-01
Presents a headspace gas chromatography experiment that enables discovery of the temperature dependence of the vapor pressure of a pure liquid. Illustrates liquid-vapor phase equilibrium of pure liquids. Contains 22 references. (JRH)
2016-01-01
Observations of individual organisms (data) can be combined with expert ecological knowledge of species, especially causal knowledge, to model and extract from flower–visiting data useful information about behavioral interactions between insect and plant organisms, such as nectar foraging and pollen transfer. We describe and evaluate a method to elicit and represent such expert causal knowledge of behavioral ecology, and discuss the potential for wider application of this method to the design of knowledge-based systems for knowledge discovery in biodiversity and ecosystem informatics. PMID:27851814
Enabling Graph Mining in RDF Triplestores using SPARQL for Holistic In-situ Graph Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Sangkeun; Sukumar, Sreenivas R; Hong, Seokyong
The graph analysis is now considered as a promising technique to discover useful knowledge in data with a new perspective. We envi- sion that there are two dimensions of graph analysis: OnLine Graph Analytic Processing (OLGAP) and Graph Mining (GM) where each respectively focuses on subgraph pattern matching and automatic knowledge discovery in graph. Moreover, as these two dimensions aim to complementarily solve complex problems, holistic in-situ graph analysis which covers both OLGAP and GM in a single system is critical for minimizing the burdens of operating multiple graph systems and transferring intermediate result-sets between those systems. Nevertheless, most existingmore » graph analysis systems are only capable of one dimension of graph analysis. In this work, we take an approach to enabling GM capabilities (e.g., PageRank, connected-component analysis, node eccentricity, etc.) in RDF triplestores, which are originally developed to store RDF datasets and provide OLGAP capability. More specifically, to achieve our goal, we implemented six representative graph mining algorithms using SPARQL. The approach allows a wide range of available RDF data sets directly applicable for holistic graph analysis within a system. For validation of our approach, we evaluate performance of our implementations with nine real-world datasets and three different computing environments - a laptop computer, an Amazon EC2 instance, and a shared-memory Cray XMT2 URIKA-GD graph-processing appliance. The experimen- tal results show that our implementation can provide promising and scalable performance for real world graph analysis in all tested environments. The developed software is publicly available in an open-source project that we initiated.« less
Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Younge, Andrew J.; Pedretti, Kevin; Grant, Ryan
While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In thismore » paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.« less
Enabling Graph Mining in RDF Triplestores using SPARQL for Holistic In-situ Graph Analysis
Lee, Sangkeun; Sukumar, Sreenivas R; Hong, Seokyong; ...
2016-01-01
The graph analysis is now considered as a promising technique to discover useful knowledge in data with a new perspective. We envi- sion that there are two dimensions of graph analysis: OnLine Graph Analytic Processing (OLGAP) and Graph Mining (GM) where each respectively focuses on subgraph pattern matching and automatic knowledge discovery in graph. Moreover, as these two dimensions aim to complementarily solve complex problems, holistic in-situ graph analysis which covers both OLGAP and GM in a single system is critical for minimizing the burdens of operating multiple graph systems and transferring intermediate result-sets between those systems. Nevertheless, most existingmore » graph analysis systems are only capable of one dimension of graph analysis. In this work, we take an approach to enabling GM capabilities (e.g., PageRank, connected-component analysis, node eccentricity, etc.) in RDF triplestores, which are originally developed to store RDF datasets and provide OLGAP capability. More specifically, to achieve our goal, we implemented six representative graph mining algorithms using SPARQL. The approach allows a wide range of available RDF data sets directly applicable for holistic graph analysis within a system. For validation of our approach, we evaluate performance of our implementations with nine real-world datasets and three different computing environments - a laptop computer, an Amazon EC2 instance, and a shared-memory Cray XMT2 URIKA-GD graph-processing appliance. The experimen- tal results show that our implementation can provide promising and scalable performance for real world graph analysis in all tested environments. The developed software is publicly available in an open-source project that we initiated.« less
Discovery learning model with geogebra assisted for improvement mathematical visual thinking ability
NASA Astrophysics Data System (ADS)
Juandi, D.; Priatna, N.
2018-05-01
The main goal of this study is to improve the mathematical visual thinking ability of high school student through implementation the Discovery Learning Model with Geogebra Assisted. This objective can be achieved through study used quasi-experimental method, with non-random pretest-posttest control design. The sample subject of this research consist of 62 senior school student grade XI in one of school in Bandung district. The required data will be collected through documentation, observation, written tests, interviews, daily journals, and student worksheets. The results of this study are: 1) Improvement students Mathematical Visual Thinking Ability who obtain learning with applied the Discovery Learning Model with Geogebra assisted is significantly higher than students who obtain conventional learning; 2) There is a difference in the improvement of students’ Mathematical Visual Thinking ability between groups based on prior knowledge mathematical abilities (high, medium, and low) who obtained the treatment. 3) The Mathematical Visual Thinking Ability improvement of the high group is significantly higher than in the medium and low groups. 4) The quality of improvement ability of high and low prior knowledge is moderate category, in while the quality of improvement ability in the high category achieved by student with medium prior knowledge.
Systematic identification of latent disease-gene associations from PubMed articles.
Zhang, Yuji; Shen, Feichen; Mojarad, Majid Rastegar; Li, Dingcheng; Liu, Sijia; Tao, Cui; Yu, Yue; Liu, Hongfang
2018-01-01
Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research.
Danchin, Antoine; Ouzounis, Christos; Tokuyasu, Taku; Zucker, Jean-Daniel
2018-07-01
Science and engineering rely on the accumulation and dissemination of knowledge to make discoveries and create new designs. Discovery-driven genome research rests on knowledge passed on via gene annotations. In response to the deluge of sequencing big data, standard annotation practice employs automated procedures that rely on majority rules. We argue this hinders progress through the generation and propagation of errors, leading investigators into blind alleys. More subtly, this inductive process discourages the discovery of novelty, which remains essential in biological research and reflects the nature of biology itself. Annotation systems, rather than being repositories of facts, should be tools that support multiple modes of inference. By combining deduction, induction and abduction, investigators can generate hypotheses when accurate knowledge is extracted from model databases. A key stance is to depart from 'the sequence tells the structure tells the function' fallacy, placing function first. We illustrate our approach with examples of critical or unexpected pathways, using MicroScope to demonstrate how tools can be implemented following the principles we advocate. We end with a challenge to the reader. © 2018 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
Systematic identification of latent disease-gene associations from PubMed articles
Mojarad, Majid Rastegar; Li, Dingcheng; Liu, Sijia; Tao, Cui; Yu, Yue; Liu, Hongfang
2018-01-01
Recent scientific advances have accumulated a tremendous amount of biomedical knowledge providing novel insights into the relationship between molecular and cellular processes and diseases. Literature mining is one of the commonly used methods to retrieve and extract information from scientific publications for understanding these associations. However, due to large data volume and complicated associations with noises, the interpretability of such association data for semantic knowledge discovery is challenging. In this study, we describe an integrative computational framework aiming to expedite the discovery of latent disease mechanisms by dissecting 146,245 disease-gene associations from over 25 million of PubMed indexed articles. We take advantage of both Latent Dirichlet Allocation (LDA) modeling and network-based analysis for their capabilities of detecting latent associations and reducing noises for large volume data respectively. Our results demonstrate that (1) the LDA-based modeling is able to group similar diseases into disease topics; (2) the disease-specific association networks follow the scale-free network property; (3) certain subnetwork patterns were enriched in the disease-specific association networks; and (4) genes were enriched in topic-specific biological processes. Our approach offers promising opportunities for latent disease-gene knowledge discovery in biomedical research. PMID:29373609
Temporal data mining for the quality assessment of hemodialysis services.
Bellazzi, Riccardo; Larizza, Cristiana; Magni, Paolo; Bellazzi, Roberto
2005-05-01
This paper describes the temporal data mining aspects of a research project that deals with the definition of methods and tools for the assessment of the clinical performance of hemodialysis (HD) services, on the basis of the time series automatically collected during hemodialysis sessions. Intelligent data analysis and temporal data mining techniques are applied to gain insight and to discover knowledge on the causes of unsatisfactory clinical results. In particular, two new methods for association rule discovery and temporal rule discovery are applied to the time series. Such methods exploit several pre-processing techniques, comprising data reduction, multi-scale filtering and temporal abstractions. We have analyzed the data of more than 5800 dialysis sessions coming from 43 different patients monitored for 19 months. The qualitative rules associating the outcome parameters and the measured variables were examined by the domain experts, which were able to distinguish between rules confirming available background knowledge and unexpected but plausible rules. The new methods proposed in the paper are suitable tools for knowledge discovery in clinical time series. Their use in the context of an auditing system for dialysis management helped clinicians to improve their understanding of the patients' behavior.
Infrared heterodyne spectroscopy. [for observation of thermal emission from astrophysical objects
NASA Technical Reports Server (NTRS)
Mumma, M. J.; Kostiuk, T.; Buhl, D.; Chin, G.; Zipoy, D.
1982-01-01
Infrared heterodyne spectroscopy is an extremely useful tool for Doppler-limited studies of atomic and molecular lines in diverse astrophysical regions. The current state of the art is reviewed, and the analysis of CO2 lines in the atmosphere of Mars is outlined. Doppler-limited observations have enabled the discovery of natural laser emission in the mesosphere of Mars and the discovery of failure of local thermodynamic equilibrium near the surface of Mars.
Applying flow chemistry: methods, materials, and multistep synthesis.
McQuade, D Tyler; Seeberger, Peter H
2013-07-05
The synthesis of complex molecules requires control over both chemical reactivity and reaction conditions. While reactivity drives the majority of chemical discovery, advances in reaction condition control have accelerated method development/discovery. Recent tools include automated synthesizers and flow reactors. In this Synopsis, we describe how flow reactors have enabled chemical advances in our groups in the areas of single-stage reactions, materials synthesis, and multistep reactions. In each section, we detail the lessons learned and propose future directions.
An Integrated Microfluidic Processor for DNA-Encoded Combinatorial Library Functional Screening
2017-01-01
DNA-encoded synthesis is rekindling interest in combinatorial compound libraries for drug discovery and in technology for automated and quantitative library screening. Here, we disclose a microfluidic circuit that enables functional screens of DNA-encoded compound beads. The device carries out library bead distribution into picoliter-scale assay reagent droplets, photochemical cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits. DNA-encoded compound beads (10-μm diameter) displaying a photocleavable positive control inhibitor pepstatin A were mixed (1920 beads, 729 encoding sequences) with negative control beads (58 000 beads, 1728 encoding sequences) and screened for cathepsin D inhibition using a biochemical enzyme activity assay. The circuit sorted 1518 hit droplets for collection following 18 min incubation over a 240 min analysis. Visual inspection of a subset of droplets (1188 droplets) yielded a 24% false discovery rate (1166 pepstatin A beads; 366 negative control beads). Using template barcoding strategies, it was possible to count hit collection beads (1863) using next-generation sequencing data. Bead-specific barcodes enabled replicate counting, and the false discovery rate was reduced to 2.6% by only considering hit-encoding sequences that were observed on >2 beads. This work represents a complete distributable small molecule discovery platform, from microfluidic miniaturized automation to ultrahigh-throughput hit deconvolution by sequencing. PMID:28199790
An Integrated Microfluidic Processor for DNA-Encoded Combinatorial Library Functional Screening.
MacConnell, Andrew B; Price, Alexander K; Paegel, Brian M
2017-03-13
DNA-encoded synthesis is rekindling interest in combinatorial compound libraries for drug discovery and in technology for automated and quantitative library screening. Here, we disclose a microfluidic circuit that enables functional screens of DNA-encoded compound beads. The device carries out library bead distribution into picoliter-scale assay reagent droplets, photochemical cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits. DNA-encoded compound beads (10-μm diameter) displaying a photocleavable positive control inhibitor pepstatin A were mixed (1920 beads, 729 encoding sequences) with negative control beads (58 000 beads, 1728 encoding sequences) and screened for cathepsin D inhibition using a biochemical enzyme activity assay. The circuit sorted 1518 hit droplets for collection following 18 min incubation over a 240 min analysis. Visual inspection of a subset of droplets (1188 droplets) yielded a 24% false discovery rate (1166 pepstatin A beads; 366 negative control beads). Using template barcoding strategies, it was possible to count hit collection beads (1863) using next-generation sequencing data. Bead-specific barcodes enabled replicate counting, and the false discovery rate was reduced to 2.6% by only considering hit-encoding sequences that were observed on >2 beads. This work represents a complete distributable small molecule discovery platform, from microfluidic miniaturized automation to ultrahigh-throughput hit deconvolution by sequencing.
2017-01-01
The development of structure-guided drug discovery is a story of knowledge exchange where new ideas originate from all parts of the research ecosystem. Dorothy Crowfoot Hodgkin obtained insulin from Boots Pure Drug Company in the 1930s and insulin crystallization was optimized in the company Novo in the 1950s, allowing the structure to be determined at Oxford University. The structure of renin was developed in academia, on this occasion in London, in response to a need to develop antihypertensives in pharma. The idea of a dimeric aspartic protease came from an international academic team and was discovered in HIV; it eventually led to new HIV antivirals being developed in industry. Structure-guided fragment-based discovery was developed in large pharma and biotechs, but has been exploited in academia for the development of new inhibitors targeting protein–protein interactions and also antimicrobials to combat mycobacterial infections such as tuberculosis. These observations provide a strong argument against the so-called ‘linear model’, where ideas flow only in one direction from academic institutions to industry. Structure-guided drug discovery is a story of applications of protein crystallography and knowledge exhange between academia and industry that has led to new drug approvals for cancer and other common medical conditions by the Food and Drug Administration in the USA, as well as hope for the treatment of rare genetic diseases and infectious diseases that are a particular challenge in the developing world. PMID:28875019
Choosing experiments to accelerate collective discovery
Rzhetsky, Andrey; Foster, Jacob G.; Foster, Ian T.
2015-01-01
A scientist’s choice of research problem affects his or her personal career trajectory. Scientists’ combined choices affect the direction and efficiency of scientific discovery as a whole. In this paper, we infer preferences that shape problem selection from patterns of published findings and then quantify their efficiency. We represent research problems as links between scientific entities in a knowledge network. We then build a generative model of discovery informed by qualitative research on scientific problem selection. We map salient features from this literature to key network properties: an entity’s importance corresponds to its degree centrality, and a problem’s difficulty corresponds to the network distance it spans. Drawing on millions of papers and patents published over 30 years, we use this model to infer the typical research strategy used to explore chemical relationships in biomedicine. This strategy generates conservative research choices focused on building up knowledge around important molecules. These choices become more conservative over time. The observed strategy is efficient for initial exploration of the network and supports scientific careers that require steady output, but is inefficient for science as a whole. Through supercomputer experiments on a sample of the network, we study thousands of alternatives and identify strategies much more efficient at exploring mature knowledge networks. We find that increased risk-taking and the publication of experimental failures would substantially improve the speed of discovery. We consider institutional shifts in grant making, evaluation, and publication that would help realize these efficiencies. PMID:26554009