enabling information discovery: Topics by Science.gov

Sample records for enabling information discovery

49 CFR 209.313 - Discovery.

Code of Federal Regulations, 2012 CFR

2012-10-01

... 49 Transportation 4 2012-10-01 2012-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...
49 CFR 209.313 - Discovery.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 49 Transportation 4 2011-10-01 2011-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...
49 CFR 209.313 - Discovery.

Code of Federal Regulations, 2013 CFR

2013-10-01

... 49 Transportation 4 2013-10-01 2013-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...
49 CFR 209.313 - Discovery.

Code of Federal Regulations, 2014 CFR

2014-10-01

... 49 Transportation 4 2014-10-01 2014-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...
49 CFR 209.313 - Discovery.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 49 Transportation 4 2010-10-01 2010-10-01 false Discovery. 209.313 Section 209.313 Transportation... TRANSPORTATION RAILROAD SAFETY ENFORCEMENT PROCEDURES Disqualification Procedures § 209.313 Discovery. (a... parties. Discovery is designed to enable a party to obtain relevant information needed for preparation of...
Virtual Observatories, Data Mining, and Astroinformatics

NASA Astrophysics Data System (ADS)

Borne, Kirk

The historical, current, and future trends in knowledge discovery from data in astronomy are presented here. The story begins with a brief history of data gathering and data organization. A description of the development ofnew information science technologies for astronomical discovery is then presented. Among these are e-Science and the virtual observatory, with its data discovery, access, display, and integration protocols; astroinformatics and data mining for exploratory data analysis, information extraction, and knowledge discovery from distributed data collections; new sky surveys' databases, including rich multivariate observational parameter sets for large numbers of objects; and the emerging discipline of data-oriented astronomical research, called astroinformatics. Astroinformatics is described as the fourth paradigm of astronomical research, following the three traditional research methodologies: observation, theory, and computation/modeling. Astroinformatics research areas include machine learning, data mining, visualization, statistics, semantic science, and scientific data management.Each of these areas is now an active research discipline, with significantscience-enabling applications in astronomy. Research challenges and sample research scenarios are presented in these areas, in addition to sample algorithms for data-oriented research. These information science technologies enable scientific knowledge discovery from the increasingly large and complex data collections in astronomy. The education and training of the modern astronomy student must consequently include skill development in these areas, whose practitioners have traditionally been limited to applied mathematicians, computer scientists, and statisticians. Modern astronomical researchers must cross these traditional discipline boundaries, thereby borrowing the best of breed methodologies from multiple disciplines. In the era of large sky surveys and numerous large telescopes, the potential for astronomical discovery is equally large, and so the data-oriented research methods, algorithms, and techniques that are presented here will enable the greatest discovery potential from the ever-growing data and information resources in astronomy.
The future of drug discovery: enabling technologies for enhancing lead characterization and profiling therapeutic potential.

PubMed

Janero, David R

2014-08-01

Technology often serves as a handmaiden and catalyst of invention. The discovery of safe, effective medications depends critically upon experimental approaches capable of providing high-impact information on the biological effects of drug candidates early in the discovery pipeline. This information can enable reliable lead identification, pharmacological compound differentiation and successful translation of research output into clinically useful therapeutics. The shallow preclinical profiling of candidate compounds promulgates a minimalistic understanding of their biological effects and undermines the level of value creation necessary for finding quality leads worth moving forward within the development pipeline with efficiency and prognostic reliability sufficient to help remediate the current pharma-industry productivity drought. Three specific technologies discussed herein, in addition to experimental areas intimately associated with contemporary drug discovery, appear to hold particular promise for strengthening the preclinical valuation of drug candidates by deepening lead characterization. These are: i) hydrogen-deuterium exchange mass spectrometry for characterizing structural and ligand-interaction dynamics of disease-relevant proteins; ii) activity-based chemoproteomics for profiling the functional diversity of mammalian proteomes; and iii) nuclease-mediated precision gene editing for developing more translatable cellular and in vivo models of human diseases. When applied in an informed manner congruent with the clinical understanding of disease processes, technologies such as these that span levels of biological organization can serve as valuable enablers of drug discovery and potentially contribute to reducing the current, unacceptably high rates of compound clinical failure.
Combining data from multiple sources using the CUAHSI Hydrologic Information System

NASA Astrophysics Data System (ADS)

Tarboton, D. G.; Ames, D. P.; Horsburgh, J. S.; Goodall, J. L.

2012-12-01

The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) has developed a Hydrologic Information System (HIS) to provide better access to data by enabling the publication, cataloging, discovery, retrieval, and analysis of hydrologic data using web services. The CUAHSI HIS is an Internet based system comprised of hydrologic databases and servers connected through web services as well as software for data publication, discovery and access. The HIS metadata catalog lists close to 100 web services registered to provide data through this system, ranging from large federal agency data sets to experimental watersheds managed by University investigators. The system's flexibility in storing and enabling public access to similarly formatted data and metadata has created a community data resource from governmental and academic data that might otherwise remain private or analyzed only in isolation. Comprehensive understanding of hydrology requires integration of this information from multiple sources. HydroDesktop is the client application developed as part of HIS to support data discovery and access through this system. HydroDesktop is founded on an open source GIS client and has a plug-in architecture that has enabled the integration of modeling and analysis capability with the functionality for data discovery and access. Model integration is possible through a plug-in built on the OpenMI standard and data visualization and analysis is supported by an R plug-in. This presentation will demonstrate HydroDesktop, showing how it provides an analysis environment within which data from multiple sources can be discovered, accessed and integrated.
Good Practices in Model‐Informed Drug Discovery and Development: Practice, Application, and Documentation

PubMed Central

Burghaus, R; Cosson, V; Cheung, SYA; Chenel, M; DellaPasqua, O; Frey, N; Hamrén, B; Harnisch, L; Ivanow, F; Kerbusch, T; Lippert, J; Milligan, PA; Rohou, S; Staab, A; Steimer, JL; Tornøe, C; Visser, SAG

2016-01-01

This document was developed to enable greater consistency in the practice, application, and documentation of Model‐Informed Drug Discovery and Development (MID3) across the pharmaceutical industry. A collection of “good practice” recommendations are assembled here in order to minimize the heterogeneity in both the quality and content of MID3 implementation and documentation. The three major objectives of this white paper are to: i) inform company decision makers how the strategic integration of MID3 can benefit R&D efficiency; ii) provide MID3 analysts with sufficient material to enhance the planning, rigor, and consistency of the application of MID3; and iii) provide regulatory authorities with substrate to develop MID3 related and/or MID3 enabled guidelines. PMID:27069774
On the Faceting and Linking of PROV for Earth Science Data Systems

NASA Astrophysics Data System (ADS)

Hua, H.; Manipon, G.; Wilson, B. D.; Tan, D.; Starch, M.

2015-12-01

Faceted search has yielded powerful capabilities for discovery of information by applying multiple filters to explore information. This is often more effective when the information is decomposed into faceted components that can be sliced and diced during faceted navigation. We apply this approach to the representation of PROV for Earth Science (PROV-ES) to facilitate more atomic units of provenance for discovery. Traditional bundles of PROV are then decomposed to enable finer-grain discovery of provenance. Linkages across provenance components can then be explored across seemingly disparate bundles. We will show how mappings into this provenance approach can be used to explore more data life-cycle relationships from observation to data to findings. We will also show examples of how this approach can be used to improve the discovery, access, and transparency of NASA datasets and the science data systems that were used to capture, manage, and produce the provenance information.
Antibody-enabled small-molecule drug discovery.

PubMed

Lawson, Alastair D G

2012-06-29

Although antibody-based therapeutics have become firmly established as medicines for serious diseases, the value of antibodies as tools in the early stages of small-molecule drug discovery is only beginning to be realized. In particular, antibodies may provide information to reduce risk in small-molecule drug discovery by enabling the validation of targets and by providing insights into the design of small-molecule screening assays. Moreover, antibodies can act as guides in the quest for small molecules that have the ability to modulate protein-protein interactions, which have traditionally only been considered to be tractable targets for biological drugs. The development of small molecules that have similar therapeutic effects to current biologics has the potential to benefit a broader range of patients at earlier stages of disease.
[Frontiers in Live Bone Imaging Researches. Novel drug discovery by means of intravital bone imaging technology].

PubMed

Ishii, Masaru

2015-06-01

Recent advances in intravital bone imaging technology has enabled us to grasp the real cellular behaviors and functions in vivo , revolutionizing the field of drug discovery for novel therapeutics against intractable bone diseases. In this chapter, I introduce various updated information on pharmacological actions of several antibone resorptive agents, which could only be derived from advanced imaging techniques, and also discuss the future perspectives of this new trend in drug discovery.
Comparison: Discovery on WSMOLX and miAamics/jABC

NASA Astrophysics Data System (ADS)

Kubczak, Christian; Vitvar, Tomas; Winkler, Christian; Zaharia, Raluca; Zaremba, Maciej

This chapter compares the solutions to the SWS-Challenge discovery problems provided by DERI Galway and the joint solution from the Technical University of Dortmund and University of Postdam. The two approaches are described in depth in Chapters 10 and 13. The discovery scenario raises problems associated with making service discovery an automated process. It requires fine-grained specifications of search requests and service functionality including support for fetching dynamic information during the discovery process (e.g., shipment price). Both teams utilize semantics to describe services, service requests and data models in order to enable search at the required fine-grained level of detail.
The EGS Data Collaboration Platform: Enabling Scientific Discovery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weers, Jonathan D; Johnston, Henry; Huggins, Jay V

Collaboration in the digital age has been stifled in recent years. Reasonable responses to legitimate security concerns have created a virtual landscape of silos and fortified castles incapable of sharing information efficiently. This trend is unfortunately opposed to the geothermal scientific community's migration toward larger, more collaborative projects. To facilitate efficient sharing of information between team members from multiple national labs, universities, and private organizations, the 'EGS Collab' team has developed a universally accessible, secure data collaboration platform and has fully integrated it with the U.S. Department of Energy's (DOE) Geothermal Data Repository (GDR) and the National Geothermal Data Systemmore » (NGDS). This paper will explore some of the challenges of collaboration in the modern digital age, highlight strategies for active data management, and discuss the integration of the EGS Collab data management platform with the GDR to enable scientific discovery through the timely dissemination of information.« less
Discovery of Information Diffusion Process in Social Networks

NASA Astrophysics Data System (ADS)

Kim, Kwanho; Jung, Jae-Yoon; Park, Jonghun

Information diffusion analysis in social networks is of significance since it enables us to deeply understand dynamic social interactions among users. In this paper, we introduce approaches to discovering information diffusion process in social networks based on process mining. Process mining techniques are applied from three perspectives: social network analysis, process discovery and community recognition. We then present experimental results by using a real-life social network data. The proposed techniques are expected to employ as new analytical tools in online social networks such as blog and wikis for company marketers, politicians, news reporters and online writers.
MOPED enables discoveries through consistently processed proteomics data

PubMed Central

Higdon, Roger; Stewart, Elizabeth; Stanberry, Larissa; Haynes, Winston; Choiniere, John; Montague, Elizabeth; Anderson, Nathaniel; Yandl, Gregory; Janko, Imre; Broomall, William; Fishilevich, Simon; Lancet, Doron; Kolker, Natali; Kolker, Eugene

2014-01-01

The Model Organism Protein Expression Database (MOPED, http://moped.proteinspire.org), is an expanding proteomics resource to enable biological and biomedical discoveries. MOPED aggregates simple, standardized and consistently processed summaries of protein expression and metadata from proteomics (mass spectrometry) experiments from human and model organisms (mouse, worm and yeast). The latest version of MOPED adds new estimates of protein abundance and concentration, as well as relative (differential) expression data. MOPED provides a new updated query interface that allows users to explore information by organism, tissue, localization, condition, experiment, or keyword. MOPED supports the Human Proteome Project’s efforts to generate chromosome and diseases specific proteomes by providing links from proteins to chromosome and disease information, as well as many complementary resources. MOPED supports a new omics metadata checklist in order to harmonize data integration, analysis and use. MOPED’s development is driven by the user community, which spans 90 countries guiding future development that will transform MOPED into a multi-omics resource. MOPED encourages users to submit data in a simple format. They can use the metadata a checklist generate a data publication for this submission. As a result, MOPED will provide even greater insights into complex biological processes and systems and enable deeper and more comprehensive biological and biomedical discoveries. PMID:24350770
Multi-year Content Analysis of User Facility Related Publications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Patton, Robert M; Stahl, Christopher G; Hines, Jayson

2013-01-01

Scientific user facilities provide resources and support that enable scientists to conduct experiments or simulations pertinent to their respective research. Consequently, it is critical to have an informed understanding of the impact and contributions that these facilities have on scientific discoveries. Leveraging insight into scientific publications that acknowledge the use of these facilities enables more informed decisions by facility management and sponsors in regard to policy, resource allocation, and influencing the direction of science as well as more effectively understand the impact of a scientific user facility. This work discusses preliminary results of mining scientific publications that utilized resources atmore » the Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National Laboratory (ORNL). These results show promise in identifying and leveraging multi-year trends and providing a higher resolution view of the impact that a scientific user facility may have on scientific discoveries.« less
Big, Deep, and Smart Data in Scanning Probe Microscopy

DOE PAGES

Kalinin, Sergei V.; Strelcov, Evgheni; Belianinov, Alex; ...

2016-09-27

Scanning probe microscopy techniques open the door to nanoscience and nanotechnology by enabling imaging and manipulation of structure and functionality of matter on nanometer and atomic scales. We analyze the discovery process by SPM in terms of information flow from tip-surface junction to the knowledge adoption by scientific community. Furthermore, we discuss the challenges and opportunities offered by merging of SPM and advanced data mining, visual analytics, and knowledge discovery technologies.
10 CFR 708.2 - What are the definitions of terms used in this part?

Code of Federal Regulations, 2010 CFR

2010-01-01

... calendar day. Discovery means a process used to enable the parties to learn about each other's evidence... DOE. Mediation means an informal, confidential process in which a neutral third person assists the...
Use of Semantic Technology to Create Curated Data Albums

NASA Technical Reports Server (NTRS)

Ramachandran, Rahul; Kulkarni, Ajinkya; Li, Xiang; Sainju, Roshan; Bakare, Rohan; Basyal, Sabin; Fox, Peter (Editor); Norack, Tom (Editor)

2014-01-01

One of the continuing challenges in any Earth science investigation is the discovery and access of useful science content from the increasingly large volumes of Earth science data and related information available online. Current Earth science data systems are designed with the assumption that researchers access data primarily by instrument or geophysical parameter. Those who know exactly the data sets they need can obtain the specific files using these systems. However, in cases where researchers are interested in studying an event of research interest, they must manually assemble a variety of relevant data sets by searching the different distributed data systems. Consequently, there is a need to design and build specialized search and discovery tools in Earth science that can filter through large volumes of distributed online data and information and only aggregate the relevant resources needed to support climatology and case studies. This paper presents a specialized search and discovery tool that automatically creates curated Data Albums. The tool was designed to enable key elements of the search process such as dynamic interaction and sense-making. The tool supports dynamic interaction via different modes of interactivity and visual presentation of information. The compilation of information and data into a Data Album is analogous to a shoebox within the sense-making framework. This tool automates most of the tedious information/data gathering tasks for researchers. Data curation by the tool is achieved via an ontology-based, relevancy ranking algorithm that filters out non-relevant information and data. The curation enables better search results as compared to the simple keyword searches provided by existing data systems in Earth science.

Big, Deep, and Smart Data in Scanning Probe Microscopy.

PubMed

Kalinin, Sergei V; Strelcov, Evgheni; Belianinov, Alex; Somnath, Suhas; Vasudevan, Rama K; Lingerfelt, Eric J; Archibald, Richard K; Chen, Chaomei; Proksch, Roger; Laanait, Nouamane; Jesse, Stephen

2016-09-27

Scanning probe microscopy (SPM) techniques have opened the door to nanoscience and nanotechnology by enabling imaging and manipulation of the structure and functionality of matter at nanometer and atomic scales. Here, we analyze the scientific discovery process in SPM by following the information flow from the tip-surface junction, to knowledge adoption by the wider scientific community. We further discuss the challenges and opportunities offered by merging SPM with advanced data mining, visual analytics, and knowledge discovery technologies.
A Context-Aware Paradigm for Information Discovery and Dissemination in Mobile Environments

ERIC Educational Resources Information Center

Lundquist, Doug

2011-01-01

The increasing power and ubiquity of mobile wireless devices is enabling real-time information delivery for many diverse applications. A crucial question is how to allocate finite network resources efficiently and fairly despite the uncertainty common in highly dynamic mobile ad hoc networks. We propose a set of routing protocols, Self-Balancing…
Visual representation of scientific information.

PubMed

Wong, Bang

2011-02-15

Great technological advances have enabled researchers to generate an enormous amount of data. Data analysis is replacing data generation as the rate-limiting step in scientific research. With this wealth of information, we have an opportunity to understand the molecular causes of human diseases. However, the unprecedented scale, resolution, and variety of data pose new analytical challenges. Visual representation of data offers insights that can lead to new understanding, whether the purpose is analysis or communication. This presentation shows how art, design, and traditional illustration can enable scientific discovery. Examples will be drawn from the Broad Institute's Data Visualization Initiative, aimed at establishing processes for creating informative visualization models.
Monitoring and Discovery for Self-Organized Network Management in Virtualized and Software Defined Networks

PubMed Central

Valdivieso Caraguay, Ángel Leonardo; García Villalba, Luis Javier

2017-01-01

This paper presents the Monitoring and Discovery Framework of the Self-Organized Network Management in Virtualized and Software Defined Networks SELFNET project. This design takes into account the scalability and flexibility requirements needed by 5G infrastructures. In this context, the present framework focuses on gathering and storing the information (low-level metrics) related to physical and virtual devices, cloud environments, flow metrics, SDN traffic and sensors. Similarly, it provides the monitoring data as a generic information source in order to allow the correlation and aggregation tasks. Our design enables the collection and storing of information provided by all the underlying SELFNET sublayers, including the dynamically onboarded and instantiated SDN/NFV Apps, also known as SELFNET sensors. PMID:28362346
Monitoring and Discovery for Self-Organized Network Management in Virtualized and Software Defined Networks.

PubMed

Caraguay, Ángel Leonardo Valdivieso; Villalba, Luis Javier García

2017-03-31

This paper presents the Monitoring and Discovery Framework of the Self-Organized Network Management in Virtualized and Software Defined Networks SELFNET project. This design takes into account the scalability and flexibility requirements needed by 5G infrastructures. In this context, the present framework focuses on gathering and storing the information (low-level metrics) related to physical and virtual devices, cloud environments, flow metrics, SDN traffic and sensors. Similarly, it provides the monitoring data as a generic information source in order to allow the correlation and aggregation tasks. Our design enables the collection and storing of information provided by all the underlying SELFNET sublayers, including the dynamically onboarded and instantiated SDN/NFV Apps, also known as SELFNET sensors.
A geodata warehouse: Using denormalisation techniques as a tool for delivering spatially enabled integrated geological information to geologists

NASA Astrophysics Data System (ADS)

Kingdon, Andrew; Nayembil, Martin L.; Richardson, Anne E.; Smith, A. Graham

2016-11-01

New requirements to understand geological properties in three dimensions have led to the development of PropBase, a data structure and delivery tools to deliver this. At the BGS, relational database management systems (RDBMS) has facilitated effective data management using normalised subject-based database designs with business rules in a centralised, vocabulary controlled, architecture. These have delivered effective data storage in a secure environment. However, isolated subject-oriented designs prevented efficient cross-domain querying of datasets. Additionally, the tools provided often did not enable effective data discovery as they struggled to resolve the complex underlying normalised structures providing poor data access speeds. Users developed bespoke access tools to structures they did not fully understand sometimes delivering them incorrect results. Therefore, BGS has developed PropBase, a generic denormalised data structure within an RDBMS to store property data, to facilitate rapid and standardised data discovery and access, incorporating 2D and 3D physical and chemical property data, with associated metadata. This includes scripts to populate and synchronise the layer with its data sources through structured input and transcription standards. A core component of the architecture includes, an optimised query object, to deliver geoscience information from a structure equivalent to a data warehouse. This enables optimised query performance to deliver data in multiple standardised formats using a web discovery tool. Semantic interoperability is enforced through vocabularies combined from all data sources facilitating searching of related terms. PropBase holds 28.1 million spatially enabled property data points from 10 source databases incorporating over 50 property data types with a vocabulary set that includes 557 property terms. By enabling property data searches across multiple databases PropBase has facilitated new scientific research, previously considered impractical. PropBase is easily extended to incorporate 4D data (time series) and is providing a baseline for new "big data" monitoring projects.
FHIR Healthcare Directories: Adopting Shared Interfaces to Achieve Interoperable Medical Device Data Integration.

PubMed

Tyndall, Timothy; Tyndall, Ayami

2018-01-01

Healthcare directories are vital for interoperability among healthcare providers, researchers and patients. Past efforts at directory services have not provided the tools to allow integration of the diverse data sources. Many are overly strict, incompatible with legacy databases, and do not provide Data Provenance. A more architecture-independent system is needed to enable secure, GDPR-compatible (8) service discovery across organizational boundaries. We review our development of a portable Data Provenance Toolkit supporting provenance within Health Information Exchange (HIE) systems. The Toolkit has been integrated with client software and successfully leveraged in clinical data integration. The Toolkit validates provenance stored in a Blockchain or Directory record and creates provenance signatures, providing standardized provenance that moves with the data. This healthcare directory suite implements discovery of healthcare data by HIE and EHR systems via FHIR. Shortcomings of past directory efforts include the ability to map complex datasets and enabling interoperability via exchange endpoint discovery. By delivering data without dictating how it is stored we improve exchange and facilitate discovery on a multi-national level through open source, fully interoperable tools. With the development of Data Provenance resources we enhance exchange and improve security and usability throughout the health data continuum.
[Artificial Intelligence in Drug Discovery].

PubMed

Fujiwara, Takeshi; Kamada, Mayumi; Okuno, Yasushi

2018-04-01

According to the increase of data generated from analytical instruments, application of artificial intelligence(AI)technology in medical field is indispensable. In particular, practical application of AI technology is strongly required in "genomic medicine" and "genomic drug discovery" that conduct medical practice and novel drug development based on individual genomic information. In our laboratory, we have been developing a database to integrate genome data and clinical information obtained by clinical genome analysis and a computational support system for clinical interpretation of variants using AI. In addition, with the aim of creating new therapeutic targets in genomic drug discovery, we have been also working on the development of a binding affinity prediction system for mutated proteins and drugs by molecular dynamics simulation using supercomputer "Kei". We also have tackled for problems in a drug virtual screening. Our developed AI technology has successfully generated virtual compound library, and deep learning method has enabled us to predict interaction between compound and target protein.
The Influence of Big (Clinical) Data and Genomics on Precision Medicine and Drug Development.

PubMed

Denny, Joshua C; Van Driest, Sara L; Wei, Wei-Qi; Roden, Dan M

2018-03-01

Drug development continues to be costly and slow, with medications failing due to lack of efficacy or presence of toxicity. The promise of pharmacogenomic discovery includes tailoring therapeutics based on an individual's genetic makeup, rational drug development, and repurposing medications. Rapid growth of large research cohorts, linked to electronic health record (EHR) data, fuels discovery of new genetic variants predicting drug action, supports Mendelian randomization experiments to show drug efficacy, and suggests new indications for existing medications. New biomedical informatics and machine-learning approaches advance the ability to interpret clinical information, enabling identification of complex phenotypes and subpopulations of patients. We review the recent history of use of "big data" from EHR-based cohorts and biobanks supporting these activities. Future studies using EHR data, other information sources, and new methods will promote a foundation for discovery to more rapidly advance precision medicine. © 2017 American Society for Clinical Pharmacology and Therapeutics.
User needs analysis and usability assessment of DataMed - a biomedical data discovery index.

PubMed

Dixit, Ram; Rogith, Deevakar; Narayana, Vidya; Salimi, Mandana; Gururaj, Anupama; Ohno-Machado, Lucila; Xu, Hua; Johnson, Todd R

2017-11-30

To present user needs and usability evaluations of DataMed, a Data Discovery Index (DDI) that allows searching for biomedical data from multiple sources. We conducted 2 phases of user studies. Phase 1 was a user needs analysis conducted before the development of DataMed, consisting of interviews with researchers. Phase 2 involved iterative usability evaluations of DataMed prototypes. We analyzed data qualitatively to document researchers' information and user interface needs. Biomedical researchers' information needs in data discovery are complex, multidimensional, and shaped by their context, domain knowledge, and technical experience. User needs analyses validate the need for a DDI, while usability evaluations of DataMed show that even though aggregating metadata into a common search engine and applying traditional information retrieval tools are promising first steps, there remain challenges for DataMed due to incomplete metadata and the complexity of data discovery. Biomedical data poses distinct problems for search when compared to websites or publications. Making data available is not enough to facilitate biomedical data discovery: new retrieval techniques and user interfaces are necessary for dataset exploration. Consistent, complete, and high-quality metadata are vital to enable this process. While available data and researchers' information needs are complex and heterogeneous, a successful DDI must meet those needs and fit into the processes of biomedical researchers. Research directions include formalizing researchers' information needs, standardizing overviews of data to facilitate relevance judgments, implementing user interfaces for concept-based searching, and developing evaluation methods for open-ended discovery systems such as DDIs. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
The Future of Healthcare–Information Based Medicine

PubMed Central

Borangiu, T; Purcărea, V

2008-01-01

The paper discusses how information based medicine has become an increasingly important model of healthcare. Today's patients are better informed and therefore play a more active role in their own healthcare, fuelling the drive towards personalized medicine. Information Based Medicine enables researchers to design targeted therapeutics and rapidly develop best practices guidelines to enable healthcare providers to deliver the most complete individualized healthcare solutions. Information based medicine is realized thanks to growth in four key areas–Clinical Genomics, Medical Imaging, Targeted Pharmaceuticals, and Information Systems. Also discussed, is how technological advances throughout this decade are changing the discovery, development and delivery of new treatments–with healthcare becoming increasingly personalized as a result. A glimpse into the future of personalised healthcare is presented, highlighting scenarios in development today along with the challenges and perspectives which lie ahead. PMID:20108471
High Throughput Experimental Materials Database

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zakutayev, Andriy; Perkins, John; Schwarting, Marcus

The mission of the High Throughput Experimental Materials Database (HTEM DB) is to enable discovery of new materials with useful properties by releasing large amounts of high-quality experimental data to public. The HTEM DB contains information about materials obtained from high-throughput experiments at the National Renewable Energy Laboratory (NREL).
Lifelong Learning: Skills and Online Resources

ERIC Educational Resources Information Center

Lim, Russell F.; Hsiung, Bob C.; Hales, Deborah J.

2006-01-01

Objective: Advances in information technology enable the practicing psychiatrist's quest to keep up-to-date with new discoveries in psychiatry, as well as to meet recertification requirements. However, physicians' computer skills do not always keep up with technology, nor do they take advantage of online search and continuing education services.…
Visualising "Junk" DNA through Bioinformatics

ERIC Educational Resources Information Center

Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia

2005-01-01

One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…
Ligand-based receptor tyrosine kinase partial agonists: New paradigm for cancer drug discovery?

PubMed

Riese, David J

2011-02-01

INTRODUCTION: Receptor tyrosine kinases (RTKs) are validated targets for oncology drug discovery and several RTK antagonists have been approved for the treatment of human malignancies. Nonetheless, the discovery and development of RTK antagonists has lagged behind the discovery and development of agents that target G-protein coupled receptors. In part, this is because it has been difficult to discover analogs of naturally-occurring RTK agonists that function as antagonists. AREAS COVERED: Here we describe ligands of ErbB receptors that function as partial agonists for these receptors, thereby enabling these ligands to antagonize the activity of full agonists for these receptors. We provide insights into the mechanisms by which these ligands function as antagonists. We discuss how information concerning these mechanisms can be translated into screens for novel small molecule- and antibody-based antagonists of ErbB receptors and how such antagonists hold great potential as targeted cancer chemotherapeutics. EXPERT OPINION: While there have been a number of important key findings into this field, the identification of the structural basis of ligand functional specificity is still of the greatest importance. While it is true that, with some notable exceptions, peptide hormones and growth factors have not proven to be good platforms for oncology drug discovery; addressing the fundamental issues of antagonistic partial agonists for receptor tyrosine kinases has the potential to steer oncology drug discovery in new directions. Mechanism based approaches are now emerging to enable the discovery of RTK partial agonists that may antagonize both agonist-dependent and -independent RTK signaling and may hold tremendous promise as targeted cancer chemotherapeutics.
Proactive human-computer collaboration for information discovery

NASA Astrophysics Data System (ADS)

DiBona, Phil; Shilliday, Andrew; Barry, Kevin

2016-05-01

Lockheed Martin Advanced Technology Laboratories (LM ATL) is researching methods, representations, and processes for human/autonomy collaboration to scale analysis and hypotheses substantiation for intelligence analysts. This research establishes a machinereadable hypothesis representation that is commonsensical to the human analyst. The representation unifies context between the human and computer, enabling autonomy in the form of analytic software, to support the analyst through proactively acquiring, assessing, and organizing high-value information that is needed to inform and substantiate hypotheses.
Enabling information management systems in tactical network environments

NASA Astrophysics Data System (ADS)

Carvalho, Marco; Uszok, Andrzej; Suri, Niranjan; Bradshaw, Jeffrey M.; Ceccio, Philip J.; Hanna, James P.; Sinclair, Asher

2009-05-01

Net-Centric Information Management (IM) and sharing in tactical environments promises to revolutionize forward command and control capabilities by providing ubiquitous shared situational awareness to the warfighter. This vision can be realized by leveraging the tactical and Mobile Ad hoc Networks (MANET) which provide the underlying communications infrastructure, but, significant technical challenges remain. Enabling information management in these highly dynamic environments will require multiple support services and protocols which are affected by, and highly dependent on, the underlying capabilities and dynamics of the tactical network infrastructure. In this paper we investigate, discuss, and evaluate the effects of realistic tactical and mobile communications network environments on mission-critical information management systems. We motivate our discussion by introducing the Advanced Information Management System (AIMS) which is targeted for deployment in tactical sensor systems. We present some operational requirements for AIMS and highlight how critical IM support services such as discovery, transport, federation, and Quality of Service (QoS) management are necessary to meet these requirements. Our goal is to provide a qualitative analysis of the impact of underlying assumptions of availability and performance of some of the critical services supporting tactical information management. We will also propose and describe a number of technologies and capabilities that have been developed to address these challenges, providing alternative approaches for transport, service discovery, and federation services for tactical networks.
Leveraging model-informed approaches for drug discovery and development in the cardiovascular space.

PubMed

Dockendorf, Marissa F; Vargo, Ryan C; Gheyas, Ferdous; Chain, Anne S Y; Chatterjee, Manash S; Wenning, Larissa A

2018-06-01

Cardiovascular disease remains a significant global health burden, and development of cardiovascular drugs in the current regulatory environment often demands large and expensive cardiovascular outcome trials. Thus, the use of quantitative pharmacometric approaches which can help enable early Go/No Go decision making, ensure appropriate dose selection, and increase the likelihood of successful clinical trials, have become increasingly important to help reduce the risk of failed cardiovascular outcomes studies. In addition, cardiovascular safety is an important consideration for many drug development programs, whether or not the drug is designed to treat cardiovascular disease; modeling and simulation approaches also have utility in assessing risk in this area. Herein, examples of modeling and simulation applied at various stages of drug development, spanning from the discovery stage through late-stage clinical development, for cardiovascular programs are presented. Examples of how modeling approaches have been utilized in early development programs across various therapeutic areas to help inform strategies to mitigate the risk of cardiovascular-related adverse events, such as QTc prolongation and changes in blood pressure, are also presented. These examples demonstrate how more informed drug development decisions can be enabled by modeling and simulation approaches in the cardiovascular area.
Phenotypic screening in cancer drug discovery - past, present and future.

PubMed

Moffat, John G; Rudolph, Joachim; Bailey, David

2014-08-01

There has been a resurgence of interest in the use of phenotypic screens in drug discovery as an alternative to target-focused approaches. Given that oncology is currently the most active therapeutic area, and also one in which target-focused approaches have been particularly prominent in the past two decades, we investigated the contribution of phenotypic assays to oncology drug discovery by analysing the origins of all new small-molecule cancer drugs approved by the US Food and Drug Administration (FDA) over the past 15 years and those currently in clinical development. Although the majority of these drugs originated from target-based discovery, we identified a significant number whose discovery depended on phenotypic screening approaches. We postulate that the contribution of phenotypic screening to cancer drug discovery has been hampered by a reliance on 'classical' nonspecific drug effects such as cytotoxicity and mitotic arrest, exacerbated by a paucity of mechanistically defined cellular models for therapeutically translatable cancer phenotypes. However, technical and biological advances that enable such mechanistically informed phenotypic models have the potential to empower phenotypic drug discovery in oncology.
Enabling the Discovery of Gravitational Radiation

NASA Astrophysics Data System (ADS)

Isaacson, Richard

2017-01-01

The discovery of gravitational radiation was announced with the publication of the results of a physics experiment involving over a thousand participants. This was preceded by a century of theoretical work, involving a similarly large group of physicists, mathematicians, and computer scientists. This huge effort was enabled by a substantial commitment of resources, both public and private, to develop the different strands of this complex research enterprise, and to build a community of scientists to carry it out. In the excitement following the discovery, the role of key enablers of this success has not always been adequately recognized in popular accounts. In this talk, I will try to call attention to a few of the key ingredients that proved crucial to enabling the successful discovery of gravitational waves, and the opening of a new field of science.

Bigger data, collaborative tools and the future of predictive drug discovery

NASA Astrophysics Data System (ADS)

Ekins, Sean; Clark, Alex M.; Swamidass, S. Joshua; Litterman, Nadia; Williams, Antony J.

2014-10-01

Over the past decade we have seen a growth in the provision of chemistry data and cheminformatics tools as either free websites or software as a service commercial offerings. These have transformed how we find molecule-related data and use such tools in our research. There have also been efforts to improve collaboration between researchers either openly or through secure transactions using commercial tools. A major challenge in the future will be how such databases and software approaches handle larger amounts of data as it accumulates from high throughput screening and enables the user to draw insights, enable predictions and move projects forward. We now discuss how information from some drug discovery datasets can be made more accessible and how privacy of data should not overwhelm the desire to share it at an appropriate time with collaborators. We also discuss additional software tools that could be made available and provide our thoughts on the future of predictive drug discovery in this age of big data. We use some examples from our own research on neglected diseases, collaborations, mobile apps and algorithm development to illustrate these ideas.
The Mason Water Data Information System (MWDIS): Enabling data sharing and discovery at George Mason University

NASA Astrophysics Data System (ADS)

Ferreira, C.; Da Silva, A. L.; Nunes, A.; Haddad, J.; Lawler, S.

2014-12-01

Enabling effective data use and re-use in scientific investigations relies heavily not only on data availability but also on efficient data sharing discovery. The CUAHSI led Hydrological Information Systems (HIS) and supporting products have paved the way to efficient data sharing and discovery in the hydrological sciences. Based on the CUAHSI-HIS framework concepts for hydrologic data sharing we developed a unique system devoted to the George Mason University scientific community to support university wide data sharing and discovery as well as real time data access for extreme events situational awareness. The internet-based system will provide an interface where the researchers will input data collected from the measurement stations and present them to the public in form of charts, tables, maps, and documents. Moreover, the system is developed in ASP.NET MVC 4 using as Database Management System, Microsoft SQL Server 2008 R2, and hosted by Amazon Web Services. Currently the system is supporting the Mason Watershed Project providing historical hydrological, atmospheric and water quality data for the campus watershed and real time flood conditions in the campus. The system is also a gateway for unprecedented data collection of hurricane storm surge hydrodynamics in coastal wetlands in the Chesapeake Bay providing not only access to historical data but recent storms such as Hurricane Arthur. Future research includes coupling the system to a real-time flood alert system on campus, and besides providing data on the World Wide Web, to foment and provide a venue for interdisciplinary collaboration within the water scientists in the region.
Chemical Informatics and the Drug Discovery Knowledge Pyramid

PubMed Central

Lushington, Gerald H.; Dong, Yinghua; Theertham, Bhargav

2012-01-01

The magnitude of the challenges in preclinical drug discovery is evident in the large amount of capital invested in such efforts in pursuit of a small static number of eventually successful marketable therapeutics. An explosion in the availability of potentially drug-like compounds and chemical biology data on these molecules can provide us with the means to improve the eventual success rates for compounds being considered at the preclinical level, but only if the community is able to access available information in an efficient and meaningful way. Thus, chemical database resources are critical to any serious drug discovery effort. This paper explores the basic principles underlying the development and implementation of chemical databases, and examines key issues of how molecular information may be encoded within these databases so as to enhance the likelihood that users will be able to extract meaningful information from data queries. In addition to a broad survey of conventional data representation and query strategies, key enabling technologies such as new context-sensitive chemical similarity measures and chemical cartridges are examined, with recommendations on how such resources may be integrated into a practical database environment. PMID:23782037
Using key performance indicators as knowledge-management tools at a regional health-care authority level.

PubMed

Berler, Alexander; Pavlopoulos, Sotiris; Koutsouris, Dimitris

2005-06-01

The advantages of the introduction of information and communication technologies in the complex health-care sector are already well-known and well-stated in the past. It is, nevertheless, paradoxical that although the medical community has embraced with satisfaction most of the technological discoveries allowing the improvement in patient care, this has not happened when talking about health-care informatics. Taking the above issue of concern, our work proposes an information model for knowledge management (KM) based upon the use of key performance indicators (KPIs) in health-care systems. Based upon the use of the balanced scorecard (BSC) framework (Kaplan/Norton) and quality assurance techniques in health care (Donabedian), this paper is proposing a patient journey centered approach that drives information flow at all levels of the day-to-day process of delivering effective and managed care, toward information assessment and knowledge discovery. In order to persuade health-care decision-makers to assess the added value of KM tools, those should be used to propose new performance measurement and performance management techniques at all levels of a health-care system. The proposed KPIs are forming a complete set of metrics that enable the performance management of a regional health-care system. In addition, the performance framework established is technically applied by the use of state-of-the-art KM tools such as data warehouses and business intelligence information systems. In that sense, the proposed infrastructure is, technologically speaking, an important KM tool that enables knowledge sharing amongst various health-care stakeholders and between different health-care groups. The use of BSC is an enabling framework toward a KM strategy in health care.
Integrating Semantic Information in Metadata Descriptions for a Geoscience-wide Resource Inventory.

NASA Astrophysics Data System (ADS)

Zaslavsky, I.; Richard, S. M.; Gupta, A.; Valentine, D.; Whitenack, T.; Ozyurt, I. B.; Grethe, J. S.; Schachne, A.

2016-12-01

Integrating semantic information into legacy metadata catalogs is a challenging issue and so far has been mostly done on a limited scale. We present experience of CINERGI (Community Inventory of Earthcube Resources for Geoscience Interoperability), an NSF Earthcube Building Block project, in creating a large cross-disciplinary catalog of geoscience information resources to enable cross-domain discovery. The project developed a pipeline for automatically augmenting resource metadata, in particular generating keywords that describe metadata documents harvested from multiple geoscience information repositories or contributed by geoscientists through various channels including surveys and domain resource inventories. The pipeline examines available metadata descriptions using text parsing, vocabulary management and semantic annotation and graph navigation services of GeoSciGraph. GeoSciGraph, in turn, relies on a large cross-domain ontology of geoscience terms, which bridges several independently developed ontologies or taxonomies including SWEET, ENVO, YAGO, GeoSciML, GCMD, SWO, and CHEBI. The ontology content enables automatic extraction of keywords reflecting science domains, equipment used, geospatial features, measured properties, methods, processes, etc. We specifically focus on issues of cross-domain geoscience ontology creation, resolving several types of semantic conflicts among component ontologies or vocabularies, and constructing and managing facets for improved data discovery and navigation. The ontology and keyword generation rules are iteratively improved as pipeline results are presented to data managers for selective manual curation via a CINERGI Annotator user interface. We present lessons learned from applying CINERGI metadata augmentation pipeline to a number of federal agency and academic data registries, in the context of several use cases that require data discovery and integration across multiple earth science data catalogs of varying quality and completeness. The inventory is accessible at http://cinergi.sdsc.edu, and the CINERGI project web page is http://earthcube.org/group/cinergi
Translational Research 2.0: a framework for accelerating collaborative discovery.

PubMed

Asakiewicz, Chris

2014-05-01

The world wide web has revolutionized the conduct of global, cross-disciplinary research. In the life sciences, interdisciplinary approaches to problem solving and collaboration are becoming increasingly important in facilitating knowledge discovery and integration. Web 2.0 technologies promise to have a profound impact - enabling reproducibility, aiding in discovery, and accelerating and transforming medical and healthcare research across the healthcare ecosystem. However, knowledge integration and discovery require a consistent foundation upon which to operate. A foundation should be capable of addressing some of the critical issues associated with how research is conducted within the ecosystem today and how it should be conducted for the future. This article will discuss a framework for enhancing collaborative knowledge discovery across the medical and healthcare research ecosystem. A framework that could serve as a foundation upon which ecosystem stakeholders can enhance the way data, information and knowledge is created, shared and used to accelerate the translation of knowledge from one area of the ecosystem to another.
Efficient discovery of bioactive scaffolds by activity-directed synthesis

NASA Astrophysics Data System (ADS)

Karageorgis, George; Warriner, Stuart; Nelson, Adam

2014-10-01

The structures and biological activities of natural products have often provided inspiration in drug discovery. The functional benefits of natural products to the host organism steers the evolution of their biosynthetic pathways. Here, we describe a discovery approach—which we term activity-directed synthesis—in which reactions with alternative outcomes are steered towards functional products. Arrays of catalysed reactions of α-diazo amides, whose outcome was critically dependent on the specific conditions used, were performed. The products were assayed at increasingly low concentration, with the results informing the design of a subsequent reaction array. Finally, promising reactions were scaled up and, after purification, submicromolar ligands based on two scaffolds with no previous annotated activity against the androgen receptor were discovered. The approach enables the discovery, in tandem, of both bioactive small molecules and associated synthetic routes, analogous to the evolution of biosynthetic pathways to yield natural products.
Big data: the next frontier for innovation in therapeutics and healthcare.

PubMed

Issa, Naiem T; Byers, Stephen W; Dakshanamurthy, Sivanesan

2014-05-01

Advancements in genomics and personalized medicine not only effect healthcare delivery from patient and provider standpoints, but also reshape biomedical discovery. We are in the era of the '-omics', wherein an individual's genome, transcriptome, proteome and metabolome can be scrutinized to the finest resolution to paint a personalized biochemical fingerprint that enables tailored treatments, prognoses, risk factors, etc. Digitization of this information parlays into 'big data' informatics-driven evidence-based medical practice. While individualized patient management is a key beneficiary of next-generation medical informatics, this data also harbors a wealth of novel therapeutic discoveries waiting to be uncovered. 'Big data' informatics allows for networks-driven systems pharmacodynamics whereby drug information can be coupled to cellular- and organ-level physiology for determining whole-body outcomes. Patient '-omics' data can be integrated for ontology-based data-mining for the discovery of new biological associations and drug targets. Here we highlight the potential of 'big data' informatics for clinical pharmacology.
Big data: the next frontier for innovation in therapeutics and healthcare

PubMed Central

Issa, Naiem T; Byers, Stephen W; Dakshanamurthy, Sivanesan

2015-01-01

Advancements in genomics and personalized medicine not only effect healthcare delivery from patient and provider standpoints, but also reshape biomedical discovery. We are in the era of the “-omics”, wherein an individual’s genome, transcriptome, proteome and metabolome can be scrutinized to the finest resolution to paint a personalized biochemical fingerprint that enables tailored treatments, prognoses, risk factors, etc. Digitization of this information parlays into “big data” informatics-driven evidence-based medical practice. While individualized patient management is a key beneficiary of next-generation medical informatics, this data also harbors a wealth of novel therapeutic discoveries waiting to be uncovered. “Big data” informatics allows for networks-driven systems pharmacodynamics whereby drug information can be coupled to cellular- and organ-level physiology for determining whole-body outcomes. Patient “-omics” data can be integrated for ontology-based data-mining for the discovery of new biological associations and drug targets. Here we highlight the potential of “big data” informatics for clinical pharmacology. PMID:24702684
Repurposing High-Throughput Image Assays Enables Biological Activity Prediction for Drug Discovery.

PubMed

Simm, Jaak; Klambauer, Günter; Arany, Adam; Steijaert, Marvin; Wegner, Jörg Kurt; Gustin, Emmanuel; Chupakhin, Vladimir; Chong, Yolanda T; Vialard, Jorge; Buijnsters, Peter; Velter, Ingrid; Vapirev, Alexander; Singh, Shantanu; Carpenter, Anne E; Wuyts, Roel; Hochreiter, Sepp; Moreau, Yves; Ceulemans, Hugo

2018-05-17

In both academia and the pharmaceutical industry, large-scale assays for drug discovery are expensive and often impractical, particularly for the increasingly important physiologically relevant model systems that require primary cells, organoids, whole organisms, or expensive or rare reagents. We hypothesized that data from a single high-throughput imaging assay can be repurposed to predict the biological activity of compounds in other assays, even those targeting alternate pathways or biological processes. Indeed, quantitative information extracted from a three-channel microscopy-based screen for glucocorticoid receptor translocation was able to predict assay-specific biological activity in two ongoing drug discovery projects. In these projects, repurposing increased hit rates by 50- to 250-fold over that of the initial project assays while increasing the chemical structure diversity of the hits. Our results suggest that data from high-content screens are a rich source of information that can be used to predict and replace customized biological assays. Copyright © 2018 Elsevier Ltd. All rights reserved.
Mathematical modeling for novel cancer drug discovery and development.

PubMed

Zhang, Ping; Brusic, Vladimir

2014-10-01

Mathematical modeling enables: the in silico classification of cancers, the prediction of disease outcomes, optimization of therapy, identification of promising drug targets and prediction of resistance to anticancer drugs. In silico pre-screened drug targets can be validated by a small number of carefully selected experiments. This review discusses the basics of mathematical modeling in cancer drug discovery and development. The topics include in silico discovery of novel molecular drug targets, optimization of immunotherapies, personalized medicine and guiding preclinical and clinical trials. Breast cancer has been used to demonstrate the applications of mathematical modeling in cancer diagnostics, the identification of high-risk population, cancer screening strategies, prediction of tumor growth and guiding cancer treatment. Mathematical models are the key components of the toolkit used in the fight against cancer. The combinatorial complexity of new drugs discovery is enormous, making systematic drug discovery, by experimentation, alone difficult if not impossible. The biggest challenges include seamless integration of growing data, information and knowledge, and making them available for a multiplicity of analyses. Mathematical models are essential for bringing cancer drug discovery into the era of Omics, Big Data and personalized medicine.
Designing and Developing a NASA Research Projects Knowledge Base and Implementing Knowledge Management and Discovery Techniques

NASA Astrophysics Data System (ADS)

Dabiru, L.; O'Hara, C. G.; Shaw, D.; Katragadda, S.; Anderson, D.; Kim, S.; Shrestha, B.; Aanstoos, J.; Frisbie, T.; Policelli, F.; Keblawi, N.

2006-12-01

The Research Project Knowledge Base (RPKB) is currently being designed and will be implemented in a manner that is fully compatible and interoperable with enterprise architecture tools developed to support NASA's Applied Sciences Program. Through user needs assessment, collaboration with Stennis Space Center, Goddard Space Flight Center, and NASA's DEVELOP Staff personnel insight to information needs for the RPKB were gathered from across NASA scientific communities of practice. To enable efficient, consistent, standard, structured, and managed data entry and research results compilation a prototype RPKB has been designed and fully integrated with the existing NASA Earth Science Systems Components database. The RPKB will compile research project and keyword information of relevance to the six major science focus areas, 12 national applications, and the Global Change Master Directory (GCMD). The RPKB will include information about projects awarded from NASA research solicitations, project investigator information, research publications, NASA data products employed, and model or decision support tools used or developed as well as new data product information. The RPKB will be developed in a multi-tier architecture that will include a SQL Server relational database backend, middleware, and front end client interfaces for data entry. The purpose of this project is to intelligently harvest the results of research sponsored by the NASA Applied Sciences Program and related research program results. We present various approaches for a wide spectrum of knowledge discovery of research results, publications, projects, etc. from the NASA Systems Components database and global information systems and show how this is implemented in SQL Server database. The application of knowledge discovery is useful for intelligent query answering and multiple-layered database construction. Using advanced EA tools such as the Earth Science Architecture Tool (ESAT), RPKB will enable NASA and partner agencies to efficiently identify the significant results for new experiment directions and principle investigators to formulate experiment directions for new proposals.
WordSeeker: concurrent bioinformatics software for discovering genome-wide patterns and word-based genomic signatures

PubMed Central

2010-01-01

Background An important focus of genomic science is the discovery and characterization of all functional elements within genomes. In silico methods are used in genome studies to discover putative regulatory genomic elements (called words or motifs). Although a number of methods have been developed for motif discovery, most of them lack the scalability needed to analyze large genomic data sets. Methods This manuscript presents WordSeeker, an enumerative motif discovery toolkit that utilizes multi-core and distributed computational platforms to enable scalable analysis of genomic data. A controller task coordinates activities of worker nodes, each of which (1) enumerates a subset of the DNA word space and (2) scores words with a distributed Markov chain model. Results A comprehensive suite of performance tests was conducted to demonstrate the performance, speedup and efficiency of WordSeeker. The scalability of the toolkit enabled the analysis of the entire genome of Arabidopsis thaliana; the results of the analysis were integrated into The Arabidopsis Gene Regulatory Information Server (AGRIS). A public version of WordSeeker was deployed on the Glenn cluster at the Ohio Supercomputer Center. Conclusion WordSeeker effectively utilizes concurrent computing platforms to enable the identification of putative functional elements in genomic data sets. This capability facilitates the analysis of the large quantity of sequenced genomic data. PMID:21210985
EPA Web Taxonomy

EPA Pesticide Factsheets

EPA's Web Taxonomy is a faceted hierarchical vocabulary used to tag web pages with terms from a controlled vocabulary. Tagging enables search and discovery of EPA's Web based information assests. EPA's Web Taxonomy is being provided in Simple Knowledge Organization System (SKOS) format. SKOS is a standard for sharing and linking knowledge organization systems that promises to make Federal terminology resources more interoperable.
Information visualisation for science and policy: engaging users and avoiding bias.

PubMed

McInerny, Greg J; Chen, Min; Freeman, Robin; Gavaghan, David; Meyer, Miriah; Rowland, Francis; Spiegelhalter, David J; Stefaner, Moritz; Tessarolo, Geizi; Hortal, Joaquin

2014-03-01

Visualisations and graphics are fundamental to studying complex subject matter. However, beyond acknowledging this value, scientists and science-policy programmes rarely consider how visualisations can enable discovery, create engaging and robust reporting, or support online resources. Producing accessible and unbiased visualisations from complicated, uncertain data requires expertise and knowledge from science, policy, computing, and design. However, visualisation is rarely found in our scientific training, organisations, or collaborations. As new policy programmes develop [e.g., the Intergovernmental Platform on Biodiversity and Ecosystem Services (IPBES)], we need information visualisation to permeate increasingly both the work of scientists and science policy. The alternative is increased potential for missed discoveries, miscommunications, and, at worst, creating a bias towards the research that is easiest to display. Copyright © 2014 Elsevier Ltd. All rights reserved.
Merging Electronic Health Record Data and Genomics for Cardiovascular Research

PubMed Central

Hall, Jennifer L.; Ryan, John J.; Bray, Bruce E.; Brown, Candice; Lanfear, David; Newby, L. Kristin; Relling, Mary V.; Risch, Neil J.; Roden, Dan M.; Shaw, Stanley Y.; Tcheng, James E.; Tenenbaum, Jessica; Wang, Thomas N.; Weintraub, William S.

2017-01-01

The process of scientific discovery is rapidly evolving. The funding climate has influenced a favorable shift in scientific discovery toward the use of existing resources such as the electronic health record. The electronic health record enables long-term outlooks on human health and disease, in conjunction with multidimensional phenotypes that include laboratory data, images, vital signs, and other clinical information. Initial work has confirmed the utility of the electronic health record for understanding mechanisms and patterns of variability in disease susceptibility, disease evolution, and drug responses. The addition of biobanks and genomic data to the information contained in the electronic health record has been demonstrated. The purpose of this statement is to discuss the current challenges in and the potential for merging electronic health record data and genomics for cardiovascular research. PMID:26976545
Improvements to the Ontology-based Metadata Portal for Unified Semantics (OlyMPUS)

NASA Astrophysics Data System (ADS)

Linsinbigler, M. A.; Gleason, J. L.; Huffer, E.

2016-12-01

The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support Earth Science data consumers and data providers, enabling the latter to register data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS complements the ODISEES' data discovery system with an intelligent tool to enable data producers to auto-generate semantically enhanced metadata and upload it to the metadata repository that drives ODISEES. Like ODISEES, the OlyMPUS metadata provisioning tool leverages robust semantics, a NoSQL database and query engine, an automated reasoning engine that performs first- and second-order deductive inferencing, and uses a controlled vocabulary to support data interoperability and automated analytics. The ODISEES data discovery portal leverages this metadata to provide a seamless data discovery and access experience for data consumers who are interested in comparing and contrasting the multiple Earth science data products available across NASA data centers. Olympus will support scientists' services and tools for performing complex analyses and identifying correlations and non-obvious relationships across all types of Earth System phenomena using the full spectrum of NASA Earth Science data available. By providing an intelligent discovery portal that supplies users - both human users and machines - with detailed information about data products, their contents and their structure, ODISEES will reduce the level of effort required to identify and prepare large volumes of data for analysis. This poster will explain how OlyMPUS leverages deductive reasoning and other technologies to create an integrated environment for generating and exploiting semantically rich metadata.
Biophysical Discovery through the Lens of a Computational Microscope

NASA Astrophysics Data System (ADS)

Amaro, Rommie

With exascale computing power on the horizon, improvements in the underlying algorithms and available structural experimental data are enabling new paradigms for chemical discovery. My work has provided key insights for the systematic incorporation of structural information resulting from state-of-the-art biophysical simulations into protocols for inhibitor and drug discovery. We have shown that many disease targets have druggable pockets that are otherwise ``hidden'' in high resolution x-ray structures, and that this is a common theme across a wide range of targets in different disease areas. We continue to push the limits of computational biophysical modeling by expanding the time and length scales accessible to molecular simulation. My sights are set on, ultimately, the development of detailed physical models of cells, as the fundamental unit of life, and two recent achievements highlight our efforts in this arena. First is the development of a molecular and Brownian dynamics multi-scale modeling framework, which allows us to investigate drug binding kinetics in addition to thermodynamics. In parallel, we have made significant progress developing new tools to extend molecular structure to cellular environments. Collectively, these achievements are enabling the investigation of the chemical and biophysical nature of cells at unprecedented scales.
Bigger Data, Collaborative Tools and the Future of Predictive Drug Discovery

PubMed Central

Clark, Alex M.; Swamidass, S. Joshua; Litterman, Nadia; Williams, Antony J.

2014-01-01

Over the past decade we have seen a growth in the provision of chemistry data and cheminformatics tools as either free websites or software as a service (SaaS) commercial offerings. These have transformed how we find molecule-related data and use such tools in our research. There have also been efforts to improve collaboration between researchers either openly or through secure transactions using commercial tools. A major challenge in the future will be how such databases and software approaches handle larger amounts of data as it accumulates from high throughput screening and enables the user to draw insights, enable predictions and move projects forward. We now discuss how information from some drug discovery datasets can be made more accessible and how privacy of data should not overwhelm the desire to share it at an appropriate time with collaborators. We also discuss additional software tools that could be made available and provide our thoughts on the future of predictive drug discovery in this age of big data. We use some examples from our own research on neglected diseases, collaborations, mobile apps and algorithm development to illustrate these ideas. PMID:24943138
Meta Data Mining in Earth Remote Sensing Data Archives

NASA Astrophysics Data System (ADS)

Davis, B.; Steinwand, D.

2014-12-01

Modern search and discovery tools for satellite based remote sensing data are often catalog based and rely on query systems which use scene- (or granule-) based meta data for those queries. While these traditional catalog systems are often robust, very little has been done in the way of meta data mining to aid in the search and discovery process. The recently coined term "Big Data" can be applied in the remote sensing world's efforts to derive information from the vast data holdings of satellite based land remote sensing data. Large catalog-based search and discovery systems such as the United States Geological Survey's Earth Explorer system and the NASA Earth Observing System Data and Information System's Reverb-ECHO system provide comprehensive access to these data holdings, but do little to expose the underlying scene-based meta data. These catalog-based systems are extremely flexible, but are manually intensive and often require a high level of user expertise. Exposing scene-based meta data to external, web-based services can enable machine-driven queries to aid in the search and discovery process. Furthermore, services which expose additional scene-based content data (such as product quality information) are now available and can provide a "deeper look" into remote sensing data archives too large for efficient manual search methods. This presentation shows examples of the mining of Landsat and Aster scene-based meta data, and an experimental service using OPeNDAP to extract information from quality band from multiple granules in the MODIS archive.

Big Data Transforms Discovery-Utilization Therapeutics Continuum.

PubMed

Waldman, S A; Terzic, A

2016-03-01

Enabling omic technologies adopt a holistic view to produce unprecedented insights into the molecular underpinnings of health and disease, in part, by generating massive high-dimensional biological data. Leveraging these systems-level insights as an engine driving the healthcare evolution is maximized through integration with medical, demographic, and environmental datasets from individuals to populations. Big data analytics has accordingly emerged to add value to the technical aspects of storage, transfer, and analysis required for merging vast arrays of omic-, clinical-, and eco-datasets. In turn, this new field at the interface of biology, medicine, and information science is systematically transforming modern therapeutics across discovery, development, regulation, and utilization. © 2015 ASCPT.
Sensor Webs with a Service-Oriented Architecture for On-demand Science Products

NASA Technical Reports Server (NTRS)

Mandl, Daniel; Ungar, Stephen; Ames, Troy; Justice, Chris; Frye, Stuart; Chien, Steve; Tran, Daniel; Cappelaere, Patrice; Derezinsfi, Linda; Paules, Granville;

2007-01-01

This paper describes the work being managed by the NASA Goddard Space Flight Center (GSFC) Information System Division (ISD) under a NASA Earth Science Technology Ofice (ESTO) Advanced Information System Technology (AIST) grant to develop a modular sensor web architecture which enables discovery of sensors and workflows that can create customized science via a high-level service-oriented architecture based on Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) web service standards. These capabilities serve as a prototype to a user-centric architecture for Global Earth Observing System of Systems (GEOSS). This work builds and extends previous sensor web efforts conducted at NASA/GSFC using the Earth Observing 1 (EO-1) satellite and other low-earth orbiting satellites.

Mechanistic models enable the rational use of in vitro drug-target binding kinetics for better drug effects in patients.

PubMed

de Witte, Wilhelmus E A; Wong, Yin Cheong; Nederpelt, Indira; Heitman, Laura H; Danhof, Meindert; van der Graaf, Piet H; Gilissen, Ron A H J; de Lange, Elizabeth C M

2016-01-01

Drug-target binding kinetics are major determinants of the time course of drug action for several drugs, as clearly described for the irreversible binders omeprazole and aspirin. This supports the increasing interest to incorporate newly developed high-throughput assays for drug-target binding kinetics in drug discovery. A meaningful application of in vitro drug-target binding kinetics in drug discovery requires insight into the relation between in vivo drug effect and in vitro measured drug-target binding kinetics. In this review, the authors discuss both the relation between in vitro and in vivo measured binding kinetics and the relation between in vivo binding kinetics, target occupancy and effect profiles. More scientific evidence is required for the rational selection and development of drug-candidates on the basis of in vitro estimates of drug-target binding kinetics. To elucidate the value of in vitro binding kinetics measurements, it is necessary to obtain information on system-specific properties which influence the kinetics of target occupancy and drug effect. Mathematical integration of this information enables the identification of drug-specific properties which lead to optimal target occupancy and drug effect in patients.
An Efficient Workflow Environment to Support the Collaborative Development of Actionable Climate Information Using the NCAR Climate Risk Management Engine (CRMe)

NASA Astrophysics Data System (ADS)

Ammann, C. M.; Vigh, J. L.; Lee, J. A.

2016-12-01

Society's growing needs for robust and relevant climate information have fostered an explosion in tools and frameworks for processing climate projections. Many top-down workflows might be employed to generate sets of pre-computed data and plots, frequently served in a "loading-dock style" through a metadata-enabled search and discovery engine. Despite these increasing resources, the diverse needs of applications-driven projects often result in data processing workflow requirements that cannot be fully satisfied using past approaches. In parallel to the data processing challenges, the provision of climate information to users in a form that is also usable represents a formidable challenge of its own. Finally, many users do not have the time nor the desire to synthesize and distill massive volumes of climate information to find the relevant information for their particular application. All of these considerations call for new approaches to developing actionable climate information. CRMe seeks to bridge the gap between the diversity and richness of bottom-up needs of practitioners, with discrete, structured top-down workflows typically implemented for rapid delivery. Additionally, CRMe has implemented web-based data services capable of providing focused climate information in usable form for a given location, or as spatially aggregated information for entire regions or countries following the needs of users and sectors. Making climate data actionable also involves summarizing and presenting it in concise and approachable ways. CRMe is developing the concept of dashboards, co-developed with the users, to condense the key information into a quick summary of the most relevant, curated climate data for a given discipline, application, or location, while still enabling users to efficiently conduct deeper discovery into rich datasets on an as-needed basis.
Phoenix: SOA based information management services

NASA Astrophysics Data System (ADS)

Grant, Rob; Combs, Vaughn; Hanna, Jim; Lipa, Brian; Reilly, Jim

2009-05-01

The Air Force Research Laboratory (AFRL) has developed a reference set of Information Management (IM) Services that will provide an essential piece of the envisioned final Net-Centric IM solution for the Department of Defense (DoD). These IM Services will provide mission critical functionality to enable seamless interoperability between existing and future DoD systems and services while maintaining a highly available IM capability across the wide spectrum of differing scalability and performance requirements. AFRL designed this set of IM Services for integration with other DoD and commercial SOA environments. The services developed will provide capabilities for information submission, information brokering and discovery, repository, query, type management, dissemination, session management, authorization, service brokering and event notification. In addition, the IM services support common information models that facilitate the management and dissemination of information consistent with client needs and established policy. The services support flexible and extensible definitions of session, service, and channel contexts that enable the application of Quality of Service (QoS) and security policies at many levels within the SOA.
Use of Semantic Technology to Create Curated Data Albums

NASA Technical Reports Server (NTRS)

Ramachandran, Rahul; Kulkarni, Ajinkya; Li, Xiang; Sainju, Roshan; Bakare, Rohan; Basyal, Sabin

2014-01-01

One of the continuing challenges in any Earth science investigation is the discovery and access of useful science content from the increasingly large volumes of Earth science data and related information available online. Current Earth science data systems are designed with the assumption that researchers access data primarily by instrument or geophysical parameter. Those who know exactly the data sets they need can obtain the specific files using these systems. However, in cases where researchers are interested in studying an event of research interest, they must manually assemble a variety of relevant data sets by searching the different distributed data systems. Consequently, there is a need to design and build specialized search and discover tools in Earth science that can filter through large volumes of distributed online data and information and only aggregate the relevant resources needed to support climatology and case studies. This paper presents a specialized search and discovery tool that automatically creates curated Data Albums. The tool was designed to enable key elements of the search process such as dynamic interaction and sense-making. The tool supports dynamic interaction via different modes of interactivity and visual presentation of information. The compilation of information and data into a Data Album is analogous to a shoebox within the sense-making framework. This tool automates most of the tedious information/data gathering tasks for researchers. Data curation by the tool is achieved via an ontology-based, relevancy ranking algorithm that filters out nonrelevant information and data. The curation enables better search results as compared to the simple keyword searches provided by existing data systems in Earth science.
Zebra Crossing Spotter: Automatic Population of Spatial Databases for Increased Safety of Blind Travelers

PubMed Central

Ahmetovic, Dragan; Manduchi, Roberto; Coughlan, James M.; Mascetti, Sergio

2016-01-01

In this paper we propose a computer vision-based technique that mines existing spatial image databases for discovery of zebra crosswalks in urban settings. Knowing the location of crosswalks is critical for a blind person planning a trip that includes street crossing. By augmenting existing spatial databases (such as Google Maps or OpenStreetMap) with this information, a blind traveler may make more informed routing decisions, resulting in greater safety during independent travel. Our algorithm first searches for zebra crosswalks in satellite images; all candidates thus found are validated against spatially registered Google Street View images. This cascaded approach enables fast and reliable discovery and localization of zebra crosswalks in large image datasets. While fully automatic, our algorithm could also be complemented by a final crowdsourcing validation stage for increased accuracy. PMID:26824080
Information analytics for healthcare service discovery.

PubMed

Sun, Lily; Yamin, Mohammad; Mushi, Cleopa; Liu, Kecheng; Alsaigh, Mohammed; Chen, Fabian

2014-01-01

The concept of being 'patient-centric' is a challenge to many existing healthcare service provision practices. This paper focuses on the issue of referrals, where multiple stakeholders, such as General Practitioners (GPs) and patients, are encouraged to make a consensual decision based on patients' needs. In this paper, we present an ontology-enabled healthcare service provision, which facilitates both patients and GPs in jointly deciding upon the referral decision. In the healthcare service provision model, we define three types of profiles which represent different stakeholders' requirements. This model also comprises a set of healthcare service discovery processes: articulating a service need, matching the need with the healthcare service offerings, and deciding on a best-fit service for acceptance. As a result, the healthcare service provision can carry out coherent analysis using personalised information and iterative processes that deal with requirements which change over time.
EDITORIAL: Nobel Prize in Physiology or Medicine 2003 awarded to Paul Lauterbur and Peter Mansfield for discoveries concerning magnetic resonance imaging

NASA Astrophysics Data System (ADS)

Leach, Martin O.

2004-02-01

The award of the Nobel Prize in Physiology or Medicine recognizes discoveries concerning the use of magnetic resonance to visualize different structures. The Assembly's decision to recognize the discoveries underpinning efficient spatial mapping of biological properties reflects the singular importance of imaging to the medical application of this technique. Without this, abnormalities in morphology cannot be recognized. Equally, the wealth of physiological information that can be obtained by manipulation of the magnetic resonance signal is of little value unless localized to identified organs, pathology or areas of tissue. Based on these early discoveries, a wide range of imaging and measurement techniques, together with enabling instrumentation, have been developed over the last 30 years. Commercial equipment became available in the early 1980s, and some 60 million MRI examinations are now performed each year. The power of the technique, and the range of applications, continues to develop rapidly. The full text of this editorial is given in the PDF file below.
Object-graphs for context-aware visual category discovery.

PubMed

Lee, Yong Jae; Grauman, Kristen

2012-02-01

How can knowing about some categories help us to discover new ones in unlabeled images? Unsupervised visual category discovery is useful to mine for recurring objects without human supervision, but existing methods assume no prior information and thus tend to perform poorly for cluttered scenes with multiple objects. We propose to leverage knowledge about previously learned categories to enable more accurate discovery, and address challenges in estimating their familiarity in unsegmented, unlabeled images. We introduce two variants of a novel object-graph descriptor to encode the 2D and 3D spatial layout of object-level co-occurrence patterns relative to an unfamiliar region and show that by using them to model the interaction between an image’s known and unknown objects, we can better detect new visual categories. Rather than mine for all categories from scratch, our method identifies new objects while drawing on useful cues from familiar ones. We evaluate our approach on several benchmark data sets and demonstrate clear improvements in discovery over conventional purely appearance-based baselines.
Current status and future prospects for enabling chemistry technology in the drug discovery process.

PubMed

Djuric, Stevan W; Hutchins, Charles W; Talaty, Nari N

2016-01-01

This review covers recent advances in the implementation of enabling chemistry technologies into the drug discovery process. Areas covered include parallel synthesis chemistry, high-throughput experimentation, automated synthesis and purification methods, flow chemistry methodology including photochemistry, electrochemistry, and the handling of "dangerous" reagents. Also featured are advances in the "computer-assisted drug design" area and the expanding application of novel mass spectrometry-based techniques to a wide range of drug discovery activities.
Fusing Sensor Paradigms to Acquire Chemical Information: An Integrative Role for Smart Biopolymeric Hydrogels

PubMed Central

Kim, Eunkyoung; Liu, Yi; Ben-Yoav, Hadar; Winkler, Thomas E.; Yan, Kun; Shi, Xiaowen; Shen, Jana; Kelly, Deanna L.; Ghodssi, Reza; Bentley, William E.

2017-01-01

The Information Age transformed our lives but it has had surprisingly little impact on the way chemical information (e.g., from our biological world) is acquired, analyzed and communicated. Sensor systems are poised to change this situation by providing rapid access to chemical information. This access will be enabled by technological advances from various fields: biology enables the synthesis, design and discovery of molecular recognition elements as well as the generation of cell-based signal processors; physics and chemistry are providing nano-components that facilitate the transmission and transduction of signals rich with chemical information; microfabrication is yielding sensors capable of receiving these signals through various modalities; and signal processing analysis enhances the extraction of chemical information. The authors contend that integral to the development of functional sensor systems will be materials that (i) enable the integrative and hierarchical assembly of various sensing components (for chemical recognition and signal transduction) and (ii) facilitate meaningful communication across modalities. It is suggested that stimuli-responsive self-assembling biopolymers can perform such integrative functions, and redox provides modality-spanning communication capabilities. Recent progress toward the development of electrochemical sensors to manage schizophrenia is used to illustrate the opportunities and challenges for enlisting sensors for chemical information processing. PMID:27616350
Linking Automated Data Analysis and Visualization with Applications in Developmental Biology and High-Energy Physics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ruebel, Oliver

2009-11-20

Knowledge discovery from large and complex collections of today's scientific datasets is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the increasing number of data dimensions and data objects is presenting tremendous challenges for data analysis and effective data exploration methods and tools. Researchers are overwhelmed with data and standard tools are often insufficient to enable effective data analysis and knowledge discovery. The main objective of this thesis is to provide important new capabilities to accelerate scientific knowledge discovery form large, complex, and multivariate scientific data. The research coveredmore » in this thesis addresses these scientific challenges using a combination of scientific visualization, information visualization, automated data analysis, and other enabling technologies, such as efficient data management. The effectiveness of the proposed analysis methods is demonstrated via applications in two distinct scientific research fields, namely developmental biology and high-energy physics.Advances in microscopy, image analysis, and embryo registration enable for the first time measurement of gene expression at cellular resolution for entire organisms. Analysis of high-dimensional spatial gene expression datasets is a challenging task. By integrating data clustering and visualization, analysis of complex, time-varying, spatial gene expression patterns and their formation becomes possible. The analysis framework MATLAB and the visualization have been integrated, making advanced analysis tools accessible to biologist and enabling bioinformatic researchers to directly integrate their analysis with the visualization. Laser wakefield particle accelerators (LWFAs) promise to be a new compact source of high-energy particles and radiation, with wide applications ranging from medicine to physics. To gain insight into the complex physical processes of particle acceleration, physicists model LWFAs computationally. The datasets produced by LWFA simulations are (i) extremely large, (ii) of varying spatial and temporal resolution, (iii) heterogeneous, and (iv) high-dimensional, making analysis and knowledge discovery from complex LWFA simulation data a challenging task. To address these challenges this thesis describes the integration of the visualization system VisIt and the state-of-the-art index/query system FastBit, enabling interactive visual exploration of extremely large three-dimensional particle datasets. Researchers are especially interested in beams of high-energy particles formed during the course of a simulation. This thesis describes novel methods for automatic detection and analysis of particle beams enabling a more accurate and efficient data analysis process. By integrating these automated analysis methods with visualization, this research enables more accurate, efficient, and effective analysis of LWFA simulation data than previously possible.« less
A Fully Automated High-Throughput Flow Cytometry Screening System Enabling Phenotypic Drug Discovery.

PubMed

Joslin, John; Gilligan, James; Anderson, Paul; Garcia, Catherine; Sharif, Orzala; Hampton, Janice; Cohen, Steven; King, Miranda; Zhou, Bin; Jiang, Shumei; Trussell, Christopher; Dunn, Robert; Fathman, John W; Snead, Jennifer L; Boitano, Anthony E; Nguyen, Tommy; Conner, Michael; Cooke, Mike; Harris, Jennifer; Ainscow, Ed; Zhou, Yingyao; Shaw, Chris; Sipes, Dan; Mainquist, James; Lesley, Scott

2018-05-01

The goal of high-throughput screening is to enable screening of compound libraries in an automated manner to identify quality starting points for optimization. This often involves screening a large diversity of compounds in an assay that preserves a connection to the disease pathology. Phenotypic screening is a powerful tool for drug identification, in that assays can be run without prior understanding of the target and with primary cells that closely mimic the therapeutic setting. Advanced automation and high-content imaging have enabled many complex assays, but these are still relatively slow and low throughput. To address this limitation, we have developed an automated workflow that is dedicated to processing complex phenotypic assays for flow cytometry. The system can achieve a throughput of 50,000 wells per day, resulting in a fully automated platform that enables robust phenotypic drug discovery. Over the past 5 years, this screening system has been used for a variety of drug discovery programs, across many disease areas, with many molecules advancing quickly into preclinical development and into the clinic. This report will highlight a diversity of approaches that automated flow cytometry has enabled for phenotypic drug discovery.
Generation of a novel next-generation sequencing-based method for the isolation of new human papillomavirus types.

PubMed

Brancaccio, Rosario N; Robitaille, Alexis; Dutta, Sankhadeep; Cuenin, Cyrille; Santare, Daiga; Skenders, Girts; Leja, Marcis; Fischer, Nicole; Giuliano, Anna R; Rollison, Dana E; Grundhoff, Adam; Tommasino, Massimo; Gheit, Tarik

2018-05-07

With the advent of new molecular tools, the discovery of new papillomaviruses (PVs) has accelerated during the past decade, enabling the expansion of knowledge about the viral populations that inhabit the human body. Human PVs (HPVs) are etiologically linked to benign or malignant lesions of the skin and mucosa. The detection of HPV types can vary widely, depending mainly on the methodology and the quality of the biological sample. Next-generation sequencing is one of the most powerful tools, enabling the discovery of novel viruses in a wide range of biological material. Here, we report a novel protocol for the detection of known and unknown HPV types in human skin and oral gargle samples using improved PCR protocols combined with next-generation sequencing. We identified 105 putative new PV types in addition to 296 known types, thus providing important information about the viral distribution in the oral cavity and skin. Copyright © 2018. Published by Elsevier Inc.
2018 Cyber Enabled Emerging Technologies Symposium

DTIC Science & Technology

2018-03-08

Principles • Better data = better outcomes • Training > Programming • AI anxiety?... Think IA (Intelligent Assistant) • Ingest much more information • Make...Local Marketing 7 Usage: “Local” / Specific AI • Healthcare (oncology) • Data Mining/Discovery • Chat bots • Personnel • Finance • Sourcing...cognitive- principles / So, Our Priorities for AI Adoption and Ethics • Purpose: human augmentation versus replacement • Human decision-making • Human
Relating the "mirrorness" of mirror neurons to their origins.

PubMed

Kilner, James M; Friston, Karl J

2014-04-01

Ever since their discovery, mirror neurons have generated much interest and debate. A commonly held view of mirror neuron function is that they transform "visual information into knowledge," thus enabling action understanding and non-verbal social communication between con-specifics (Rizzolatti & Craighero 2004). This functionality is thought to be so important that it has been argued that mirror neurons must be a result of selective pressure.
Disaster Response Tools for Decision Support and Data Discovery - E-DECIDER and GeoGateway

NASA Astrophysics Data System (ADS)

Glasscoe, M. T.; Donnellan, A.; Parker, J. W.; Granat, R. A.; Lyzenga, G. A.; Pierce, M. E.; Wang, J.; Grant Ludwig, L.; Eguchi, R. T.; Huyck, C. K.; Hu, Z.; Chen, Z.; Yoder, M. R.; Rundle, J. B.; Rosinski, A.

2015-12-01

Providing actionable data for situational awareness following an earthquake or other disaster is critical to decision makers in order to improve their ability to anticipate requirements and provide appropriate resources for response. E-DECIDER (Emergency Data Enhanced Cyber-Infrastructure for Disaster Evaluation and Response) is a decision support system producing remote sensing and geophysical modeling products that are relevant to the emergency preparedness and response communities and serves as a gateway to enable the delivery of actionable information to these communities. GeoGateway is a data product search and analysis gateway for scientific discovery, field use, and disaster response focused on NASA UAVSAR and GPS data that integrates with fault data, seismicity and models. Key information on the nature, magnitude and scope of damage, or Essential Elements of Information (EEI), necessary to achieve situational awareness are often generated from a wide array of organizations and disciplines, using any number of geospatial and non-geospatial technologies. We have worked in partnership with the California Earthquake Clearinghouse to develop actionable data products for use in their response efforts, particularly in regularly scheduled, statewide exercises like the recent May 2015 Capstone/SoCal NLE/Ardent Sentry Exercises and in the August 2014 South Napa earthquake activation. We also provided a number of products, services, and consultation to the NASA agency-wide response to the April 2015 Gorkha, Nepal earthquake. We will present perspectives on developing tools for decision support and data discovery in partnership with the Clearinghouse and for the Nepal earthquake. Products delivered included map layers as part of the common operational data plan for the Clearinghouse, delivered through XchangeCore Web Service Data Orchestration, enabling users to create merged datasets from multiple providers. For the Nepal response effort, products included models, damage and loss estimates, and aftershock forecasts that were posted to a NASA information site and delivered directly to end-users such as USAID, OFDA, World Bank, and UNICEF.
Current status and future prospects for enabling chemistry technology in the drug discovery process

PubMed Central

Djuric, Stevan W.; Hutchins, Charles W.; Talaty, Nari N.

2016-01-01

This review covers recent advances in the implementation of enabling chemistry technologies into the drug discovery process. Areas covered include parallel synthesis chemistry, high-throughput experimentation, automated synthesis and purification methods, flow chemistry methodology including photochemistry, electrochemistry, and the handling of “dangerous” reagents. Also featured are advances in the “computer-assisted drug design” area and the expanding application of novel mass spectrometry-based techniques to a wide range of drug discovery activities. PMID:27781094
Three-Component Reaction Discovery Enabled by Mass Spectrometry of Self-Assembled Monolayers

PubMed Central

Montavon, Timothy J.; Li, Jing; Cabrera-Pardo, Jaime R.; Mrksich, Milan; Kozmin, Sergey A.

2011-01-01

Multi-component reactions have been extensively employed in many areas of organic chemistry. Despite significant progress, the discovery of such enabling transformations remains challenging. Here, we present the development of a parallel, label-free reaction-discovery platform, which can be used for identification of new multi-component transformations. Our approach is based on the parallel mass spectrometric screening of interfacial chemical reactions on arrays of self-assembled monolayers. This strategy enabled the identification of a simple organic phosphine that can catalyze a previously unknown condensation of siloxy alkynes, aldehydes and amines to produce 3-hydroxy amides with high efficiency and diastereoselectivity. The reaction was further optimized using solution phase methods. PMID:22169871

Fragment-based hit discovery and structure-based optimization of aminotriazoloquinazolines as novel Hsp90 inhibitors.

PubMed

Casale, Elena; Amboldi, Nadia; Brasca, Maria Gabriella; Caronni, Dannica; Colombo, Nicoletta; Dalvit, Claudio; Felder, Eduard R; Fogliatto, Gianpaolo; Galvani, Arturo; Isacchi, Antonella; Polucci, Paolo; Riceputi, Laura; Sola, Francesco; Visco, Carlo; Zuccotto, Fabio; Casuscelli, Francesco

2014-08-01

In the last decade the heat shock protein 90 (Hsp90) has emerged as a major therapeutic target and many efforts have been dedicated to the discovery of Hsp90 inhibitors as new potent anticancer agents. Here we report the identification of a novel class of Hsp90 inhibitors by means of a biophysical FAXS-NMR based screening of a library of fragments. The use of X-ray structure information combined with modeling studies enabled the fragment evolution of the initial triazoloquinazoline hit to a class of compounds with nanomolar potency and drug-like properties suited for further lead optimization. Copyright © 2014 Elsevier Ltd. All rights reserved.
Consolidating a Distributed Compound Management Capability into a Single Installation: The Application of Overall Equipment Effectiveness to Determine Capacity Utilization.

PubMed

Green, Clive; Taylor, Daniel

2016-12-01

Compound management (CM) is a critical discipline enabling hit discovery through the production of assay-ready compound plates for screening. CM in pharma requires significant investments in manpower, capital equipment, repairs and maintenance, and information technology. These investments are at risk from external factors, for example, new technology rendering existing equipment obsolete and strategic site closures. At AstraZeneca, we faced the challenge of evaluating the number of CM sites required to support hit discovery in response to site closures and pressure on our operating budget. We reasoned that overall equipment effectiveness, a tool used extensively in the manufacturing sector, could determine the equipment capacity and appropriate number of sites. We identified automation downtime as the critical component governing capacity, and a connection between automation downtime and the availability of skilled staff. We demonstrated that sufficient production capacity existed in two sites to meet hit discovery demand without the requirement for an additional investment of $7 million in new facilities. In addition, we developed an automated capacity model that incorporated an extended working-day pattern as a solution for reducing automation downtime. The application of this solution enabled the transition to a single site, with an annual cost saving of $2.1 million. © 2015 Society for Laboratory Automation and Screening.
Pituitary Medicine From Discovery to Patient-Focused Outcomes

PubMed Central

2016-01-01

Context: This perspective traces a pipeline of discovery in pituitary medicine over the past 75 years. Objective: To place in context past advances and predict future changes in understanding pituitary pathophysiology and clinical care. Design: Author's perspective on reports of pituitary advances in the published literature. Setting: Clinical and translational Endocrinology. Outcomes: Discovery of the hypothalamic-pituitary axis and mechanisms for pituitary control, have culminated in exquisite understanding of anterior pituitary cell function and dysfunction. Challenges facing the discipline include fundamental understanding of pituitary adenoma pathogenesis leading to more effective treatments of inexorably growing and debilitating hormone secreting pituitary tumors as well as medical management of non-secreting pituitary adenomas. Newly emerging pituitary syndromes include those associated with immune-targeted cancer therapies and head trauma. Conclusions: Novel diagnostic techniques including imaging genomic, proteomic, and biochemical analyses will yield further knowledge to enable diagnosis of heretofore cryptic syndromes, as well as sub classifications of pituitary syndromes for personalized treatment approaches. Cost effective personalized approaches to precision therapy must demonstrate value, and will be empowered by multidisciplinary approaches to integrating complex subcellular information to identify therapeutic targets for enabling maximal outcomes. These goals will be challenging to attain given the rarity of pituitary disorders and the difficulty in conducting appropriately powered prospective trials. PMID:26908107
Merging Electronic Health Record Data and Genomics for Cardiovascular Research: A Science Advisory From the American Heart Association.

PubMed

Hall, Jennifer L; Ryan, John J; Bray, Bruce E; Brown, Candice; Lanfear, David; Newby, L Kristin; Relling, Mary V; Risch, Neil J; Roden, Dan M; Shaw, Stanley Y; Tcheng, James E; Tenenbaum, Jessica; Wang, Thomas N; Weintraub, William S

2016-04-01

The process of scientific discovery is rapidly evolving. The funding climate has influenced a favorable shift in scientific discovery toward the use of existing resources such as the electronic health record. The electronic health record enables long-term outlooks on human health and disease, in conjunction with multidimensional phenotypes that include laboratory data, images, vital signs, and other clinical information. Initial work has confirmed the utility of the electronic health record for understanding mechanisms and patterns of variability in disease susceptibility, disease evolution, and drug responses. The addition of biobanks and genomic data to the information contained in the electronic health record has been demonstrated. The purpose of this statement is to discuss the current challenges in and the potential for merging electronic health record data and genomics for cardiovascular research. © 2016 American Heart Association, Inc.
The Application of the Open Pharmacological Concepts Triple Store (Open PHACTS) to Support Drug Discovery Research

PubMed Central

Ratnam, Joseline; Zdrazil, Barbara; Digles, Daniela; Cuadrado-Rodriguez, Emiliano; Neefs, Jean-Marc; Tipney, Hannah; Siebes, Ronald; Waagmeester, Andra; Bradley, Glyn; Chau, Chau Han; Richter, Lars; Brea, Jose; Evelo, Chris T.; Jacoby, Edgar; Senger, Stefan; Loza, Maria Isabel; Ecker, Gerhard F.; Chichester, Christine

2014-01-01

Integration of open access, curated, high-quality information from multiple disciplines in the Life and Biomedical Sciences provides a holistic understanding of the domain. Additionally, the effective linking of diverse data sources can unearth hidden relationships and guide potential research strategies. However, given the lack of consistency between descriptors and identifiers used in different resources and the absence of a simple mechanism to link them, gathering and combining relevant, comprehensive information from diverse databases remains a challenge. The Open Pharmacological Concepts Triple Store (Open PHACTS) is an Innovative Medicines Initiative project that uses semantic web technology approaches to enable scientists to easily access and process data from multiple sources to solve real-world drug discovery problems. The project draws together sources of publicly-available pharmacological, physicochemical and biomolecular data, represents it in a stable infrastructure and provides well-defined information exploration and retrieval methods. Here, we highlight the utility of this platform in conjunction with workflow tools to solve pharmacological research questions that require interoperability between target, compound, and pathway data. Use cases presented herein cover 1) the comprehensive identification of chemical matter for a dopamine receptor drug discovery program 2) the identification of compounds active against all targets in the Epidermal growth factor receptor (ErbB) signaling pathway that have a relevance to disease and 3) the evaluation of established targets in the Vitamin D metabolism pathway to aid novel Vitamin D analogue design. The example workflows presented illustrate how the Open PHACTS Discovery Platform can be used to exploit existing knowledge and generate new hypotheses in the process of drug discovery. PMID:25522365
Discovery-2: an interactive resource for the rational selection and comparison of putative drug target proteins in malaria

PubMed Central

2013-01-01

Background Drug resistance to anti-malarial compounds remains a serious problem, with resistance to newer pharmaceuticals developing at an alarming rate. The development of new anti-malarials remains a priority, and the rational selection of putative targets is a key element of this process. Discovery-2 is an update of the original Discovery in silico resource for the rational selection of putative drug target proteins, enabling researchers to obtain information for a protein which may be useful for the selection of putative drug targets, and to perform advanced filtering of proteins encoded by the malaria genome based on a series of molecular properties. Methods An updated in silico resource has been developed where researchers are able to mine information on malaria proteins and predicted ligands, as well as perform comparisons to the human and mosquito host characteristics. Protein properties used include: domains, motifs, EC numbers, GO terms, orthologs, protein-protein interactions, protein-ligand interactions. Newly added features include drugability measures from ChEMBL, automated literature relations and links to clinical trial information. Searching by chemical structure is also available. Results The updated functionality of the Discovery-2 resource is presented, together with a detailed case study of the Plasmodium falciparum S-adenosyl-L-homocysteine hydrolase (PfSAHH) protein. A short example of a chemical search with pyrimethamine is also illustrated. Conclusion The updated Discovery-2 resource allows researchers to obtain detailed properties of proteins from the malaria genome, which may be of interest in the target selection process, and to perform advanced filtering and selection of proteins based on a relevant range of molecular characteristics. PMID:23537208
The Biomedical Resource Ontology (BRO) to Enable Resource Discovery in Clinical and Translational Research

PubMed Central

Tenenbaum, Jessica D.; Whetzel, Patricia L.; Anderson, Kent; Borromeo, Charles D.; Dinov, Ivo D.; Gabriel, Davera; Kirschner, Beth; Mirel, Barbara; Morris, Tim; Noy, Natasha; Nyulas, Csongor; Rubenson, David; Saxman, Paul R.; Singh, Harpreet; Whelan, Nancy; Wright, Zach; Athey, Brian D.; Becich, Michael J.; Ginsburg, Geoffrey S.; Musen, Mark A.; Smith, Kevin A.; Tarantal, Alice F.; Rubin, Daniel L; Lyster, Peter

2010-01-01

The biomedical research community relies on a diverse set of resources, both within their own institutions and at other research centers. In addition, an increasing number of shared electronic resources have been developed. Without effective means to locate and query these resources, it is challenging, if not impossible, for investigators to be aware of the myriad resources available, or to effectively perform resource discovery when the need arises. In this paper, we describe the development and use of the Biomedical Resource Ontology (BRO) to enable semantic annotation and discovery of biomedical resources. We also describe the Resource Discovery System (RDS) which is a federated, inter-institutional pilot project that uses the BRO to facilitate resource discovery on the Internet. Through the RDS framework and its associated Biositemaps infrastructure, the BRO facilitates semantic search and discovery of biomedical resources, breaking down barriers and streamlining scientific research that will improve human health. PMID:20955817
NASA's Universe of Learning: Engaging Learners in Discovery

NASA Astrophysics Data System (ADS)

Cominsky, L.; Smith, D. A.; Lestition, K.; Greene, M.; Squires, G.

2016-12-01

NASA's Universe of Learning is one of 27 competitively awarded education programs selected by NASA's Science Mission Directorate (SMD) to enable scientists and engineers to more effectively engage with learners of all ages. The NASA's Universe of Learning program is created through a partnership between the Space Telescope Science Institute, Chandra X-ray Center, IPAC at Caltech, Jet Propulsion Laboratory Exoplanet Exploration Program, and Sonoma State University. The program will connect the scientists, engineers, science, technology and adventure of NASA Astrophysics with audience needs, proven infrastructure, and a network of over 500 partners to advance the objectives of SMD's newly restructured education program. The multi-institutional team will develop and deliver a unified, consolidated suite of education products, programs, and professional development offerings that spans the full spectrum of NASA Astrophysics, including the Exoplanet Exploration theme. Program elements include enabling educational use of Astrophysics mission data and offering participatory experiences; creating multimedia and immersive experiences; designing exhibits and community programs; providing professional development for pre-service educators, undergraduate instructors, and informal educators; and, producing resources for special needs and underserved/underrepresented audiences. This presentation will provide an overview of the program and process for mapping discoveries to products and programs for informal, lifelong, and self-directed learning environments.
To ontologise or not to ontologise: An information model for a geospatial knowledge infrastructure

NASA Astrophysics Data System (ADS)

Stock, Kristin; Stojanovic, Tim; Reitsma, Femke; Ou, Yang; Bishr, Mohamed; Ortmann, Jens; Robertson, Anne

2012-08-01

A geospatial knowledge infrastructure consists of a set of interoperable components, including software, information, hardware, procedures and standards, that work together to support advanced discovery and creation of geoscientific resources, including publications, data sets and web services. The focus of the work presented is the development of such an infrastructure for resource discovery. Advanced resource discovery is intended to support scientists in finding resources that meet their needs, and focuses on representing the semantic details of the scientific resources, including the detailed aspects of the science that led to the resource being created. This paper describes an information model for a geospatial knowledge infrastructure that uses ontologies to represent these semantic details, including knowledge about domain concepts, the scientific elements of the resource (analysis methods, theories and scientific processes) and web services. This semantic information can be used to enable more intelligent search over scientific resources, and to support new ways to infer and visualise scientific knowledge. The work describes the requirements for semantic support of a knowledge infrastructure, and analyses the different options for information storage based on the twin goals of semantic richness and syntactic interoperability to allow communication between different infrastructures. Such interoperability is achieved by the use of open standards, and the architecture of the knowledge infrastructure adopts such standards, particularly from the geospatial community. The paper then describes an information model that uses a range of different types of ontologies, explaining those ontologies and their content. The information model was successfully implemented in a working geospatial knowledge infrastructure, but the evaluation identified some issues in creating the ontologies.
Semantics-enabled service discovery framework in the SIMDAT pharma grid.

PubMed

Qu, Cangtao; Zimmermann, Falk; Kumpf, Kai; Kamuzinzi, Richard; Ledent, Valérie; Herzog, Robert

2008-03-01

We present the design and implementation of a semantics-enabled service discovery framework in the data Grids for process and product development using numerical simulation and knowledge discovery (SIMDAT) Pharma Grid, an industry-oriented Grid environment for integrating thousands of Grid-enabled biological data services and analysis services. The framework consists of three major components: the Web ontology language (OWL)-description logic (DL)-based biological domain ontology, OWL Web service ontology (OWL-S)-based service annotation, and semantic matchmaker based on the ontology reasoning. Built upon the framework, workflow technologies are extensively exploited in the SIMDAT to assist biologists in (semi)automatically performing in silico experiments. We present a typical usage scenario through the case study of a biological workflow: IXodus.
Enabling drug discovery project decisions with integrated computational chemistry and informatics

NASA Astrophysics Data System (ADS)

Tsui, Vickie; Ortwine, Daniel F.; Blaney, Jeffrey M.

2017-03-01

Computational chemistry/informatics scientists and software engineers in Genentech Small Molecule Drug Discovery collaborate with experimental scientists in a therapeutic project-centric environment. Our mission is to enable and improve pre-clinical drug discovery design and decisions. Our goal is to deliver timely data, analysis, and modeling to our therapeutic project teams using best-in-class software tools. We describe our strategy, the organization of our group, and our approaches to reach this goal. We conclude with a summary of the interdisciplinary skills required for computational scientists and recommendations for their training.
Corra: Computational framework and tools for LC-MS discovery and targeted mass spectrometry-based proteomics

PubMed Central

Brusniak, Mi-Youn; Bodenmiller, Bernd; Campbell, David; Cooke, Kelly; Eddes, James; Garbutt, Andrew; Lau, Hollis; Letarte, Simon; Mueller, Lukas N; Sharma, Vagisha; Vitek, Olga; Zhang, Ning; Aebersold, Ruedi; Watts, Julian D

2008-01-01

Background Quantitative proteomics holds great promise for identifying proteins that are differentially abundant between populations representing different physiological or disease states. A range of computational tools is now available for both isotopically labeled and label-free liquid chromatography mass spectrometry (LC-MS) based quantitative proteomics. However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis. These limitations, along with the array of choices, present a daunting prospect for biologists, and other researchers not trained in bioinformatics, who wish to use LC-MS-based quantitative proteomics. Results We have developed Corra, a computational framework and tools for discovery-based LC-MS proteomics. Corra extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing) so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Corra also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS). We present two case studies to illustrate the application of Corra to commonly performed LC-MS-based biological workflows: a pilot biomarker discovery study of glycoproteins isolated from human plasma samples relevant to type 2 diabetes, and a study in yeast to identify in vivo targets of the protein kinase Ark1 via phosphopeptide profiling. Conclusion The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools. Corra enables appropriate statistical analyses, with controlled false-discovery rates, ultimately to inform subsequent targeted identification of differentially abundant peptides by MS/MS. For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field. PMID:19087345
The Energy Industry Profile of ISO/DIS 19115-1: Facilitating Discovery and Evaluation of, and Access to Distributed Information Resources

NASA Astrophysics Data System (ADS)

Hills, S. J.; Richard, S. M.; Doniger, A.; Danko, D. M.; Derenthal, L.; Energistics Metadata Work Group

2011-12-01

A diverse group of organizations representative of the international community involved in disciplines relevant to the upstream petroleum industry, - energy companies, - suppliers and publishers of information to the energy industry, - vendors of software applications used by the industry, - partner government and academic organizations, has engaged in the Energy Industry Metadata Standards Initiative. This Initiative envisions the use of standard metadata within the community to enable significant improvements in the efficiency with which users discover, evaluate, and access distributed information resources. The metadata standard needed to realize this vision is the initiative's primary deliverable. In addition to developing the metadata standard, the initiative is promoting its adoption to accelerate realization of the vision, and publishing metadata exemplars conformant with the standard. Implementation of the standard by community members, in the form of published metadata which document the information resources each organization manages, will allow use of tools requiring consistent metadata for efficient discovery and evaluation of, and access to, information resources. While metadata are expected to be widely accessible, access to associated information resources may be more constrained. The initiative is being conducting by Energistics' Metadata Work Group, in collaboration with the USGIN Project. Energistics is a global standards group in the oil and natural gas industry. The Work Group determined early in the initiative, based on input solicited from 40+ organizations and on an assessment of existing metadata standards, to develop the target metadata standard as a profile of a revised version of ISO 19115, formally the "Energy Industry Profile of ISO/DIS 19115-1 v1.0" (EIP). The Work Group is participating on the ISO/TC 211 project team responsible for the revision of ISO 19115, now ready for "Draft International Standard" (DIS) status. With ISO 19115 an established, capability-rich, open standard for geographic metadata, EIP v1 is expected to be widely acceptable within the community and readily sustainable over the long-term. The EIP design, also per community requirements, will enable discovery, evaluation, and access to types of information resources considered important to the community, including structured and unstructured digital resources, and physical assets such as hardcopy documents and material samples. This presentation will briefly review the development of this initiative as well as the current and planned Work Group activities. More time will be spent providing an overview of the EIP v1, including the requirements it prescribes, design efforts made to enable automated metadata capture and processing, and the structure and content of its documentation, which was written to minimize ambiguity and facilitate implementation. The Work Group considers EIP v1 a solid initial design for interoperable metadata, and first step toward the vision of the Initiative.
Transforming fragments into candidates: small becomes big in medicinal chemistry.

PubMed

de Kloe, Gerdien E; Bailey, David; Leurs, Rob; de Esch, Iwan J P

2009-07-01

Fragment-based drug discovery (FBDD) represents a logical and efficient approach to lead discovery and optimisation. It can draw on structural, biophysical and biochemical data, incorporating a wide range of inputs, from precise mode-of-binding information on specific fragments to wider ranging pharmacophoric screening surveys using traditional HTS approaches. It is truly an enabling technology for the imaginative medicinal chemist. In this review, we analyse a representative set of 23 published FBDD studies that describe how low molecular weight fragments are being identified and efficiently transformed into higher molecular weight drug candidates. FBDD is now becoming warmly endorsed by industry as well as academia and the focus on small interacting molecules is making a big scientific impact.
Breaking free from chemical spreadsheets.

PubMed

Segall, Matthew; Champness, Ed; Leeding, Chris; Chisholm, James; Hunt, Peter; Elliott, Alex; Garcia-Martinez, Hector; Foster, Nick; Dowling, Samuel

2015-09-01

Drug discovery scientists often consider compounds and data in terms of groups, such as chemical series, and relationships, representing similarity or structural transformations, to aid compound optimisation. This is often supported by chemoinformatics algorithms, for example clustering and matched molecular pair analysis. However, chemistry software packages commonly present these data as spreadsheets or form views that make it hard to find relevant patterns or compare related compounds conveniently. Here, we review common data visualisation and analysis methods used to extract information from chemistry data. We introduce a new framework that enables scientists to work flexibly with drug discovery data to reflect their thought processes and interact with the output of algorithms to identify key structure-activity relationships and guide further optimisation intuitively. Copyright © 2015 Elsevier Ltd. All rights reserved.
Advancements in Aptamer Discovery Technologies.

PubMed

Gotrik, Michael R; Feagin, Trevor A; Csordas, Andrew T; Nakamoto, Margaret A; Soh, H Tom

2016-09-20

Affinity reagents that specifically bind to their target molecules are invaluable tools in nearly every field of modern biomedicine. Nucleic acid-based aptamers offer many advantages in this domain, because they are chemically synthesized, stable, and economical. Despite these compelling features, aptamers are currently not widely used in comparison to antibodies. This is primarily because conventional aptamer-discovery techniques such as SELEX are time-consuming and labor-intensive and often fail to produce aptamers with comparable binding performance to antibodies. This Account describes a body of work from our laboratory in developing advanced methods for consistently producing high-performance aptamers with higher efficiency, fewer resources, and, most importantly, a greater probability of success. We describe our efforts in systematically transforming each major step of the aptamer discovery process: selection, analysis, and characterization. To improve selection, we have developed microfluidic devices (M-SELEX) that enable discovery of high-affinity aptamers after a minimal number of selection rounds by precisely controlling the target concentration and washing stringency. In terms of improving aptamer pool analysis, our group was the first to use high-throughput sequencing (HTS) for the discovery of new aptamers. We showed that tracking the enrichment trajectory of individual aptamer sequences enables the identification of high-performing aptamers without requiring full convergence of the selected aptamer pool. HTS is now widely used for aptamer discovery, and open-source software has become available to facilitate analysis. To improve binding characterization, we used HTS data to design custom aptamer arrays to measure the affinity and specificity of up to ∼10(4) DNA aptamers in parallel as a means to rapidly discover high-quality aptamers. Most recently, our efforts have culminated in the invention of the "particle display" (PD) screening system, which transforms solution-phase aptamers into "aptamer particles" that can be individually screened at high-throughput via fluorescence-activated cell sorting. Using PD, we have shown the feasibility of rapidly generating aptamers with exceptional affinities, even for proteins that have previously proven intractable to aptamer discovery. We are confident that these advanced aptamer-discovery methods will accelerate the discovery of aptamer reagents with excellent affinities and specificities, perhaps even exceeding those of the best monoclonal antibodies. Since aptamers are reproducible, renewable, stable, and can be distributed as sequence information, we anticipate that these affinity reagents will become even more valuable tools for both research and clinical applications.
Panacea, a semantic-enabled drug recommendations discovery framework.

PubMed

Doulaverakis, Charalampos; Nikolaidis, George; Kleontas, Athanasios; Kompatsiaris, Ioannis

2014-03-06

Personalized drug prescription can be benefited from the use of intelligent information management and sharing. International standard classifications and terminologies have been developed in order to provide unique and unambiguous information representation. Such standards can be used as the basis of automated decision support systems for providing drug-drug and drug-disease interaction discovery. Additionally, Semantic Web technologies have been proposed in earlier works, in order to support such systems. The paper presents Panacea, a semantic framework capable of offering drug-drug and drug-diseases interaction discovery. For enabling this kind of service, medical information and terminology had to be translated to ontological terms and be appropriately coupled with medical knowledge of the field. International standard classifications and terminologies, provide the backbone of the common representation of medical data while the medical knowledge of drug interactions is represented by a rule base which makes use of the aforementioned standards. Representation is based on a lightweight ontology. A layered reasoning approach is implemented where at the first layer ontological inference is used in order to discover underlying knowledge, while at the second layer a two-step rule selection strategy is followed resulting in a computationally efficient reasoning approach. Details of the system architecture are presented while also giving an outline of the difficulties that had to be overcome. Panacea is evaluated both in terms of quality of recommendations against real clinical data and performance. The quality recommendation gave useful insights regarding requirements for real world deployment and revealed several parameters that affected the recommendation results. Performance-wise, Panacea is compared to a previous published work by the authors, a service for drug recommendations named GalenOWL, and presents their differences in modeling and approach to the problem, while also pinpointing the advantages of Panacea. Overall, the paper presents a framework for providing an efficient drug recommendations service where Semantic Web technologies are coupled with traditional business rule engines.
The Frictionless Data Package: Data Containerization for Automated Scientific Workflows

NASA Astrophysics Data System (ADS)

Shepherd, A.; Fils, D.; Kinkade, D.; Saito, M. A.

2017-12-01

As cross-disciplinary geoscience research increasingly relies on machines to discover and access data, one of the critical questions facing data repositories is how data and supporting materials should be packaged for consumption. Traditionally, data repositories have relied on a human's involvement throughout discovery and access workflows. This human could assess fitness for purpose by reading loosely coupled, unstructured information from web pages and documentation. In attempts to shorten the time to science and access data resources across may disciplines, expectations for machines to mediate the process of discovery and access is challenging data repository infrastructure. This challenge is to find ways to deliver data and information in ways that enable machines to make better decisions by enabling them to understand the data and metadata of many data types. Additionally, once machines have recommended a data resource as relevant to an investigator's needs, the data resource should be easy to integrate into that investigator's toolkits for analysis and visualization. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) supports NSF-funded OCE and PLR investigators with their project's data management needs. These needs involve a number of varying data types some of which require multiple files with differing formats. Presently, BCO-DMO has described these data types and the important relationships between the type's data files through human-readable documentation on web pages. For machines directly accessing data files from BCO-DMO, this documentation could be overlooked and lead to misinterpreting the data. Instead, BCO-DMO is exploring the idea of data containerization, or packaging data and related information for easier transport, interpretation, and use. In researching the landscape of data containerization, the Frictionlessdata Data Package (http://frictionlessdata.io/) provides a number of valuable advantages over similar solutions. This presentation will focus on these advantages and how the Frictionlessdata Data Package addresses a number of real-world use cases faced for data discovery, access, analysis and visualization.
A Community Roadmap for Discovery of Geosciences Data

NASA Astrophysics Data System (ADS)

Baru, C.

2012-12-01

This talk will summarize on-going discussions and deliberations related to data discovery undertaken as part of the EarthCube initiative and in the context of current trends and technologies in search and discovery of scientific data and information. The goal of the EarthCube initiative is to transform the conduct of research by supporting the development of community-guided cyberinfrastructure to integrate data and information for knowledge management across the Geosciences. The vision of EarthCube is to provide a coherent framework for finding and using information about the Earth system across the entire research enterprise that will allow for substantial improved collaboration between specialties using each other's data (e.g. subdomains of geo- and biological sciences). Indeed, data discovery is an essential prerequisite to any action that an EarthCube user would undertake. The community roadmap activity addresses challenges in data discovery, beginning with an assessment of the state-of-the-art, and then identifying issues, challenges, and risks in reaching the data discovery vision. Many of the lessons learned are general and applicable not only to the geosciences but also to a variety of other science communities. The roadmap considers data discovery issues in Geoscience that include but are not limited to metadata-based discovery and the use of semantic information and ontologies; content-based discovery and integration with data mining activities; integration with data access services; and policy and governance issues. Furthermore, many geoscience use cases require access to heterogeneous data from multiple disciplinary sources in order to analyze and make intelligent connections between data to advance research frontiers. Examples include, say, assessing the rise of sea surface temperatures; modeling geodynamical earth systems from deep time to present; or, examining in detail the causes and consequences of global climate change. It has taken the past one to two decades for the community to arrive at a few commonly understood and commonly agreed upon standards for metadata and services. There have been significant advancements in the development of prototype systems in the area of metadata-based data discovery, including efforts such as OpenDAP and THREDDS catalogs, the GEON Portal and Catalog Services (www.geongrid.org), OGC standards, and development of systems like OneGeology (onegeology.org), the USGIN (usgin.org), the Earth System Grid, and EOSDIS. Such efforts have set the stage now for the development of next generation, production-quality, advanced discovery services. The next challenge is in converting these into robust, sustained services for the community and developing capabilities such as content-based search and ontology-enabled search, and ensuring that the long tail of geoscience data are fully included in any future discovery services. As EarthCube attempts to pursue these challenges, the key question to pose is whether we will be able to establish a cultural environment that is able to sustain, extend, and manage an infrastructure that will last 50, 100 years?
CICT Computing, Information, and Communications Technology Program

NASA Technical Reports Server (NTRS)

Laufenberg, Lawrence; Tu, Eugene (Technical Monitor)

2002-01-01

The CICT Program is part of the NASA Aerospace Technology Enterprise's fundamental technology thrust to develop tools. processes, and technologies that enable new aerospace system capabilities and missions. The CICT Program's four key objectives are: Provide seamless access to NASA resources- including ground-, air-, and space-based distributed information technology resources-so that NASA scientists and engineers can more easily control missions, make new scientific discoveries, and design the next-generation space vehicles, provide high-data delivery from these assets directly to users for missions, develop goal-oriented human-centered systems, and research, develop and evaluate revolutionary technology.

Discovering Knowledge from AIS Database for Application in VTS

NASA Astrophysics Data System (ADS)

Tsou, Ming-Cheng

The widespread use of the Automatic Identification System (AIS) has had a significant impact on maritime technology. AIS enables the Vessel Traffic Service (VTS) not only to offer commonly known functions such as identification, tracking and monitoring of vessels, but also to provide rich real-time information that is useful for marine traffic investigation, statistical analysis and theoretical research. However, due to the rapid accumulation of AIS observation data, the VTS platform is often unable quickly and effectively to absorb and analyze it. Traditional observation and analysis methods are becoming less suitable for the modern AIS generation of VTS. In view of this, we applied the same data mining technique used for business intelligence discovery (in Customer Relation Management (CRM) business marketing) to the analysis of AIS observation data. This recasts the marine traffic problem as a business-marketing problem and integrates technologies such as Geographic Information Systems (GIS), database management systems, data warehousing and data mining to facilitate the discovery of hidden and valuable information in a huge amount of observation data. Consequently, this provides the marine traffic managers with a useful strategic planning resource.
Healthcare applications of knowledge discovery in databases.

PubMed

DeGruy, K B

2000-01-01

Many healthcare leaders find themselves overwhelmed with data, but lack the information they need to make informed decisions. Knowledge discovery in databases (KDD) can help organizations turn their data into information. KDD is the process of finding complex patterns and relationships in data. The tools and techniques of KDD have achieved impressive results in other industries, and healthcare needs to take advantage of advances in this exciting field. Recent advances in the KDD field have brought it from the realm of research institutions and large corporations to many smaller companies. Software and hardware advances enable small organizations to tap the power of KDD using desktop PCs. KDD has been used extensively for fraud detection and focused marketing. There is a wealth of data available within the healthcare industry that would benefit from the application of KDD tools and techniques. Providers and payers have a vast quantity of data (such as, charges and claims), but not effective way to analyze the data to accurately determine relationships and trends. Organizations that take advantage of KDD techniques will find that they offer valuable assistance in the quest to lower healthcare costs while improving healthcare quality.
Open cyberGIS software for geospatial research and education in the big data era

NASA Astrophysics Data System (ADS)

Wang, Shaowen; Liu, Yan; Padmanabhan, Anand

CyberGIS represents an interdisciplinary field combining advanced cyberinfrastructure, geographic information science and systems (GIS), spatial analysis and modeling, and a number of geospatial domains to improve research productivity and enable scientific breakthroughs. It has emerged as new-generation GIS that enable unprecedented advances in data-driven knowledge discovery, visualization and visual analytics, and collaborative problem solving and decision-making. This paper describes three open software strategies-open access, source, and integration-to serve various research and education purposes of diverse geospatial communities. These strategies have been implemented in a leading-edge cyberGIS software environment through three corresponding software modalities: CyberGIS Gateway, Toolkit, and Middleware, and achieved broad and significant impacts.
Text mining resources for the life sciences.

PubMed

Przybyła, Piotr; Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia

2016-01-01

Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable-those that have the crucial ability to share information, enabling smooth integration and reusability. © The Author(s) 2016. Published by Oxford University Press.
Text mining resources for the life sciences

PubMed Central

Shardlow, Matthew; Aubin, Sophie; Bossy, Robert; Eckart de Castilho, Richard; Piperidis, Stelios; McNaught, John; Ananiadou, Sophia

2016-01-01

Text mining is a powerful technology for quickly distilling key information from vast quantities of biomedical literature. However, to harness this power the researcher must be well versed in the availability, suitability, adaptability, interoperability and comparative accuracy of current text mining resources. In this survey, we give an overview of the text mining resources that exist in the life sciences to help researchers, especially those employed in biocuration, to engage with text mining in their own work. We categorize the various resources under three sections: Content Discovery looks at where and how to find biomedical publications for text mining; Knowledge Encoding describes the formats used to represent the different levels of information associated with content that enable text mining, including those formats used to carry such information between processes; Tools and Services gives an overview of workflow management systems that can be used to rapidly configure and compare domain- and task-specific processes, via access to a wide range of pre-built tools. We also provide links to relevant repositories in each section to enable the reader to find resources relevant to their own area of interest. Throughout this work we give a special focus to resources that are interoperable—those that have the crucial ability to share information, enabling smooth integration and reusability. PMID:27888231
Development and use of Ontologies Inside the Neuroscience Information Framework: A Practical Approach

PubMed Central

Imam, Fahim T.; Larson, Stephen D.; Bandrowski, Anita; Grethe, Jeffery S.; Gupta, Amarnath; Martone, Maryann E.

2012-01-01

An initiative of the NIH Blueprint for neuroscience research, the Neuroscience Information Framework (NIF) project advances neuroscience by enabling discovery and access to public research data and tools worldwide through an open source, semantically enhanced search portal. One of the critical components for the overall NIF system, the NIF Standardized Ontologies (NIFSTD), provides an extensive collection of standard neuroscience concepts along with their synonyms and relationships. The knowledge models defined in the NIFSTD ontologies enable an effective concept-based search over heterogeneous types of web-accessible information entities in NIF’s production system. NIFSTD covers major domains in neuroscience, including diseases, brain anatomy, cell types, sub-cellular anatomy, small molecules, techniques, and resource descriptors. Since the first production release in 2008, NIF has grown significantly in content and functionality, particularly with respect to the ontologies and ontology-based services that drive the NIF system. We present here on the structure, design principles, community engagement, and the current state of NIFSTD ontologies. PMID:22737162
Leveraging the Thousands of Known Planets to Inform TESS Follow-Up

NASA Astrophysics Data System (ADS)

Ballard, Sarah

2017-10-01

The Solar System furnishes our most familiar planetary architecture: many planets, orbiting nearly coplanar to one another. However, a typical system of planets in the Milky Way orbits a much smaller M dwarf star, and these stars furnish a different blueprint in key ways than the conditions that nourished evolution of life on Earth. With ensemble studies of hundreds-to-thousands of exoplanets, I will describe the emerging links between planet formation from disks, orbital dynamics of planets, and the content and observability of planetary atmospheres. These quantities can be tied to observables even in discovery light curves, to enable judicious selection of follow-up targets from the ground and from space. After TESS exoplanet discoveries start in earnest, the studies of individual planets with large, space-based platforms comprise the clear next step toward understanding the hospitability of the Milky Way to life. Our success hinges upon leveraging the many thousands of planet discoveries in hand to determine how to use these precious and limited resources.
Enabling Genomic-Phenomic Association Discovery without Sacrificing Anonymity

PubMed Central

Heatherly, Raymond D.; Loukides, Grigorios; Denny, Joshua C.; Haines, Jonathan L.; Roden, Dan M.; Malin, Bradley A.

2013-01-01

Health information technologies facilitate the collection of massive quantities of patient-level data. A growing body of research demonstrates that such information can support novel, large-scale biomedical investigations at a fraction of the cost of traditional prospective studies. While healthcare organizations are being encouraged to share these data in a de-identified form, there is hesitation over concerns that it will allow corresponding patients to be re-identified. Currently proposed technologies to anonymize clinical data may make unrealistic assumptions with respect to the capabilities of a recipient to ascertain a patients identity. We show that more pragmatic assumptions enable the design of anonymization algorithms that permit the dissemination of detailed clinical profiles with provable guarantees of protection. We demonstrate this strategy with a dataset of over one million medical records and show that 192 genotype-phenotype associations can be discovered with fidelity equivalent to non-anonymized clinical data. PMID:23405076
A platform for discovery: The University of Pennsylvania Integrated Neurodegenerative Disease Biobank

PubMed Central

Toledo, Jon B.; Van Deerlin, Vivianna M.; Lee, Edward B.; Suh, EunRan; Baek, Young; Robinson, John L.; Xie, Sharon X.; McBride, Jennifer; Wood, Elisabeth M.; Schuck, Theresa; Irwin, David J.; Gross, Rachel G.; Hurtig, Howard; McCluskey, Leo; Elman, Lauren; Karlawish, Jason; Schellenberg, Gerard; Chen-Plotkin, Alice; Wolk, David; Grossman, Murray; Arnold, Steven E.; Shaw, Leslie M.; Lee, Virginia M.-Y.; Trojanowski, John Q.

2014-01-01

Neurodegenerative diseases (NDs) are defined by the accumulation of abnormal protein deposits in the central nervous system (CNS), and only neuropathological examination enables a definitive diagnosis. Brain banks and their associated scientific programs have shaped the actual knowledge of NDs, identifying and characterizing the CNS deposits that define new diseases, formulating staging schemes, and establishing correlations between neuropathological changes and clinical features. However, brain banks have evolved to accommodate the banking of biofluids as well as DNA and RNA samples. Moreover, the value of biobanks is greatly enhanced if they link all the multidimensional clinical and laboratory information of each case, which is accomplished, optimally, using systematic and standardized operating procedures, and in the framework of multidisciplinary teams with the support of a flexible and user-friendly database system that facilitates the sharing of information of all the teams in the network. We describe a biobanking system that is a platform for discovery research at the Center for Neurodegenerative Disease Research at the University of Pennsylvania. PMID:23978324
Composition and applications of focus libraries to phenotypic assays

PubMed Central

Wassermann, Anne Mai; Camargo, Luiz M.; Auld, Douglas S.

2014-01-01

The wealth of bioactivity information now available on low-molecular weight compounds has enabled a paradigm shift in chemical biology and early phase drug discovery efforts. Traditionally chemical libraries have been most commonly employed in screening approaches where a bioassay is used to characterize a chemical library in a random search for active samples. However, robust curating of bioassay data, establishment of ontologies enabling mining of large chemical biology datasets, and a wealth of public chemical biology information has made possible the establishment of highly annotated compound collections. Such annotated chemical libraries can now be used to build a pathway/target hypothesis and have led to a new view where chemical libraries are used to characterize a bioassay. In this article we discuss the types of compounds in these annotated libraries composed of tools, probes, and drugs. As well, we provide rationale and a few examples for how such libraries can enable phenotypic/forward chemical genomic approaches. As with any approach, there are several pitfalls that need to be considered and we also outline some strategies to avoid these. PMID:25104937
Featured Article: Genotation: Actionable knowledge for the scientific reader

PubMed Central

Willis, Ethan; Sakauye, Mark; Jose, Rony; Chen, Hao; Davis, Robert L

2016-01-01

We present an article viewer application that allows a scientific reader to easily discover and share knowledge by linking genomics-related concepts to knowledge of disparate biomedical databases. High-throughput data streams generated by technical advancements have contributed to scientific knowledge discovery at an unprecedented rate. Biomedical Informaticists have created a diverse set of databases to store and retrieve the discovered knowledge. The diversity and abundance of such resources present biomedical researchers a challenge with knowledge discovery. These challenges highlight a need for a better informatics solution. We use a text mining algorithm, Genomine, to identify gene symbols from the text of a journal article. The identified symbols are supplemented with information from the GenoDB knowledgebase. Self-updating GenoDB contains information from NCBI Gene, Clinvar, Medgen, dbSNP, KEGG, PharmGKB, Uniprot, and Hugo Gene databases. The journal viewer is a web application accessible via a web browser. The features described herein are accessible on www.genotation.org. The Genomine algorithm identifies gene symbols with an accuracy shown by .65 F-Score. GenoDB currently contains information regarding 59,905 gene symbols, 5633 drug–gene relationships, 5981 gene–disease relationships, and 713 pathways. This application provides scientific readers with actionable knowledge related to concepts of a manuscript. The reader will be able to save and share supplements to be visualized in a graphical manner. This provides convenient access to details of complex biological phenomena, enabling biomedical researchers to generate novel hypothesis to further our knowledge in human health. This manuscript presents a novel application that integrates genomic, proteomic, and pharmacogenomic information to supplement content of a biomedical manuscript and enable readers to automatically discover actionable knowledge. PMID:26900164
Featured Article: Genotation: Actionable knowledge for the scientific reader.

PubMed

Nagahawatte, Panduka; Willis, Ethan; Sakauye, Mark; Jose, Rony; Chen, Hao; Davis, Robert L

2016-06-01

We present an article viewer application that allows a scientific reader to easily discover and share knowledge by linking genomics-related concepts to knowledge of disparate biomedical databases. High-throughput data streams generated by technical advancements have contributed to scientific knowledge discovery at an unprecedented rate. Biomedical Informaticists have created a diverse set of databases to store and retrieve the discovered knowledge. The diversity and abundance of such resources present biomedical researchers a challenge with knowledge discovery. These challenges highlight a need for a better informatics solution. We use a text mining algorithm, Genomine, to identify gene symbols from the text of a journal article. The identified symbols are supplemented with information from the GenoDB knowledgebase. Self-updating GenoDB contains information from NCBI Gene, Clinvar, Medgen, dbSNP, KEGG, PharmGKB, Uniprot, and Hugo Gene databases. The journal viewer is a web application accessible via a web browser. The features described herein are accessible on www.genotation.org The Genomine algorithm identifies gene symbols with an accuracy shown by .65 F-Score. GenoDB currently contains information regarding 59,905 gene symbols, 5633 drug-gene relationships, 5981 gene-disease relationships, and 713 pathways. This application provides scientific readers with actionable knowledge related to concepts of a manuscript. The reader will be able to save and share supplements to be visualized in a graphical manner. This provides convenient access to details of complex biological phenomena, enabling biomedical researchers to generate novel hypothesis to further our knowledge in human health. This manuscript presents a novel application that integrates genomic, proteomic, and pharmacogenomic information to supplement content of a biomedical manuscript and enable readers to automatically discover actionable knowledge. © 2016 by the Society for Experimental Biology and Medicine.
Accelerating target discovery using pre-competitive open science-patients need faster innovation more than anyone else.

PubMed

Low, Eric; Bountra, Chas; Lee, Wen Hwa

2016-01-01

We are experiencing a new era enabled by unencumbered access to high quality data through the emergence of open science initiatives in the historically challenging area of early stage drug discovery. At the same time, many patient-centric organisations are taking matters into their own hands by participating in, enabling and funding research. Here we present the rationale behind the innovative partnership between the Structural Genomics Consortium (SGC)-an open, pre-competitive pre-clinical research consortium and the research-focused patient organisation Myeloma UK to create a new, comprehensive platform to accelerate the discovery and development of new treatments for multiple myeloma.
GalenOWL: Ontology-based drug recommendations discovery

PubMed Central

2012-01-01

Background Identification of drug-drug and drug-diseases interactions can pose a difficult problem to cope with, as the increasingly large number of available drugs coupled with the ongoing research activities in the pharmaceutical domain, make the task of discovering relevant information difficult. Although international standards, such as the ICD-10 classification and the UNII registration, have been developed in order to enable efficient knowledge sharing, medical staff needs to be constantly updated in order to effectively discover drug interactions before prescription. The use of Semantic Web technologies has been proposed in earlier works, in order to tackle this problem. Results This work presents a semantic-enabled online service, named GalenOWL, capable of offering real time drug-drug and drug-diseases interaction discovery. For enabling this kind of service, medical information and terminology had to be translated to ontological terms and be appropriately coupled with medical knowledge of the field. International standards such as the aforementioned ICD-10 and UNII, provide the backbone of the common representation of medical data, while the medical knowledge of drug interactions is represented by a rule base which makes use of the aforementioned standards. Details of the system architecture are presented while also giving an outline of the difficulties that had to be overcome. A comparison of the developed ontology-based system with a similar system developed using a traditional business logic rule engine is performed, giving insights on the advantages and drawbacks of both implementations. Conclusions The use of Semantic Web technologies has been found to be a good match for developing drug recommendation systems. Ontologies can effectively encapsulate medical knowledge and rule-based reasoning can capture and encode the drug interactions knowledge. PMID:23256945
Discovering Drugs with DNA-Encoded Library Technology: From Concept to Clinic with an Inhibitor of Soluble Epoxide Hydrolase.

PubMed

Belyanskaya, Svetlana L; Ding, Yun; Callahan, James F; Lazaar, Aili L; Israel, David I

2017-05-04

DNA-encoded chemical library technology was developed with the vision of its becoming a transformational platform for drug discovery. The hope was that a new paradigm for the discovery of low-molecular-weight drugs would be enabled by combining the vast molecular diversity achievable with combinatorial chemistry, the information-encoding attributes of DNA, the power of molecular biology, and a streamlined selection-based discovery process. Here, we describe the discovery and early clinical development of GSK2256294, an inhibitor of soluble epoxide hydrolase (sEH, EPHX2), by using encoded-library technology (ELT). GSK2256294 is an orally bioavailable, potent and selective inhibitor of sEH that has a long half life and produced no serious adverse events in a first-time-in-human clinical study. To our knowledge, GSK2256294 is the first molecule discovered from this technology to enter human clinical testing and represents a realization of the vision that DNA-encoded chemical library technology can efficiently yield molecules with favorable properties that can be readily progressed into high-quality drugs. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Enlisting User Community Perspectives to Inform Development of a Semantic Web Application for Discovery of Cross-Institutional Research Information and Data

NASA Astrophysics Data System (ADS)

Johns, E. M.; Mayernik, M. S.; Boler, F. M.; Corson-Rikert, J.; Daniels, M. D.; Gross, M. B.; Khan, H.; Maull, K. E.; Rowan, L. R.; Stott, D.; Williams, S.; Krafft, D. B.

2015-12-01

Researchers seek information and data through a variety of avenues: published literature, search engines, repositories, colleagues, etc. In order to build a web application that leverages linked open data to enable multiple paths for information discovery, the EarthCollab project has surveyed two geoscience user communities to consider how researchers find and share scholarly output. EarthCollab, a cross-institutional, EarthCube funded project partnering UCAR, Cornell University, and UNAVCO, is employing the open-source semantic web software, VIVO, as the underlying technology to connect the people and resources of virtual research communities. This study will present an analysis of survey responses from members of the two case study communities: (1) the Bering Sea Project, an interdisciplinary field program whose data archive is hosted by NCAR's Earth Observing Laboratory (EOL), and (2) UNAVCO, a geodetic facility and consortium that supports diverse research projects informed by geodesy. The survey results illustrate the types of research products that respondents indicate should be discoverable within a digital platform and the current methods used to find publications, data, personnel, tools, and instrumentation. The responses showed that scientists rely heavily on general purpose search engines, such as Google, to find information, but that data center websites and the published literature were also critical sources for finding collaborators, data, and research tools.The survey participants also identify additional features of interest for an information platform such as search engine indexing, connection to institutional web pages, generation of bibliographies and CVs, and outward linking to social media. Through the survey, the user communities prioritized the type of information that is most important to display and describe their work within a research profile. The analysis of this survey will inform our further development of a platform that will facilitate different types of information discovery strategies, and help researchers to find and use the associated resources of a research project.
Sharing Service Resource Information for Application Integration in a Virtual Enterprise - Modeling the Communication Protocol for Exchanging Service Resource Information

NASA Astrophysics Data System (ADS)

Yamada, Hiroshi; Kawaguchi, Akira

Grid computing and web service technologies enable us to use networked resources in a coordinated manner. An integrated service is made of individual services running on coordinated resources. In order to achieve such coordinated services autonomously, the initiator of a coordinated service needs to know detailed service resource information. This information ranges from static attributes like the IP address of the application server to highly dynamic ones like the CPU load. The most famous wide-area service discovery mechanism based on names is DNS. Its hierarchical tree organization and caching methods take advantage of the static information managed. However, in order to integrate business applications in a virtual enterprise, we need a discovery mechanism to search for the optimal resources based on the given a set of criteria (search keys). In this paper, we propose a communication protocol for exchanging service resource information among wide-area systems. We introduce the concept of the service domain that consists of service providers managed under the same management policy. This concept of the service domain is similar to that for autonomous systems (ASs). In each service domain, the service information provider manages the service resource information of service providers that exist in this service domain. The service resource information provider exchanges this information with other service resource information providers that belong to the different service domains. We also verified the protocol's behavior and effectiveness using a simulation model developed for proposed protocol.
Information Mining Technologies to Enable Discovery of Actionable Intelligence to Facilitate Maritime Situational Awareness: I-MINE

DTIC Science & Technology

2013-01-01

website). Data mining tools are in-house code developed in Python, C++ and Java . • NGA The National Geospatial-Intelligence Agency (NGA) performs data...as PostgreSQL (with PostGIS), MySQL , Microsoft SQL Server, SQLite, etc. using the appropriate JDBC driver. 14 The documentation and ease to learn are...written in Java that is able to perform various types of regressions, classi- fications, and other data mining tasks. There is also a commercial version
Comparative mass spectrometry-based metabolomics strategies for the investigation of microbial secondary metabolites.

PubMed

Covington, Brett C; McLean, John A; Bachmann, Brian O

2017-01-04

Covering: 2000 to 2016The labor-intensive process of microbial natural product discovery is contingent upon identifying discrete secondary metabolites of interest within complex biological extracts, which contain inventories of all extractable small molecules produced by an organism or consortium. Historically, compound isolation prioritization has been driven by observed biological activity and/or relative metabolite abundance and followed by dereplication via accurate mass analysis. Decades of discovery using variants of these methods has generated the natural pharmacopeia but also contributes to recent high rediscovery rates. However, genomic sequencing reveals substantial untapped potential in previously mined organisms, and can provide useful prescience of potentially new secondary metabolites that ultimately enables isolation. Recently, advances in comparative metabolomics analyses have been coupled to secondary metabolic predictions to accelerate bioactivity and abundance-independent discovery work flows. In this review we will discuss the various analytical and computational techniques that enable MS-based metabolomic applications to natural product discovery and discuss the future prospects for comparative metabolomics in natural product discovery.
A Cloud-enabled Service-oriented Spatial Web Portal for Facilitating Arctic Data Discovery, Integration, and Utilization

NASA Astrophysics Data System (ADS)

dias, S. B.; Yang, C.; Li, Z.; XIA, J.; Liu, K.; Gui, Z.; Li, W.

2013-12-01

Global climate change has become one of the biggest concerns for human kind in the 21st century due to its broad impacts on society and ecosystems across the world. Arctic has been observed as one of the most vulnerable regions to the climate change. In order to understand the impacts of climate change on the natural environment, ecosystems, biodiversity and others in the Arctic region, and thus to better support the planning and decision making process, cross-disciplinary researches are required to monitor and analyze changes of Arctic regions such as water, sea level, biodiversity and so on. Conducting such research demands the efficient utilization of various geospatially referenced data, web services and information related to Arctic region. In this paper, we propose a cloud-enabled and service-oriented Spatial Web Portal (SWP) to support the discovery, integration and utilization of Arctic related geospatial resources, serving as a building block of polar CI. This SWP leverages the following techniques: 1) a hybrid searching mechanism combining centralized local search, distributed catalogue search and specialized Internet search for effectively discovering Arctic data and web services from multiple sources; 2) a service-oriented quality-enabled framework for seamless integration and utilization of various geospatial resources; and 3) a cloud-enabled parallel spatial index building approach to facilitate near-real time resource indexing and searching. A proof-of-concept prototype is developed to demonstrate the feasibility of the proposed SWP, using an example of analyzing the Arctic snow cover change over the past 50 years.

search.bioPreprint: a discovery tool for cutting edge, preprint biomedical research articles

PubMed Central

Iwema, Carrie L.; LaDue, John; Zack, Angela; Chattopadhyay, Ansuman

2016-01-01

The time it takes for a completed manuscript to be published traditionally can be extremely lengthy. Article publication delay, which occurs in part due to constraints associated with peer review, can prevent the timely dissemination of critical and actionable data associated with new information on rare diseases or developing health concerns such as Zika virus. Preprint servers are open access online repositories housing preprint research articles that enable authors (1) to make their research immediately and freely available and (2) to receive commentary and peer review prior to journal submission. There is a growing movement of preprint advocates aiming to change the current journal publication and peer review system, proposing that preprints catalyze biomedical discovery, support career advancement, and improve scientific communication. While the number of articles submitted to and hosted by preprint servers are gradually increasing, there has been no simple way to identify biomedical research published in a preprint format, as they are not typically indexed and are only discoverable by directly searching the specific preprint server websites. To address this issue, we created a search engine that quickly compiles preprints from disparate host repositories and provides a one-stop search solution. Additionally, we developed a web application that bolsters the discovery of preprints by enabling each and every word or phrase appearing on any web site to be integrated with articles from preprint servers. This tool, search.bioPreprint, is publicly available at http://www.hsls.pitt.edu/resources/preprint. PMID:27508060
New Solutions for Enabling Discovery of User-Centric Virtual Data Products in NASA's Common Metadata Repository

NASA Astrophysics Data System (ADS)

Pilone, D.; Gilman, J.; Baynes, K.; Shum, D.

2015-12-01

This talk introduces a new NASA Earth Observing System Data and Information System (EOSDIS) capability to automatically generate and maintain derived, Virtual Product information allowing DAACs and Data Providers to create tailored and more discoverable variations of their products. After this talk the audience will be aware of the new EOSDIS Virtual Product capability, applications of it, and how to take advantage of it. Much of the data made available in the EOSDIS are organized for generation and archival rather than for discovery and use. The EOSDIS Common Metadata Repository (CMR) is launching a new capability providing automated generation and maintenance of user-oriented Virtual Product information. DAACs can easily surface variations on established data products tailored to specific uses cases and users, leveraging DAAC exposed services such as custom ordering or access services like OPeNDAP for on-demand product generation and distribution. Virtual Data Products enjoy support for spatial and temporal information, keyword discovery, association with imagery, and are fully discoverable by tools such as NASA Earthdata Search, Worldview, and Reverb. Virtual Product generation has applicability across many use cases: - Describing derived products such as Surface Kinetic Temperature information (AST_08) from source products (ASTER L1A) - Providing streamlined access to data products (e.g. AIRS) containing many (>800) data variables covering an enormous variety of physical measurements - Attaching additional EOSDIS offerings such as Visual Metadata, external services, and documentation metadata - Publishing alternate formats for a product (e.g. netCDF for HDF products) with the actual conversion happening on request - Publishing granules to be modified by on-the-fly services, like GES-DISC's Data Quality Screening Service - Publishing "bundled" products where granules from one product correspond to granules from one or more other related products
A web Accessible Framework for Discovery, Visualization and Dissemination of Polar Data

NASA Astrophysics Data System (ADS)

Kirsch, P. J.; Breen, P.; Barnes, T. D.

2007-12-01

A web accessible information framework, currently under development within the Physical Sciences Division of the British Antarctic Survey is described. The datasets accessed are generally heterogeneous in nature from fields including space physics, meteorology, atmospheric chemistry, ice physics, and oceanography. Many of these are returned in near real time over a 24/7 limited bandwidth link from remote Antarctic Stations and ships. The requirement is to provide various user groups - each with disparate interests and demands - a system incorporating a browsable and searchable catalogue; bespoke data summary visualization, metadata access facilities and download utilities. The system allows timely access to raw and processed datasets through an easily navigable discovery interface. Once discovered, a summary of the dataset can be visualized in a manner prescribed by the particular projects and user communities or the dataset may be downloaded, subject to accessibility restrictions that may exist. In addition, access to related ancillary information including software, documentation, related URL's and information concerning non-electronic media (of particular relevance to some legacy datasets) is made directly available having automatically been associated with a dataset during the discovery phase. Major components of the framework include the relational database containing the catalogue, the organizational structure of the systems holding the data - enabling automatic updates of the system catalogue and real-time access to data -, the user interface design, and administrative and data management scripts allowing straightforward incorporation of utilities, datasets and system maintenance.
Data Discovery of Big and Diverse Climate Change Datasets - Options, Practices and Challenges

NASA Astrophysics Data System (ADS)

Palanisamy, G.; Boden, T.; McCord, R. A.; Frame, M. T.

2013-12-01

Developing data search tools is a very common, but often confusing, task for most of the data intensive scientific projects. These search interfaces need to be continually improved to handle the ever increasing diversity and volume of data collections. There are many aspects which determine the type of search tool a project needs to provide to their user community. These include: number of datasets, amount and consistency of discovery metadata, ancillary information such as availability of quality information and provenance, and availability of similar datasets from other distributed sources. Environmental Data Science and Systems (EDSS) group within the Environmental Science Division at the Oak Ridge National Laboratory has a long history of successfully managing diverse and big observational datasets for various scientific programs via various data centers such as DOE's Atmospheric Radiation Measurement Program (ARM), DOE's Carbon Dioxide Information and Analysis Center (CDIAC), USGS's Core Science Analytics and Synthesis (CSAS) metadata Clearinghouse and NASA's Distributed Active Archive Center (ORNL DAAC). This talk will showcase some of the recent developments for improving the data discovery within these centers The DOE ARM program recently developed a data discovery tool which allows users to search and discover over 4000 observational datasets. These datasets are key to the research efforts related to global climate change. The ARM discovery tool features many new functions such as filtered and faceted search logic, multi-pass data selection, filtering data based on data quality, graphical views of data quality and availability, direct access to data quality reports, and data plots. The ARM Archive also provides discovery metadata to other broader metadata clearinghouses such as ESGF, IASOA, and GOS. In addition to the new interface, ARM is also currently working on providing DOI metadata records to publishers such as Thomson Reuters and Elsevier. The ARM program also provides a standards based online metadata editor (OME) for PIs to submit their data to the ARM Data Archive. USGS CSAS metadata Clearinghouse aggregates metadata records from several USGS projects and other partner organizations. The Clearinghouse allows users to search and discover over 100,000 biological and ecological datasets from a single web portal. The Clearinghouse also enabled some new data discovery functions such as enhanced geo-spatial searches based on land and ocean classifications, metadata completeness rankings, data linkage via digital object identifiers (DOIs), and semantically enhanced keyword searches. The Clearinghouse also currently working on enabling a dashboard which allows the data providers to look at various statistics such as number their records accessed via the Clearinghouse, most popular keywords, metadata quality report and DOI creation service. The Clearinghouse also publishes metadata records to broader portals such as NSF DataONE and Data.gov. The author will also present how these capabilities are currently reused by the recent and upcoming data centers such as DOE's NGEE-Arctic project. References: [1] Devarakonda, R., Palanisamy, G., Wilson, B. E., & Green, J. M. (2010). Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics, 3(1-2), 87-94. [2]Devarakonda, R., Shrestha, B., Palanisamy, G., Hook, L., Killeffer, T., Krassovski, M., ... & Frame, M. (2014, October). OME: Tool for generating and managing metadata to handle BigData. In BigData Conference (pp. 8-10).
New classes of piezoelectrics, ferroelectrics, and antiferroelectrics by first-principles high-throughput materials design

NASA Astrophysics Data System (ADS)

Bennett, Joseph

2013-03-01

Functional materials, such as piezoelectrics, ferroelectrics, and antiferroelectrics, exhibit large changes with applied fields and stresses. This behavior enables their incorporation into a wide variety of devices in technological fields such as energy conversion/storage and information processing/storage. Discovery of functional materials with improved performance or even new types of responses is thus not only a scientific challenge, but can have major impacts on society. In this talk I will review our efforts to uncover new families of functional materials using a combined crystallographic database/high-throughput first-principles approach. I will describe our work on the design and discovery of thousands of new functional materials, specifically the LiAlSi family as piezoelectrics, the LiGaGe family as ferroelectrics, and the MgSrSi family as antiferroelectrics.
Strategies for the follow-up of gravitational wave transients with the Cherenkov Telescope Array

NASA Astrophysics Data System (ADS)

Bartos, I.; Di Girolamo, T.; Gair, J. R.; Hendry, M.; Heng, I. S.; Humensky, T. B.; Márka, S.; Márka, Z.; Messenger, C.; Mukherjee, R.; Nieto, D.; O'Brien, P.; Santander, M.

2018-06-01

The observation of the electromagnetic counterpart of gravitational-wave (GW) transient GW170817 demonstrated the potential in extracting astrophysical information from multimessenger discoveries. The forthcoming deployment of the first telescopes of the Cherenkov Telescope Array (CTA) observatory will coincide with Advanced LIGO/Virgo's next observing run, O3, enabling the monitoring of gamma-ray emission at E > 20 GeV, and thus particle acceleration, from GW sources. CTA will not be greatly limited by the precision of GW localization as it will be capable of rapidly covering the GW error region with sufficient sensitivity. We examine the current status of GW searches and their follow-up effort, as well as the status of CTA, in order to identify some of the general strategies that will enhance CTA's contribution to multimessenger discoveries.
Gifts from Exoplanetary Transits

NASA Astrophysics Data System (ADS)

Narita, Norio

2009-08-01

The discovery of transiting extrasolar planets has enabled us to do a number of interesting studies. Transit photometry reveals the radius and the orbital inclination of transiting planets, which allows us to learn the true mass and density of the respective planets by the combined information from radial velocity (RV) measurements. In addition, follow-up observations of transiting planets, looking at such things as secondary eclipses, transit timing variations, transmission spectroscopy, and the Rossiter-McLaughlin effect, provide us information about their dayside temperatures, unseen bodies in systems, planetary atmospheres, and the obliquity of planetary orbits. Such observational information, which will provide us a greater understanding of extrasolar planets, is available only for transiting planets. Here, I briefly summarize what we can learn from transiting planets and introduce previous studies.
Enhancing Undergraduate Education with NASA Resources

NASA Astrophysics Data System (ADS)

Manning, James G.; Meinke, Bonnie; Schultz, Gregory; Smith, Denise Anne; Lawton, Brandon L.; Gurton, Suzanne; Astrophysics Community, NASA

2015-08-01

The NASA Astrophysics Science Education and Public Outreach Forum (SEPOF) coordinates the work of NASA Science Mission Directorate (SMD) Astrophysics EPO projects and their teams to bring cutting-edge discoveries of NASA missions to the introductory astronomy college classroom. Uniquely poised to foster collaboration between scientists with content expertise and educators with pedagogical expertise, the Forum has coordinated the development of several resources that provide new opportunities for college and university instructors to bring the latest NASA discoveries in astrophysics into their classrooms.To address the needs of the higher education community, the Astrophysics Forum collaborated with the astrophysics E/PO community, researchers, and introductory astronomy instructors to place individual science discoveries and learning resources into context for higher education audiences. The resulting products include two “Resource Guides” on cosmology and exoplanets, each including a variety of accessible resources. The Astrophysics Forum also coordinates the development of the “Astro 101” slide set series. The sets are five- to seven-slide presentations on new discoveries from NASA astrophysics missions relevant to topics in introductory astronomy courses. These sets enable Astronomy 101 instructors to include new discoveries not yet in their textbooks in their courses, and may be found at: https://www.astrosociety.org/education/resources-for-the-higher-education-audience/.The Astrophysics Forum also coordinated the development of 12 monthly “Universe Discovery Guides,” each featuring a theme and a representative object well-placed for viewing, with an accompanying interpretive story, strategies for conveying the topics, and supporting NASA-approved education activities and background information from a spectrum of NASA missions and programs. These resources are adaptable for use by instructors and may be found at: http://nightsky.jpl.nasa.gov/news-display.cfm?News_ID=611.These resources help enhance the Science, Technology, Engineering, and Mathematics (STEM) experiences of undergraduates, and will be described with access information provided.
KNODWAT: A scientific framework application for testing knowledge discovery methods for the biomedical domain

PubMed Central

2013-01-01

Background Professionals in the biomedical domain are confronted with an increasing mass of data. Developing methods to assist professional end users in the field of Knowledge Discovery to identify, extract, visualize and understand useful information from these huge amounts of data is a huge challenge. However, there are so many diverse methods and methodologies available, that for biomedical researchers who are inexperienced in the use of even relatively popular knowledge discovery methods, it can be very difficult to select the most appropriate method for their particular research problem. Results A web application, called KNODWAT (KNOwledge Discovery With Advanced Techniques) has been developed, using Java on Spring framework 3.1. and following a user-centered approach. The software runs on Java 1.6 and above and requires a web server such as Apache Tomcat and a database server such as the MySQL Server. For frontend functionality and styling, Twitter Bootstrap was used as well as jQuery for interactive user interface operations. Conclusions The framework presented is user-centric, highly extensible and flexible. Since it enables methods for testing using existing data to assess suitability and performance, it is especially suitable for inexperienced biomedical researchers, new to the field of knowledge discovery and data mining. For testing purposes two algorithms, CART and C4.5 were implemented using the WEKA data mining framework. PMID:23763826
KNODWAT: a scientific framework application for testing knowledge discovery methods for the biomedical domain.

PubMed

Holzinger, Andreas; Zupan, Mario

2013-06-13

Professionals in the biomedical domain are confronted with an increasing mass of data. Developing methods to assist professional end users in the field of Knowledge Discovery to identify, extract, visualize and understand useful information from these huge amounts of data is a huge challenge. However, there are so many diverse methods and methodologies available, that for biomedical researchers who are inexperienced in the use of even relatively popular knowledge discovery methods, it can be very difficult to select the most appropriate method for their particular research problem. A web application, called KNODWAT (KNOwledge Discovery With Advanced Techniques) has been developed, using Java on Spring framework 3.1. and following a user-centered approach. The software runs on Java 1.6 and above and requires a web server such as Apache Tomcat and a database server such as the MySQL Server. For frontend functionality and styling, Twitter Bootstrap was used as well as jQuery for interactive user interface operations. The framework presented is user-centric, highly extensible and flexible. Since it enables methods for testing using existing data to assess suitability and performance, it is especially suitable for inexperienced biomedical researchers, new to the field of knowledge discovery and data mining. For testing purposes two algorithms, CART and C4.5 were implemented using the WEKA data mining framework.
Representation of Serendipitous Scientific Data

NASA Technical Reports Server (NTRS)

James, Mark

2006-01-01

A computer program defines and implements an innovative kind of data structure than can be used for representing information derived from serendipitous discoveries made via collection of scientific data on long exploratory spacecraft missions. Data structures capable of collecting any kind of data can easily be implemented in advance, but the task of designing a fixed and efficient data structure suitable for processing raw data into useful information and taking advantage of serendipitous scientific discovery is becoming increasingly difficult as missions go deeper into space. The present software eases the task by enabling definition of arbitrarily complex data structures that can adapt at run time as raw data are transformed into other types of information. This software runs on a variety of computers, and can be distributed in either source code or binary code form. It must be run in conjunction with any one of a number of Lisp compilers that are available commercially or as shareware. It has no specific memory requirements and depends upon the other software with which it is used. This program is implemented as a library that is called by, and becomes folded into, the other software with which it is used.
Three-Year College Discovery Master Plan, Bronx Community College, 1998-2001, Parts I-III.

ERIC Educational Resources Information Center

Smith, Shirley; Santa Rita, Emilio

Bronx Community College created a three-year College Discovery (CD) master plan for 1998-2001 to help restructure its counseling programs and support services and enable CD students to acquire an associate's degree level of education. The first area of restructuring is in the role of the director of College Discovery and Counseling. General…
37 CFR 1.71 - Detailed description and specification of the invention.

Code of Federal Regulations, 2011 CFR

2011-07-01

... enable any person skilled in the art or science to which the invention or discovery appertains, or with... specification must include a written description of the invention or discovery and of the manner and process of...
37 CFR 1.71 - Detailed description and specification of the invention.

Code of Federal Regulations, 2010 CFR

2010-07-01

... enable any person skilled in the art or science to which the invention or discovery appertains, or with... specification must include a written description of the invention or discovery and of the manner and process of...
Concept of operations for knowledge discovery from Big Data across enterprise data warehouses

NASA Astrophysics Data System (ADS)

Sukumar, Sreenivas R.; Olama, Mohammed M.; McNair, Allen W.; Nutaro, James J.

2013-05-01

The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Options that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.
A Virtual Bioinformatics Knowledge Environment for Early Cancer Detection

NASA Technical Reports Server (NTRS)

Crichton, Daniel; Srivastava, Sudhir; Johnsey, Donald

2003-01-01

Discovery of disease biomarkers for cancer is a leading focus of early detection. The National Cancer Institute created a network of collaborating institutions focused on the discovery and validation of cancer biomarkers called the Early Detection Research Network (EDRN). Informatics plays a key role in enabling a virtual knowledge environment that provides scientists real time access to distributed data sets located at research institutions across the nation. The distributed and heterogeneous nature of the collaboration makes data sharing across institutions very difficult. EDRN has developed a comprehensive informatics effort focused on developing a national infrastructure enabling seamless access, sharing and discovery of science data resources across all EDRN sites. This paper will discuss the EDRN knowledge system architecture, its objectives and its accomplishments.
Organs-on-chips at the frontiers of drug discovery

PubMed Central

Esch, Eric W.; Bahinski, Anthony; Huh, Dongeun

2016-01-01

Improving the effectiveness of preclinical predictions of human drug responses is critical to reducing costly failures in clinical trials. Recent advances in cell biology, microfabrication and microfluidics have enabled the development of microengineered models of the functional units of human organs — known as organs-on-chips — that could provide the basis for preclinical assays with greater predictive power. Here, we examine the new opportunities for the application of organ-on-chip technologies in a range of areas in preclinical drug discovery, such as target identification and validation, target-based screening, and phenotypic screening. We also discuss emerging drug discovery opportunities enabled by organs-on-chips, as well as important challenges in realizing the full potential of this technology. PMID:25792263
Microscale High-Throughput Experimentation as an Enabling Technology in Drug Discovery: Application in the Discovery of (Piperidinyl)pyridinyl-1H-benzimidazole Diacylglycerol Acyltransferase 1 Inhibitors.

PubMed

Cernak, Tim; Gesmundo, Nathan J; Dykstra, Kevin; Yu, Yang; Wu, Zhicai; Shi, Zhi-Cai; Vachal, Petr; Sperbeck, Donald; He, Shuwen; Murphy, Beth Ann; Sonatore, Lisa; Williams, Steven; Madeira, Maria; Verras, Andreas; Reiter, Maud; Lee, Claire Heechoon; Cuff, James; Sherer, Edward C; Kuethe, Jeffrey; Goble, Stephen; Perrotto, Nicholas; Pinto, Shirly; Shen, Dong-Ming; Nargund, Ravi; Balkovec, James; DeVita, Robert J; Dreher, Spencer D

2017-05-11

Miniaturization and parallel processing play an important role in the evolution of many technologies. We demonstrate the application of miniaturized high-throughput experimentation methods to resolve synthetic chemistry challenges on the frontlines of a lead optimization effort to develop diacylglycerol acyltransferase (DGAT1) inhibitors. Reactions were performed on ∼1 mg scale using glass microvials providing a miniaturized high-throughput experimentation capability that was used to study a challenging S N Ar reaction. The availability of robust synthetic chemistry conditions discovered in these miniaturized investigations enabled the development of structure-activity relationships that ultimately led to the discovery of soluble, selective, and potent inhibitors of DGAT1.
An interactive web application for the dissemination of human systems immunology data.

PubMed

Speake, Cate; Presnell, Scott; Domico, Kelly; Zeitner, Brad; Bjork, Anna; Anderson, David; Mason, Michael J; Whalen, Elizabeth; Vargas, Olivia; Popov, Dimitry; Rinchai, Darawan; Jourde-Chiche, Noemie; Chiche, Laurent; Quinn, Charlie; Chaussabel, Damien

2015-06-19

Systems immunology approaches have proven invaluable in translational research settings. The current rate at which large-scale datasets are generated presents unique challenges and opportunities. Mining aggregates of these datasets could accelerate the pace of discovery, but new solutions are needed to integrate the heterogeneous data types with the contextual information that is necessary for interpretation. In addition, enabling tools and technologies facilitating investigators' interaction with large-scale datasets must be developed in order to promote insight and foster knowledge discovery. State of the art application programming was employed to develop an interactive web application for browsing and visualizing large and complex datasets. A collection of human immune transcriptome datasets were loaded alongside contextual information about the samples. We provide a resource enabling interactive query and navigation of transcriptome datasets relevant to human immunology research. Detailed information about studies and samples are displayed dynamically; if desired the associated data can be downloaded. Custom interactive visualizations of the data can be shared via email or social media. This application can be used to browse context-rich systems-scale data within and across systems immunology studies. This resource is publicly available online at [Gene Expression Browser Landing Page ( https://gxb.benaroyaresearch.org/dm3/landing.gsp )]. The source code is also available openly [Gene Expression Browser Source Code ( https://github.com/BenaroyaResearch/gxbrowser )]. We have developed a data browsing and visualization application capable of navigating increasingly large and complex datasets generated in the context of immunological studies. This intuitive tool ensures that, whether taken individually or as a whole, such datasets generated at great effort and expense remain interpretable and a ready source of insight for years to come.
Introduction to biological complexity as a missing link in drug discovery.

PubMed

Gintant, Gary A; George, Christopher H

2018-06-06

Despite a burgeoning knowledge of the intricacies and mechanisms responsible for human disease, technological advances in medicinal chemistry, and more efficient assays used for drug screening, it remains difficult to discover novel and effective pharmacologic therapies. Areas covered: By reference to the primary literature and concepts emerging from academic and industrial drug screening landscapes, the authors propose that this disconnect arises from the inability to scale and integrate responses from simpler model systems to outcomes from more complex and human-based biological systems. Expert opinion: Further collaborative efforts combining target-based and phenotypic-based screening along with systems-based pharmacology and informatics will be necessary to harness the technological breakthroughs of today to derive the novel drug candidates of tomorrow. New questions must be asked of enabling technologies-while recognizing inherent limitations-in a way that moves drug development forward. Attempts to integrate mechanistic and observational information acquired across multiple scales frequently expose the gap between our knowledge and our understanding as the level of complexity increases. We hope that the thoughts and actionable items highlighted will help to inform the directed evolution of the drug discovery process.

A new Lower Pleistocene archeological site in Europe (Vallparadís, Barcelona, Spain)

PubMed Central

Martínez, Kenneth; Garcia, Joan; Carbonell, Eudald; Agustí, Jordi; Bahain, Jean-Jaques; Blain, Hugues-Alexandre; Burjachs, Francesc; Cáceres, Isabel; Duval, Mathieu; Falguères, Christophe; Gómez, Manuel; Huguet, Rosa

2010-01-01

Here we report the discovery of a new late Lower Pleistocene site named Vallparadís (Barcelona, Spain) that produced a rich archeological and paleontological sequence dated from the upper boundary of the Jaramillo subchron to the early Middle Pleistocene. This deposit contained a main archeological layer with numerous artifacts and a rich macromammalian assemblage, some of which bore cut marks, that could indicate that hominins had access to carcasses. Paleomagnetic analysis, electron spin resonance-uranium series (ESR-US), and the biostratigraphic chronological position of the macro- and micromammal and lithic assemblages of this layer reinforce the proposal that hominins inhabited Europe during the Lower Pleistocene. The archeological sequence provides key information on the successful adaptation of European hominins that preceded the well-known fossil population from Atapuerca and succeeded the finds from Orce basin. Hence, this discovery enables us to close a major chronological gap in the early prehistory of Iberia. According to the information in this paper and the available data from these other sites, we propose that Mediterranean Western Europe was repeatedly and perhaps continuously occupied during the late Matuyama chron. PMID:20231433
X-ray free electron laser: opportunities for drug discovery.

PubMed

Cheng, Robert K Y; Abela, Rafael; Hennig, Michael

2017-11-08

Past decades have shown the impact of structural information derived from complexes of drug candidates with their protein targets to facilitate the discovery of safe and effective medicines. Despite recent developments in single particle cryo-electron microscopy, X-ray crystallography has been the main method to derive structural information. The unique properties of X-ray free electron laser (XFEL) with unmet peak brilliance and beam focus allow X-ray diffraction data recording and successful structure determination from smaller and weaker diffracting crystals shortening timelines in crystal optimization. To further capitalize on the XFEL advantage, innovations in crystal sample delivery for the X-ray experiment, data collection and processing methods are required. This development was a key contributor to serial crystallography allowing structure determination at room temperature yielding physiologically more relevant structures. Adding the time resolution provided by the femtosecond X-ray pulse will enable monitoring and capturing of dynamic processes of ligand binding and associated conformational changes with great impact to the design of candidate drug compounds. © 2017 The Author(s). Published by Portland Press Limited on behalf of the Biochemical Society.
discovery toolset for Emulytics v. 1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fritz, David; Crussell, Jonathan

The discovery toolset for Emulytics enables the construction of high-fidelity emulation models of systems. The toolset consists of a set of tools and techniques to automatically go from network discovery of operational systems to emulating those complex systems. Our toolset combines data from host discovery and network mapping tools into an intermediate representation that can then be further refined. Once the intermediate representation reaches the desired state, our toolset supports emitting the Emulytics models with varying levels of specificity based on experiment needs.
Hooks and Shifts: A Dialectical Study of Mediated Discovery

ERIC Educational Resources Information Center

Abrahamson, Dor; Trninic, Dragan; Gutierrez, Jose F.; Huth, Jacob; Lee, Rosa G.

2011-01-01

Radical constructivists advocate discovery-based pedagogical regimes that enable students to incrementally and continuously adapt their cognitive structures to the instrumented cultural environment. Some sociocultural theorists, however, maintain that learning implies discontinuity in conceptual development, because novices must appropriate expert…
Cancer and genetics: what we need to know now.

PubMed

Ruccione, K

1999-07-01

Profound changes brought about by discoveries in molecular biology may enable us in the future to treat cancer without causing late effects or to prevent cancer altogether. Even before that happens, the age of molecular medicine has arrived. Molecular biology is the study of biological processes at the level of the molecule. A major aspect of molecular biology is molecular genetics--the science that deals with DNA and RNA. Most of the progress in molecular biology has been made in the second half of the 20th century. Each discovery or technological innovation has built on previous discoveries and paved the way for the next, culminating in the current effort to map, sequence, and understand the functions of the entire human genome. In the past 20 years, many pieces of the cancer puzzle have been found, showing us how the normal cellular control mechanisms go awry to cause cancer and setting the stage for genetic testing and disease treatment. These new discoveries bring both promise and peril. To provide comprehensive care for survivors of childhood cancer and care in other settings as well, health care providers must now be familiar with the concepts and language of molecular biology, understand its applications to cancer care, and be fully informed about its implications for clinical practice, research, and education.
Combining NMR and X-ray crystallography in fragment-based drug discovery: discovery of highly potent and selective BACE-1 inhibitors.

PubMed

Wyss, Daniel F; Wang, Yu-Sen; Eaton, Hugh L; Strickland, Corey; Voigt, Johannes H; Zhu, Zhaoning; Stamford, Andrew W

2012-01-01

Fragment-based drug discovery (FBDD) has become increasingly popular over the last decade. We review here how we have used highly structure-driven fragment-based approaches to complement more traditional lead discovery to tackle high priority targets and those struggling for leads. Combining biomolecular nuclear magnetic resonance (NMR), X-ray crystallography, and molecular modeling with structure-assisted chemistry and innovative biology as an integrated approach for FBDD can solve very difficult problems, as illustrated in this chapter. Here, a successful FBDD campaign is described that has allowed the development of a clinical candidate for BACE-1, a challenging CNS drug target. Crucial to this achievement were the initial identification of a ligand-efficient isothiourea fragment through target-based NMR screening and the determination of its X-ray crystal structure in complex with BACE-1, which revealed an extensive H-bond network with the two active site aspartate residues. This detailed 3D structural information then enabled the design and validation of novel, chemically stable and accessible heterocyclic acylguanidines as aspartic acid protease inhibitor cores. Structure-assisted fragment hit-to-lead optimization yielded iminoheterocyclic BACE-1 inhibitors that possess desirable molecular properties as potential therapeutic agents to test the amyloid hypothesis of Alzheimer's disease in a clinical setting.
Benefits and challenges of a QSP approach through case study: Evaluation of a hypothetical GLP-1/GIP dual agonist therapy.

PubMed

Rieger, Theodore R; Musante, Cynthia J

2016-10-30

Quantitative Systems Pharmacology (QSP) is an emerging science with increasing application to pharmaceutical research and development paradigms. Through case study we provide an overview of the benefits and challenges of applying QSP approaches to inform program decisions in the early stages of drug discovery and development. Specifically, we describe the use of a type 2 diabetes systems model to inform a No-Go decision prior to lead development for a potential GLP-1/GIP dual agonist program, enabling prioritization of exploratory programs with higher probability of clinical success. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
From IHE Audit Trails to XES Event Logs Facilitating Process Mining.

PubMed

Paster, Ferdinand; Helm, Emmanuel

2015-01-01

Recently Business Intelligence approaches like process mining are applied to the healthcare domain. The goal of process mining is to gain process knowledge, compliance and room for improvement by investigating recorded event data. Previous approaches focused on process discovery by event data from various specific systems. IHE, as a globally recognized basis for healthcare information systems, defines in its ATNA profile how real-world events must be recorded in centralized event logs. The following approach presents how audit trails collected by the means of ATNA can be transformed to enable process mining. Using the standardized audit trails provides the ability to apply these methods to all IHE based information systems.
NASA EOSDIS: Enabling Science by Improving User Knowledge

NASA Technical Reports Server (NTRS)

Lindsay, Francis; Brennan, Jennifer; Blumenfeld, Joshua

2016-01-01

Lessons learned and impacts of applying these newer methods are explained and include several examples from our current efforts such as the interactive, on-line webinars focusing on data discovery and access including tool usage, informal and informative data chats with data experts across our EOSDIS community, data user profile interviews with scientists actively using EOSDIS data in their research, and improved conference and meeting interactions via EOSDIS data interactively used during hyper-wall talks and Worldview application. The suite of internet-based, interactive capabilities and technologies has allowed our project to expand our user community by making the data and applications from numerous Earth science missions more engaging, approachable and meaningful.
Distributed data discovery, access and visualization services to Improve Data Interoperability across different data holdings

NASA Astrophysics Data System (ADS)

Palanisamy, G.; Krassovski, M.; Devarakonda, R.; Santhana Vannan, S.

2012-12-01

The current climate debate is highlighting the importance of free, open, and authoritative sources of high quality climate data that are available for peer review and for collaborative purposes. It is increasingly important to allow various organizations around the world to share climate data in an open manner, and to enable them to perform dynamic processing of climate data. This advanced access to data can be enabled via Web-based services, using common "community agreed" standards without having to change their internal structure used to describe the data. The modern scientific community has become diverse and increasingly complex in nature. To meet the demands of such diverse user community, the modern data supplier has to provide data and other related information through searchable, data and process oriented tool. This can be accomplished by setting up on-line, Web-based system with a relational database as a back end. The following common features of the web data access/search systems will be outlined in the proposed presentation: - A flexible data discovery - Data in commonly used format (e.g., CSV, NetCDF) - Preparing metadata in standard formats (FGDC, ISO19115, EML, DIF etc.) - Data subseting capabilities and ability to narrow down to individual data elements - Standards based data access protocols and mechanisms (SOAP, REST, OpenDAP, OGC etc.) - Integration of services across different data systems (discovery to access, visualizations and subseting) This presentation will also include specific examples of integration of various data systems that are developed by Oak Ridge National Laboratory's - Climate Change Science Institute, their ability to communicate between each other to enable better data interoperability and data integration. References: [1] Devarakonda, Ranjeet, and Harold Shanafield. "Drupal: Collaborative framework for science research." Collaboration Technologies and Systems (CTS), 2011 International Conference on. IEEE, 2011. [2]Devarakonda, R., Shrestha, B., Palanisamy, G., Hook, L. A., Killeffer, T. S., Boden, T. A., ... & Lazer, K. (2014). THE NEW ONLINE METADATA EDITOR FOR GENERATING STRUCTURED METADATA. Oak Ridge National Laboratory (ORNL).
Facilitating Stewardship of scientific data through standards based workflows

NASA Astrophysics Data System (ADS)

Bastrakova, I.; Kemp, C.; Potter, A. K.

2013-12-01

There are main suites of standards that can be used to define the fundamental scientific methodology of data, methods and results. These are firstly Metadata standards to enable discovery of the data (ISO 19115), secondly the Sensor Web Enablement (SWE) suite of standards that include the O&M and SensorML standards and thirdly Ontology that provide vocabularies to define the scientific concepts and relationships between these concepts. All three types of standards have to be utilised by the practicing scientist to ensure that those who ultimately have to steward the data stewards to ensure that the data can be preserved curated and reused and repurposed. Additional benefits of this approach include transparency of scientific processes from the data acquisition to creation of scientific concepts and models, and provision of context to inform data use. Collecting and recording metadata is the first step in scientific data flow. The primary role of metadata is to provide details of geographic extent, availability and high-level description of data suitable for its initial discovery through common search engines. The SWE suite provides standardised patterns to describe observations and measurements taken for these data, capture detailed information about observation or analytical methods, used instruments and define quality determinations. This information standardises browsing capability over discrete data types. The standardised patterns of the SWE standards simplify aggregation of observation and measurement data enabling scientists to transfer disintegrated data to scientific concepts. The first two steps provide a necessary basis for the reasoning about concepts of ';pure' science, building relationship between concepts of different domains (linked-data), and identifying domain classification and vocabularies. Geoscience Australia is re-examining its marine data flows, including metadata requirements and business processes, to achieve a clearer link between scientific data acquisition and analysis requirements and effective interoperable data management and delivery. This includes participating in national and international dialogue on development of standards, embedding data management activities in business processes, and developing scientific staff as effective data stewards. Similar approach is applied to the geophysical data. By ensuring the geophysical datasets at GA strictly follow metadata and industry standards we are able to implement a provenance based workflow where the data is easily discoverable, geophysical processing can be applied to it and results can be stored. The provenance based workflow enables metadata records for the results to be produced automatically from the input dataset metadata.
American Military Barrier War Paint, Camp Buehring, Kuwait: A Discovery of Troop Identity, Values, and Warfighting Attributes as They Deployed into Combat for Operation Iraqi Freedom

DTIC Science & Technology

2014-05-21

6. AUTHOR(S) Hutsell, Loren B., Chaplain Major, USA 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING...source of information, humor, and motivation to see the project through. Special recognition is given to my lovely wife, Heather, for her daily support...spouse is a cornerstone for their military member’s service, my wife is the force that enabled me to complete this project . Together we share in its
GeneLab: NASA's Open Access, Collaborative Platform for Systems Biology and Space Medicine

NASA Technical Reports Server (NTRS)

Berrios, Daniel C.; Thompson, Terri G.; Fogle, Homer W.; Rask, Jon C.; Coughlan, Joseph C.

2015-01-01

NASA is investing in GeneLab1 (http:genelab.nasa.gov), a multi-year effort to maximize utilization of the limited resources to conduct biological and medical research in space, principally aboard the International Space Station (ISS). High-throughput genomic, transcriptomic, proteomic or other omics analyses from experiments conducted on the ISS will be stored in the GeneLab Data Systems (GLDS), an open-science information system that will also include a biocomputation platform with collaborative science capabilities, to enable the discovery and validation of molecular networks.
Discovering Mendeleev's Model.

ERIC Educational Resources Information Center

Sterling, Donna

1996-01-01

Presents an activity that introduces the historical developments in science that led to the discovery of the periodic table and lets students experience scientific discovery firsthand. Enables students to learn about patterns among the elements and experience how scientists analyze data to discover patterns and build models. (JRH)
Big Data Transforms Discovery-Utilization Therapeutics Continuum

PubMed Central

Waldman, SA; Terzic, A

2015-01-01

Enabling omic technologies adopt a holistic view to produce unprecedented insights into the molecular underpinnings of health and disease, in part, by generating massive high-dimensional biological data. Leveraging these systems-level insights as an engine driving the healthcare evolution is maximized through integration with medical, demographic, and environmental datasets from individuals to populations. Big data analytics has accordingly emerged to add value to the technical aspects of storage, transfer, and analysis required for merging vast arrays of omic-, clinical- and eco-datasets. In turn, this new field at the interface of biology, medicine, and information science is systematically transforming modern therapeutics across discovery, development, regulation, and utilization. “…a man's discourse was like to a rich Persian carpet, the beautiful figures and patterns of which can be shown only by spreading and extending it out; when it is contracted and folded up, they are obscured and lost” Themistocles quoted by Plutarch AD 46 – AD 120 PMID:26888297
DEGAS: sharing and tracking target compound ideas with external collaborators.

PubMed

Lee, Man-Ling; Aliagas, Ignacio; Dotson, Jennafer; Feng, Jianwen A; Gobbi, Alberto; Heffron, Timothy

2012-02-27

To minimize the risk of failure in clinical trials, drug discovery teams must propose active and selective clinical candidates with good physicochemical properties. An additional challenge is that today drug discovery is often conducted by teams at different geographical locations. To improve the collaborative decision making on which compounds to synthesize, we have implemented DEGAS, an application which enables scientists from Genentech and from collaborating external partners to instantly access the same data. DEGAS was implemented to ensure that only the best target compounds are made and that they are made without duplicate effort. Physicochemical properties and DMPK model predictions are computed for each compound to allow the team to make informed decisions when prioritizing. The synthesis progress can be easily tracked. While developing DEGAS, ease of use was a particular goal in order to minimize the difficulty of training and supporting remote users.
HCV versus HIV drug discovery: Déjà vu all over again?

PubMed

Watkins, William J; Desai, Manoj C

2013-04-15

Efforts to address HIV infection have been highly successful, enabling chronic suppression of viral replication with once-daily regimens. More recent research into HCV therapeutics have also resulted in very promising clinical candidates. This Digest explores similarities and differences in the two fields and compares the chronology of drug discovery relative to the availability of enabling tools, and concludes that safe and convenient, once-daily regimens are likely to reach approval much more rapidly for HCV than was the case for HIV. Copyright © 2013 Elsevier Ltd. All rights reserved.
Label-assisted mass spectrometry for the acceleration of reaction discovery and optimization

NASA Astrophysics Data System (ADS)

Cabrera-Pardo, Jaime R.; Chai, David I.; Liu, Song; Mrksich, Milan; Kozmin, Sergey A.

2013-05-01

The identification of new reactions expands our knowledge of chemical reactivity and enables new synthetic applications. Accelerating the pace of this discovery process remains challenging. We describe a highly effective and simple platform for screening a large number of potential chemical reactions in order to discover and optimize previously unknown catalytic transformations, thereby revealing new chemical reactivity. Our strategy is based on labelling one of the reactants with a polyaromatic chemical tag, which selectively undergoes a photoionization/desorption process upon laser irradiation, without the assistance of an external matrix, and enables rapid mass spectrometric detection of any products originating from such labelled reactants in complex reaction mixtures without any chromatographic separation. This method was successfully used for high-throughput discovery and subsequent optimization of two previously unknown benzannulation reactions.
Discovery Mechanisms for the Sensor Web

PubMed Central

Jirka, Simon; Bröring, Arne; Stasch, Christoph

2009-01-01

This paper addresses the discovery of sensors within the OGC Sensor Web Enablement framework. Whereas services like the OGC Web Map Service or Web Coverage Service are already well supported through catalogue services, the field of sensor networks and the according discovery mechanisms is still a challenge. The focus within this article will be on the use of existing OGC Sensor Web components for realizing a discovery solution. After discussing the requirements for a Sensor Web discovery mechanism, an approach will be presented that was developed within the EU funded project “OSIRIS”. This solution offers mechanisms to search for sensors, exploit basic semantic relationships, harvest sensor metadata and integrate sensor discovery into already existing catalogues. PMID:22574038
40 CFR 22.19 - Prehearing information exchange; prehearing conference; other discovery.

Code of Federal Regulations, 2010 CFR

2010-07-01

... method of discovery sought, provide the proposed discovery instruments, and describe in detail the nature... finding that: (i) The information sought cannot reasonably be obtained by alternative methods of discovery... promptly supplement or correct the exchange when the party learns that the information exchanged or...

10 CFR 590.305 - Informal discovery.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 10 Energy 4 2012-01-01 2012-01-01 false Informal discovery. 590.305 Section 590.305 Energy... WITH RESPECT TO THE IMPORT AND EXPORT OF NATURAL GAS Procedures § 590.305 Informal discovery. The parties to a proceeding may conduct discovery through use of procedures such as written interrogatories or...
10 CFR 590.305 - Informal discovery.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 10 Energy 4 2011-01-01 2011-01-01 false Informal discovery. 590.305 Section 590.305 Energy... WITH RESPECT TO THE IMPORT AND EXPORT OF NATURAL GAS Procedures § 590.305 Informal discovery. The parties to a proceeding may conduct discovery through use of procedures such as written interrogatories or...
Grid Enabled Geospatial Catalogue Web Service

NASA Technical Reports Server (NTRS)

Chen, Ai-Jun; Di, Li-Ping; Wei, Ya-Xing; Liu, Yang; Bui, Yu-Qi; Hu, Chau-Min; Mehrotra, Piyush

2004-01-01

Geospatial Catalogue Web Service is a vital service for sharing and interoperating volumes of distributed heterogeneous geospatial resources, such as data, services, applications, and their replicas over the web. Based on the Grid technology and the Open Geospatial Consortium (0GC) s Catalogue Service - Web Information Model, this paper proposes a new information model for Geospatial Catalogue Web Service, named as GCWS which can securely provides Grid-based publishing, managing and querying geospatial data and services, and the transparent access to the replica data and related services under the Grid environment. This information model integrates the information model of the Grid Replica Location Service (RLS)/Monitoring & Discovery Service (MDS) with the information model of OGC Catalogue Service (CSW), and refers to the geospatial data metadata standards from IS0 19115, FGDC and NASA EOS Core System and service metadata standards from IS0 191 19 to extend itself for expressing geospatial resources. Using GCWS, any valid geospatial user, who belongs to an authorized Virtual Organization (VO), can securely publish and manage geospatial resources, especially query on-demand data in the virtual community and get back it through the data-related services which provide functions such as subsetting, reformatting, reprojection etc. This work facilitates the geospatial resources sharing and interoperating under the Grid environment, and implements geospatial resources Grid enabled and Grid technologies geospatial enabled. It 2!so makes researcher to focus on science, 2nd not cn issues with computing ability, data locztic, processir,g and management. GCWS also is a key component for workflow-based virtual geospatial data producing.
Trusted Data Sharing and Imagery Workflow for Disaster Response in Partnership with the State of California

NASA Astrophysics Data System (ADS)

Glasscoe, M. T.; Aubrey, A. D.; Rosinski, A.; Morentz, J.; Beilin, P.; Jones, D.

2016-12-01

Providing actionable data for situational awareness following an earthquake or other disaster is critical to decision makers in order to improve their ability to anticipate requirements and provide appropriate resources for response. Key information on the nature, magnitude and scope of damage, or Essential Elements of Information (EEI), necessary to achieve situational awareness are often generated from a wide array of organizations and disciplines, using any number of geospatial and non-geospatial technologies. We have worked in partnership with the California Earthquake Clearinghouse to develop actionable data products for use in their response efforts, particularly in regularly scheduled, statewide exercises like the recent 2016 Cascadia Rising NLE, the May 2015 Capstone/SoCal NLE/Ardent Sentry Exercises and in the August 2014 South Napa earthquake activation and plan to participate in upcoming exercises with the National Guard (Vigilant Guard 17) and the USGS (Haywired). Our efforts over the past several years have been to aid in enabling coordination between research scientists, applied scientists and decision makers in order to reduce duplication of effort, maximize information sharing, translate scientific results into actionable information for decision-makers, and increase situational awareness. We will present perspectives on developing tools for decision support and data discovery in partnership with the Clearinghouse. Products delivered include map layers as part of the common operational data plan for the Clearinghouse delivered through XchangeCore Web Service Data Orchestration and the SpotOnResponse field analysis application. We are exploring new capabilities for real-time collaboration using GeoCollaborate®. XchangeCore allows real-time, two-way information sharing, enabling users to create merged datasets from multiple providers; SpotOnResponse provides web-enabled secure information exchange, collaboration, and field analysis for responders; and GeoCollaborate® enables users to access, share, manipulate, and interact across disparate platforms, connecting public and private sector agencies and organizations rapidly on the same map at the same time, allowing improved collaborative decision making on the same datasets simultaneously.
mHealth Visual Discovery Dashboard.

PubMed

Fang, Dezhi; Hohman, Fred; Polack, Peter; Sarker, Hillol; Kahng, Minsuk; Sharmin, Moushumi; al'Absi, Mustafa; Chau, Duen Horng

2017-09-01

We present Discovery Dashboard, a visual analytics system for exploring large volumes of time series data from mobile medical field studies. Discovery Dashboard offers interactive exploration tools and a data mining motif discovery algorithm to help researchers formulate hypotheses, discover trends and patterns, and ultimately gain a deeper understanding of their data. Discovery Dashboard emphasizes user freedom and flexibility during the data exploration process and enables researchers to do things previously challenging or impossible to do - in the web-browser and in real time. We demonstrate our system visualizing data from a mobile sensor study conducted at the University of Minnesota that included 52 participants who were trying to quit smoking.
mHealth Visual Discovery Dashboard

PubMed Central

Fang, Dezhi; Hohman, Fred; Polack, Peter; Sarker, Hillol; Kahng, Minsuk; Sharmin, Moushumi; al'Absi, Mustafa; Chau, Duen Horng

2018-01-01

We present Discovery Dashboard, a visual analytics system for exploring large volumes of time series data from mobile medical field studies. Discovery Dashboard offers interactive exploration tools and a data mining motif discovery algorithm to help researchers formulate hypotheses, discover trends and patterns, and ultimately gain a deeper understanding of their data. Discovery Dashboard emphasizes user freedom and flexibility during the data exploration process and enables researchers to do things previously challenging or impossible to do — in the web-browser and in real time. We demonstrate our system visualizing data from a mobile sensor study conducted at the University of Minnesota that included 52 participants who were trying to quit smoking. PMID:29354812
Data mining for better material synthesis: The case of pulsed laser deposition of complex oxides

NASA Astrophysics Data System (ADS)

Young, Steven R.; Maksov, Artem; Ziatdinov, Maxim; Cao, Ye; Burch, Matthew; Balachandran, Janakiraman; Li, Linglong; Somnath, Suhas; Patton, Robert M.; Kalinin, Sergei V.; Vasudevan, Rama K.

2018-03-01

The pursuit of more advanced electronics, and finding solutions to energy needs often hinges upon the discovery and optimization of new functional materials. However, the discovery rate of these materials is alarmingly low. Much of the information that could drive this rate higher is scattered across tens of thousands of papers in the extant literature published over several decades but is not in an indexed form, and cannot be used in entirety without substantial effort. Many of these limitations can be circumvented if the experimentalist has access to systematized collections of prior experimental procedures and results. Here, we investigate the property-processing relationship during growth of oxide films by pulsed laser deposition. To do so, we develop an enabling software tool to (1) mine the literature of relevant papers for synthesis parameters and functional properties of previously studied materials, (2) enhance the accuracy of this mining through crowd sourcing approaches, (3) create a searchable repository that will be a community-wide resource enabling material scientists to leverage this information, and (4) provide through the Jupyter notebook platform, simple machine-learning-based analysis to learn the complex interactions between growth parameters and functional properties (all data/codes available on https://github.com/ORNL-DataMatls). The results allow visualization of growth windows, trends and outliers, which can serve as a template for analyzing the distribution of growth conditions, provide starting points for related compounds and act as a feedback for first-principles calculations. Such tools will comprise an integral part of the materials design schema in the coming decade.
Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species.

PubMed

Irizarry, Kristopher J L; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L; Barrett, Gini; Barr, Margaret C

2016-01-01

Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management.
Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species

PubMed Central

Irizarry, Kristopher J. L.; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L.; Barrett, Gini; Barr, Margaret C.

2016-01-01

Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management. PMID:27376076
NASA Wavelength: A Full Spectrum of NASA Resources for Earth and Space Science Education

NASA Astrophysics Data System (ADS)

Smith, D. A.; Schwerin, T. G.; Peticolas, L. M.; Porcello, D.; Kansa, E.; Shipp, S. S.; Bartolone, L.

2013-12-01

The NASA Science Education and Public Outreach Forums have developed a digital library--NASAWavelength.org--that enables easy discovery and retrieval of thousands of resources from the NASA Earth and space science education portfolio. The system has been developed based on best practices in the architecture and design of web-based information systems. The design style and philosophy emphasize simple, reusable data and services that facilitate the free flow of data across systems. The primary audiences for NASA Wavelength are STEM educators (K-12, higher education and informal education) as well as scientists, education and public outreach professionals who work with K-12, higher education, and informal education. A NASA Wavelength strandmap service features the 19 AAAS strandmaps that are most relevant to NASA science; the service also generates all of the 103 AAAS strandmaps with content from the Wavelength collection. These maps graphically and interactively provide connections between concepts as well as illustrate how concepts build upon one another across grade levels. New features have been developed for this site based on user feedback, including list-building so that users can create and share individual collections within Wavelength. We will also discuss potential methods for integrating the Next Generation Science Standards (NGSS) into the search and discovery tools on NASA Wavelength.
Quantitative proteomics in cardiovascular research: global and targeted strategies

PubMed Central

Shen, Xiaomeng; Young, Rebeccah; Canty, John M.; Qu, Jun

2014-01-01

Extensive technical advances in the past decade have substantially expanded quantitative proteomics in cardiovascular research. This has great promise for elucidating the mechanisms of cardiovascular diseases (CVD) and the discovery of cardiac biomarkers used for diagnosis and treatment evaluation. Global and targeted proteomics are the two major avenues of quantitative proteomics. While global approaches enable unbiased discovery of altered proteins via relative quantification at the proteome level, targeted techniques provide higher sensitivity and accuracy, and are capable of multiplexed absolute quantification in numerous clinical/biological samples. While promising, technical challenges need to be overcome to enable full utilization of these techniques in cardiovascular medicine. Here we discuss recent advances in quantitative proteomics and summarize applications in cardiovascular research with an emphasis on biomarker discovery and elucidating molecular mechanisms of disease. We propose the integration of global and targeted strategies as a high-throughput pipeline for cardiovascular proteomics. Targeted approaches enable rapid, extensive validation of biomarker candidates discovered by global proteomics. These approaches provide a promising alternative to immunoassays and other low-throughput means currently used for limited validation. PMID:24920501
40 CFR 22.52 - Information exchange and discovery.

Code of Federal Regulations, 2010 CFR

2010-07-01

... Procedure Act § 22.52 Information exchange and discovery. Respondent's information exchange pursuant to § 22.19(a) shall include information on any economic benefit resulting from any activity or failure to act... 40 Protection of Environment 1 2010-07-01 2010-07-01 false Information exchange and discovery. 22...
Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback

PubMed Central

Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

2016-01-01

Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery. PMID:27861505
Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback.

PubMed

Hu, Kai; Gui, Zhipeng; Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

2016-01-01

Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.
Assessment of microbiota:host interactions at the vaginal mucosa interface.

PubMed

Pruski, Pamela; Lewis, Holly V; Lee, Yun S; Marchesi, Julian R; Bennett, Phillip R; Takats, Zoltan; MacIntyre, David A

2018-04-27

There is increasing appreciation of the role that vaginal microbiota play in health and disease throughout a woman's lifespan. This has been driven partly by molecular techniques that enable detailed identification and characterisation of microbial community structures. However, these methods do not enable assessment of the biochemical and immunological interactions between host and vaginal microbiota involved in pathophysiology. This review examines our current knowledge of the relationships that exist between vaginal microbiota and the host at the level of the vaginal mucosal interface. We also consider methodological approaches to microbiomic, immunologic and metabolic profiling that permit assessment of these interactions. Integration of information derived from these platforms brings the potential for biomarker discovery, disease risk stratification and improved understanding of the mechanisms regulating vaginal microbial community dynamics in health and disease. Copyright © 2018 Elsevier Inc. All rights reserved.
Spin transport in epitaxial graphene

NASA Astrophysics Data System (ADS)

Tbd, -

2014-03-01

Spintronics is a paradigm focusing on spin as the information vector in fast and ultra-low-power non volatile devices such as the new STT-MRAM. Beyond its widely distributed application in data storage it aims at providing more complex architectures and a powerful beyond CMOS solution for information processing. The recent discovery of graphene has opened novel exciting opportunities in terms of functionalities and performances for spintronics devices. We will present experimental results allowing us to assess the potential of graphene for spintronics. We will show that unprecedented highly efficient spin information transport can occur in epitaxial graphene leading to large spin signals and macroscopic spin diffusion lengths (~ 100 microns), a key enabler for the advent of envisioned beyond-CMOS spin-based logic architectures. We will also show that how the device behavior is well explained within the framework of the Valet-Fert drift-diffusion equations. Furthermore, we will show that a thin graphene passivation layer can prevent the oxidation of a ferromagnet, enabling its use in novel humide/ambient low-cost processes for spintronics devices, while keeping its highly surface sensitive spin current polarizer/analyzer behavior and adding new enhanced spin filtering property. These different experiments unveil promising uses of graphene for spintronics.
System and Method for Providing a Climate Data Persistence Service

NASA Technical Reports Server (NTRS)

Schnase, John L. (Inventor); Ripley, III, William David (Inventor); Duffy, Daniel Q. (Inventor); Thompson, John H. (Inventor); Strong, Savannah L. (Inventor); McInerney, Mark (Inventor); Sinno, Scott (Inventor); Tamkin, Glenn S. (Inventor); Nadeau, Denis (Inventor)

2018-01-01

A system, method and computer-readable storage devices for providing a climate data persistence service. A system configured to provide the service can include a climate data server that performs data and metadata storage and management functions for climate data objects, a compute-storage platform that provides the resources needed to support a climate data server, provisioning software that allows climate data server instances to be deployed as virtual climate data servers in a cloud computing environment, and a service interface, wherein persistence service capabilities are invoked by software applications running on a client device. The climate data objects can be in various formats, such as International Organization for Standards (ISO) Open Archival Information System (OAIS) Reference Model Submission Information Packages, Archive Information Packages, and Dissemination Information Packages. The climate data server can enable scalable, federated storage, management, discovery, and access, and can be tailored for particular use cases.
Harnessing Disordered-Ensemble Quantum Dynamics for Machine Learning

NASA Astrophysics Data System (ADS)

Fujii, Keisuke; Nakajima, Kohei

2017-08-01

The quantum computer has an amazing potential of fast information processing. However, the realization of a digital quantum computer is still a challenging problem requiring highly accurate controls and key application strategies. Here we propose a platform, quantum reservoir computing, to solve these issues successfully by exploiting the natural quantum dynamics of ensemble systems, which are ubiquitous in laboratories nowadays, for machine learning. This framework enables ensemble quantum systems to universally emulate nonlinear dynamical systems including classical chaos. A number of numerical experiments show that quantum systems consisting of 5-7 qubits possess computational capabilities comparable to conventional recurrent neural networks of 100-500 nodes. This discovery opens up a paradigm for information processing with artificial intelligence powered by quantum physics.
2015 Army Science Planning and Strategy Meeting Series: Outcomes and Conclusions

DTIC Science & Technology

2017-12-21

modeling and nanoscale characterization tools to enable efficient design of hybridized manufacturing ; realtime, multiscale computational capability...to enable predictive analytics for expeditionary on-demand manufacturing • Discovery of design principles to enable programming advanced genetic...goals, significant research is needed to mature the fundamental materials science, processing and manufacturing sciences, design methodologies, data
Concept of Operations for Collaboration and Discovery from Big Data Across Enterprise Data Warehouses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Olama, Mohammed M; Nutaro, James J; Sukumar, Sreenivas R

2013-01-01

The success of data-driven business in government, science, and private industry is driving the need for seamless integration of intra and inter-enterprise data sources to extract knowledge nuggets in the form of correlations, trends, patterns and behaviors previously not discovered due to physical and logical separation of datasets. Today, as volume, velocity, variety and complexity of enterprise data keeps increasing, the next generation analysts are facing several challenges in the knowledge extraction process. Towards addressing these challenges, data-driven organizations that rely on the success of their analysts have to make investment decisions for sustainable data/information systems and knowledge discovery. Optionsmore » that organizations are considering are newer storage/analysis architectures, better analysis machines, redesigned analysis algorithms, collaborative knowledge management tools, and query builders amongst many others. In this paper, we present a concept of operations for enabling knowledge discovery that data-driven organizations can leverage towards making their investment decisions. We base our recommendations on the experience gained from integrating multi-agency enterprise data warehouses at the Oak Ridge National Laboratory to design the foundation of future knowledge nurturing data-system architectures.« less

MorphDB: Prioritizing Genes for Specialized Metabolism Pathways and Gene Ontology Categories in Plants.

PubMed

Zwaenepoel, Arthur; Diels, Tim; Amar, David; Van Parys, Thomas; Shamir, Ron; Van de Peer, Yves; Tzfadia, Oren

2018-01-01

Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest.
Sea Level Rise Data Discovery

NASA Astrophysics Data System (ADS)

Quach, N.; Huang, T.; Boening, C.; Gill, K. M.

2016-12-01

Research related to sea level rise crosses multiple disciplines from sea ice to land hydrology. The NASA Sea Level Change Portal (SLCP) is a one-stop source for current sea level change information and data, including interactive tools for accessing and viewing regional data, a virtual dashboard of sea level indicators, and ongoing updates through a suite of editorial products that include content articles, graphics, videos, and animations. The architecture behind the SLCP makes it possible to integrate web content and data relevant to sea level change that are archived across various data centers as well as new data generated by sea level change principal investigators. The Extensible Data Gateway Environment (EDGE) is incorporated into the SLCP architecture to provide a unified platform for web content and science data discovery. EDGE is a data integration platform designed to facilitate high-performance geospatial data discovery and access with the ability to support multi-metadata standard specifications. EDGE has the capability to retrieve data from one or more sources and package the resulting sets into a single response to the requestor. With this unified endpoint, the Data Analysis Tool that is available on the SLCP can retrieve dataset and granule level metadata as well as perform geospatial search on the data. This talk focuses on the architecture that makes it possible to seamlessly integrate and enable discovery of disparate data relevant to sea level rise.
Biomechanics of Early Cardiac Development

PubMed Central

Goenezen, Sevan; Rennie, Monique Y.

2012-01-01

Biomechanics affect early cardiac development, from looping to the development of chambers and valves. Hemodynamic forces are essential for proper cardiac development, and their disruption leads to congenital heart defects. A wealth of information already exists on early cardiac adaptations to hemodynamic loading, and new technologies, including high resolution imaging modalities and computational modeling, are enabling a more thorough understanding of relationships between hemodynamics and cardiac development. Imaging and modeling approaches, used in combination with biological data on cell behavior and adaptation, are paving the road for new discoveries on links between biomechanics and biology and their effect on cardiac development and fetal programming. PMID:22760547
Co-fuse: a new class discovery analysis tool to identify and prioritize recurrent fusion genes from RNA-sequencing data.

PubMed

Paisitkriangkrai, Sakrapee; Quek, Kelly; Nievergall, Eva; Jabbour, Anissa; Zannettino, Andrew; Kok, Chung Hoow

2018-06-07

Recurrent oncogenic fusion genes play a critical role in the development of various cancers and diseases and provide, in some cases, excellent therapeutic targets. To date, analysis tools that can identify and compare recurrent fusion genes across multiple samples have not been available to researchers. To address this deficiency, we developed Co-occurrence Fusion (Co-fuse), a new and easy to use software tool that enables biologists to merge RNA-seq information, allowing them to identify recurrent fusion genes, without the need for exhaustive data processing. Notably, Co-fuse is based on pattern mining and statistical analysis which enables the identification of hidden patterns of recurrent fusion genes. In this report, we show that Co-fuse can be used to identify 2 distinct groups within a set of 49 leukemic cell lines based on their recurrent fusion genes: a multiple myeloma (MM) samples-enriched cluster and an acute myeloid leukemia (AML) samples-enriched cluster. Our experimental results further demonstrate that Co-fuse can identify known driver fusion genes (e.g., IGH-MYC, IGH-WHSC1) in MM, when compared to AML samples, indicating the potential of Co-fuse to aid the discovery of yet unknown driver fusion genes through cohort comparisons. Additionally, using a 272 primary glioma sample RNA-seq dataset, Co-fuse was able to validate recurrent fusion genes, further demonstrating the power of this analysis tool to identify recurrent fusion genes. Taken together, Co-fuse is a powerful new analysis tool that can be readily applied to large RNA-seq datasets, and may lead to the discovery of new disease subgroups and potentially new driver genes, for which, targeted therapies could be developed. The Co-fuse R source code is publicly available at https://github.com/sakrapee/co-fuse .
Directional genomic hybridization for chromosomal inversion discovery and detection.

PubMed

Ray, F Andrew; Zimmerman, Erin; Robinson, Bruce; Cornforth, Michael N; Bedford, Joel S; Goodwin, Edwin H; Bailey, Susan M

2013-04-01

Chromosomal rearrangements are a source of structural variation within the genome that figure prominently in human disease, where the importance of translocations and deletions is well recognized. In principle, inversions-reversals in the orientation of DNA sequences within a chromosome-should have similar detrimental potential. However, the study of inversions has been hampered by traditional approaches used for their detection, which are not particularly robust. Even with significant advances in whole genome approaches, changes in the absolute orientation of DNA remain difficult to detect routinely. Consequently, our understanding of inversions is still surprisingly limited, as is our appreciation for their frequency and involvement in human disease. Here, we introduce the directional genomic hybridization methodology of chromatid painting-a whole new way of looking at structural features of the genome-that can be employed with high resolution on a cell-by-cell basis, and demonstrate its basic capabilities for genome-wide discovery and targeted detection of inversions. Bioinformatics enabled development of sequence- and strand-specific directional probe sets, which when coupled with single-stranded hybridization, greatly improved the resolution and ease of inversion detection. We highlight examples of the far-ranging applicability of this cytogenomics-based approach, which include confirmation of the alignment of the human genome database and evidence that individuals themselves share similar sequence directionality, as well as use in comparative and evolutionary studies for any species whose genome has been sequenced. In addition to applications related to basic mechanistic studies, the information obtainable with strand-specific hybridization strategies may ultimately enable novel gene discovery, thereby benefitting the diagnosis and treatment of a variety of human disease states and disorders including cancer, autism, and idiopathic infertility.
Online Metadata Directories: A way of preserving, sharing and discovering scientific information

NASA Technical Reports Server (NTRS)

Meaux, M.

2005-01-01

The Global Change Master Directory (GCMD) assists the scientific community in the discovery of and linkage to Earth Science data and provides data holders a means to advertise their data to the community through its portals, i.e. online customized subset metadata directories. These directories are effectively serving communities like the Joint Committee on Antarctic Data Management (JCADM), the Global Observing System Information Center (GOSIC), and the Global Ocean Ecosystems Dynamic Program (GLOBEC) by increasing the visibility of their data holding. The purpose of the Gulf of Maine Ocean Data Partnership (GoMODP) is to "promote and coordinate the sharing, linking, electronic dissemination, and use of data on the Gulf of Maine region". The participants have decided that a "coordinated effort is needed to enable users throughout the Gulf of Maine region and beyond to discover and put to use the vast and growing quantities of data in their respective databases". GoMODP members have invited the GCMD to discuss potential collaborations associated with this effort. The presentation will focus on the use of the GCMD s metadata directory as a powerful tool for data discovery and sharing. An overview of the directory and its metadata authoring tools will be given.
KSC-2012-2128

NASA Image and Video Library

2012-04-14

CAPE CANAVERAL, Fla. – At the Shuttle Landing Facility at NASA’s Kennedy Space Center in Florida, workers secure a sling to space shuttle Discovery to enable the mate-demate device to lift it onto a Shuttle Carrier Aircraft. The device, known as the MDD, is a large gantry-like steel structure used to hoist a shuttle off the ground and position it onto the back of the aircraft, or SCA. The SCA is a Boeing 747 jet, originally manufactured for commercial use, which was modified by NASA to transport the shuttles between destinations on Earth. The SCA designated NASA 905 is assigned to the remaining ferry missions, delivering the shuttles to their permanent public display sites. NASA 905 is scheduled to ferry Discovery to the Washington Dulles International Airport in Virginia on April 17, after which the shuttle will be placed on display in the Smithsonian's National Air and Space Museum Steven F. Udvar-Hazy Center. For more information on the SCA, visit http://www.nasa.gov/centers/dryden/news/FactSheets/FS-013-DFRC.html. For more information on shuttle transition and retirement activities, visit http://www.nasa.gov/transition. Photo credit: NASA/Dimitri Gerondidakis
KSC-2012-2129

NASA Image and Video Library

2012-04-14

CAPE CANAVERAL, Fla. – At the Shuttle Landing Facility at NASA’s Kennedy Space Center in Florida, workers secure a sling to space shuttle Discovery to enable the mate-demate device to lift it onto a Shuttle Carrier Aircraft. The device, known as the MDD, is a large gantry-like steel structure used to hoist a shuttle off the ground and position it onto the back of the aircraft, or SCA. The SCA is a Boeing 747 jet, originally manufactured for commercial use, which was modified by NASA to transport the shuttles between destinations on Earth. The SCA designated NASA 905 is assigned to the remaining ferry missions, delivering the shuttles to their permanent public display sites. NASA 905 is scheduled to ferry Discovery to the Washington Dulles International Airport in Virginia on April 17, after which the shuttle will be placed on display in the Smithsonian's National Air and Space Museum Steven F. Udvar-Hazy Center. For more information on the SCA, visit http://www.nasa.gov/centers/dryden/news/FactSheets/FS-013-DFRC.html. For more information on shuttle transition and retirement activities, visit http://www.nasa.gov/transition. Photo credit: NASA/Dimitri Gerondidakis
MetaCoMET: a web platform for discovery and visualization of the core microbiome

USDA-ARS?s Scientific Manuscript database

A key component of the analysis of microbiome datasets is the identification of OTUs shared between multiple experimental conditions, commonly referred to as the core microbiome. Results: We present a web platform named MetaCoMET that enables the discovery and visualization of the core microbiome an...
Service-based analysis of biological pathways

PubMed Central

Zheng, George; Bouguettaya, Athman

2009-01-01

Background Computer-based pathway discovery is concerned with two important objectives: pathway identification and analysis. Conventional mining and modeling approaches aimed at pathway discovery are often effective at achieving either objective, but not both. Such limitations can be effectively tackled leveraging a Web service-based modeling and mining approach. Results Inspired by molecular recognitions and drug discovery processes, we developed a Web service mining tool, named PathExplorer, to discover potentially interesting biological pathways linking service models of biological processes. The tool uses an innovative approach to identify useful pathways based on graph-based hints and service-based simulation verifying user's hypotheses. Conclusion Web service modeling of biological processes allows the easy access and invocation of these processes on the Web. Web service mining techniques described in this paper enable the discovery of biological pathways linking these process service models. Algorithms presented in this paper for automatically highlighting interesting subgraph within an identified pathway network enable the user to formulate hypothesis, which can be tested out using our simulation algorithm that are also described in this paper. PMID:19796403
An EarthCube Roadmap for Cross-Domain Interoperability in the Geosciences: Governance Aspects

NASA Astrophysics Data System (ADS)

Zaslavsky, I.; Couch, A.; Richard, S. M.; Valentine, D. W.; Stocks, K.; Murphy, P.; Lehnert, K. A.

2012-12-01

The goal of cross-domain interoperability is to enable reuse of data and models outside the original context in which these data and models are collected and used and to facilitate analysis and modeling of physical processes that are not confined to disciplinary or jurisdictional boundaries. A new research initiative of the U.S. National Science Foundation, called EarthCube, is developing a roadmap to address challenges of interoperability in the earth sciences and create a blueprint for community-guided cyberinfrastructure accessible to a broad range of geoscience researchers and students. Infrastructure readiness for cross-domain interoperability encompasses the capabilities that need to be in place for such secondary or derivative-use of information to be both scientifically sound and technically feasible. In this initial assessment we consider the following four basic infrastructure components that need to be present to enable cross-domain interoperability in the geosciences: metadata catalogs (at the appropriate community defined granularity) that provide standard discovery services over datasets, data access services, models and other resources of the domain; vocabularies that support unambiguous interpretation of domain resources and metadata; services used to access data repositories and other resources including models, visualizations and workflows; and formal information models that define structure and semantics of the information returned on service requests. General standards for these components have been proposed; they form the backbone of large scale integration activities in the geosciences. By utilizing these standards, EarthCube research designs can take advantage of data discovery across disciplines using the commonality in key data characteristics related to shared models of spatial features, time measurements, and observations. Data can be discovered via federated catalogs and linked nomenclatures from neighboring domains, while standard data services can be used to transparently compile composite data products. Key questions addressed in this presentation are: (1) How to define and assess readiness of existing domain information systems for cross-domain re-use? (2) How to determine EarthCube development priorities given a multitude of use cases that involve cross-domain data flows? and (3) How to involve a wider community of geoscientists in the development and curation of cross-domain resources and incorporate community feedback in the CI design? Answering them involves consideration of governance mechanisms for cross-domain interoperability: while domain information systems and projects developed governance mechanisms, managing cross-domain CI resources and supporting cross-domain information re-use hasn't been the development focus at the scale of the geosciences. We present a cross-domain readiness model as enabling effective communication among scientists, governance bodies, and information providers. We also present an initial readiness assessment and a cross-domain connectivity map for the geosciences, and outline processes for eliciting user requirements, setting priorities, and obtaining community consensus.
The Localized Discovery and Recovery for Query Packet Losses in Wireless Sensor Networks with Distributed Detector Clusters

PubMed Central

Teng, Rui; Leibnitz, Kenji; Miura, Ryu

2013-01-01

An essential application of wireless sensor networks is to successfully respond to user queries. Query packet losses occur in the query dissemination due to wireless communication problems such as interference, multipath fading, packet collisions, etc. The losses of query messages at sensor nodes result in the failure of sensor nodes reporting the requested data. Hence, the reliable and successful dissemination of query messages to sensor nodes is a non-trivial problem. The target of this paper is to enable highly successful query delivery to sensor nodes by localized and energy-efficient discovery, and recovery of query losses. We adopt local and collective cooperation among sensor nodes to increase the success rate of distributed discoveries and recoveries. To enable the scalability in the operations of discoveries and recoveries, we employ a distributed name resolution mechanism at each sensor node to allow sensor nodes to self-detect the correlated queries and query losses, and then efficiently locally respond to the query losses. We prove that the collective discovery of query losses has a high impact on the success of query dissemination and reveal that scalability can be achieved by using the proposed approach. We further study the novel features of the cooperation and competition in the collective recovery at PHY and MAC layers, and show that the appropriate number of detectors can achieve optimal successful recovery rate. We evaluate the proposed approach with both mathematical analyses and computer simulations. The proposed approach enables a high rate of successful delivery of query messages and it results in short route lengths to recover from query losses. The proposed approach is scalable and operates in a fully distributed manner. PMID:23748172
Collaborative Web-Enabled GeoAnalytics Applied to OECD Regional Data

NASA Astrophysics Data System (ADS)

Jern, Mikael

Recent advances in web-enabled graphics technologies have the potential to make a dramatic impact on developing collaborative geovisual analytics (GeoAnalytics). In this paper, tools are introduced that help establish progress initiatives at international and sub-national levels aimed at measuring and collaborating, through statistical indicators, economic, social and environmental developments and to engage both statisticians and the public in such activities. Given this global dimension of such a task, the “dream” of building a repository of progress indicators, where experts and public users can use GeoAnalytics collaborative tools to compare situations for two or more countries, regions or local communities, could be accomplished. While the benefits of GeoAnalytics tools are many, it remains a challenge to adapt these dynamic visual tools to the Internet. For example, dynamic web-enabled animation that enables statisticians to explore temporal, spatial and multivariate demographics data from multiple perspectives, discover interesting relationships, share their incremental discoveries with colleagues and finally communicate selected relevant knowledge to the public. These discoveries often emerge through the diverse backgrounds and experiences of expert domains and are precious in a creative analytics reasoning process. In this context, we introduce a demonstrator “OECD eXplorer”, a customized tool for interactively analyzing, and collaborating gained insights and discoveries based on a novel story mechanism that capture, re-use and share task-related explorative events.
How to succeed in science: a concise guide for young biomedical scientists. Part II: making discoveries

PubMed Central

Yewdell, Jonathan W.

2009-01-01

Making discoveries is the most important part of being a scientist, and also the most fun. Young scientists need to develop the experimental and mental skill sets that enable them to make discoveries, including how to recognize and exploit serendipity when it strikes. Here, I provide practical advice to young scientists on choosing a research topic, designing, performing and interpreting experiments and, last but not least, on maintaining your sanity in the process. PMID:18401347
Photoreactive Stapled BH3 Peptides to Dissect the BCL-2 Family Interactome

PubMed Central

Braun, Craig R.; Mintseris, Julian; Gavathiotis, Evripidis; Bird, Gregory H.; Gygi, Steven P.; Walensky, Loren D.

2010-01-01

SUMMARY Defining protein interactions forms the basis for discovery of biological pathways, disease mechanisms, and opportunities for therapeutic intervention. To harness the robust binding affinity and selectivity of structured peptides for interactome discovery, we engineered photoreactive stapled BH3 peptide helices that covalently capture their physiologic BCL-2 family targets. The crosslinking α-helices covalently trap both static and dynamic protein interactors, and enable rapid identification of interaction sites, providing a critical link between interactome discovery and targeted drug design. PMID:21168768
How to succeed in science: a concise guide for young biomedical scientists. Part II: making discoveries.

PubMed

Yewdell, Jonathan W

2008-06-01

Making discoveries is the most important part of being a scientist, and also the most fun. Young scientists need to develop the experimental and mental skill sets that enable them to make discoveries, including how to recognize and exploit serendipity when it strikes. Here, I provide practical advice to young scientists on choosing a research topic, designing, performing and interpreting experiments and, last but not least, on maintaining your sanity in the process.
Supporting Information Linking and Discovery Across Organizations Using the VIVO Semantic Web Software Suite

NASA Astrophysics Data System (ADS)

Mayernik, M. S.; Daniels, M. D.; Maull, K. E.; Khan, H.; Krafft, D. B.; Gross, M. B.; Rowan, L. R.

2016-12-01

Geosciences research is often conducted using distributed networks of researchers and resources. To better enable the discovery of the research output from the scientists and resources used within these organizations, UCAR, Cornell University, and UNAVCO are collaborating on the EarthCollab (http://earthcube.org/group/earthcollab) project which seeks to leverage semantic technologies to manage and link scientific data. As part of this effort, we have been exploring how to leverage information distributed across multiple research organizations. EarthCollab is using the VIVO semantic software suite to lookup and display Semantic Web information across our project partners.Our presentation will include a demonstration of linking between VIVO instances, discussing how to create linkages between entities in different VIVO instances where both entities describe the same person or resource. This discussion will explore how we designate the equivalence of these entities using "same as" assertions between identifiers representing these entities including URIs and ORCID IDs and how we have extended the base VIVO architecture to support the lookup of which entities in separate VIVO instances may be equivalent and to then display information from external linked entities. We will also discuss how these extensions can support other linked data lookups and sources of information.This VIVO cross-linking mechanism helps bring information from multiple VIVO instances together and helps users in navigating information spread-out between multiple VIVO instances. Challenges and open questions for this approach relate to how to display the information obtained from an external VIVO instance, both in order to preserve the brands of the internal and external systems and to handle discrepancies between ontologies, content, and/or VIVO versions.
10 CFR 590.305 - Informal discovery.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 10 Energy 4 2010-01-01 2010-01-01 false Informal discovery. 590.305 Section 590.305 Energy DEPARTMENT OF ENERGY (CONTINUED) NATURAL GAS (ECONOMIC REGULATORY ADMINISTRATION) ADMINISTRATIVE PROCEDURES WITH RESPECT TO THE IMPORT AND EXPORT OF NATURAL GAS Procedures § 590.305 Informal discovery. The...
NIPTE: a multi-university partnership supporting academic drug development.

PubMed

Gurvich, Vadim J; Byrn, Stephen R

2013-10-01

The strategic goal of academic translational research is to accelerate translational science through the improvement and development of resources for moving discoveries across translational barriers through 'first in humans' studies. To achieve this goal, access to drug discovery resources and preclinical IND-enabling infrastructure is crucial. One potential approach of research institutions for coordinating preclinical development, based on a model from the National Institute for Pharmaceutical Technology and Education (NIPTE), can provide academic translational and medical centers with access to a wide variety of enabling infrastructure for developing small molecule clinical candidates in an efficient, cost-effective manner. Copyright © 2013 Elsevier Ltd. All rights reserved.
Investigation of the pathogenesis of autoimmune diseases by iPS cells.

PubMed

Natsumoto, Bunki; Shoda, Hirofumi; Fujio, Keishi; Otsu, Makoto; Yamamoto, Kazuhiko

2017-01-01

The pluripotent stem cells have a self-renewal ability and can be differentiated into theoretically all of cell types. The induced pluripotent stem (iPS) cells overcame the ethical problems of the human embryonic stem (ES) cell, and enable pathologic analysis of intractable diseases and drug discovery. The in vitro disease model using disease-specific iPS cells enables repeated analyses of human cells without influence of environment factors. Even though autoimmune diseases are polygenic diseases, autoimmune disease-specific iPS cells are thought to be a promising tool for analyzing the pathogenesis of the diseases and drug discovery in future.

Automatic discovery of cell types and microcircuitry from neural connectomics

PubMed Central

Jonas, Eric; Kording, Konrad

2015-01-01

Neural connectomics has begun producing massive amounts of data, necessitating new analysis methods to discover the biological and computational structure. It has long been assumed that discovering neuron types and their relation to microcircuitry is crucial to understanding neural function. Here we developed a non-parametric Bayesian technique that identifies neuron types and microcircuitry patterns in connectomics data. It combines the information traditionally used by biologists in a principled and probabilistically coherent manner, including connectivity, cell body location, and the spatial distribution of synapses. We show that the approach recovers known neuron types in the retina and enables predictions of connectivity, better than simpler algorithms. It also can reveal interesting structure in the nervous system of Caenorhabditis elegans and an old man-made microprocessor. Our approach extracts structural meaning from connectomics, enabling new approaches of automatically deriving anatomical insights from these emerging datasets. DOI: http://dx.doi.org/10.7554/eLife.04250.001 PMID:25928186
Automatic discovery of cell types and microcircuitry from neural connectomics

DOE PAGES

Jonas, Eric; Kording, Konrad

2015-04-30

Neural connectomics has begun producing massive amounts of data, necessitating new analysis methods to discover the biological and computational structure. It has long been assumed that discovering neuron types and their relation to microcircuitry is crucial to understanding neural function. Here we developed a non-parametric Bayesian technique that identifies neuron types and microcircuitry patterns in connectomics data. It combines the information traditionally used by biologists in a principled and probabilistically coherent manner, including connectivity, cell body location, and the spatial distribution of synapses. We show that the approach recovers known neuron types in the retina and enables predictions of connectivity,more » better than simpler algorithms. It also can reveal interesting structure in the nervous system of Caenorhabditis elegans and an old man-made microprocessor. Our approach extracts structural meaning from connectomics, enabling new approaches of automatically deriving anatomical insights from these emerging datasets.« less
Functional Properties of the Mitochondrial Carrier System.

PubMed

Taylor, Eric B

2017-09-01

The mitochondrial carrier system (MCS) transports small molecules between mitochondria and the cytoplasm. It is integral to the core mitochondrial function to regulate cellular chemistry by metabolism. The mammalian MCS comprises the transporters of the 53-member canonical SLC25A family and a lesser number of identified noncanonical transporters. The recent discovery and investigations of the mitochondrial pyruvate carrier (MPC) illustrate the diverse effects a single mitochondrial carrier may exert on cellular function. However, the transport selectivities of many carriers remain unknown, and most have not been functionally investigated in mammalian cells. The mechanisms coordinating their function as a unified system remain undefined. Increased accessibility to molecular genetic and metabolomic technologies now greatly enables investigation of the MCS. Continued investigation of the MCS may reveal how mitochondria encode complex regulatory information within chemical thermodynamic gradients. This understanding may enable precision modulation of cellular chemistry to counteract the dysmetabolism inherent in disease. Copyright © 2017 Elsevier Ltd. All rights reserved.
Automatic discovery of cell types and microcircuitry from neural connectomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jonas, Eric; Kording, Konrad

Neural connectomics has begun producing massive amounts of data, necessitating new analysis methods to discover the biological and computational structure. It has long been assumed that discovering neuron types and their relation to microcircuitry is crucial to understanding neural function. Here we developed a non-parametric Bayesian technique that identifies neuron types and microcircuitry patterns in connectomics data. It combines the information traditionally used by biologists in a principled and probabilistically coherent manner, including connectivity, cell body location, and the spatial distribution of synapses. We show that the approach recovers known neuron types in the retina and enables predictions of connectivity,more » better than simpler algorithms. It also can reveal interesting structure in the nervous system of Caenorhabditis elegans and an old man-made microprocessor. Our approach extracts structural meaning from connectomics, enabling new approaches of automatically deriving anatomical insights from these emerging datasets.« less
Scalable Collaborative Infrastructure for a Learning Healthcare System (SCILHS): Architecture

PubMed Central

Mandl, Kenneth D; Kohane, Isaac S; McFadden, Douglas; Weber, Griffin M; Natter, Marc; Mandel, Joshua; Schneeweiss, Sebastian; Weiler, Sarah; Klann, Jeffrey G; Bickel, Jonathan; Adams, William G; Ge, Yaorong; Zhou, Xiaobo; Perkins, James; Marsolo, Keith; Bernstam, Elmer; Showalter, John; Quarshie, Alexander; Ofili, Elizabeth; Hripcsak, George; Murphy, Shawn N

2014-01-01

We describe the architecture of the Patient Centered Outcomes Research Institute (PCORI) funded Scalable Collaborative Infrastructure for a Learning Healthcare System (SCILHS, http://www.SCILHS.org) clinical data research network, which leverages the $48 billion dollar federal investment in health information technology (IT) to enable a queryable semantic data model across 10 health systems covering more than 8 million patients, plugging universally into the point of care, generating evidence and discovery, and thereby enabling clinician and patient participation in research during the patient encounter. Central to the success of SCILHS is development of innovative ‘apps’ to improve PCOR research methods and capacitate point of care functions such as consent, enrollment, randomization, and outreach for patient-reported outcomes. SCILHS adapts and extends an existing national research network formed on an advanced IT infrastructure built with open source, free, modular components. PMID:24821734
Discovery Systems

NASA Technical Reports Server (NTRS)

Pell, Barney

2003-01-01

A viewgraph presentation on NASA's Discovery Systems Project is given. The topics of discussion include: 1) NASA's Computing Information and Communications Technology Program; 2) Discovery Systems Program; and 3) Ideas for Information Integration Using the Web.
Working Group Proposed to Preserve Archival Records

NASA Astrophysics Data System (ADS)

Bartlett, Jennifer L.

2013-01-01

The AAS and AIP co-hosted a Workshop in April 2012 with NSF support (AST-1110231) that recommends establishing a Working Group on Time Domain Astronomy (WGTDA) to encourage and advise on preserving historical observations in a form meaningful for future scientific analysis. Participants specifically considered archival observations that could describe how astronomical objects change over time. Modern techniques and increased storage capacity enable extracting additional information from older media. Despite the photographic plate focus, other formats also concerned participants. To prioritize preservation efforts, participants recommended considering the information density, the amount of previously published data, their format and associated materials, their current condition, and their expected deterioration rate. Because the best digitization still produces an observation of an observation, the originals should be retained. For accessibility, participants recommended that observations and their metadata be available digitally and on-line. Standardized systems for classifying, organizing, and listing holdings should enable discovery of historical observations through the Virtual Astronomical Observatory. Participants recommended pilot projects that produce scientific results, demonstrate the dependence of some advances on heritage data, and open new avenues of exploration. Surveying a broad region of the sky with a long time-base and high cadence should reveal new phenomena and improve statistics for rare events. Adequate financial support is essential. While their capacity to produce new science is the primary motivation for preserving astronomical records, their potential for historical research and citizen science allows targeting cultural institutions and other private sources. A committee was elected to prepare the WGTDA proposal. The WGTDA executive committee should be composed of ~10 members representing modern surveys, heritage materials, data management, data standardization and integration, follow-up of time-domain discoveries, and virtual observatories. The Working Group on the Preservation of Astronomical Heritage Web page includes a full report.
An integrated SNP mining and utilization (ISMU) pipeline for next generation sequencing data.

PubMed

Azam, Sarwar; Rathore, Abhishek; Shah, Trushar M; Telluri, Mohan; Amindala, BhanuPrakash; Ruperao, Pradeep; Katta, Mohan A V S K; Varshney, Rajeev K

2014-01-01

Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone free software.
Process-driven information management system at a biotech company: concept and implementation.

PubMed

Gobbi, Alberto; Funeriu, Sandra; Ioannou, John; Wang, Jinyi; Lee, Man-Ling; Palmer, Chris; Bamford, Bob; Hewitt, Robin

2004-01-01

While established pharmaceutical companies have chemical information systems in place to manage their compounds and the associated data, new startup companies need to implement these systems from scratch. Decisions made early in the design phase usually have long lasting effects on the expandability, maintenance effort, and costs associated with the information management system. Careful analysis of work and data flows, both inter- and intradepartmental, and identification of existing dependencies between activities are important. This knowledge is required to implement an information management system, which enables the research community to work efficiently by avoiding redundant registration and processing of data and by timely provision of the data whenever needed. This paper first presents the workflows existing at Anadys, then ARISE, the research information management system developed in-house at Anadys. ARISE was designed to support the preclinical drug discovery process and covers compound registration, analytical quality control, inventory management, high-throughput screening, lower throughput screening, and data reporting.
Closed-Loop Multitarget Optimization for Discovery of New Emulsion Polymerization Recipes

PubMed Central

2015-01-01

Self-optimization of chemical reactions enables faster optimization of reaction conditions or discovery of molecules with required target properties. The technology of self-optimization has been expanded to discovery of new process recipes for manufacture of complex functional products. A new machine-learning algorithm, specifically designed for multiobjective target optimization with an explicit aim to minimize the number of “expensive” experiments, guides the discovery process. This “black-box” approach assumes no a priori knowledge of chemical system and hence particularly suited to rapid development of processes to manufacture specialist low-volume, high-value products. The approach was demonstrated in discovery of process recipes for a semibatch emulsion copolymerization, targeting a specific particle size and full conversion. PMID:26435638
Terminology for Neuroscience Data Discovery: Multi-tree Syntax and Investigator-Derived Semantics

PubMed Central

Goldberg, David H.; Grafstein, Bernice; Robert, Adrian; Gardner, Esther P.

2009-01-01

The Neuroscience Information Framework (NIF), developed for the NIH Blueprint for Neuroscience Research and available at http://nif.nih.gov and http://neurogateway.org, is built upon a set of coordinated terminology components enabling data and web-resource description and selection. Core NIF terminologies use a straightforward syntax designed for ease of use and for navigation by familiar web interfaces, and readily exportable to aid development of relational-model databases for neuroscience data sharing. Datasets, data analysis tools, web resources, and other entities are characterized by multiple descriptors, each addressing core concepts, including data type, acquisition technique, neuroanatomy, and cell class. Terms for each concept are organized in a tree structure, providing is-a and has-a relations. Broad general terms near each root span the category or concept and spawn more detailed entries for specificity. Related but distinct concepts (e.g., brain area and depth) are specified by separate trees, for easier navigation than would be required by graph representation. Semantics enabling NIF data discovery were selected at one or more workshops by investigators expert in particular systems (vision, olfaction, behavioral neuroscience, neurodevelopment), brain areas (cerebellum, thalamus, hippocampus), preparations (molluscs, fly), diseases (neurodegenerative disease), or techniques (microscopy, computation and modeling, neurogenetics). Workshop-derived integrated term lists are available Open Source at http://brainml.org; a complete list of participants is at http://brainml.org/workshops. PMID:18958630
Time-gated detection of protein-protein interactions with transcriptional readout

PubMed Central

Sanchez, Mateo I; Coukos, Robert; von Zastrow, Mark

2017-01-01

Transcriptional assays, such as yeast two-hybrid and TANGO, that convert transient protein-protein interactions (PPIs) into stable expression of transgenes are powerful tools for PPI discovery, screens, and analysis of cell populations. However, such assays often have high background and lose information about PPI dynamics. We have developed SPARK (Specific Protein Association tool giving transcriptional Readout with rapid Kinetics), in which proteolytic release of a membrane-tethered transcription factor (TF) requires both a PPI to deliver a protease proximal to its cleavage peptide and blue light to uncage the cleavage site. SPARK was used to detect 12 different PPIs in mammalian cells, with 5 min temporal resolution and signal ratios up to 37. By shifting the light window, we could reconstruct PPI time-courses. Combined with FACS, SPARK enabled 51 fold enrichment of PPI-positive over PPI-negative cells. Due to its high specificity and sensitivity, SPARK has the potential to advance PPI analysis and discovery. PMID:29189201
Synthetic biology to access and expand nature’s chemical diversity

PubMed Central

Smanski, Michael J.; Zhou, Hui; Claesen, Jan; Shen, Ben; Fischbach, Michael; Voigt, Christopher A.

2016-01-01

Bacterial genomes encode the biosynthetic potential to produce hundreds of thousands of complex molecules with diverse applications, from medicine to agriculture and materials. Economically accessing the potential encoded within sequenced genomes promises to reinvigorate waning drug discovery pipelines and provide novel routes to intricate chemicals. This is a tremendous undertaking, as the pathways often comprise dozens of genes spanning as much as 100+ kiliobases of DNA, are controlled by complex regulatory networks, and the most interesting molecules are made by non-model organisms. Advances in synthetic biology address these issues, including DNA construction technologies, genetic parts for precision expression control, synthetic regulatory circuits, computer aided design, and multiplexed genome engineering. Collectively, these technologies are moving towards an era when chemicals can be accessed en mass based on sequence information alone. This will enable the harnessing of metagenomic data and massive strain banks for high-throughput molecular discovery and, ultimately, the ability to forward design pathways to complex chemicals not found in nature. PMID:26876034
Essential Skills and Knowledge for Troubleshooting E-Resources Access Issues in a Web-Scale Discovery Environment

ERIC Educational Resources Information Center

Carter, Sunshine; Traill, Stacie

2017-01-01

Electronic resource access troubleshooting is familiar work in most libraries. The added complexity introduced when a library implements a web-scale discovery service, however, creates a strong need for well-organized, rigorous training to enable troubleshooting staff to provide the best service possible. This article outlines strategies, tools,…
Deploying the ODISEES Ontology-guided Search in the NASA Earth Exchange (NEX)

NASA Astrophysics Data System (ADS)

Huffer, E.; Gleason, J. L.; Cotnoir, M.; Spaulding, R.; Deardorff, G.

2016-12-01

Robust, semantically rich metadata can support data discovery and access, and facilitate machine-to-machine transactions with services such as data subsetting, regridding, and reformatting. Despite this, for users not already familiar with the data in a given archive, most metadata is insufficient to help them find appropriate data for their projects. With this in mind, the Ontology-driven Interactive Search Environment (ODISEES) Data Discovery Portal was developed to enable users to find and download data variables that satisfy precise, parameter-level criteria, even when they know little or nothing about the naming conventions employed by data providers, or where suitable data might be archived. ODISEES relies on an Earth science ontology and metadata repository that provide an ontological framework for describing NASA data holdings with enough detail and fidelity to enable researchers to find, compare and evaluate individual data variables. Users can search for data by indicating the specific parameters desired, and comparing the results in a table that lets them quickly determine which data is most suitable. ODISEES and OLYMPUS, a tool for generating the semantically enhanced metadata used by ODISEES, are being developed in collaboration with the NASA Earth Exchange (NEX) project at the NASA Ames Research Center to prototype a robust data discovery and access service that could be made available to NEX users. NEX is a collaborative platform that provides researchers with access to TB to PB-scale datasets and analysis tools to operate on those data. By integrating ODISEES into the NEX Web Portal we hope to enable NEX users to locate datasets relevant to their research and download them directly into the NAS environment, where they can run applications using those datasets on the NAS supercomputers. This poster will describe the prototype integration of ODISEES into the NEX portal development environment, the mechanism implemented to use NASA APIs to retrieve data, and the approach to transfer data into the NAS supercomputing environment. Finally, we will describe the end-to-end demonstration of the capabilities implemented. This work was funded by the Advanced Information Systems Technology Program of NASA's Research Opportunities in Space and Earth Science.
Bioinformatics in translational drug discovery.

PubMed

Wooller, Sarah K; Benstead-Hume, Graeme; Chen, Xiangrong; Ali, Yusuf; Pearl, Frances M G

2017-08-31

Bioinformatics approaches are becoming ever more essential in translational drug discovery both in academia and within the pharmaceutical industry. Computational exploitation of the increasing volumes of data generated during all phases of drug discovery is enabling key challenges of the process to be addressed. Here, we highlight some of the areas in which bioinformatics resources and methods are being developed to support the drug discovery pipeline. These include the creation of large data warehouses, bioinformatics algorithms to analyse 'big data' that identify novel drug targets and/or biomarkers, programs to assess the tractability of targets, and prediction of repositioning opportunities that use licensed drugs to treat additional indications. © 2017 The Author(s).
Human Exploration and Development of Space: Strategic Plan

NASA Technical Reports Server (NTRS)

Branscome, Darrell (Editor); Allen, Marc (Editor); Bihner, William (Editor); Craig, Mark (Editor); Crouch, Matthew (Editor); Crouch, Roger (Editor); Flaherty, Chris (Editor); Haynes, Norman (Editor); Horowitz, Steven (Editor)

2000-01-01

The five goals of the Human Exploration and Development of Space include: 1) Explore the Space Frontier; 2) Expand Scientific Knowledge; 3) Enable Humans to Live and Work Permanently in Space; 4) Enable the Commercial Development of Space; and 5) Share the Experience and Benefits of Discovery.
An Ecological Framework of the Human Virome Provides Classification of Current Knowledge and Identifies Areas of Forthcoming Discovery

PubMed Central

Parker, Michael T.

2016-01-01

Recent advances in sequencing technologies have opened the door for the classification of the human virome. While taxonomic classification can be applied to the viruses identified in such studies, this gives no information as to the type of interaction the virus has with the host. As follow-up studies are performed to address these questions, the description of these virus-host interactions would be greatly enriched by applying a standard set of definitions that typify them. This paper describes a framework with which all members of the human virome can be classified based on principles of ecology. The scaffold not only enables categorization of the human virome, but can also inform research aimed at identifying novel virus-host interactions. PMID:27698618
GeoViQua: quality-aware geospatial data discovery and evaluation

NASA Astrophysics Data System (ADS)

Bigagli, L.; Papeschi, F.; Mazzetti, P.; Nativi, S.

2012-04-01

GeoViQua (QUAlity aware VIsualization for the Global Earth Observation System of Systems) is a recently started FP7 project aiming at complementing the Global Earth Observation System of Systems (GEOSS) with rigorous data quality specifications and quality-aware capabilities, in order to improve reliability in scientific studies and policy decision-making. GeoViQua main scientific and technical objective is to enhance the GEOSS Common Infrastructure (GCI) providing the user community with innovative quality-aware search and evaluation tools, which will be integrated in the GEO-Portal, as well as made available to other end-user interfaces. To this end, GeoViQua will promote the extension of the current standard metadata for geographic information with accurate and expressive quality indicators, also contributing to the definition of a quality label (GEOLabel). GeoViQua proposed solutions will be assessed in several pilot case studies covering the whole Earth Observation chain, from remote sensing acquisition to data processing, to applications in the main GEOSS Societal Benefit Areas. This work presents the preliminary results of GeoViQua Work Package 4 "Enhanced geo-search tools" (WP4), started in January 2012. Its major anticipated technical innovations are search and evaluation tools that communicate and exploit data quality information from the GCI. In particular, GeoViQua will investigate a graphical search interface featuring a coherent and meaningful aggregation of statistics and metadata summaries (e.g. in the form of tables, charts), thus enabling end users to leverage quality constraints for data discovery and evaluation. Preparatory work on WP4 requirements indicated that users need the "best" data for their purpose, implying a high degree of subjectivity in judgment. This suggests that the GeoViQua system should exploit a combination of provider-generated metadata (objective indicators such as summary statistics), system-generated metadata (contextual/tracking information such as provenance of data and metadata), and user-generated metadata (informal user comments, usage information, rating, etc.). Moreover, metadata should include sufficiently complete access information, to allow rich data visualization and propagation. The following main enabling components are currently identified within WP4: - Quality-aware access services, e.g. a quality-aware extension of the OGC Sensor Observation Service (SOS-Q) specification, to support quality constraints for sensor data publishing and access; - Quality-aware discovery services, namely a quality-aware extension of the OGC Catalog Service for the Web (CSW-Q), to cope with quality constrained search; - Quality-augmentation broker (GeoViQua Broker), to support the linking and combination of the existing GCI metadata with GeoViQua- and user-generated metadata required to support the users in selecting the "best" data for their intended use. We are currently developing prototypes of the above quality-enabled geo-search components, that will be assessed in a sensor-based pilot case study in the next months. In particular, the GeoViQua Broker will be integrated with the EuroGEOSS Broker, to implement CSW-Q and federate (either via distribution or harvesting schemes) quality-aware data sources, GeoViQua will constitute a valuable test-bed for advancing the current best practices and standards in geospatial quality representation and exploitation. The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under Grant Agreement n° 265178.
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities

NASA Astrophysics Data System (ADS)

Wang, H.; Chen, Y.; West, P.; Erickson, J. S.; Ma, X.; Fox, P. A.

2014-12-01

Deep Carbon Observatory (DCO) is a decade-long scientific endeavor to understand carbon in the complex deep Earth system. Thousands of DCO scientists from institutions across the globe are organized into communities representing four domains of exploration: Extreme Physics and Chemistry, Reservoirs and Fluxes, Deep Energy, and Deep Life. Cross-community and cross-disciplinary collaboration is one of the most distinctive features in DCO's flexible research framework. VIVO is an open-source Semantic Web platform that facilitates cross-institutional researcher and research discovery. it includes a number of standard ontologies that interconnect people, organizations, publications, activities, locations, and other entities of research interest to enable browsing, searching, visualizing, and generating Linked Open (research) Data. The DCO-VIVO solution expedites research collaboration between DCO scientists and communities. Based on DCO's specific requirements, the DCO Data Science team developed a series of extensions to the VIVO platform including extending the VIVO information model, extended query over the semantic information within VIVO, integration with other open source collaborative environments and data management systems, using single sign-on, assigning of unique Handles to DCO objects, and publication and dataset ingesting extensions using existing publication systems. We present here the iterative development of these requirements that are now in daily use by the DCO community of scientists for research reporting, information sharing, and resource discovery in support of research activities and program management.

ESIP's Earth Science Knowledge Graph (ESKG) Testbed Project: An Automatic Approach to Building Interdisciplinary Earth Science Knowledge Graphs to Improve Data Discovery

NASA Astrophysics Data System (ADS)

McGibbney, L. J.; Jiang, Y.; Burgess, A. B.

2017-12-01

Big Earth observation data have been produced, archived and made available online, but discovering the right data in a manner that precisely and efficiently satisfies user needs presents a significant challenge to the Earth Science (ES) community. An emerging trend in information retrieval community is to utilize knowledge graphs to assist users in quickly finding desired information from across knowledge sources. This is particularly prevalent within the fields of social media and complex multimodal information processing to name but a few, however building a domain-specific knowledge graph is labour-intensive and hard to keep up-to-date. In this work, we update our progress on the Earth Science Knowledge Graph (ESKG) project; an ESIP-funded testbed project which provides an automatic approach to building a dynamic knowledge graph for ES to improve interdisciplinary data discovery by leveraging implicit, latent existing knowledge present within across several U.S Federal Agencies e.g. NASA, NOAA and USGS. ESKG strengthens ties between observations and user communities by: 1) developing a knowledge graph derived from various sources e.g. Web pages, Web Services, etc. via natural language processing and knowledge extraction techniques; 2) allowing users to traverse, explore, query, reason and navigate ES data via knowledge graph interaction. ESKG has the potential to revolutionize the way in which ES communities interact with ES data in the open world through the entity, spatial and temporal linkages and characteristics that make it up. This project enables the advancement of ESIP collaboration areas including both Discovery and Semantic Technologies by putting graph information right at our fingertips in an interactive, modern manner and reducing the efforts to constructing ontology. To demonstrate the ESKG concept, we will demonstrate use of our framework across NASA JPL's PO.DAAC, NOAA's Earth Observation Requirements Evaluation System (EORES) and various USGS systems.
Application of an automated natural language processing (NLP) workflow to enable federated search of external biomedical content in drug discovery and development.

PubMed

McEntire, Robin; Szalkowski, Debbie; Butler, James; Kuo, Michelle S; Chang, Meiping; Chang, Man; Freeman, Darren; McQuay, Sarah; Patel, Jagruti; McGlashen, Michael; Cornell, Wendy D; Xu, Jinghai James

2016-05-01

External content sources such as MEDLINE(®), National Institutes of Health (NIH) grants and conference websites provide access to the latest breaking biomedical information, which can inform pharmaceutical and biotechnology company pipeline decisions. The value of the sites for industry, however, is limited by the use of the public internet, the limited synonyms, the rarity of batch searching capability and the disconnected nature of the sites. Fortunately, many sites now offer their content for download and we have developed an automated internal workflow that uses text mining and tailored ontologies for programmatic search and knowledge extraction. We believe such an efficient and secure approach provides a competitive advantage to companies needing access to the latest information for a range of use cases and complements manually curated commercial sources. Copyright © 2016. Published by Elsevier Ltd.
49 CFR 511.31 - General provisions governing discovery.

Code of Federal Regulations, 2011 CFR

2011-10-01

... discovery. Discovery may commence at any time after filing of the answer. Unless otherwise provided in these... established in this subpart are applicable to the discovery of information among the parties to a proceeding. Parties seeking information from persons not parties may do so by subpoena in accordance with § 511.38. (b...
From Visual Exploration to Storytelling and Back Again.

PubMed

Gratzl, S; Lex, A; Gehlenborg, N; Cosgrove, N; Streit, M

2016-06-01

The primary goal of visual data exploration tools is to enable the discovery of new insights. To justify and reproduce insights, the discovery process needs to be documented and communicated. A common approach to documenting and presenting findings is to capture visualizations as images or videos. Images, however, are insufficient for telling the story of a visual discovery, as they lack full provenance information and context. Videos are difficult to produce and edit, particularly due to the non-linear nature of the exploratory process. Most importantly, however, neither approach provides the opportunity to return to any point in the exploration in order to review the state of the visualization in detail or to conduct additional analyses. In this paper we present CLUE (Capture, Label, Understand, Explain), a model that tightly integrates data exploration and presentation of discoveries. Based on provenance data captured during the exploration process, users can extract key steps, add annotations, and author "Vistories", visual stories based on the history of the exploration. These Vistories can be shared for others to view, but also to retrace and extend the original analysis. We discuss how the CLUE approach can be integrated into visualization tools and provide a prototype implementation. Finally, we demonstrate the general applicability of the model in two usage scenarios: a Gapminder-inspired visualization to explore public health data and an example from molecular biology that illustrates how Vistories could be used in scientific journals. (see Figure 1 for visual abstract).
From Visual Exploration to Storytelling and Back Again

PubMed Central

Gratzl, S.; Lex, A.; Gehlenborg, N.; Cosgrove, N.; Streit, M.

2016-01-01

The primary goal of visual data exploration tools is to enable the discovery of new insights. To justify and reproduce insights, the discovery process needs to be documented and communicated. A common approach to documenting and presenting findings is to capture visualizations as images or videos. Images, however, are insufficient for telling the story of a visual discovery, as they lack full provenance information and context. Videos are difficult to produce and edit, particularly due to the non-linear nature of the exploratory process. Most importantly, however, neither approach provides the opportunity to return to any point in the exploration in order to review the state of the visualization in detail or to conduct additional analyses. In this paper we present CLUE (Capture, Label, Understand, Explain), a model that tightly integrates data exploration and presentation of discoveries. Based on provenance data captured during the exploration process, users can extract key steps, add annotations, and author “Vistories”, visual stories based on the history of the exploration. These Vistories can be shared for others to view, but also to retrace and extend the original analysis. We discuss how the CLUE approach can be integrated into visualization tools and provide a prototype implementation. Finally, we demonstrate the general applicability of the model in two usage scenarios: a Gapminder-inspired visualization to explore public health data and an example from molecular biology that illustrates how Vistories could be used in scientific journals. (see Figure 1 for visual abstract) PMID:27942091
Open data in drug discovery and development: lessons from malaria.

PubMed

Wells, Timothy N C; Willis, Paul; Burrows, Jeremy N; Hooft van Huijsduijnen, Rob

2016-10-01

There is a growing consensus that drug discovery thrives in an open environment. Here, we describe how the malaria community has embraced four levels of open data - open science, open innovation, open access and open source - to catalyse the development of new medicines, and consider principles that could enable open data approaches to be applied to other disease areas.
Discovery Lab in the Chemistry Lecture Room: Design and Evaluation of Audio-Visual Constructivist Methodology of Teaching Descriptive Inorganic Chemistry.

ERIC Educational Resources Information Center

Young, Barbara N.; Hoffman, Lyubov

Demonstration of chemical reactions is a tool used in the teaching of inorganic descriptive chemistry to enable students to understand the fundamental concepts of chemistry through the use of concrete examples. For maximum benefit, students need to learn through discovery to observe, interpret, hypothesize, and draw conclusions; however, chemical…
Mapping the Stacks: Sustainability and User Experience of Animated Maps in Library Discovery Interfaces

ERIC Educational Resources Information Center

McMillin, Bill; Gibson, Sally; MacDonald, Jean

2016-01-01

Animated maps of the library stacks were integrated into the catalog interface at Pratt Institute and into the EBSCO Discovery Service interface at Illinois State University. The mapping feature was developed for optimal automation of the update process to enable a range of library personnel to update maps and call-number ranges. The development…
KSC-06pd0316

NASA Image and Video Library

2006-02-18

KENNEDY SPACE CENTER, FLA. - In NASA Kennedy Space Center's Orbiter Processing Facility bay 3, United Space Alliance shuttle technicians remove the hard cover from a window on Space Shuttle Discovery to enable STS-121 crew members to inspect the window from the cockpit. Launch of Space Shuttle Discovery on mission STS-121, the second return-to-flight mission, is scheduled no earlier than May.
[From the discovery of antibiotics to emerging highly drug-resistant bacteria].

PubMed

Meunier, Olivier

2015-01-01

The discovery of antibiotics has enabled serious infections to be treated. However, bacteria resistant to several families of antibiotics and the emergence of new highly drug-resistant bacteria constitute a public health issue in France and across the world. Actions to prevent their transmission are being put in place. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Genetics of coronary artery disease: discovery, biology and clinical translation

PubMed Central

Khera, Amit V.; Kathiresan, Sekar

2018-01-01

Coronary artery disease is the leading global cause of mortality. Long recognized to be heritable, recent advances have started to unravel the genetic architecture of the disease. Common variant association studies have linked about 60 genetic loci to coronary risk. Large-scale gene sequencing efforts and functional studies have facilitated a better understanding of causal risk factors, elucidated underlying biology and informed the development of new therapeutics. Moving forward, genetic testing could enable precision medicine approaches, by identifying subgroups of patients at increased risk of CAD or those with a specific driving pathophysiology in whom a therapeutic or preventive approach is most useful. PMID:28286336
Two Models for Semi-Supervised Terrorist Group Detection

NASA Astrophysics Data System (ADS)

Ozgul, Fatih; Erdem, Zeki; Bowerman, Chris

Since discovery of organization structure of offender groups leads the investigation to terrorist cells or organized crime groups, detecting covert networks from crime data are important to crime investigation. Two models, GDM and OGDM, which are based on another representation model - OGRM are developed and tested on nine terrorist groups. GDM, which is basically depending on police arrest data and “caught together” information and OGDM, which uses a feature matching on year-wise offender components from arrest and demographics data, performed well on terrorist groups, but OGDM produced high precision with low recall values. OGDM uses a terror crime modus operandi ontology which enabled matching of similar crimes.
Online Analysis Enhances Use of NASA Earth Science Data

NASA Technical Reports Server (NTRS)

Acker, James G.; Leptoukh, Gregory

2007-01-01

Giovanni, the Goddard Earth Sciences Data and Information Services Center (GES DISC) Interactive Online Visualization and Analysis Infrastructure, has provided researchers with advanced capabilities to perform data exploration and analysis with observational data from NASA Earth observation satellites. In the past 5-10 years, examining geophysical events and processes with remote-sensing data required a multistep process of data discovery, data acquisition, data management, and ultimately data analysis. Giovanni accelerates this process by enabling basic visualization and analysis directly on the World Wide Web. In the last two years, Giovanni has added new data acquisition functions and expanded analysis options to increase its usefulness to the Earth science research community.
Cryptococcus: from environmental saprophyte to global pathogen

PubMed Central

May, Robin C.; Stone, Neil R.H.; Wiesner, Darin L.; Bicanic, Tihana; Nielsen, Kirsten

2016-01-01

Cryptococcosis is a globally distributed invasive fungal infection that is caused by species within the genus Cryptococcus which presents substantial therapeutic challenges. Although natural human-to-human transmission has never been observed, recent work has identified multiple virulence mechanisms that enable cryptococci to infect, disseminate within and ultimately kill their human host. In this Review, we describe these recent discoveries that illustrate the intricacy of host-pathogen interactions and reveal new details about the host immune responses that either help to protect against disease or increase host susceptibility. In addition, we discuss how this improved understanding of both the host and the pathogen informs potential new avenues for therapeutic development. PMID:26685750
Cryptococcus: from environmental saprophyte to global pathogen.

PubMed

May, Robin C; Stone, Neil R H; Wiesner, Darin L; Bicanic, Tihana; Nielsen, Kirsten

2016-02-01

Cryptococcosis is a globally distributed invasive fungal infection that is caused by species within the genus Cryptococcus which presents substantial therapeutic challenges. Although natural human-to-human transmission has never been observed, recent work has identified multiple virulence mechanisms that enable cryptococci to infect, disseminate within and ultimately kill their human host. In this Review, we describe these recent discoveries that illustrate the intricacy of host-pathogen interactions and reveal new details about the host immune responses that either help to protect against disease or increase host susceptibility. In addition, we discuss how this improved understanding of both the host and the pathogen informs potential new avenues for therapeutic development.
How Formal Methods Impels Discovery: A Short History of an Air Traffic Management Project

NASA Technical Reports Server (NTRS)

Butler, Ricky W.; Hagen, George; Maddalon, Jeffrey M.; Munoz, Cesar A.; Narkawicz, Anthony; Dowek, Gilles

2010-01-01

In this paper we describe a process of algorithmic discovery that was driven by our goal of achieving complete, mechanically verified algorithms that compute conflict prevention bands for use in en route air traffic management. The algorithms were originally defined in the PVS specification language and subsequently have been implemented in Java and C++. We do not present the proofs in this paper: instead, we describe the process of discovery and the key ideas that enabled the final formal proof of correctness
Ontology for Transforming Geo-Spatial Data for Discovery and Integration of Scientific Data

NASA Astrophysics Data System (ADS)

Nguyen, L.; Chee, T.; Minnis, P.

2013-12-01

Discovery and access to geo-spatial scientific data across heterogeneous repositories and multi-discipline datasets can present challenges for scientist. We propose to build a workflow for transforming geo-spatial datasets into semantic environment by using relationships to describe the resource using OWL Web Ontology, RDF, and a proposed geo-spatial vocabulary. We will present methods for transforming traditional scientific dataset, use of a semantic repository, and querying using SPARQL to integrate and access datasets. This unique repository will enable discovery of scientific data by geospatial bound or other criteria.
Lung tumor diagnosis and subtype discovery by gene expression profiling.

PubMed

Wang, Lu-yong; Tu, Zhuowen

2006-01-01

The optimal treatment of patients with complex diseases, such as cancers, depends on the accurate diagnosis by using a combination of clinical and histopathological data. In many scenarios, it becomes tremendously difficult because of the limitations in clinical presentation and histopathology. To accurate diagnose complex diseases, the molecular classification based on gene or protein expression profiles are indispensable for modern medicine. Moreover, many heterogeneous diseases consist of various potential subtypes in molecular basis and differ remarkably in their response to therapies. It is critical to accurate predict subgroup on disease gene expression profiles. More fundamental knowledge of the molecular basis and classification of disease could aid in the prediction of patient outcome, the informed selection of therapies, and identification of novel molecular targets for therapy. In this paper, we propose a new disease diagnostic method, probabilistic boosting tree (PB tree) method, on gene expression profiles of lung tumors. It enables accurate disease classification and subtype discovery in disease. It automatically constructs a tree in which each node combines a number of weak classifiers into a strong classifier. Also, subtype discovery is naturally embedded in the learning process. Our algorithm achieves excellent diagnostic performance, and meanwhile it is capable of detecting the disease subtype based on gene expression profile.
Molecular dynamics-driven drug discovery: leaping forward with confidence.

PubMed

Ganesan, Aravindhan; Coote, Michelle L; Barakat, Khaled

2017-02-01

Given the significant time and financial costs of developing a commercial drug, it remains important to constantly reform the drug discovery pipeline with novel technologies that can narrow the candidates down to the most promising lead compounds for clinical testing. The past decade has witnessed tremendous growth in computational capabilities that enable in silico approaches to expedite drug discovery processes. Molecular dynamics (MD) has become a particularly important tool in drug design and discovery. From classical MD methods to more sophisticated hybrid classical/quantum mechanical (QM) approaches, MD simulations are now able to offer extraordinary insights into ligand-receptor interactions. In this review, we discuss how the applications of MD approaches are significantly transforming current drug discovery and development efforts. Copyright © 2016 Elsevier Ltd. All rights reserved.
The Emory Chemical Biology Discovery Center: leveraging academic innovation to advance novel targets through HTS and beyond.

PubMed

Johns, Margaret A; Meyerkord-Belton, Cheryl L; Du, Yuhong; Fu, Haian

2014-03-01

The Emory Chemical Biology Discovery Center (ECBDC) aims to accelerate high throughput biology and translation of biomedical research discoveries into therapeutic targets and future medicines by providing high throughput research platforms to scientific collaborators worldwide. ECBDC research is focused at the interface of chemistry and biology, seeking to fundamentally advance understanding of disease-related biology with its HTS/HCS platforms and chemical tools, ultimately supporting drug discovery. Established HTS/HCS capabilities, university setting, and expertise in diverse assay formats, including protein-protein interaction interrogation, have enabled the ECBDC to contribute to national chemical biology efforts, empower translational research, and serve as a training ground for young scientists. With these resources, the ECBDC is poised to leverage academic innovation to advance biology and therapeutic discovery.

Data Type Registry - Cross Road Between Catalogs, Data And Semantics

NASA Astrophysics Data System (ADS)

Richard, S. M.; Zaslavsky, I.; Bristol, S.

2017-12-01

As more data become accessible online, the opportunity is increasing to improve search for information within datasets and for automating some levels of data integration. A prerequisite for these advances is indexing the kinds of information that are present in datasets and providing machine actionable descriptions of data structures. We are exploring approaches to enabling these capabilities in the EarthCube DigitalCrust and Data Discovery Hub Building Block projects, building on the Data type registry (DTR) workgroup activity in the Research Data Alliance. We are prototyping a registry implementation using the CNRI Cordra platform and API to enable 'deep registration' of datasets for building hydrogeologic models of the Earth's Crust, and executing complex science scenarios for river chemistry and coral bleaching data. These use cases require the ability to respond to queries such as: What are properties of Entity X; What entities include property Y (or L, M, N…), and What DataTypes are about Entity X and include property Y. Development of the registry to enable these capabilities requires more in-depth metadata than is commonly available, so we are also exploring approaches to analyzing simple tabular data to automate recognition of entities and properties, and assist users with establishing semantic mappings to data integration vocabularies. This poster will review the current capabilities and implementation of a data type registry.
Modular, Antibody-free Time-Resolved LRET Kinase Assay Enabled by Quantum Dots and Tb3+-sensitizing Peptides

NASA Astrophysics Data System (ADS)

Cui, Wei; Parker, Laurie L.

2016-07-01

Fluorescent drug screening assays are essential for tyrosine kinase inhibitor discovery. Here we demonstrate a flexible, antibody-free TR-LRET kinase assay strategy that is enabled by the combination of streptavidin-coated quantum dot (QD) acceptors and biotinylated, Tb3+ sensitizing peptide donors. By exploiting the spectral features of Tb3+ and QD, and the high binding affinity of the streptavidin-biotin interaction, we achieved multiplexed detection of kinase activity in a modular fashion without requiring additional covalent labeling of each peptide substrate. This strategy is compatible with high-throughput screening, and should be adaptable to the rapidly changing workflows and targets involved in kinase inhibitor discovery.
Entity Linking Leveraging the GeoDeepDive Cyberinfrastructure and Managing Uncertainty with Provenance.

NASA Astrophysics Data System (ADS)

Maio, R.; Arko, R. A.; Lehnert, K.; Ji, P.

2017-12-01

Unlocking the full, rich, network of links between the scientific literature and the real world entities to which data correspond - such as field expeditions (cruises) on oceanographic research vessels and physical samples collected during those expeditions - remains a challenge for the geoscience community. Doing so would enable data reuse and integration on a broad scale; making it possible to inspect the network and discover, for example, all rock samples reported in the scientific literature found within 10 kilometers of an undersea volcano, and associated geochemical analyses. Such a capability could facilitate new scientific discoveries. The GeoDeepDive project provides negotiated access to 4.2+ million documents from scientific publishers, enabling text and document mining via a public API and cyberinfrastructure. We mined this corpus using entity linking techniques, which are inherently uncertain, and recorded provenance information about each link. This opens the entity linking methodology to scrutiny, and enables downstream applications to make informed assessments about the suitability of an entity link for consumption. A major challenge is how to model and disseminate the provenance information. We present results from entity linking between journal articles, research vessels and cruises, and physical samples from the Petrological Database (PetDB), and incorporate Linked Data resources such as cruises in the Rolling Deck to Repository (R2R) catalog where possible. Our work demonstrates the value and potential of the GeoDeepDive cyberinfrastructure in combination with Linked Data infrastructure provided by the EarthCube GeoLink project. We present a research workflow to capture provenance information that leverages the World Wide Web Consortium (W3C) recommendation PROV Ontology.
Automated Quantification of Arbitrary Arm-Segment Structure in Spiral Galaxies

NASA Astrophysics Data System (ADS)

Davis, Darren Robert

This thesis describes a system that, given approximately-centered images of spiral galaxies, produces quantitative descriptions of spiral galaxy structure without the need for per-image human input. This structure information consists of a list of spiral arm segments, each associated with a fitted logarithmic spiral arc and a pixel region. This list-of-arcs representation allows description of arbitrary spiral galaxy structure: the arms do not need to be symmetric, may have forks or bends, and, more generally, may be arranged in any manner with a consistent spiral-pattern center (non-merging galaxies have a sufficiently well-defined center). Such flexibility is important in order to accommodate the myriad structure variations observed in spiral galaxies. From the arcs produced from our method it is possible to calculate measures of spiral galaxy structure such as winding direction, winding tightness, arm counts, asymmetry, or other values of interest (including user-defined measures). In addition to providing information about the spiral arm "skeleton" of each galaxy, our method can enable analyses of brightness within individual spiral arms, since we provide the pixel regions associated with each spiral arm segment. For winding direction, arm tightness, and arm count, comparable information is available (to various extents) from previous efforts; to the extent that such information is available, we find strong correspondence with our output. We also characterize the changes to (and invariances in) our output as a function of modifications to important algorithm parameters. By enabling generation of extensive data about spiral galaxy structure from large-scale sky surveys, our method will enable new discoveries and tests regarding the nature of galaxies and the universe, and will facilitate subsequent work to automatically fit detailed brightness models of spiral galaxies.
Preference vs. Authority: A Comparison of Student Searching in a Subject-Specific Indexing and Abstracting Database and a Customized Discovery Layer

ERIC Educational Resources Information Center

Dahlen, Sarah P. C.; Hanson, Kathlene

2017-01-01

Discovery layers provide a simplified interface for searching library resources. Libraries with limited finances make decisions about retaining indexing and abstracting databases when similar information is available in discovery layers. These decisions should be informed by student success at finding quality information as well as satisfaction…
Discovery of Empirical Components by Information Theory

DTIC Science & Technology

2016-08-10

AFRL-AFOSR-VA-TR-2016-0289 Discovery of Empirical Components by Information Theory Amit Singer TRUSTEES OF PRINCETON UNIVERSITY 1 NASSAU HALL...3. DATES COVERED (From - To) 15 Feb 2013 to 14 Feb 2016 5a. CONTRACT NUMBER Discovery of Empirical Components by Information Theory 5b. GRANT...they draw not only from traditional linear algebra based numerical analysis or approximation theory , but also from information theory , graph theory
MyDas, an Extensible Java DAS Server

PubMed Central

Jimenez, Rafael C.; Quinn, Antony F.; Jenkinson, Andrew M.; Mulder, Nicola; Martin, Maria; Hunter, Sarah; Hermjakob, Henning

2012-01-01

A large number of diverse, complex, and distributed data resources are currently available in the Bioinformatics domain. The pace of discovery and the diversity of information means that centralised reference databases like UniProt and Ensembl cannot integrate all potentially relevant information sources. From a user perspective however, centralised access to all relevant information concerning a specific query is essential. The Distributed Annotation System (DAS) defines a communication protocol to exchange annotations on genomic and protein sequences; this standardisation enables clients to retrieve data from a myriad of sources, thus offering centralised access to end-users. We introduce MyDas, a web server that facilitates the publishing of biological annotations according to the DAS specification. It deals with the common functionality requirements of making data available, while also providing an extension mechanism in order to implement the specifics of data store interaction. MyDas allows the user to define where the required information is located along with its structure, and is then responsible for the communication protocol details. PMID:23028496
MyDas, an extensible Java DAS server.

PubMed

Salazar, Gustavo A; García, Leyla J; Jones, Philip; Jimenez, Rafael C; Quinn, Antony F; Jenkinson, Andrew M; Mulder, Nicola; Martin, Maria; Hunter, Sarah; Hermjakob, Henning

2012-01-01

A large number of diverse, complex, and distributed data resources are currently available in the Bioinformatics domain. The pace of discovery and the diversity of information means that centralised reference databases like UniProt and Ensembl cannot integrate all potentially relevant information sources. From a user perspective however, centralised access to all relevant information concerning a specific query is essential. The Distributed Annotation System (DAS) defines a communication protocol to exchange annotations on genomic and protein sequences; this standardisation enables clients to retrieve data from a myriad of sources, thus offering centralised access to end-users.We introduce MyDas, a web server that facilitates the publishing of biological annotations according to the DAS specification. It deals with the common functionality requirements of making data available, while also providing an extension mechanism in order to implement the specifics of data store interaction. MyDas allows the user to define where the required information is located along with its structure, and is then responsible for the communication protocol details.
CUAHSI Data Services: Tools and Cyberinfrastructure for Water Data Discovery, Research and Collaboration

NASA Astrophysics Data System (ADS)

Seul, M.; Brazil, L.; Castronova, A. M.

2017-12-01

CUAHSI Data Services: Tools and Cyberinfrastructure for Water Data Discovery, Research and CollaborationEnabling research surrounding interdisciplinary topics often requires a combination of finding, managing, and analyzing large data sets and models from multiple sources. This challenge has led the National Science Foundation to make strategic investments in developing community data tools and cyberinfrastructure that focus on water data, as it is central need for many of these research topics. CUAHSI (The Consortium of Universities for the Advancement of Hydrologic Science, Inc.) is a non-profit organization funded by the National Science Foundation to aid students, researchers, and educators in using and managing data and models to support research and education in the water sciences. This presentation will focus on open-source CUAHSI-supported tools that enable enhanced data discovery online using advanced searching capabilities and computational analysis run in virtual environments pre-designed for educators and scientists so they can focus their efforts on data analysis rather than IT set-up.
Interoperability and information discovery

USGS Publications Warehouse

Christian, E.

2001-01-01

In the context of information systems, there is interoperability when the distinctions between separate information systems are not a barrier to accomplishing a task that spans those systems. Interoperability so defined implies that there are commonalities among the systems involved and that one can exploit such commonalities to achieve interoperability. The challenge of a particular interoperability task is to identify relevant commonalities among the systems involved and to devise mechanisms that exploit those commonalities. The present paper focuses on the particular interoperability task of information discovery. The Global Information Locator Service (GILS) is described as a policy, standards, and technology framework for addressing interoperable information discovery on a global and long-term basis. While there are many mechanisms for people to discover and use all manner of data and information resources, GILS initiatives exploit certain key commonalities that seem to be sufficient to realize useful information discovery interoperability at a global, long-term scale. This paper describes ten of the specific commonalities that are key to GILS initiatives. It presents some of the practical implications for organizations in various roles: content provider, system engineer, intermediary, and searcher. The paper also provides examples of interoperable information discovery as deployed using GILS in four types of information communities: bibliographic, geographic, environmental, and government.
Mining SNPs from EST sequences using filters and ensemble classifiers.

PubMed

Wang, J; Zou, Q; Guo, M Z

2010-05-04

Abundant single nucleotide polymorphisms (SNPs) provide the most complete information for genome-wide association studies. However, due to the bottleneck of manual discovery of putative SNPs and the inaccessibility of the original sequencing reads, it is essential to develop a more efficient and accurate computational method for automated SNP detection. We propose a novel computational method to rapidly find true SNPs in public-available EST (expressed sequence tag) databases; this method is implemented as SNPDigger. EST sequences are clustered and aligned. SNP candidates are then obtained according to a measure of redundant frequency. Several new informative biological features, such as the structural neighbor profiles and the physical position of the SNP, were extracted from EST sequences, and the effectiveness of these features was demonstrated. An ensemble classifier, which employs a carefully selected feature set, was included for the imbalanced training data. The sensitivity and specificity of our method both exceeded 80% for human genetic data in the cross validation. Our method enables detection of SNPs from the user's own EST dataset and can be used on species for which there is no genome data. Our tests showed that this method can effectively guide SNP discovery in ESTs and will be useful to avoid and save the cost of biological analyses.
Going beyond the NASA Earthdata website: Reaching out to new audiences via social media and webinars

NASA Astrophysics Data System (ADS)

Bagwell, R.; Wong, M. M.; Brennan, J.; Murphy, K. J.; Behnke, J.

2014-12-01

This poster will introduce and explore the various social media efforts and monthly webinar series recently established by the National Aeronautics and Space Administration (NASA) Earth Observing System Data and Information System (EOSDIS) project. EOSDIS is a key core capability in NASA's Earth Science Data Systems Program. It provides end-to-end capabilities for managing NASA's Earth science data from various sources - satellites, aircraft, field measurements, and various other programs. Some of the capabilities include twelve Distributed Active Archive Centers (DAACs), Science Computing Facilities (SCFs), a data discovery and service access client (Reverb), dataset directory (Global Change Master Directory - GCMD), near real-time data (Land Atmosphere Near real-time Capability for EOS - LANCE), Worldview (an imagery visualization interface), Global Imagery Browse Services, the Earthdata Code Collaborative, and a host of other discipline specific data discovery, data access, data subsetting and visualization tools and services. We have embarked on these efforts to reach out to new audiences and potential new users and to engage our diverse end user communities world-wide. One of the key objectives is to increase awareness of the breadth of Earth science data information, services, and tools that are publicly available while also highlighting how these data and technologies enable scientific research.
Who uses NASA Earth Science Data? Connecting with Users through the Earthdata website and Social Media

NASA Astrophysics Data System (ADS)

Wong, M. M.; Brennan, J.; Bagwell, R.; Behnke, J.

2015-12-01

This poster will introduce and explore the various social media efforts, monthly webinar series and a redesigned website (https://earthdata.nasa.gov) established by National Aeronautics and Space Administration's (NASA) Earth Observing System Data and Information System (EOSDIS) project. EOSDIS is a key core capability in NASA's Earth Science Data Systems Program. It provides end-to-end capabilities for managing NASA's Earth science data from various sources - satellites, aircraft, field measurements, and various other programs. It is comprised of twelve Distributed Active Archive Centers (DAACs), Science Computing Facilities (SCFs), data discovery and service access client (Reverb and Earthdata Search), dataset directory (Global Change Master Directory - GCMD), near real-time data (Land Atmosphere Near real-time Capability for EOS - LANCE), Worldview (an imagery visualization interface), Global Imagery Browse Services, the Earthdata Code Collaborative and a host of other discipline specific data discovery, data access, data subsetting and visualization tools. We have embarked on these efforts to reach out to new audiences and potential new users and to engage our diverse end user communities world-wide. One of the key objectives is to increase awareness of the breadth of Earth science data information, services, and tools that are publicly available while also highlighting how these data and technologies enable scientific research.
Herbarium data: Global biodiversity and societal botanical needs for novel research.

PubMed

James, Shelley A; Soltis, Pamela S; Belbin, Lee; Chapman, Arthur D; Nelson, Gil; Paul, Deborah L; Collins, Matthew

2018-02-01

Building on centuries of research based on herbarium specimens gathered through time and around the globe, a new era of discovery, synthesis, and prediction using digitized collections data has begun. This paper provides an overview of how aggregated, open access botanical and associated biological, environmental, and ecological data sets, from genes to the ecosystem, can be used to document the impacts of global change on communities, organisms, and society; predict future impacts; and help to drive the remediation of change. Advocacy for botanical collections and their expansion is needed, including ongoing digitization and online publishing. The addition of non-traditional digitized data fields, user annotation capability, and born-digital field data collection enables the rapid access of rich, digitally available data sets for research, education, informed decision-making, and other scholarly and creative activities. Researchers are receiving enormous benefits from data aggregators including the Global Biodiversity Information Facility (GBIF), Integrated Digitized Biocollections (iDigBio), the Atlas of Living Australia (ALA), and the Biodiversity Heritage Library (BHL), but effective collaboration around data infrastructures is needed when working with large and disparate data sets. Tools for data discovery, visualization, analysis, and skills training are increasingly important for inspiring novel research that improves the intrinsic value of physical and digital botanical collections.
78 FR 55775 - Pipeline Safety: Information Collection Activities

Federal Register 2010, 2011, 2012, 2013, 2014

2013-09-11

...'' is open to wide interpretation and suggests that ``awareness'' be replaced with ``discovery'', which... conditions characterize ``discovery'' as ``when an operator's representative has adequate information from... ``adequate'' and ``probable'' in the definition of ``discovery'' provides additional clarity. Part A18 of the...
Enabling Service Discovery in a Federation of Systems: WS-Discovery Case Study

DTIC Science & Technology

2014-06-01

found that Pastry [3] coupled with SCRIBE [4] provides everything we require from the overlay network: Pastry nodes form a decentralized, self...application-independent manner. Furthermore, Pastry provides mechanisms that support and facilitate application-specific object replication, caching, and fault...recovery. Add SCRIBE to Pastry , and you get a generic, scalable and efficient group communication and event notification system providing
A rhetorical approach to environmental information sharing

NASA Astrophysics Data System (ADS)

Woolf, Andrew

2014-05-01

`Faceted search' has recently been widely adopted as a powerful information discovery framework, enabling users to navigate a complex landscape of information by successive refinement along key dimensions. The compelling user experience that results has seen adoption of faceted search by online retailers, media outlets, and encyclopedic publishers. A key challenge with faceted browse is the choice of suitable search dimensions, or facets. Conventional facet analysis adopts principles of exclusivity and exhaustiveness; identifying facets on their relevance to the subject and discrimination ability (Spiteri, 1998). The rhetoricians of ancient Greece defined seven dimensions (`circumstances') of analytical enquiry: who, what, when, where, why, in what way, by what means. These provide a broadly applicable framework that may be seen in Ranganathan's classic (`PMEST') scheme for facet analysis. The utility of the `Five Ws' is also manifest through their adoption in daily discourse and pedagogical frameworks. If we apply the `Five Ws' to environmental information, we arrive at a model very close to the `O&M' (ISO 19156) conceptual model for standardised exchange of environmental observation and measurements data: * who: metadata * what: observed property * when: time of observation * where: feature of interest * why: metadata * how: procedure Thus, we adopt an approach for distributed environmental information sharing which factors the architecture into components aligned with the `Five Ws' (or O&M). We give an overview of this architecture and its information classes, components, interfaces and standards. We also describe how it extends the classic SDI architecture to provide additional specific benefit for environmental information. Finally, we offer a perspective on the architecture which may be seen as a `brokering' overlay to environmental information resources, enabling an O&M-conformant view. The approach to be presented is being adopted by the Australian Bureau of Meteorology as the basis for a National Environmental Information Infrastructure.
Engaging Scientists in Meaningful E/PO: How the NASA SMD E/PO Community Addresses the Needs of the Higher Ed Community

NASA Astrophysics Data System (ADS)

Manning, James; Meinke, Bonnie K.; Schultz, Gregory R.; Smith, Denise A.; Lawton, Brandon L.; Gurton, Suzanne; NASA Astrophysics E/PO Community

2015-01-01

The NASA Astrophysics Science Education and Public Outreach Forum (SEPOF) coordinates the work of NASA Science Mission Directorate (SMD) Astrophysics EPO projects and their teams to bring cutting-edge discoveries of NASA missions to the introductory astronomy college classroom. The Astrophysics Forum assists scientist and educator involvement in SMD E/PO (uniquely poised to foster collaboration between scientists with content expertise and educators with pedagogy expertise) and makes SMD E/PO resources and expertise accessible to the science and education communities. We present three new opportunities for college instructors to bring the latest NASA discoveries in Astrophysics into their classrooms.To address the expressed needs of the higher education community, the Astrophysics Forum collaborated with the Astrophysics E/PO community, researchers, and Astronomy 101 instructors to place individual science discoveries and learning resources into context for higher education audiences. Among these resources are two Resource Guides on the topics of cosmology and exoplanets, each including a variety of accessible sources.The Astrophysics Forum also coordinates the development of the Astro 101 slide set series--5 to 7-slide presentations on new discoveries from NASA Astrophysics missions relevant to topics in introductory astronomy courses. These sets enable Astronomy 101 instructors to include new discoveries not yet in their textbooks into the broader context of the course: http://www.astrosociety.org/education/astronomy-resource-guides/.The Astrophysics Forum also coordinated the development of 12 monthly Universe Discovery Guides, each featuring a theme and a representative object well-placed for viewing, with an accompanying interpretive story, strategies for conveying the topics, and supporting NASA-approved education activities and background information from a spectrum of NASA missions and programs: http://nightsky.jpl.nasa.gov/news-display.cfm?News_ID=611.These resources help enhance the Science, Technology, Engineering, and Mathematics (STEM) experiences of undergraduates.
[Challenges and strategies of drug innovation].

PubMed

Guo, Zong-Ru; Zhao, Hong-Yu

2013-07-01

Drug research involves scientific discovery, technological inventions and product development. This multiple dimensional effort embodies both high risk and high reward and is considered one of the most complicated human activities. Prior to the initiation of a program, an in-depth analysis of "what to do" and "how to do it" must be conducted. On the macro level, market prospects, capital required, risk assessment, necessary human resources, etc. need to be evaluated critically. For execution, drug candidates need to be optimized in multiple properties such as potency, selectivity, pharmacokinetics, safety, formulation, etc., all with the constraint of finite amount of time and resources, to maximize the probability of success in clinical development. Drug discovery is enormously complicated, both in terms of technological innovation and organizing capital and other resources. A deep understanding of the complexity of drug research and our competitive edge is critical for success. Our unique government-enterprise-academia system represents a distinct advantage. As a new player, we have not heavily invested in any particular discovery paradigm, which allows us to select the optimal approach with little organizational burden. Virtue R&D model using CROs has gained momentum lately and China is a global leader in CRO market. Essentially all technological support for drug discovery can be found in China, which greatly enables domestic R&D efforts. The information technology revolution ensures the globalization of drug discovery knowledge, which has bridged much of the gap between China and the developed countries. The blockbuster model and the target-centric drug discovery paradigm have overlooked the research in several important fields such as injectable drugs, orphan drugs, and following high quality therapeutic leads, etc. Prejudice against covalent ligands, prodrugs, nondrug-like ligands can also be taken advantage of to find novel medicines. This article will discuss the current challenges and future opportunities for drug innovation in China.
77 FR 6820 - Proposed Information Collection; Comment Request: Creating Stewardship Through Biodiversity...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-02-09

... Information Collection; Comment Request: Creating Stewardship Through Biodiversity Discovery in National Parks... collection (IC) described below. This collection will survey participants of Biodiversity Discovery efforts... Biodiversity Discovery refers to a variety of efforts to discover living organisms through public involvement...

29 CFR 2700.56 - Discovery; general.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 29 Labor 9 2011-07-01 2011-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or more... upon property for inspecting, copying, photographing, and gathering information. (b) Scope of discovery...
29 CFR 2700.56 - Discovery; general.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 29 Labor 9 2012-07-01 2012-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or more... upon property for inspecting, copying, photographing, and gathering information. (b) Scope of discovery...
29 CFR 2700.56 - Discovery; general.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 29 Labor 9 2014-07-01 2014-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or more... upon property for inspecting, copying, photographing, and gathering information. (b) Scope of discovery...
29 CFR 2700.56 - Discovery; general.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 29 Labor 9 2010-07-01 2010-07-01 false Discovery; general. 2700.56 Section 2700.56 Labor... Hearings § 2700.56 Discovery; general. (a) Discovery methods. Parties may obtain discovery by one or more... upon property for inspecting, copying, photographing, and gathering information. (b) Scope of discovery...
PeptideNavigator: An interactive tool for exploring large and complex data sets generated during peptide-based drug design projects.

PubMed

Diller, Kyle I; Bayden, Alexander S; Audie, Joseph; Diller, David J

2018-01-01

There is growing interest in peptide-based drug design and discovery. Due to their relatively large size, polymeric nature, and chemical complexity, the design of peptide-based drugs presents an interesting "big data" challenge. Here, we describe an interactive computational environment, PeptideNavigator, for naturally exploring the tremendous amount of information generated during a peptide drug design project. The purpose of PeptideNavigator is the presentation of large and complex experimental and computational data sets, particularly 3D data, so as to enable multidisciplinary scientists to make optimal decisions during a peptide drug discovery project. PeptideNavigator provides users with numerous viewing options, such as scatter plots, sequence views, and sequence frequency diagrams. These views allow for the collective visualization and exploration of many peptides and their properties, ultimately enabling the user to focus on a small number of peptides of interest. To drill down into the details of individual peptides, PeptideNavigator provides users with a Ramachandran plot viewer and a fully featured 3D visualization tool. Each view is linked, allowing the user to seamlessly navigate from collective views of large peptide data sets to the details of individual peptides with promising property profiles. Two case studies, based on MHC-1A activating peptides and MDM2 scaffold design, are presented to demonstrate the utility of PeptideNavigator in the context of disparate peptide-design projects. Copyright © 2017 Elsevier Ltd. All rights reserved.
XGR software for enhanced interpretation of genomic summary data, illustrated by application to immunological traits.

PubMed

Fang, Hai; Knezevic, Bogdan; Burnham, Katie L; Knight, Julian C

2016-12-13

Biological interpretation of genomic summary data such as those resulting from genome-wide association studies (GWAS) and expression quantitative trait loci (eQTL) studies is one of the major bottlenecks in medical genomics research, calling for efficient and integrative tools to resolve this problem. We introduce eXploring Genomic Relations (XGR), an open source tool designed for enhanced interpretation of genomic summary data enabling downstream knowledge discovery. Targeting users of varying computational skills, XGR utilises prior biological knowledge and relationships in a highly integrated but easily accessible way to make user-input genomic summary datasets more interpretable. We show how by incorporating ontology, annotation, and systems biology network-driven approaches, XGR generates more informative results than conventional analyses. We apply XGR to GWAS and eQTL summary data to explore the genomic landscape of the activated innate immune response and common immunological diseases. We provide genomic evidence for a disease taxonomy supporting the concept of a disease spectrum from autoimmune to autoinflammatory disorders. We also show how XGR can define SNP-modulated gene networks and pathways that are shared and distinct between diseases, how it achieves functional, phenotypic and epigenomic annotations of genes and variants, and how it enables exploring annotation-based relationships between genetic variants. XGR provides a single integrated solution to enhance interpretation of genomic summary data for downstream biological discovery. XGR is released as both an R package and a web-app, freely available at http://galahad.well.ox.ac.uk/XGR .
100 years of Drosophila research and its impact on vertebrate neuroscience: a history lesson for the future.

PubMed

Bellen, Hugo J; Tong, Chao; Tsuda, Hiroshi

2010-07-01

Discoveries in fruit flies have greatly contributed to our understanding of neuroscience. The use of an unparalleled wealth of tools, many of which originated between 1910–1960, has enabled milestone discoveries in nervous system development and function. Such findings have triggered and guided many research efforts in vertebrate neuroscience. After 100 years, fruit flies continue to be the choice model system for many neuroscientists. The combinational use of powerful research tools will ensure that this model organism will continue to lead to key discoveries that will impact vertebrate neuroscience.
100 years of Drosophila research and its impact on vertebrate neuroscience: a history lesson for the future

PubMed Central

Bellen, Hugo J; Tong, Chao; Tsuda, Hiroshi

2014-01-01

Discoveries in fruit flies have greatly contributed to our understanding of neuroscience. The use of an unparalleled wealth of tools, many of which originated between 1910–1960, has enabled milestone discoveries in nervous system development and function. Such findings have triggered and guided many research efforts in vertebrate neuroscience. After 100 years, fruit flies continue to be the choice model system for many neuroscientists. The combinational use of powerful research tools will ensure that this model organism will continue to lead to key discoveries that will impact vertebrate neuroscience. PMID:20383202
DOE High Performance Computing Operational Review (HPCOR): Enabling Data-Driven Scientific Discovery at HPC Facilities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gerber, Richard; Allcock, William; Beggio, Chris

2014-10-17

U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at themore » DOE national laboratories. The report contains findings from that review.« less
Selective transformations of complex molecules are enabled by aptameric protective groups

NASA Astrophysics Data System (ADS)

Bastian, Andreas A.; Marcozzi, Alessio; Herrmann, Andreas

2012-10-01

Emerging trends in drug discovery are prompting a renewed interest in natural products as a source of chemical diversity and lead structures. However, owing to the structural complexity of many natural compounds, the synthesis of derivatives is not easily realized. Here, we demonstrate a conceptually new approach using oligonucleotides as aptameric protective groups. These block several functionalities by non-covalent interactions in a complex molecule and enable the highly chemo- and regioselective derivatization (>99%) of natural antibiotics in a single synthetic step with excellent conversions of up to 83%. This technique reveals an important structure-activity relationship in neamine-based antibiotics and should help both to accelerate the discovery of new biologically active structures and to avoid potentially costly and cumbersome synthetic routes.
Laboratory Astrophysics: Enabling Scientific Discovery and Understanding

NASA Technical Reports Server (NTRS)

Kirby, K.

2006-01-01

NASA's Science Strategic Roadmap for Universe Exploration lays out a series of science objectives on a grand scale and discusses the various missions, over a wide range of wavelengths, which will enable discovery. Astronomical spectroscopy is arguably the most powerful tool we have for exploring the Universe. Experimental and theoretical studies in Laboratory Astrophysics convert "hard-won data into scientific understanding". However, the development of instruments with increasingly high spectroscopic resolution demands atomic and molecular data of unprecedented accuracy and completeness. How to meet these needs, in a time of severe budgetary constraints, poses a significant challenge both to NASA, the astronomical observers and model-builders, and the laboratory astrophysics community. I will discuss these issues, together with some recent examples of productive astronomy/lab astro collaborations.
Revisiting lab-on-a-chip technology for drug discovery.

PubMed

Neuži, Pavel; Giselbrecht, Stefan; Länge, Kerstin; Huang, Tony Jun; Manz, Andreas

2012-08-01

The field of microfluidics or lab-on-a-chip technology aims to improve and extend the possibilities of bioassays, cell biology and biomedical research based on the idea of miniaturization. Microfluidic systems allow more accurate modelling of physiological situations for both fundamental research and drug development, and enable systematic high-volume testing for various aspects of drug discovery. Microfluidic systems are in development that not only model biological environments but also physically mimic biological tissues and organs; such 'organs on a chip' could have an important role in expediting early stages of drug discovery and help reduce reliance on animal testing. This Review highlights the latest lab-on-a-chip technologies for drug discovery and discusses the potential for future developments in this field.
28 CFR 2.30 - False information or new criminal conduct: Discovery after release.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 28 Judicial Administration 1 2010-07-01 2010-07-01 false False information or new criminal conduct... Prisoners and Parolees § 2.30 False information or new criminal conduct: Discovery after release. If... willfully provided false information or misrepresented information deemed significant to his application for...
Scalable Collaborative Infrastructure for a Learning Healthcare System (SCILHS): architecture.

PubMed

Mandl, Kenneth D; Kohane, Isaac S; McFadden, Douglas; Weber, Griffin M; Natter, Marc; Mandel, Joshua; Schneeweiss, Sebastian; Weiler, Sarah; Klann, Jeffrey G; Bickel, Jonathan; Adams, William G; Ge, Yaorong; Zhou, Xiaobo; Perkins, James; Marsolo, Keith; Bernstam, Elmer; Showalter, John; Quarshie, Alexander; Ofili, Elizabeth; Hripcsak, George; Murphy, Shawn N

2014-01-01

We describe the architecture of the Patient Centered Outcomes Research Institute (PCORI) funded Scalable Collaborative Infrastructure for a Learning Healthcare System (SCILHS, http://www.SCILHS.org) clinical data research network, which leverages the $48 billion dollar federal investment in health information technology (IT) to enable a queryable semantic data model across 10 health systems covering more than 8 million patients, plugging universally into the point of care, generating evidence and discovery, and thereby enabling clinician and patient participation in research during the patient encounter. Central to the success of SCILHS is development of innovative 'apps' to improve PCOR research methods and capacitate point of care functions such as consent, enrollment, randomization, and outreach for patient-reported outcomes. SCILHS adapts and extends an existing national research network formed on an advanced IT infrastructure built with open source, free, modular components. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
The UCSC Genome Browser: What Every Molecular Biologist Should Know

PubMed Central

Mangan, Mary E.; Williams, Jennifer M.; Kuhn, Robert M.; Lathe, Warren C.

2014-01-01

Electronic data resources can enable molecular biologists to quickly get information from around the world that a decade ago would have been buried in papers scattered throughout the library. The ability to access, query, and display these data make benchwork much more efficient and drive new discoveries. Increasingly, mastery of software resources and corresponding data repositories is required to fully explore the volume of data generated in biomedical and agricultural research, because only small amounts of data are actually found in traditional publications. The UCSC Genome Browser provides a wealth of data and tools that advance understanding of genomic context for many species, enable detailed analysis of data, and provide the ability to interrogate regions of interest across disparate data sets from a wide variety of sources. Researchers can also supplement the standard display with their own data to query and share this with others. Effective use of these resources has become crucial to biological research today, and this unit describes some practical applications of the UCSC Genome Browser. PMID:24984850
Peroxidase gene discovery from the horseradish transcriptome.

PubMed

Näätsaari, Laura; Krainer, Florian W; Schubert, Michael; Glieder, Anton; Thallinger, Gerhard G

2014-03-24

Horseradish peroxidases (HRPs) from Armoracia rusticana have long been utilized as reporters in various diagnostic assays and histochemical stainings. Regardless of their increasing importance in the field of life sciences and suggested uses in medical applications, chemical synthesis and other industrial applications, the HRP isoenzymes, their substrate specificities and enzymatic properties are poorly characterized. Due to lacking sequence information of natural isoenzymes and the low levels of HRP expression in heterologous hosts, commercially available HRP is still extracted as a mixture of isoenzymes from the roots of A. rusticana. In this study, a normalized, size-selected A. rusticana transcriptome library was sequenced using 454 Titanium technology. The resulting reads were assembled into 14871 isotigs with an average length of 1133 bp. Sequence databases, ORF finding and ORF characterization were utilized to identify peroxidase genes from the 14871 isotigs generated by de novo assembly. The sequences were manually reviewed and verified with Sanger sequencing of PCR amplified genomic fragments, resulting in the discovery of 28 secretory peroxidases, 23 of them previously unknown. A total of 22 isoenzymes including allelic variants were successfully expressed in Pichia pastoris and showed peroxidase activity with at least one of the substrates tested, thus enabling their development into commercial pure isoenzymes. This study demonstrates that transcriptome sequencing combined with sequence motif search is a powerful concept for the discovery and quick supply of new enzymes and isoenzymes from any plant or other eukaryotic organisms. Identification and manual verification of the sequences of 28 HRP isoenzymes do not only contribute a set of peroxidases for industrial, biological and biomedical applications, but also provide valuable information on the reliability of the approach in identifying and characterizing a large group of isoenzymes.
Peroxidase gene discovery from the horseradish transcriptome

PubMed Central

2014-01-01

Background Horseradish peroxidases (HRPs) from Armoracia rusticana have long been utilized as reporters in various diagnostic assays and histochemical stainings. Regardless of their increasing importance in the field of life sciences and suggested uses in medical applications, chemical synthesis and other industrial applications, the HRP isoenzymes, their substrate specificities and enzymatic properties are poorly characterized. Due to lacking sequence information of natural isoenzymes and the low levels of HRP expression in heterologous hosts, commercially available HRP is still extracted as a mixture of isoenzymes from the roots of A. rusticana. Results In this study, a normalized, size-selected A. rusticana transcriptome library was sequenced using 454 Titanium technology. The resulting reads were assembled into 14871 isotigs with an average length of 1133 bp. Sequence databases, ORF finding and ORF characterization were utilized to identify peroxidase genes from the 14871 isotigs generated by de novo assembly. The sequences were manually reviewed and verified with Sanger sequencing of PCR amplified genomic fragments, resulting in the discovery of 28 secretory peroxidases, 23 of them previously unknown. A total of 22 isoenzymes including allelic variants were successfully expressed in Pichia pastoris and showed peroxidase activity with at least one of the substrates tested, thus enabling their development into commercial pure isoenzymes. Conclusions This study demonstrates that transcriptome sequencing combined with sequence motif search is a powerful concept for the discovery and quick supply of new enzymes and isoenzymes from any plant or other eukaryotic organisms. Identification and manual verification of the sequences of 28 HRP isoenzymes do not only contribute a set of peroxidases for industrial, biological and biomedical applications, but also provide valuable information on the reliability of the approach in identifying and characterizing a large group of isoenzymes. PMID:24666710
An Ontology for the Discovery of Time-series Data

NASA Astrophysics Data System (ADS)

Hooper, R. P.; Choi, Y.; Piasecki, M.; Zaslavsky, I.; Valentine, D. W.; Whitenack, T.

2010-12-01

An ontology was developed to enable a single-dimensional keyword search of time-series data collected at fixed points, such as stream gage records, water quality observations, or repeated biological measurements collected at fixed stations. The hierarchical levels were developed to allow navigation from general concepts to more specific ones, terminating in a leaf concept, which is the specific property measured. For example, the concept “nutrient” has child concepts of “nitrogen”, “phosphorus”, and “carbon”; each of these children concepts are then broken into the actual constituent measured (e.g., “total kjeldahl nitrogen” or “nitrate + nitrite”). In this way, a non-expert user can find all nutrients containing nitrogen without knowing all the species measured, but an expert user can go immediately to the compound of interest. In addition, a property, such as dissolved silica, can appear as a leaf concept under nutrients or weathering products. This flexibility allows users from various disciplines to find properties of interest. The ontology can be viewed at http://water.sdsc.edu/hiscentral/startree.aspx. Properties measured by various data publishers (e.g., universities and government agencies) are tagged with leaf concepts from this ontology. A discovery client, HydroDesktop, creates a search request by defining the spatial and temporal extent of interest and a keyword taken from the discovery ontology. Metadata returned from the catalog describes the time series which meet the specified search criteria. This ontology is considered to be an initial description of physical, chemical and biological properties measured in water and suspended sediment. Future plans call for creating a moderated forum for the scientific community to add to and to modify this ontology. Further information for the Hydrologic Information Systems project, of which this is a part, is available at http://his.cuahsi.org.
BioTextQuest(+): a knowledge integration platform for literature mining and concept discovery.

PubMed

Papanikolaou, Nikolas; Pavlopoulos, Georgios A; Pafilis, Evangelos; Theodosiou, Theodosios; Schneider, Reinhard; Satagopam, Venkata P; Ouzounis, Christos A; Eliopoulos, Aristides G; Promponas, Vasilis J; Iliopoulos, Ioannis

2014-11-15

The iterative process of finding relevant information in biomedical literature and performing bioinformatics analyses might result in an endless loop for an inexperienced user, considering the exponential growth of scientific corpora and the plethora of tools designed to mine PubMed(®) and related biological databases. Herein, we describe BioTextQuest(+), a web-based interactive knowledge exploration platform with significant advances to its predecessor (BioTextQuest), aiming to bridge processes such as bioentity recognition, functional annotation, document clustering and data integration towards literature mining and concept discovery. BioTextQuest(+) enables PubMed and OMIM querying, retrieval of abstracts related to a targeted request and optimal detection of genes, proteins, molecular functions, pathways and biological processes within the retrieved documents. The front-end interface facilitates the browsing of document clustering per subject, the analysis of term co-occurrence, the generation of tag clouds containing highly represented terms per cluster and at-a-glance popup windows with information about relevant genes and proteins. Moreover, to support experimental research, BioTextQuest(+) addresses integration of its primary functionality with biological repositories and software tools able to deliver further bioinformatics services. The Google-like interface extends beyond simple use by offering a range of advanced parameterization for expert users. We demonstrate the functionality of BioTextQuest(+) through several exemplary research scenarios including author disambiguation, functional term enrichment, knowledge acquisition and concept discovery linking major human diseases, such as obesity and ageing. The service is accessible at http://bioinformatics.med.uoc.gr/biotextquest. g.pavlopoulos@gmail.com or georgios.pavlopoulos@esat.kuleuven.be Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Autonomy enables new science missions

NASA Astrophysics Data System (ADS)

Doyle, Richard J.; Gor, Victoria; Man, Guy K.; Stolorz, Paul E.; Chapman, Clark; Merline, William J.; Stern, Alan

1997-01-01

The challenge of space flight in NASA's future is to enable smaller, more frequent and intensive space exploration at much lower total cost without substantially decreasing mission reliability, capability, or the scientific return on investment. The most effective way to achieve this goal is to build intelligent capabilities into the spacecraft themselves. Our technological vision for meeting the challenge of returning quality science through limited communication bandwidth will actually put scientists in a more direct link with the spacecraft than they have enjoyed to date. Technologies such as pattern recognition and machine learning can place a part of the scientist's awareness onboard the spacecraft to prioritize downlink or to autonomously trigger time-critical follow-up observations-particularly important in flyby missions-without ground interaction. Onboard knowledge discovery methods can be used to include candidate discoveries in each downlink for scientists' scrutiny. Such capabilities will allow scientists to quickly reprioritize missions in a much more intimate and efficient manner than is possible today. Ultimately, new classes of exploration missions will be enabled.

Microfluidic-based mini-metagenomics enables discovery of novel microbial lineages from complex environmental samples.

PubMed

Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R

2017-07-05

Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell.
Web-based Tool Suite for Plasmasphere Information Discovery

NASA Astrophysics Data System (ADS)

Newman, T. S.; Wang, C.; Gallagher, D. L.

2005-12-01

A suite of tools that enable discovery of terrestrial plasmasphere characteristics from NASA IMAGE Extreme Ultra Violet (EUV) images is described. The tool suite is web-accessible, allowing easy remote access without the need for any software installation on the user's computer. The features supported by the tool include reconstruction of the plasmasphere plasma density distribution from a short sequence of EUV images, semi-automated selection of the plasmapause boundary in an EUV image, and mapping of the selected boundary to the geomagnetic equatorial plane. EUV image upload and result download is also supported. The tool suite's plasmapause mapping feature is achieved via the Roelof and Skinner (2000) Edge Algorithm. The plasma density reconstruction is achieved through a tomographic technique that exploits physical constraints to allow for a moderate resolution result. The tool suite's software architecture uses Java Server Pages (JSP) and Java Applets on the front side for user-software interaction and Java Servlets on the server side for task execution. The compute-intensive components of the tool suite are implemented in C++ and invoked by the server via Java Native Interface (JNI).
The development of health care data warehouses to support data mining.

PubMed

Lyman, Jason A; Scully, Kenneth; Harrison, James H

2008-03-01

Clinical data warehouses offer tremendous benefits as a foundation for data mining. By serving as a source for comprehensive clinical and demographic information on large patient populations, they streamline knowledge discovery efforts by providing standard and efficient mechanisms to replace time-consuming and expensive original data collection, organization, and processing. Building effective data warehouses requires knowledge of and attention to key issues in database design, data acquisition and processing, and data access and security. In this article, the authors provide an operational and technical definition of data warehouses, present examples of data mining projects enabled by existing data warehouses, and describe key issues and challenges related to warehouse development and implementation.
Finding the Lost City

NASA Technical Reports Server (NTRS)

1993-01-01

Nicholas Clapp, a filmmaker and archeology enthusiast, had accumulated extensive information concerning Ubar, the fabled lost city of ancient Arabia. When he was unable to identify its exact location, however, he turned to the Jet Propulsion Laboratory (JPL) for assistance in applying orbital remote sensing techniques. JPL scientists searched NASA's shuttle imaging radar, as well as Landsat and SPOT images and discovered ancient caravan tracks. This enabled them to prepare a map of the trails, which converged at a place known as Ash Shisr. An expedition was formed, which found structures and artifacts from a city that predates previous area civilization by a thousand years. Although it will take time to validate the city as Ubar, the discovery is a monumental archeological triumph.
Recommendations for adaptation and validation of commercial kits for biomarker quantification in drug development.

PubMed

Khan, Masood U; Bowsher, Ronald R; Cameron, Mark; Devanarayan, Viswanath; Keller, Steve; King, Lindsay; Lee, Jean; Morimoto, Alyssa; Rhyne, Paul; Stephen, Laurie; Wu, Yuling; Wyant, Timothy; Lachno, D Richard

2015-01-01

Increasingly, commercial immunoassay kits are used to support drug discovery and development. Longitudinally consistent kit performance is crucial, but the degree to which kits and reagents are characterized by manufacturers is not standardized, nor are the approaches by users to adapt them and evaluate their performance through validation prior to use. These factors can negatively impact data quality. This paper offers a systematic approach to assessment, method adaptation and validation of commercial immunoassay kits for quantification of biomarkers in drug development, expanding upon previous publications and guidance. These recommendations aim to standardize and harmonize user practices, contributing to reliable biomarker data from commercial immunoassays, thus, enabling properly informed decisions during drug development.
Structural and Computational Biology in the Design of Immunogenic Vaccine Antigens

PubMed Central

Liljeroos, Lassi; Malito, Enrico; Ferlenghi, Ilaria; Bottomley, Matthew James

2015-01-01

Vaccination is historically one of the most important medical interventions for the prevention of infectious disease. Previously, vaccines were typically made of rather crude mixtures of inactivated or attenuated causative agents. However, over the last 10–20 years, several important technological and computational advances have enabled major progress in the discovery and design of potently immunogenic recombinant protein vaccine antigens. Here we discuss three key breakthrough approaches that have potentiated structural and computational vaccine design. Firstly, genomic sciences gave birth to the field of reverse vaccinology, which has enabled the rapid computational identification of potential vaccine antigens. Secondly, major advances in structural biology, experimental epitope mapping, and computational epitope prediction have yielded molecular insights into the immunogenic determinants defining protective antigens, enabling their rational optimization. Thirdly, and most recently, computational approaches have been used to convert this wealth of structural and immunological information into the design of improved vaccine antigens. This review aims to illustrate the growing power of combining sequencing, structural and computational approaches, and we discuss how this may drive the design of novel immunogens suitable for future vaccines urgently needed to increase the global prevention of infectious disease. PMID:26526043
The IBD interactome: an integrated view of aetiology, pathogenesis and therapy.

PubMed

de Souza, Heitor S P; Fiocchi, Claudio; Iliopoulos, Dimitrios

2017-12-01

Crohn's disease and ulcerative colitis are prototypical complex diseases characterized by chronic and heterogeneous manifestations, induced by interacting environmental, genomic, microbial and immunological factors. These interactions result in an overwhelming complexity that cannot be tackled by studying the totality of each pathological component (an '-ome') in isolation without consideration of the interaction among all relevant -omes that yield an overall 'network effect'. The outcome of this effect is the 'IBD interactome', defined as a disease network in which dysregulation of individual -omes causes intestinal inflammation mediated by dysfunctional molecular modules. To define the IBD interactome, new concepts and tools are needed to implement a systems approach; an unbiased data-driven integration strategy that reveals key players of the system, pinpoints the central drivers of inflammation and enables development of targeted therapies. Powerful bioinformatics tools able to query and integrate multiple -omes are available, enabling the integration of genomic, epigenomic, transcriptomic, proteomic, metabolomic and microbiome information to build a comprehensive molecular map of IBD. This approach will enable identification of IBD molecular subtypes, correlations with clinical phenotypes and elucidation of the central hubs of the IBD interactome that will aid discovery of compounds that can specifically target the hubs that control the disease.
Direct UV/Optical Imaging of Stellar Surfaces: The Stellar Imager (SI) Vision Mission

NASA Technical Reports Server (NTRS)

Carpenter, Kenneth G.; Lyon, Richard G.; Schrijver, Carolus; Karovska, Margarita; Mozurkewich, David

2007-01-01

The Stellar Imager (SI) is a UV/optical, space-based interferometer designed to enable 0.1 milli-arcsecond (mas) spectral imaging of stellar surfaces and, via asteroseismology, stellar interiors and of the Universe in general. SI's science focuses on the role of magnetism in the Universe, particularly on magnetic activity on the surfaces of stars like the Sun. SI's prime goal is to enable long-term forecasting of solar activity and the space weather that it drives, in support of the Living with a Star program in the Exploration Era. SI will also revolutionize our understanding of the formation of planetary systems, of the habitability and climatology of distant planets, and of many magneto-hydrodynamically controlled processes in thc Universe. SI is a "Flagship and Landmark Discovery Mission" in the 2005 Sun Solar System Connection (SSSC) Roadmap and a candidate for a "Pathways to Life Observatory" in the Exploration of the Universe Division (EUD) Roadmap. We discuss herein the science goals of the SI Mission, a mission architecture that could meet those goals, and the technologies needed to enable this mission. Additional information on SI can be found at: http://hires.gsfc.nasa.gov/si/.
The role of 3-D interactive visualization in blind surveys of H I in galaxies

NASA Astrophysics Data System (ADS)

Punzo, D.; van der Hulst, J. M.; Roerdink, J. B. T. M.; Oosterloo, T. A.; Ramatsoku, M.; Verheijen, M. A. W.

2015-09-01

Upcoming H I surveys will deliver large datasets, and automated processing using the full 3-D information (two positional dimensions and one spectral dimension) to find and characterize H I objects is imperative. In this context, visualization is an essential tool for enabling qualitative and quantitative human control on an automated source finding and analysis pipeline. We discuss how Visual Analytics, the combination of automated data processing and human reasoning, creativity and intuition, supported by interactive visualization, enables flexible and fast interaction with the 3-D data, helping the astronomer to deal with the analysis of complex sources. 3-D visualization, coupled to modeling, provides additional capabilities helping the discovery and analysis of subtle structures in the 3-D domain. The requirements for a fully interactive visualization tool are: coupled 1-D/2-D/3-D visualization, quantitative and comparative capabilities, combined with supervised semi-automated analysis. Moreover, the source code must have the following characteristics for enabling collaborative work: open, modular, well documented, and well maintained. We review four state of-the-art, 3-D visualization packages assessing their capabilities and feasibility for use in the case of 3-D astronomical data.
Structural and Computational Biology in the Design of Immunogenic Vaccine Antigens.

PubMed

Liljeroos, Lassi; Malito, Enrico; Ferlenghi, Ilaria; Bottomley, Matthew James

2015-01-01

Vaccination is historically one of the most important medical interventions for the prevention of infectious disease. Previously, vaccines were typically made of rather crude mixtures of inactivated or attenuated causative agents. However, over the last 10-20 years, several important technological and computational advances have enabled major progress in the discovery and design of potently immunogenic recombinant protein vaccine antigens. Here we discuss three key breakthrough approaches that have potentiated structural and computational vaccine design. Firstly, genomic sciences gave birth to the field of reverse vaccinology, which has enabled the rapid computational identification of potential vaccine antigens. Secondly, major advances in structural biology, experimental epitope mapping, and computational epitope prediction have yielded molecular insights into the immunogenic determinants defining protective antigens, enabling their rational optimization. Thirdly, and most recently, computational approaches have been used to convert this wealth of structural and immunological information into the design of improved vaccine antigens. This review aims to illustrate the growing power of combining sequencing, structural and computational approaches, and we discuss how this may drive the design of novel immunogens suitable for future vaccines urgently needed to increase the global prevention of infectious disease.
A New Era of Multidisciplinary Expeditions: Recent Opportunities and Progress to Advance the Telepresence Paradigm

NASA Astrophysics Data System (ADS)

Cantwell, K. L.; Kennedy, B. R.; Malik, M.; Gray, L. M.; Elliott, K.; Lobecker, E.; Drewniak, J.; Reser, B.; Crum, E.; Lovalvo, D.

2016-02-01

Since it's commissioning in 2008, NOAA Ship Okeanos Explorer has used telepresence technology both as an outreach tool and as a new way to conduct interdisciplinary science expeditions. NOAA's Office of Ocean Exploration and Research (OER) has developed a set of collaboration tools and protocols to enable extensive shore-based participation. Telepresence offers unique advantages including access to a large pool of expertise on shore and flexibility to react to new discoveries as they occur. During early years, the telepresence experience was limited to Internet 2 enabled Exploration Command Centers, but with advent of improved bandwidth and new video transcoders, scientists from anywhere with an internet connection can participate in a telepresence expedition. Scientists have also capitalized on social media (Twitter, Facebook, Reddit etc.) by sharing discoveries to leverage the intellectual capital of scientists worldwide and engaging the general public in real-time. Aside from using telepresence to stream video off the ship, the high-bandwidth satellite connection allows for the transfer of large quantities of data in near real-time. This enables not only ship - shore data transfers, but can also support ship - ship collaborations as demonstrated during the 2015 and 2014 seasons where Okeanos worked directly with science teams onboard other vessels to share data and immediately follow up on features of interest, leading to additional discoveries. OER continues to expand its use of telepresence by experimenting with procedures to offload roles previously tied to the ship, such as data acquisition watch standers; prototyping tools for distributed user data analysis and video annotation; and incorporating in-situ sampling devices. OER has also developed improved tools to provide access to archived data to increase data distribution and facilitate additional discoveries post-expedition.
Genomics-Based Discovery of Plant Genes for Synthetic Biology of Terpenoid Fragrances: A Case Study in Sandalwood oil Biosynthesis.

PubMed

Celedon, J M; Bohlmann, J

2016-01-01

Terpenoid fragrances are powerful mediators of ecological interactions in nature and have a long history of traditional and modern industrial applications. Plants produce a great diversity of fragrant terpenoid metabolites, which make them a superb source of biosynthetic genes and enzymes. Advances in fragrance gene discovery have enabled new approaches in synthetic biology of high-value speciality molecules toward applications in the fragrance and flavor, food and beverage, cosmetics, and other industries. Rapid developments in transcriptome and genome sequencing of nonmodel plant species have accelerated the discovery of fragrance biosynthetic pathways. In parallel, advances in metabolic engineering of microbial and plant systems have established platforms for synthetic biology applications of some of the thousands of plant genes that underlie fragrance diversity. While many fragrance molecules (eg, simple monoterpenes) are abundant in readily renewable plant materials, some highly valuable fragrant terpenoids (eg, santalols, ambroxides) are rare in nature and interesting targets for synthetic biology. As a representative example for genomics/transcriptomics enabled gene and enzyme discovery, we describe a strategy used successfully for elucidation of a complete fragrance biosynthetic pathway in sandalwood (Santalum album) and its reconstruction in yeast (Saccharomyces cerevisiae). We address questions related to the discovery of specific genes within large gene families and recovery of rare gene transcripts that are selectively expressed in recalcitrant tissues. To substantiate the validity of the approaches, we describe the combination of methods used in the gene and enzyme discovery of a cytochrome P450 in the fragrant heartwood of tropical sandalwood, responsible for the fragrance defining, final step in the biosynthesis of (Z)-santalols. © 2016 Elsevier Inc. All rights reserved.
NASA's Discovery Program

NASA Astrophysics Data System (ADS)

Kicza, Mary; Bruegge, Richard Vorder

1995-01-01

NASA's Discovery Program represents an new era in planetary exploration. Discovery's primary goal: to maintain U.S. scientific leadership in planetary research by conducting a series of highly focused, cost effective missions to answer critical questions in solar system science. The Program will stimulate the development of innovative management approaches by encouraging new teaming arrangements among industry, universities and the government. The program encourages the prudent use of new technologies to enable/enhance science return and to reduce life cycle cost, and it supports the transfer of these technologies to the private sector for secondary applications. The Near-Earth Asteroid Rendezvous and Mars Pathfinder missions have been selected as the first two Discovery missions. Both will be launched in 1996. Subsequent, competitively selected missions will be conceived and proposed to NASA by teams of scientists and engineers from industry, academia, and government organizations. This paper summarizes the status of Discovery Program planning.
Simple animal models for amyotrophic lateral sclerosis drug discovery.

PubMed

Patten, Shunmoogum A; Parker, J Alex; Wen, Xiao-Yan; Drapeau, Pierre

2016-08-01

Simple animal models have enabled great progress in uncovering the disease mechanisms of amyotrophic lateral sclerosis (ALS) and are helping in the selection of therapeutic compounds through chemical genetic approaches. Within this article, the authors provide a concise overview of simple model organisms, C. elegans, Drosophila and zebrafish, which have been employed to study ALS and discuss their value to ALS drug discovery. In particular, the authors focus on innovative chemical screens that have established simple organisms as important models for ALS drug discovery. There are several advantages of using simple animal model organisms to accelerate drug discovery for ALS. It is the authors' particular belief that the amenability of simple animal models to various genetic manipulations, the availability of a wide range of transgenic strains for labelling motoneurons and other cell types, combined with live imaging and chemical screens should allow for new detailed studies elucidating early pathological processes in ALS and subsequent drug and target discovery.
Discovery of novel drugs for promising targets.

PubMed

Martell, Robert E; Brooks, David G; Wang, Yan; Wilcoxen, Keith

2013-09-01

Once a promising drug target is identified, the steps to actually discover and optimize a drug are diverse and challenging. The goal of this study was to provide a road map to navigate drug discovery. Review general steps for drug discovery and provide illustrating references. A number of approaches are available to enhance and accelerate target identification and validation. Consideration of a variety of potential mechanisms of action of potential drugs can guide discovery efforts. The hit to lead stage may involve techniques such as high-throughput screening, fragment-based screening, and structure-based design, with informatics playing an ever-increasing role. Biologically relevant screening models are discussed, including cell lines, 3-dimensional culture, and in vivo screening. The process of enabling human studies for an investigational drug is also discussed. Drug discovery is a complex process that has significantly evolved in recent years. © 2013 Elsevier HS Journals, Inc. All rights reserved.
Advanced Information Technology Investments at the NASA Earth Science Technology Office

NASA Astrophysics Data System (ADS)

Clune, T.; Seablom, M. S.; Moe, K.

2012-12-01

The NASA Earth Science Technology Office (ESTO) regularly makes investments for nurturing advanced concepts in information technology to enable rapid, low-cost acquisition, processing and visualization of Earth science data in support of future NASA missions and climate change research. In 2012, the National Research Council published a mid-term assessment of the 2007 decadal survey for future spacemissions supporting Earth science and applications [1]. The report stated, "Earth sciences have advanced significantly because of existing observational capabilities and the fruit of past investments, along with advances in data and information systems, computer science, and enabling technologies." The report found that NASA had responded favorably and aggressively to the decadal survey and noted the role of the recent ESTO solicitation for information systems technologies that partnered with the NASA Applied Sciences Program to support the transition into operations. NASA's future missions are key stakeholders for the ESTO technology investments. Also driving these investments is the need for the Agency to properly address questions regarding the prediction, adaptation, and eventual mitigation of climate change. The Earth Science Division has championed interdisciplinary research, recognizing that the Earth must be studied as a complete system in order toaddress key science questions [2]. Information technology investments in the low-mid technology readiness level (TRL) range play a key role in meeting these challenges. ESTO's Advanced Information Systems Technology (AIST) program invests in higher risk / higher reward technologies that solve the most challenging problems of the information processing chain. This includes the space segment, where the information pipeline begins, to the end user, where knowledge is ultimatelyadvanced. The objectives of the program are to reduce the risk, cost, size, and development time of Earth Science space-based and ground-based systems, increase the accessibility and utility of science data, and to enable new observation measurements and information products. We will discuss the ESTO investment strategy for information technology development, the methods used to assess stakeholder needs and technology advancements, and technology partnerships to enhance the infusion for the resulting technology. We also describe specific investments and their potential impact on enabling NASA missions and scientific discovery. [1] "Earth Science and Applications from Space: A Midterm Assessment of NASA's Implementation of the Decadal Survey", 2012: National Academies Press, http://www.nap.edu/catalog.php?record_id=13405 [2] "Responding to the Challenge of Climate and Environmental Change: NASA's Plan for a Climate-Centric Architecture for Earth Observations and Applications from Space", 2010: NASA Tech Memo, http://science.nasa.gov/media/medialibrary/2010/07/01/Climate_Architecture_Final.pdf
The CUAHSI Water Data Center: Enabling Data Publication, Discovery and Re-use

NASA Astrophysics Data System (ADS)

Seul, M.; Pollak, J.

2014-12-01

The CUAHSI Water Data Center (WDC) supports a standards-based, services-oriented architecture for time-series data and provides a separate service to publish spatial data layers as shape files. Two new services that the WDC offers are a cloud-based server (Cloud HydroServer) for publishing data and a web-based client for data discovery. The Cloud HydroServer greatly simplifies data publication by eliminating the need for scientists to set up an SQL-server data base, a requirement that has proven to be a significant barrier, and ensures greater reliability and continuity of service. Uploaders have been developed to simplify the metadata documentation process. The web-based data client eliminates the need for installing a program to be used as a client and works across all computer operating systems. The services provided by the WDC is a foundation for big data use, re-use, and meta-analyses. Using data transmission standards enables far more effective data sharing and discovery; standards used by the WDC are part of a global set of standards that should enable scientists to access unprecedented amount of data to address larger-scale research questions than was previously possible. A central mission of the WDC is to ensure these services meet the needs of the water science community and are effective at advancing water science.
The Outer Solar System Origins Survey. I. ; Design and First-Quarter Discoveries

NASA Technical Reports Server (NTRS)

Bannister, Michele T.; Kavelaars, J. J.; Petit, Jean-Marc; Gladman, Brett J.; Gwyn, Stephen D. J.; Chen, Ying-Tung; Volk, Kathryn; Alexandersen, Mike; Benecchi, Susan D.; Delsanti, Audrey;

2016-01-01

We report the discovery, tracking, and detection circumstances for 85 trans-Neptunian objects (TNOs) from the first 42 square degrees of the Outer Solar System Origins Survey. This ongoing r-band solar system survey uses the 0.9 square degree field of view MegaPrime camera on the 3.6 meter Canada-France-Hawaii Telescope. Our orbital elements for these TNOs are precise to a fractional semimajor axis uncertainty of less than 0.1 percent. We achieve this precision in just two oppositions, as compared to the normal three to five oppositions, via a dense observing cadence and innovative astrometric technique. These discoveries are free of ephemeris bias, a first for large trans-Neptunian surveys. We also provide the necessary information to enable models of TNO orbital distributions to be tested against our TNO sample. We confirm the existence of a cold "kernel" of objects within the main cold classical Kuiper Belt and infer the existence of an extension of the "stirred" cold classical Kuiper Belt to at least several au beyond the 2:1 mean motion resonance with Neptune. We find that the population model of Petit et al. remains a plausible representation of the Kuiper Belt. The full survey, to be completed in 2017, will provide an exquisitely characterized sample of important resonant TNO populations, ideal for testing models of giant planet migration during the early history of the solar system.

The Outer Solar System Origins Survey. I. Design and First-quarter Discoveries

NASA Astrophysics Data System (ADS)

Bannister, Michele T.; Kavelaars, J. J.; Petit, Jean-Marc; Gladman, Brett J.; Gwyn, Stephen D. J.; Chen, Ying-Tung; Volk, Kathryn; Alexandersen, Mike; Benecchi, Susan D.; Delsanti, Audrey; Fraser, Wesley C.; Granvik, Mikael; Grundy, Will M.; Guilbert-Lepoutre, Aurélie; Hestroffer, Daniel; Ip, Wing-Huen; Jakubik, Marian; Jones, R. Lynne; Kaib, Nathan; Kavelaars, Catherine F.; Lacerda, Pedro; Lawler, Samantha; Lehner, Matthew J.; Lin, Hsing Wen; Lister, Tim; Lykawka, Patryk Sofia; Monty, Stephanie; Marsset, Michael; Murray-Clay, Ruth; Noll, Keith S.; Parker, Alex; Pike, Rosemary E.; Rousselot, Philippe; Rusk, David; Schwamb, Megan E.; Shankman, Cory; Sicardy, Bruno; Vernazza, Pierre; Wang, Shiang-Yu

2016-09-01

We report the discovery, tracking, and detection circumstances for 85 trans-Neptunian objects (TNOs) from the first 42 deg2 of the Outer Solar System Origins Survey. This ongoing r-band solar system survey uses the 0.9 deg2 field of view MegaPrime camera on the 3.6 m Canada-France-Hawaii Telescope. Our orbital elements for these TNOs are precise to a fractional semimajor axis uncertainty <0.1%. We achieve this precision in just two oppositions, as compared to the normal three to five oppositions, via a dense observing cadence and innovative astrometric technique. These discoveries are free of ephemeris bias, a first for large trans-Neptunian surveys. We also provide the necessary information to enable models of TNO orbital distributions to be tested against our TNO sample. We confirm the existence of a cold “kernel” of objects within the main cold classical Kuiper Belt and infer the existence of an extension of the “stirred” cold classical Kuiper Belt to at least several au beyond the 2:1 mean motion resonance with Neptune. We find that the population model of Petit et al. remains a plausible representation of the Kuiper Belt. The full survey, to be completed in 2017, will provide an exquisitely characterized sample of important resonant TNO populations, ideal for testing models of giant planet migration during the early history of the solar system.
Discovery: an interactive resource for the rational selection and comparison of putative drug target proteins in malaria

PubMed Central

Joubert, Fourie; Harrison, Claudia M; Koegelenberg, Riaan J; Odendaal, Christiaan J; de Beer, Tjaart AP

2009-01-01

Background Up to half a billion human clinical cases of malaria are reported each year, resulting in about 2.7 million deaths, most of which occur in sub-Saharan Africa. Due to the over-and misuse of anti-malarials, widespread resistance to all the known drugs is increasing at an alarming rate. Rational methods to select new drug target proteins and lead compounds are urgently needed. The Discovery system provides data mining functionality on extensive annotations of five malaria species together with the human and mosquito hosts, enabling the selection of new targets based on multiple protein and ligand properties. Methods A web-based system was developed where researchers are able to mine information on malaria proteins and predicted ligands, as well as perform comparisons to the human and mosquito host characteristics. Protein features used include: domains, motifs, EC numbers, GO terms, orthologs, protein-protein interactions, protein-ligand interactions and host-pathogen interactions among others. Searching by chemical structure is also available. Results An in silico system for the selection of putative drug targets and lead compounds is presented, together with an example study on the bifunctional DHFR-TS from Plasmodium falciparum. Conclusion The Discovery system allows for the identification of putative drug targets and lead compounds in Plasmodium species based on the filtering of protein and chemical properties. PMID:19642978

The Heliophysics Data Environment: Open Source, Open Systems and Open Data.

NASA Astrophysics Data System (ADS)

King, Todd; Roberts, Aaron; Walker, Raymond; Thieman, James

2012-07-01

The Heliophysics Data Environment (HPDE) is a place for scientific discovery. Today the Heliophysics Data Environment is a framework of technologies, standards and services which enables the international community to collaborate more effectively in space physics research. Crafting a framework for a data environment begins with defining a model of the tasks to be performed, then defining the functional aspects and the work flow. The foundation of any data environment is an information model which defines the structure and content of the metadata necessary to perform the tasks. In the Heliophysics Data Environment the information model is the Space Physics Archive Search and Extract (SPASE) model and available resources are described by using this model. A described resource can reside anywhere on the internet which makes it possible for a national archive, mission, data center or individual researcher to be a provider. The generated metadata is shared, reviewed and harvested to enable services. Virtual Observatories use the metadata to provide community based portals. Through unique identifiers and registry services tools can quickly discover and access data available anywhere on the internet. This enables a researcher to quickly view and analyze data in a variety of settings and enhances the Heliophysics Data Environment. To illustrate the current Heliophysics Data Environment we present the design, architecture and operation of the Heliophysics framework. We then walk through a real example of using available tools to investigate the effects of the solar wind on Earth's magnetosphere.
14 CFR 16.213 - Discovery.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 14 Aeronautics and Space 1 2011-01-01 2011-01-01 false Discovery. 16.213 Section 16.213... PRACTICE FOR FEDERALLY-ASSISTED AIRPORT ENFORCEMENT PROCEEDINGS Hearings § 16.213 Discovery. (a) Discovery... discovery permitted by this section if a party shows that— (1) The information requested is cumulative or...
14 CFR 16.213 - Discovery.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 14 Aeronautics and Space 1 2012-01-01 2012-01-01 false Discovery. 16.213 Section 16.213... PRACTICE FOR FEDERALLY-ASSISTED AIRPORT ENFORCEMENT PROCEEDINGS Hearings § 16.213 Discovery. (a) Discovery... discovery permitted by this section if a party shows that— (1) The information requested is cumulative or...
14 CFR 16.213 - Discovery.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 14 Aeronautics and Space 1 2013-01-01 2013-01-01 false Discovery. 16.213 Section 16.213... PRACTICE FOR FEDERALLY-ASSISTED AIRPORT ENFORCEMENT PROCEEDINGS Hearings § 16.213 Discovery. (a) Discovery... discovery permitted by this section if a party shows that— (1) The information requested is cumulative or...
14 CFR 16.213 - Discovery.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 14 Aeronautics and Space 1 2014-01-01 2014-01-01 false Discovery. 16.213 Section 16.213... PRACTICE FOR FEDERALLY-ASSISTED AIRPORT ENFORCEMENT PROCEEDINGS Hearings § 16.213 Discovery. (a) Discovery... discovery permitted by this section if a party shows that— (1) The information requested is cumulative or...
14 CFR 16.213 - Discovery.

Code of Federal Regulations, 2010 CFR

2010-01-01

... 14 Aeronautics and Space 1 2010-01-01 2010-01-01 false Discovery. 16.213 Section 16.213... PRACTICE FOR FEDERALLY-ASSISTED AIRPORT ENFORCEMENT PROCEEDINGS Hearings § 16.213 Discovery. (a) Discovery... discovery permitted by this section if a party shows that— (1) The information requested is cumulative or...
Explorations of Psyche and Callisto Enabled by Ion Propulsion

NASA Technical Reports Server (NTRS)

Wenkert, Daniel D.; Landau, Damon F.; Bills, Bruce G.; Elkins-Tanton, Linda T.

2013-01-01

Recent developments in ion propulsion (specifically solar electric propulsion - SEP) have the potential for dramatically reducing the transportation cost of planetary missions. We examine two representative cases, where these new developments enable missions which, until recently, would have required resouces well beyond those allocated to the Discovery program. The two cases of interest address differentiation of asteroids and large icy satellites
NHS-Esters As Versatile Reactivity-Based Probes for Mapping Proteome-Wide Ligandable Hotspots.

PubMed

Ward, Carl C; Kleinman, Jordan I; Nomura, Daniel K

2017-06-16

Most of the proteome is considered undruggable, oftentimes hindering translational efforts for drug discovery. Identifying previously unknown druggable hotspots in proteins would enable strategies for pharmacologically interrogating these sites with small molecules. Activity-based protein profiling (ABPP) has arisen as a powerful chemoproteomic strategy that uses reactivity-based chemical probes to map reactive, functional, and ligandable hotspots in complex proteomes, which has enabled inhibitor discovery against various therapeutic protein targets. Here, we report an alkyne-functionalized N-hydroxysuccinimide-ester (NHS-ester) as a versatile reactivity-based probe for mapping the reactivity of a wide range of nucleophilic ligandable hotspots, including lysines, serines, threonines, and tyrosines, encompassing active sites, allosteric sites, post-translational modification sites, protein interaction sites, and previously uncharacterized potential binding sites. Surprisingly, we also show that fragment-based NHS-ester ligands can be made to confer selectivity for specific lysine hotspots on specific targets including Dpyd, Aldh2, and Gstt1. We thus put forth NHS-esters as promising reactivity-based probes and chemical scaffolds for covalent ligand discovery.
"New Skills and Abilities to Enable Me to Support My Pupils in a Forward Thinking Positive Way:" A Self-Discovery Programme for Teachers in Mainstream School

ERIC Educational Resources Information Center

Powell, Lesley; Cheshire, Anna

2008-01-01

The purpose of this study is to adapt, deliver, and pilot test the Self-discovery Programme (SDP) for teachers in mainstream school. The study used a pre-test post-test design. Quantitative data were collected by self-administered questionnaires given to teachers at two points in time: baseline (immediately pre-SDP) and immediately post-SDP.…
Collaborative drug discovery for More Medicines for Tuberculosis (MM4TB)

PubMed Central

Ekins, Sean; Spektor, Anna Coulon; Clark, Alex M.; Dole, Krishna; Bunin, Barry A.

2016-01-01

Neglected disease drug discovery is generally poorly funded compared with major diseases and hence there is an increasing focus on collaboration and precompetitive efforts such as public–private partnerships (PPPs). The More Medicines for Tuberculosis (MM4TB) project is one such collaboration funded by the EU with the goal of discovering new drugs for tuberculosis. Collaborative Drug Discovery has provided a commercial web-based platform called CDD Vault which is a hosted collaborative solution for securely sharing diverse chemistry and biology data. Using CDD Vault alongside other commercial and free cheminformatics tools has enabled support of this and other large collaborative projects, aiding drug discovery efforts and fostering collaboration. We will describe CDD's efforts in assisting with the MM4TB project. PMID:27884746
Federated Tensor Factorization for Computational Phenotyping

PubMed Central

Kim, Yejin; Sun, Jimeng; Yu, Hwanjo; Jiang, Xiaoqian

2017-01-01

Tensor factorization models offer an effective approach to convert massive electronic health records into meaningful clinical concepts (phenotypes) for data analysis. These models need a large amount of diverse samples to avoid population bias. An open challenge is how to derive phenotypes jointly across multiple hospitals, in which direct patient-level data sharing is not possible (e.g., due to institutional policies). In this paper, we developed a novel solution to enable federated tensor factorization for computational phenotyping without sharing patient-level data. We developed secure data harmonization and federated computation procedures based on alternating direction method of multipliers (ADMM). Using this method, the multiple hospitals iteratively update tensors and transfer secure summarized information to a central server, and the server aggregates the information to generate phenotypes. We demonstrated with real medical datasets that our method resembles the centralized training model (based on combined datasets) in terms of accuracy and phenotypes discovery while respecting privacy. PMID:29071165
Informatics Enabled Behavioral Medicine in Oncology

PubMed Central

Hesse, Bradford W.; Suls, Jerry M.

2011-01-01

For the practicing physician, the behavioral implications of preventing, diagnosing, and treating cancer are many and varied. Fortunately, an enhanced capacity in informatics may help create a redesigned ecosystem in which applying evidence-based principles from behavioral medicine will become a routine part of care. Innovation to support this evolution will be spurred by the “meaningful use” criteria stipulated by the Health Information Technology for Economic and Clinical Health (HITECH) Act of 2009, and by focused research and development efforts within the broader health information ecosystem. The implications for how to better integrate evidence-based principles in behavioral medicine into oncology care through both spheres of development are discussed within the framework of the cancer control continuum. The promise of using the data collected through these tools to accelerate discovery in psycho-oncology is also discussed. If nurtured appropriately, these developments should help accelerate successes against cancer by altering the behavioral milieu. PMID:21799329
1. On note taking.

PubMed

Plaut, Alfred B J

2005-02-01

In this paper the author explores the theoretical and technical issues relating to taking notes of analytic sessions, using an introspective approach. The paper discusses the lack of a consistent approach to note taking amongst analysts and sets out to demonstrate that systematic note taking can be helpful to the analyst. The author describes his discovery that an initial phase where as much data was recorded as possible did not prove to be reliably helpful in clinical work and initially actively interfered with recall in subsequent sessions. The impact of the nature of the analytic session itself and the focus of the analyst's interest on recall is discussed. The author then describes how he modified his note taking technique to classify information from sessions into four categories which enabled the analyst to select which information to record in notes. The characteristics of memory and its constructive nature are discussed in relation to the problems that arise in making accurate notes of analytic sessions.
Semantic Web Service Delivery in Healthcare Based on Functional and Non-Functional Properties.

PubMed

Schweitzer, Marco; Gorfer, Thilo; Hörbst, Alexander

2017-01-01

In the past decades, a lot of endeavor has been made on the trans-institutional exchange of healthcare data through electronic health records (EHR) in order to obtain a lifelong, shared accessible health record of a patient. Besides basic information exchange, there is a growing need for Information and Communication Technology (ICT) to support the use of the collected health data in an individual, case-specific workflow-based manner. This paper presents the results on how workflows can be used to process data from electronic health records, following a semantic web service approach that enables automatic discovery, composition and invocation of suitable web services. Based on this solution, the user (physician) can define its needs from a domain-specific perspective, whereas the ICT-system fulfills those needs with modular web services. By involving also non-functional properties for the service selection, this approach is even more suitable for the dynamic medical domain.
Improving life sciences information retrieval using semantic web technology.

PubMed

Quan, Dennis

2007-05-01

The ability to retrieve relevant information is at the heart of every aspect of research and development in the life sciences industry. Information is often distributed across multiple systems and recorded in a way that makes it difficult to piece together the complete picture. Differences in data formats, naming schemes and network protocols amongst information sources, both public and private, must be overcome, and user interfaces not only need to be able to tap into these diverse information sources but must also assist users in filtering out extraneous information and highlighting the key relationships hidden within an aggregated set of information. The Semantic Web community has made great strides in proposing solutions to these problems, and many efforts are underway to apply Semantic Web techniques to the problem of information retrieval in the life sciences space. This article gives an overview of the principles underlying a Semantic Web-enabled information retrieval system: creating a unified abstraction for knowledge using the RDF semantic network model; designing semantic lenses that extract contextually relevant subsets of information; and assembling semantic lenses into powerful information displays. Furthermore, concrete examples of how these principles can be applied to life science problems including a scenario involving a drug discovery dashboard prototype called BioDash are provided.
Collection and Retention Procedures for Electronically Stored Information (ESI) Collected Using E-Discovery Tools

EPA Pesticide Factsheets

This procedure is designed to support the collection of potentially responsive information using automated E-Discovery tools that rely on keywords, key phrases, index queries, or other technological assistance to retrieve Electronically Stored Information
Integrating semantic web technologies and geospatial catalog services for geospatial information discovery and processing in cyberinfrastructure

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yue, Peng; Gong, Jianya; Di, Liping

Abstract A geospatial catalogue service provides a network-based meta-information repository and interface for advertising and discovering shared geospatial data and services. Descriptive information (i.e., metadata) for geospatial data and services is structured and organized in catalogue services. The approaches currently available for searching and using that information are often inadequate. Semantic Web technologies show promise for better discovery methods by exploiting the underlying semantics. Such development needs special attention from the Cyberinfrastructure perspective, so that the traditional focus on discovery of and access to geospatial data can be expanded to support the increased demand for processing of geospatial information andmore » discovery of knowledge. Semantic descriptions for geospatial data, services, and geoprocessing service chains are structured, organized, and registered through extending elements in the ebXML Registry Information Model (ebRIM) of a geospatial catalogue service, which follows the interface specifications of the Open Geospatial Consortium (OGC) Catalogue Services for the Web (CSW). The process models for geoprocessing service chains, as a type of geospatial knowledge, are captured, registered, and discoverable. Semantics-enhanced discovery for geospatial data, services/service chains, and process models is described. Semantic search middleware that can support virtual data product materialization is developed for the geospatial catalogue service. The creation of such a semantics-enhanced geospatial catalogue service is important in meeting the demands for geospatial information discovery and analysis in Cyberinfrastructure.« less
A Knowledge Discovery framework for Planetary Defense

NASA Astrophysics Data System (ADS)

Jiang, Y.; Yang, C. P.; Li, Y.; Yu, M.; Bambacus, M.; Seery, B.; Barbee, B.

2016-12-01

Planetary Defense, a project funded by NASA Goddard and the NSF, is a multi-faceted effort focused on the mitigation of Near Earth Object (NEO) threats to our planet. Currently, there exists a dispersion of information concerning NEO's amongst different organizations and scientists, leading to a lack of a coherent system of information to be used for efficient NEO mitigation. In this paper, a planetary defense knowledge discovery engine is proposed to better assist the development and integration of a NEO responding system. Specifically, we have implemented an organized information framework by two means: 1) the development of a semantic knowledge base, which provides a structure for relevant information. It has been developed by the implementation of web crawling and natural language processing techniques, which allows us to collect and store the most relevant structured information on a regular basis. 2) the development of a knowledge discovery engine, which allows for the efficient retrieval of information from our knowledge base. The knowledge discovery engine has been built on the top of Elasticsearch, an open source full-text search engine, as well as cutting-edge machine learning ranking and recommendation algorithms. This proposed framework is expected to advance the knowledge discovery and innovation in planetary science domain.
Modeling human neurological disorders with induced pluripotent stem cells.

PubMed

Imaizumi, Yoichi; Okano, Hideyuki

2014-05-01

Human induced pluripotent stem (iPS) cells obtained by reprogramming technology are a source of great hope, not only in terms of applications in regenerative medicine, such as cell transplantation therapy, but also for modeling human diseases and new drug development. In particular, the production of iPS cells from the somatic cells of patients with intractable diseases and their subsequent differentiation into cells at affected sites (e.g., neurons, cardiomyocytes, hepatocytes, and myocytes) has permitted the in vitro construction of disease models that contain patient-specific genetic information. For example, disease-specific iPS cells have been established from patients with neuropsychiatric disorders, including schizophrenia and autism, as well as from those with neurodegenerative diseases, including Parkinson's disease and Alzheimer's disease. A multi-omics analysis of neural cells originating from patient-derived iPS cells may thus enable investigators to elucidate the pathogenic mechanisms of neurological diseases that have heretofore been unknown. In addition, large-scale screening of chemical libraries with disease-specific iPS cells is currently underway and is expected to lead to new drug discovery. Accordingly, this review outlines the progress made via the use of patient-derived iPS cells toward the modeling of neurological disorders, the testing of existing drugs, and the discovery of new drugs. The production of human induced pluripotent stem (iPS) cells from the patients' somatic cells and their subsequent differentiation into specific cells have permitted the in vitro construction of disease models that contain patient-specific genetic information. Furthermore, innovations of gene-editing technologies on iPS cells are enabling new approaches for illuminating the pathogenic mechanisms of human diseases. In this review article, we outlined the current status of neurological diseases-specific iPS cell research and described recently obtained knowledge in the form of actual examples. © 2013 International Society for Neurochemistry.
A Drupal-Based Collaborative Framework for Science Workflows

NASA Astrophysics Data System (ADS)

Pinheiro da Silva, P.; Gandara, A.

2010-12-01

Cyber-infrastructure is built from utilizing technical infrastructure to support organizational practices and social norms to provide support for scientific teams working together or dependent on each other to conduct scientific research. Such cyber-infrastructure enables the sharing of information and data so that scientists can leverage knowledge and expertise through automation. Scientific workflow systems have been used to build automated scientific systems used by scientists to conduct scientific research and, as a result, create artifacts in support of scientific discoveries. These complex systems are often developed by teams of scientists who are located in different places, e.g., scientists working in distinct buildings, and sometimes in different time zones, e.g., scientist working in distinct national laboratories. The sharing of these specifications is currently supported by the use of version control systems such as CVS or Subversion. Discussions about the design, improvement, and testing of these specifications, however, often happen elsewhere, e.g., through the exchange of email messages and IM chatting. Carrying on a discussion about these specifications is challenging because comments and specifications are not necessarily connected. For instance, the person reading a comment about a given workflow specification may not be able to see the workflow and even if the person can see the workflow, the person may not specifically know to which part of the workflow a given comments applies to. In this paper, we discuss the design, implementation and use of CI-Server, a Drupal-based infrastructure, to support the collaboration of both local and distributed teams of scientists using scientific workflows. CI-Server has three primary goals: to enable information sharing by providing tools that scientists can use within their scientific research to process data, publish and share artifacts; to build community by providing tools that support discussions between scientists about artifacts used or created through scientific processes; and to leverage the knowledge collected within the artifacts and scientific collaborations to support scientific discoveries.

An Integrated SNP Mining and Utilization (ISMU) Pipeline for Next Generation Sequencing Data

PubMed Central

Azam, Sarwar; Rathore, Abhishek; Shah, Trushar M.; Telluri, Mohan; Amindala, BhanuPrakash; Ruperao, Pradeep; Katta, Mohan A. V. S. K.; Varshney, Rajeev K.

2014-01-01

Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone free software. PMID:25003610
Advances and Directions for the Intelligent Systems for Geosciences Research Community: Updates and Opportunities from the NSF EarthCube IS-GEO RCN

NASA Astrophysics Data System (ADS)

Pierce, S. A.

2017-12-01

The Earthcube Intelligent Systems for Geosciences Research Collaboration Network (IS-GEO RCN) represents an emerging community of interdisciplinary researchers aiming to create fundamental new capabilities for understanding Earth systems. Collaborative efforts across IS-GEO fields of study offer opportunities to accelerate scientific discovery and understanding. The IS-GEO community has an active membership of approximately 65 researchers and includes researchers from across the US, international members, and an early career committee. Current working groups are open to new participants and are focused on four thematic areas with regular coordination meetings and upcoming sessions at professional conferences. (1) The Sensor-based data Collection and Integration Working group looks at techniques for analyzing and integrating of information from heterogeneous sources, with a possible application for early warning systems. (2) The Geoscience Case Studies Working group is creating benchmark data sets to enable new collaborations between geoscientists and data scientists. (3) The Geo-Simulations Working group is evaluating the state of the art in practices for parametrizations, scales, and model integration. (4) The Education Working group is gathering, organizing and collecting all the materials from the different IS-GEO courses. Innovative IS-GEO applications will help researchers overcome common challenges while will redefining the frontiers of discovery across fields and disciplines. (Visit IS-GEO.org for more information or to sign up for any of the working groups.)
Customer Discovery as the First Essential Step for Successful Health Information Technology System Development.

PubMed

Thamjamrassri, Punyotai; Song, YuJin; Tak, JaeHyun; Kang, HoYong; Kong, Hyoun-Joong; Hong, Jeeyoung

2018-01-01

Customer discovery (CD) is a method to determine if there are actual customers for a product/service and what they would want before actually developing the product/service. This concept, however, is rather new to health information technology (IT) systems. Therefore, the aim of this paper was to demonstrate how to use the CD method in developing a comprehensive health IT service for patients with knee/leg pain. We participated in a 6-week I-Corps program to perform CD, in which we interviewed 55 people in person, by phone, or by video conference within 6 weeks: 4 weeks in the United States and 2 weeks in Korea. The interviewees included orthopedic doctors, physical therapists, physical trainers, physicians, researchers, pharmacists, vendors, and patients. By analyzing the interview data, the aim was to revise our business model accordingly. Using the CD approach enabled us to understand the customer segments and identify value propositions. We concluded that a facilitating tele-rehabilitation system is needed the most and that the most suitable customer segment is early stage arthritis patients. We identified a new design concept for the customer segment. Furthermore, CD is required to identify value propositions in detail. CD is crucial to determine a more desirable direction in developing health IT systems, and it can be a powerful tool to increase the potential for successful commercialization in the health IT field.
A Planetary Defense Gateway for Smart Discovery of relevant Information for Decision Support

NASA Technical Reports Server (NTRS)

Bambacus, Myra; Yang, Chaowei Phil; Leung, Ronald Y.; Barbee, Brent; Nuth, Joseph A.; Seery, Bernard; Jiang, Yongyao; Qin, Han; Li, Yun; Yu, Manzhu;

2017-01-01

A Planetary Defense Gateway for Smart Discovery of relevant Information for Decision Support presentation discussing background, framework architecture, current results, ongoing research, conclusions.

Discovery and problem solving: Triangulation as a weak heuristic

NASA Technical Reports Server (NTRS)

Rochowiak, Daniel

1987-01-01

Recently the artificial intelligence community has turned its attention to the process of discovery and found that the history of science is a fertile source for what Darden has called compiled hindsight. Such hindsight generates weak heuristics for discovery that do not guarantee that discoveries will be made but do have proven worth in leading to discoveries. Triangulation is one such heuristic that is grounded in historical hindsight. This heuristic is explored within the general framework of the BACON, GLAUBER, STAHL, DALTON, and SUTTON programs. In triangulation different bases of information are compared in an effort to identify gaps between the bases. Thus, assuming that the bases of information are relevantly related, the gaps that are identified should be good locations for discovery and robust analysis.
5 CFR 1201.72 - Explanation and scope of discovery.

Code of Federal Regulations, 2010 CFR

2010-01-01

... obtain relevant information, including the identification of potential witnesses, from another person or a party, that the other person or party has not otherwise provided. Relevant information includes information that appears reasonably calculated to lead to the discovery of admissible evidence. This...
Visualization portal for genetic variation (VizGVar): a tool for interactive visualization of SNPs and somatic mutations in exons, genes and protein domains.

PubMed

Solano-Román, Antonio; Alfaro-Arias, Verónica; Cruz-Castillo, Carlos; Orozco-Solano, Allan

2018-03-15

VizGVar was designed to meet the growing need of the research community for improved genomic and proteomic data viewers that benefit from better information visualization. We implemented a new information architecture and applied user centered design principles to provide a new improved way of visualizing genetic information and protein data related to human disease. VizGVar connects the entire database of Ensembl protein motifs, domains, genes and exons with annotated SNPs and somatic variations from PharmGKB and COSMIC. VizGVar precisely represents genetic variations and their respective location by colored curves to designate different types of variations. The structured hierarchy of biological data is reflected in aggregated patterns through different levels, integrating several layers of information at once. VizGVar provides a new interactive, web-based JavaScript visualization of somatic mutations and protein variation, enabling fast and easy discovery of clinically relevant variation patterns. VizGVar is accessible at http://vizport.io/vizgvar; http://vizport.io/vizgvar/doc/. asolano@broadinstitute.org or allan.orozcosolano@ucr.ac.cr.
Pattern Activity Clustering and Evaluation (PACE)

NASA Astrophysics Data System (ADS)

Blasch, Erik; Banas, Christopher; Paul, Michael; Bussjager, Becky; Seetharaman, Guna

2012-06-01

With the vast amount of network information available on activities of people (i.e. motions, transportation routes, and site visits) there is a need to explore the salient properties of data that detect and discriminate the behavior of individuals. Recent machine learning approaches include methods of data mining, statistical analysis, clustering, and estimation that support activity-based intelligence. We seek to explore contemporary methods in activity analysis using machine learning techniques that discover and characterize behaviors that enable grouping, anomaly detection, and adversarial intent prediction. To evaluate these methods, we describe the mathematics and potential information theory metrics to characterize behavior. A scenario is presented to demonstrate the concept and metrics that could be useful for layered sensing behavior pattern learning and analysis. We leverage work on group tracking, learning and clustering approaches; as well as utilize information theoretical metrics for classification, behavioral and event pattern recognition, and activity and entity analysis. The performance evaluation of activity analysis supports high-level information fusion of user alerts, data queries and sensor management for data extraction, relations discovery, and situation analysis of existing data.
Experiments for Modern Introductory Chemistry.

ERIC Educational Resources Information Center

Kildahl, Nicholas; Berka, Ladislav H.

1995-01-01

Presents a headspace gas chromatography experiment that enables discovery of the temperature dependence of the vapor pressure of a pure liquid. Illustrates liquid-vapor phase equilibrium of pure liquids. Contains 22 references. (JRH)
Knowledge-based public health situation awareness

NASA Astrophysics Data System (ADS)

Mirhaji, Parsa; Zhang, Jiajie; Srinivasan, Arunkumar; Richesson, Rachel L.; Smith, Jack W.

2004-09-01

There have been numerous efforts to create comprehensive databases from multiple sources to monitor the dynamics of public health and most specifically to detect the potential threats of bioterrorism before widespread dissemination. But there are not many evidences for the assertion that these systems are timely and dependable, or can reliably identify man made from natural incident. One must evaluate the value of so called 'syndromic surveillance systems' along with the costs involved in design, development, implementation and maintenance of such systems and the costs involved in investigation of the inevitable false alarms1. In this article we will introduce a new perspective to the problem domain with a shift in paradigm from 'surveillance' toward 'awareness'. As we conceptualize a rather different approach to tackle the problem, we will introduce a different methodology in application of information science, computer science, cognitive science and human-computer interaction concepts in design and development of so called 'public health situation awareness systems'. We will share some of our design and implementation concepts for the prototype system that is under development in the Center for Biosecurity and Public Health Informatics Research, in the University of Texas Health Science Center at Houston. The system is based on a knowledgebase containing ontologies with different layers of abstraction, from multiple domains, that provide the context for information integration, knowledge discovery, interactive data mining, information visualization, information sharing and communications. The modular design of the knowledgebase and its knowledge representation formalism enables incremental evolution of the system from a partial system to a comprehensive knowledgebase of 'public health situation awareness' as it acquires new knowledge through interactions with domain experts or automatic discovery of new knowledge.
False discovery rate control incorporating phylogenetic tree increases detection power in microbiome-wide multiple testing.

PubMed

Xiao, Jian; Cao, Hongyuan; Chen, Jun

2017-09-15

Next generation sequencing technologies have enabled the study of the human microbiome through direct sequencing of microbial DNA, resulting in an enormous amount of microbiome sequencing data. One unique characteristic of microbiome data is the phylogenetic tree that relates all the bacterial species. Closely related bacterial species have a tendency to exhibit a similar relationship with the environment or disease. Thus, incorporating the phylogenetic tree information can potentially improve the detection power for microbiome-wide association studies, where hundreds or thousands of tests are conducted simultaneously to identify bacterial species associated with a phenotype of interest. Despite much progress in multiple testing procedures such as false discovery rate (FDR) control, methods that take into account the phylogenetic tree are largely limited. We propose a new FDR control procedure that incorporates the prior structure information and apply it to microbiome data. The proposed procedure is based on a hierarchical model, where a structure-based prior distribution is designed to utilize the phylogenetic tree. By borrowing information from neighboring bacterial species, we are able to improve the statistical power of detecting associated bacterial species while controlling the FDR at desired levels. When the phylogenetic tree is mis-specified or non-informative, our procedure achieves a similar power as traditional procedures that do not take into account the tree structure. We demonstrate the performance of our method through extensive simulations and real microbiome datasets. We identified far more alcohol-drinking associated bacterial species than traditional methods. R package StructFDR is available from CRAN. chen.jun2@mayo.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Infrared heterodyne spectroscopy. [for observation of thermal emission from astrophysical objects

NASA Technical Reports Server (NTRS)

Mumma, M. J.; Kostiuk, T.; Buhl, D.; Chin, G.; Zipoy, D.

1982-01-01

Infrared heterodyne spectroscopy is an extremely useful tool for Doppler-limited studies of atomic and molecular lines in diverse astrophysical regions. The current state of the art is reviewed, and the analysis of CO2 lines in the atmosphere of Mars is outlined. Doppler-limited observations have enabled the discovery of natural laser emission in the mesosphere of Mars and the discovery of failure of local thermodynamic equilibrium near the surface of Mars.
Applying flow chemistry: methods, materials, and multistep synthesis.

PubMed

McQuade, D Tyler; Seeberger, Peter H

2013-07-05

The synthesis of complex molecules requires control over both chemical reactivity and reaction conditions. While reactivity drives the majority of chemical discovery, advances in reaction condition control have accelerated method development/discovery. Recent tools include automated synthesizers and flow reactors. In this Synopsis, we describe how flow reactors have enabled chemical advances in our groups in the areas of single-stage reactions, materials synthesis, and multistep reactions. In each section, we detail the lessons learned and propose future directions.
Automated phase mapping with AgileFD and its application to light absorber discovery in the V–Mn–Nb oxide system

DOE PAGES

Suram, Santosh K.; Xue, Yexiang; Bai, Junwen; ...

2016-11-21

Rapid construction of phase diagrams is a central tenet of combinatorial materials science with accelerated materials discovery efforts often hampered by challenges in interpreting combinatorial X-ray diffraction data sets, which we address by developing AgileFD, an artificial intelligence algorithm that enables rapid phase mapping from a combinatorial library of X-ray diffraction patterns. AgileFD models alloying-based peak shifting through a novel expansion of convolutional nonnegative matrix factorization, which not only improves the identification of constituent phases but also maps their concentration and lattice parameter as a function of composition. By incorporating Gibbs’ phase rule into the algorithm, physically meaningful phase mapsmore » are obtained with unsupervised operation, and more refined solutions are attained by injecting expert knowledge of the system. The algorithm is demonstrated through investigation of the V–Mn–Nb oxide system where decomposition of eight oxide phases, including two with substantial alloying, provides the first phase map for this pseudoternary system. This phase map enables interpretation of high-throughput band gap data, leading to the discovery of new solar light absorbers and the alloying-based tuning of the direct-allowed band gap energy of MnV 2O 6. Lastly, the open-source family of AgileFD algorithms can be implemented into a broad range of high throughput workflows to accelerate materials discovery.« less
Automated phase mapping with AgileFD and its application to light absorber discovery in the V–Mn–Nb oxide system

DOE Office of Scientific and Technical Information (OSTI.GOV)

Suram, Santosh K.; Xue, Yexiang; Bai, Junwen

Rapid construction of phase diagrams is a central tenet of combinatorial materials science with accelerated materials discovery efforts often hampered by challenges in interpreting combinatorial X-ray diffraction data sets, which we address by developing AgileFD, an artificial intelligence algorithm that enables rapid phase mapping from a combinatorial library of X-ray diffraction patterns. AgileFD models alloying-based peak shifting through a novel expansion of convolutional nonnegative matrix factorization, which not only improves the identification of constituent phases but also maps their concentration and lattice parameter as a function of composition. By incorporating Gibbs’ phase rule into the algorithm, physically meaningful phase mapsmore » are obtained with unsupervised operation, and more refined solutions are attained by injecting expert knowledge of the system. The algorithm is demonstrated through investigation of the V–Mn–Nb oxide system where decomposition of eight oxide phases, including two with substantial alloying, provides the first phase map for this pseudoternary system. This phase map enables interpretation of high-throughput band gap data, leading to the discovery of new solar light absorbers and the alloying-based tuning of the direct-allowed band gap energy of MnV 2O 6. Lastly, the open-source family of AgileFD algorithms can be implemented into a broad range of high throughput workflows to accelerate materials discovery.« less
An Integrated Microfluidic Processor for DNA-Encoded Combinatorial Library Functional Screening

PubMed Central

2017-01-01

DNA-encoded synthesis is rekindling interest in combinatorial compound libraries for drug discovery and in technology for automated and quantitative library screening. Here, we disclose a microfluidic circuit that enables functional screens of DNA-encoded compound beads. The device carries out library bead distribution into picoliter-scale assay reagent droplets, photochemical cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits. DNA-encoded compound beads (10-μm diameter) displaying a photocleavable positive control inhibitor pepstatin A were mixed (1920 beads, 729 encoding sequences) with negative control beads (58 000 beads, 1728 encoding sequences) and screened for cathepsin D inhibition using a biochemical enzyme activity assay. The circuit sorted 1518 hit droplets for collection following 18 min incubation over a 240 min analysis. Visual inspection of a subset of droplets (1188 droplets) yielded a 24% false discovery rate (1166 pepstatin A beads; 366 negative control beads). Using template barcoding strategies, it was possible to count hit collection beads (1863) using next-generation sequencing data. Bead-specific barcodes enabled replicate counting, and the false discovery rate was reduced to 2.6% by only considering hit-encoding sequences that were observed on >2 beads. This work represents a complete distributable small molecule discovery platform, from microfluidic miniaturized automation to ultrahigh-throughput hit deconvolution by sequencing. PMID:28199790
An Integrated Microfluidic Processor for DNA-Encoded Combinatorial Library Functional Screening.

PubMed

MacConnell, Andrew B; Price, Alexander K; Paegel, Brian M

2017-03-13

DNA-encoded synthesis is rekindling interest in combinatorial compound libraries for drug discovery and in technology for automated and quantitative library screening. Here, we disclose a microfluidic circuit that enables functional screens of DNA-encoded compound beads. The device carries out library bead distribution into picoliter-scale assay reagent droplets, photochemical cleavage of compound from the bead, assay incubation, laser-induced fluorescence-based assay detection, and fluorescence-activated droplet sorting to isolate hits. DNA-encoded compound beads (10-μm diameter) displaying a photocleavable positive control inhibitor pepstatin A were mixed (1920 beads, 729 encoding sequences) with negative control beads (58 000 beads, 1728 encoding sequences) and screened for cathepsin D inhibition using a biochemical enzyme activity assay. The circuit sorted 1518 hit droplets for collection following 18 min incubation over a 240 min analysis. Visual inspection of a subset of droplets (1188 droplets) yielded a 24% false discovery rate (1166 pepstatin A beads; 366 negative control beads). Using template barcoding strategies, it was possible to count hit collection beads (1863) using next-generation sequencing data. Bead-specific barcodes enabled replicate counting, and the false discovery rate was reduced to 2.6% by only considering hit-encoding sequences that were observed on >2 beads. This work represents a complete distributable small molecule discovery platform, from microfluidic miniaturized automation to ultrahigh-throughput hit deconvolution by sequencing.
OlyMPUS - The Ontology-based Metadata Portal for Unified Semantics

NASA Astrophysics Data System (ADS)

Huffer, E.; Gleason, J. L.

2015-12-01

The Ontology-based Metadata Portal for Unified Semantics (OlyMPUS), funded by the NASA Earth Science Technology Office Advanced Information Systems Technology program, is an end-to-end system designed to support data consumers and data providers, enabling the latter to register their data sets and provision them with the semantically rich metadata that drives the Ontology-Driven Interactive Search Environment for Earth Sciences (ODISEES). OlyMPUS leverages the semantics and reasoning capabilities of ODISEES to provide data producers with a semi-automated interface for producing the semantically rich metadata needed to support ODISEES' data discovery and access services. It integrates the ODISEES metadata search system with multiple NASA data delivery tools to enable data consumers to create customized data sets for download to their computers, or for NASA Advanced Supercomputing (NAS) facility registered users, directly to NAS storage resources for access by applications running on NAS supercomputers. A core function of NASA's Earth Science Division is research and analysis that uses the full spectrum of data products available in NASA archives. Scientists need to perform complex analyses that identify correlations and non-obvious relationships across all types of Earth System phenomena. Comprehensive analytics are hindered, however, by the fact that many Earth science data products are disparate and hard to synthesize. Variations in how data are collected, processed, gridded, and stored, create challenges for data interoperability and synthesis, which are exacerbated by the sheer volume of available data. Robust, semantically rich metadata can support tools for data discovery and facilitate machine-to-machine transactions with services such as data subsetting, regridding, and reformatting. Such capabilities are critical to enabling the research activities integral to NASA's strategic plans. However, as metadata requirements increase and competing standards emerge, metadata provisioning becomes increasingly burdensome to data producers. The OlyMPUS system helps data providers produce semantically rich metadata, making their data more accessible to data consumers, and helps data consumers quickly discover and download the right data for their research.
New Catalog of Resources Enables Paleogeosciences Research

NASA Astrophysics Data System (ADS)

Lingo, R. C.; Horlick, K. A.; Anderson, D. M.

2014-12-01

The 21st century promises a new era for scientists of all disciplines, the age where cyber infrastructure enables research and education and fuels discovery. EarthCube is a working community of over 2,500 scientists and students of many Earth Science disciplines who are looking to build bridges between disciplines. The EarthCube initiative will create a digital infrastructure that connects databases, software, and repositories. A catalog of resources (databases, software, repositories) has been produced by the Research Coordination Network for Paleogeosciences to improve the discoverability of resources. The Catalog is currently made available within the larger-scope CINERGI geosciences portal (http://hydro10.sdsc.edu/geoportal/catalog/main/home.page). Other distribution points and web services are planned, using linked data, content services for the web, and XML descriptions that can be harvested using metadata protocols. The databases provide searchable interfaces to find data sets that would otherwise remain dark data, hidden in drawers and on personal computers. The software will be described in catalog entries so just one click will lead users to methods and analytical tools that many geoscientists were unaware of. The repositories listed in the Paleogeosciences Catalog contain physical samples found all across the globe, from natural history museums to the basements of university buildings. EarthCube has over 250 databases, 300 software systems, and 200 repositories which will grow in the coming year. When completed, geoscientists across the world will be connected into a productive workflow for managing, sharing, and exploring geoscience data and information that expedites collaboration and innovation within the paleogeosciences, potentially bringing about new interdisciplinary discoveries.
ConTour: Data-Driven Exploration of Multi-Relational Datasets for Drug Discovery.

PubMed

Partl, Christian; Lex, Alexander; Streit, Marc; Strobelt, Hendrik; Wassermann, Anne-Mai; Pfister, Hanspeter; Schmalstieg, Dieter

2014-12-01

Large scale data analysis is nowadays a crucial part of drug discovery. Biologists and chemists need to quickly explore and evaluate potentially effective yet safe compounds based on many datasets that are in relationship with each other. However, there is a lack of tools that support them in these processes. To remedy this, we developed ConTour, an interactive visual analytics technique that enables the exploration of these complex, multi-relational datasets. At its core ConTour lists all items of each dataset in a column. Relationships between the columns are revealed through interaction: selecting one or multiple items in one column highlights and re-sorts the items in other columns. Filters based on relationships enable drilling down into the large data space. To identify interesting items in the first place, ConTour employs advanced sorting strategies, including strategies based on connectivity strength and uniqueness, as well as sorting based on item attributes. ConTour also introduces interactive nesting of columns, a powerful method to show the related items of a child column for each item in the parent column. Within the columns, ConTour shows rich attribute data about the items as well as information about the connection strengths to other datasets. Finally, ConTour provides a number of detail views, which can show items from multiple datasets and their associated data at the same time. We demonstrate the utility of our system in case studies conducted with a team of chemical biologists, who investigate the effects of chemical compounds on cells and need to understand the underlying mechanisms.

The value of banked samples for oncology drug discovery and development.

PubMed

Shaw, Peter M; Patterson, Scott D

2011-01-01

To gain insights into human biology and pathobiology, ready access to banked human tissue samples that encompass a representative cross section of the population is required. For optimal use, the banked human tissue needs to be appropriately consented, collected, annotated, and stored. If any of these elements are missing, the studies using these samples are compromised. These elements are critical whether the research is for academic or pharmaceutical industry purposes. An additional temporal element that adds enormous value to such banked samples is treatment and outcome information from the people who donated the tissue. To achieve these aims, many different groups have to work effectively together, not least of which are the individuals who donate their tissue with appropriate consent. Such research is unlikely to benefit the donors but others who succumb to the same disease. The development of a large accessible human tissue bank resource (National Cancer Institute's Cancer HUman Biobank [caHUB]) that provides an ongoing supply of human tissue for all working toward the common goal of understanding human health and disease has a number of advantages. These include, but are not limited to, access to a broad cross section of healthy and diseased populations beyond what individual collections may achieve for understanding disease pathobiology, therapeutic target discovery, as well as a source of material for diagnostic assay validation. Models will need to be developed to enable fair access to caHUB under terms that enable appropriate intellectual property protection and ultimate data sharing to ensure that the biobank successfully distributes samples to a broad range of researchers.
Robustness of disaggregate oil and gas discovery forecasting models

USGS Publications Warehouse

Attanasi, E.D.; Schuenemeyer, J.H.

1989-01-01

The trend in forecasting oil and gas discoveries has been to develop and use models that allow forecasts of the size distribution of future discoveries. From such forecasts, exploration and development costs can more readily be computed. Two classes of these forecasting models are the Arps-Roberts type models and the 'creaming method' models. This paper examines the robustness of the forecasts made by these models when the historical data on which the models are based have been subject to economic upheavals or when historical discovery data are aggregated from areas having widely differing economic structures. Model performance is examined in the context of forecasting discoveries for offshore Texas State and Federal areas. The analysis shows how the model forecasts are limited by information contained in the historical discovery data. Because the Arps-Roberts type models require more regularity in discovery sequence than the creaming models, prior information had to be introduced into the Arps-Roberts models to accommodate the influence of economic changes. The creaming methods captured the overall decline in discovery size but did not easily allow introduction of exogenous information to compensate for incomplete historical data. Moreover, the predictive log normal distribution associated with the creaming model methods appears to understate the importance of the potential contribution of small fields. ?? 1989.
Developing integrated crop knowledge networks to advance candidate gene discovery.

PubMed

Hassani-Pak, Keywan; Castellote, Martin; Esch, Maria; Hindle, Matthew; Lysenko, Artem; Taubert, Jan; Rawlings, Christopher

2016-12-01

The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage. In this paper we present a general approach for building genome-scale knowledge networks that provide a unified representation of heterogeneous but interconnected datasets to enable effective knowledge mining and gene discovery. We describe the datasets and outline the methods, workflows and tools that we have developed for creating and visualising these networks for the major crop species, wheat and barley. We present the global characteristics of such knowledge networks and with an example linking a seed size phenotype to a barley WRKY transcription factor orthologous to TTG2 from Arabidopsis, we illustrate the value of integrated data in biological knowledge discovery. The software we have developed (www.ondex.org) and the knowledge resources (http://knetminer.rothamsted.ac.uk) we have created are all open-source and provide a first step towards systematic and evidence-based gene discovery in order to facilitate crop improvement.
A green protocol for efficient discovery of novel natural compounds: characterization of new ginsenosides from the stems and leaves of Panax ginseng as a case study.

PubMed

Qiu, Shi; Yang, Wen-Zhi; Shi, Xiao-Jian; Yao, Chang-Liang; Yang, Min; Liu, Xuan; Jiang, Bao-Hong; Wu, Wan-Ying; Guo, De-An

2015-09-17

Exploration of new natural compounds is of vital significance for drug discovery and development. The conventional approaches by systematic phytochemical isolation are low-efficiency and consume masses of organic solvent. This study presents an integrated strategy that combines offline comprehensive two-dimensional liquid chromatography, hybrid linear ion-trap/Orbitrap mass spectrometry, and NMR analysis (2D LC/LTQ-Orbitrap-MS/NMR), aimed to establish a green protocol for the efficient discovery of new natural molecules. A comprehensive chemical analysis of the total ginsenosides of stems and leaves of Panax ginseng (SLP), a cardiovascular disease medicine, was performed following this strategy. An offline 2D LC system was constructed with an orthogonality of 0.79 and a practical peak capacity of 11,000. The much greener UHPLC separation and LTQ-Orbitrap-MS detection by data-dependent high-energy C-trap dissociation (HCD)/dynamic exclusion were employed for separation and characterization of ginsenosides from thirteen fractionated SLP samples. Consequently, a total of 646 ginsenosides were characterized, and 427 have not been isolated from the genus of Panax L. The ginsenosides identified from SLP exhibited distinct sapogenin diversity and molecular isomerism. NMR analysis was finally employed to verify and offer complementary structural information to MS-oriented characterization. The established 2D LC/LTQ-Orbitrap-MS/NMR approach outperforms the conventional approaches in respect of significantly improved efficiency, much less use of drug materials and organic solvent. The integrated strategy enables a deep investigation on the therapeutic basis of an herbal medicine, and facilitates new compounds discovery in an efficient and environmentally friendly manner as well. Copyright © 2015 Elsevier B.V. All rights reserved.
Scientific workflows as productivity tools for drug discovery.

PubMed

Shon, John; Ohkawa, Hitomi; Hammer, Juergen

2008-05-01

Large pharmaceutical companies annually invest tens to hundreds of millions of US dollars in research informatics to support their early drug discovery processes. Traditionally, most of these investments are designed to increase the efficiency of drug discovery. The introduction of do-it-yourself scientific workflow platforms has enabled research informatics organizations to shift their efforts toward scientific innovation, ultimately resulting in a possible increase in return on their investments. Unlike the handling of most scientific data and application integration approaches, researchers apply scientific workflows to in silico experimentation and exploration, leading to scientific discoveries that lie beyond automation and integration. This review highlights some key requirements for scientific workflow environments in the pharmaceutical industry that are necessary for increasing research productivity. Examples of the application of scientific workflows in research and a summary of recent platform advances are also provided.
A Location Aware Middleware Framework for Collaborative Visual Information Discovery and Retrieval

DTIC Science & Technology

2017-09-14

Information Discovery and Retrieval Andrew J.M. Compton Follow this and additional works at: https://scholar.afit.edu/etd Part of the Digital...and Dissertations by an authorized administrator of AFIT Scholar. For more information , please contact richard.mansfield@afit.edu. Recommended Citation...
Advances in microfluidics for drug discovery.

PubMed

Lombardi, Dario; Dittrich, Petra S

2010-11-01

Microfluidics is considered as an enabling technology for the development of unconventional and innovative methods in the drug discovery process. The concept of micrometer-sized reaction systems in the form of continuous flow reactors, microdroplets or microchambers is intriguing, and the versatility of the technology perfectly fits with the requirements of drug synthesis, drug screening and drug testing. In this review article, we introduce key microfluidic approaches to the drug discovery process, highlighting the latest and promising achievements in this field, mainly from the years 2007 - 2010. Despite high expectations of microfluidic approaches to several stages of the drug discovery process, up to now microfluidic technology has not been able to significantly replace conventional drug discovery platforms. Our aim is to identify bottlenecks that have impeded the transfer of microfluidics into routine platforms for drug discovery and show some recent solutions to overcome these hurdles. Although most microfluidic approaches are still applied only for proof-of-concept studies, thanks to creative microfluidic research in the past years unprecedented novel capabilities of microdevices could be demonstrated, and general applicable, robust and reliable microfluidic platforms seem to be within reach.
Recent advances in inkjet dispensing technologies: applications in drug discovery.

PubMed

Zhu, Xiangcheng; Zheng, Qiang; Yang, Hu; Cai, Jin; Huang, Lei; Duan, Yanwen; Xu, Zhinan; Cen, Peilin

2012-09-01

Inkjet dispensing technology is a promising fabrication methodology widely applied in drug discovery. The automated programmable characteristics and high-throughput efficiency makes this approach potentially very useful in miniaturizing the design patterns for assays and drug screening. Various custom-made inkjet dispensing systems as well as specialized bio-ink and substrates have been developed and applied to fulfill the increasing demands of basic drug discovery studies. The incorporation of other modern technologies has further exploited the potential of inkjet dispensing technology in drug discovery and development. This paper reviews and discusses the recent developments and practical applications of inkjet dispensing technology in several areas of drug discovery and development including fundamental assays of cells and proteins, microarrays, biosensors, tissue engineering, basic biological and pharmaceutical studies. Progression in a number of areas of research including biomaterials, inkjet mechanical systems and modern analytical techniques as well as the exploration and accumulation of profound biological knowledge has enabled different inkjet dispensing technologies to be developed and adapted for high-throughput pattern fabrication and miniaturization. This in turn presents a great opportunity to propel inkjet dispensing technology into drug discovery.
An integrative model for in-silico clinical-genomics discovery science.

PubMed

Lussier, Yves A; Sarkar, Indra Nell; Cantor, Michael

2002-01-01

Human Genome discovery research has set the pace for Post-Genomic Discovery Research. While post-genomic fields focused at the molecular level are intensively pursued, little effort is being deployed in the later stages of molecular medicine discovery research, such as clinical-genomics. The objective of this study is to demonstrate the relevance and significance of integrating mainstream clinical informatics decision support systems to current bioinformatics genomic discovery science. This paper is a feasibility study of an original model enabling novel "in-silico" clinical-genomic discovery science and that demonstrates its feasibility. This model is designed to mediate queries among clinical and genomic knowledge bases with relevant bioinformatic analytic tools (e.g. gene clustering). Briefly, trait-disease-gene relationships were successfully illustrated using QMR, OMIM, SNOMED-RT, GeneCluster and TreeView. The analyses were visualized as two-dimensional dendrograms of clinical observations clustered around genes. To our knowledge, this is the first study using knowledge bases of clinical decision support systems for genomic discovery. Although this study is a proof of principle, it provides a framework for the development of clinical decision-support-system driven, high-throughput clinical-genomic technologies which could potentially unveil significant high-level functions of genes.
Strengthening the Cancer Research Enterprise - Annual Plan

Cancer.gov

NCI's expanding infrastructure, support for scientists at every career stage, and funding of small business innovation enables discoveries that advance cancer research. Read more about how NCI is strenghtening the cancer research enterprise.
Science Discoveries Enabled by Hosting Optical Imagers on Commercial Satellite Constellations

NASA Astrophysics Data System (ADS)

Erlandson, R. E.; Kelly, M. A.; Hibbitts, C.; Kumar, C.; Dyrud, L. P.

2012-12-01

The advent of commercial space activities that utilize large space-based constellations provide a new and cost effective opportunity to acquire multi-point observations. Previously, a custom designed space-based constellation, while technically feasible, would require a substantial monetary investment. However, commercial industry has now been entertaining the concept of hosting payloads on their space-based constellations resulting in low-cost access to space. Examples, include the low Earth orbit Iridium Next constellation as well as communication satellites in geostationary. In some of these constellations data distribution can be provided in real time, a feature relevant to applications in the areas of space weather and disaster monitoring. From the perspective of new scientific discoveries enabled by low cost access to space, the cost and thus value proposition is dramatically changed. For example, a constellation of sixty-six satellites (Iridium Next), hosting a single band or multi-spectral imager can now provide observations of the aurora with a spatial resolution of a few hundred meters at all local times and in both hemispheres simultaneously. Remote sensing of clouds is another example where it is now possible to acquire global imagery at resolutions between 100-1000m. Finally, land use imagery is another example where one can use either imaging or spectrographic imagers to solve a multitude of problems. In this work, we will discuss measurement architectures and the multi-disciplinary scientific discoveries that are enable by large space based constellations.
Enabling a new Paradigm to Address Big Data and Open Science Challenges

NASA Astrophysics Data System (ADS)

Ramamurthy, Mohan; Fisher, Ward

2017-04-01

Data are not only the lifeblood of the geosciences but they have become the currency of the modern world in science and society. Rapid advances in computing, communi¬cations, and observational technologies — along with concomitant advances in high-resolution modeling, ensemble and coupled-systems predictions of the Earth system — are revolutionizing nearly every aspect of our field. Modern data volumes from high-resolution ensemble prediction/projection/simulation systems and next-generation remote-sensing systems like hyper-spectral satellite sensors and phased-array radars are staggering. For example, CMIP efforts alone will generate many petabytes of climate projection data for use in assessments of climate change. And NOAA's National Climatic Data Center projects that it will archive over 350 petabytes by 2030. For researchers and educators, this deluge and the increasing complexity of data brings challenges along with the opportunities for discovery and scientific breakthroughs. The potential for big data to transform the geosciences is enormous, but realizing the next frontier depends on effectively managing, analyzing, and exploiting these heterogeneous data sources, extracting knowledge and useful information from heterogeneous data sources in ways that were previously impossible, to enable discoveries and gain new insights. At the same time, there is a growing focus on the area of "Reproducibility or Replicability in Science" that has implications for Open Science. The advent of cloud computing has opened new avenues for not only addressing both big data and Open Science challenges to accelerate scientific discoveries. However, to successfully leverage the enormous potential of cloud technologies, it will require the data providers and the scientific communities to develop new paradigms to enable next-generation workflows and transform the conduct of science. Making data readily available is a necessary but not a sufficient condition. Data providers also need to give scientists an ecosystem that includes data, tools, workflows and other services needed to perform analytics, integration, interpretation, and synthesis - all in the same environment or platform. Instead of moving data to processing systems near users, as is the tradition, the cloud permits one to bring processing, computing, analysis and visualization to data - so called data proximate workbench capabilities, also known as server-side processing. In this talk, I will present the ongoing work at Unidata to facilitate a new paradigm for doing science by offering a suite of tools, resources, and platforms to leverage cloud technologies for addressing both big data and Open Science/reproducibility challenges. That work includes the development and deployment of new protocols for data access and server-side operations and Docker container images of key applications, JupyterHub Python notebook tools, and cloud-based analysis and visualization capability via the CloudIDV tool to enable reproducible workflows and effectively use the accessed data.
Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle.

PubMed

Plowright, Alleyn T; Johnstone, Craig; Kihlberg, Jan; Pettersson, Jonas; Robb, Graeme; Thompson, Richard A

2012-01-01

In drug discovery, the central process of constructing and testing hypotheses, carefully conducting experiments and analysing the associated data for new findings and information is known as the design-make-test-analyse cycle. Each step relies heavily on the inputs and outputs of the other three components. In this article we report our efforts to improve and integrate all parts to enable smooth and rapid flow of high quality ideas. Key improvements include enhancing multi-disciplinary input into 'Design', increasing the use of knowledge and reducing cycle times in 'Make', providing parallel sets of relevant data within ten working days in 'Test' and maximising the learning in 'Analyse'. Copyright © 2011 Elsevier Ltd. All rights reserved.
Mitochondrial transcription in mammalian cells

PubMed Central

Shokolenko, Inna N.; Alexeyev, Mikhail F.

2017-01-01

As a consequence of recent discoveries of intimate involvement of mitochondria with key cellular processes, there has been a resurgence of interest in all aspects of mitochondrial biology, including the intricate mechanisms of mitochondrial DNA maintenance and expression. Despite four decades of research, there remains a lot to be learned about the processes that enable transcription of genetic information from mitochondrial DNA to RNA, as well as their regulation. These processes are vitally important, as evidenced by the lethality of inactivating the central components of mitochondrial transcription machinery. Here, we review the current understanding of mitochondrial transcription and its regulation in mammalian cells. We also discuss key theories in the field and highlight controversial subjects and future directions as we see them. PMID:27814650
Giovanni - The Bridge Between Data and Science

NASA Technical Reports Server (NTRS)

Liu, Zhong; Acker, James

2017-01-01

This article describes new features in the Geospatial Interactive Online Visualization ANd aNalysis Infrastructure (Giovanni), a user-friendly online tool that enables visualization, analysis, and assessment of NASA Earth science data sets without downloading data and software. Since the satellite era began, data collected from Earth-observing satellites have been widely used in research and applications; however, using satellite-based data sets can still be a challenge to many. To facilitate data access and evaluation, as well as scientific exploration and discovery, the NASA Goddard Earth Sciences (GES) Data and Information Services Center (DISC) has developed Giovanni for a wide range of users around the world. This article describes the latest capabilities of Giovanni with examples, and discusses future plans for this innovative system.
Enabling outsourcing XDS for imaging on the public cloud.

PubMed

Ribeiro, Luís S; Rodrigues, Renato P; Costa, Carlos; Oliveira, José Luís

2013-01-01

Picture Archiving and Communication System (PACS) has been the main paradigm in supporting medical imaging workflows during the last decades. Despite its consolidation, the appearance of Cross-Enterprise Document Sharing for imaging (XDS-I), within IHE initiative, constitutes a great opportunity to readapt PACS workflow for inter-institutional data exchange. XDS-I provides a centralized discovery of medical imaging and associated reports. However, the centralized XDS-I actors (document registry and repository) must be deployed in a trustworthy node in order to safeguard patient privacy, data confidentiality and integrity. This paper presents XDS for Protected Imaging (XDS-p), a new approach to XDS-I that is capable of being outsourced (e.g. Cloud Computing) while maintaining privacy, confidentiality, integrity and legal concerns about patients' medical information.
Using Controlled Vocabularies and Semantics to Improve Ocean Data Discovery (Invited)

NASA Astrophysics Data System (ADS)

Chandler, C. L.; Groman, R. C.; Shepherd, A.; Allison, M. D.; Kinkade, D.; Rauch, S.; Wiebe, P. H.; Glover, D. M.

2013-12-01

The Biological and Chemical Oceanography Data Management Office (BCO-DMO) was created in late 2006, by combining the formerly independent data management offices for the U.S. GLOBal Ocean ECosystems Dynamics (GLOBEC) and U.S. Joint Global Ocean flux Study (JGOFS) programs. BCO-DMO staff members work with investigators to publish data from research projects funded by the NSF Geosciences Directorate (GEO) Division of Ocean Sciences (OCE) Biological and Chemical Oceanography Sections and Polar Programs (PLR) Antarctic Sciences Organisms & Ecosystems Program (ANT). Since 2006, researchers have been contributing new data to the BCO-DMO data system. As the data from new research efforts have been added to the data previously shared by U.S. GLOBEC and U.S. JGOFS researchers, the BCO-DMO system has developed into a rich repository of data from ocean, coastal, and Great Lakes research programs. The metadata records for the original research program data (prior to 2006) were stored in human-readable flat files of text, translated on-demand to Web-retrievable files. Beginning in 2006, the metadata records from multiple data systems managed by BCO-DMO were ingested into a relational database (MySQL). Since that time, efforts have been made to incorporate lists of controlled vocabulary terms for key information concepts stored in the MySQL database (e.g. names of research programs, deployments, instruments and measurements). This presents a challenge for a data system that includes legacy data and is continually expanding with the addition of new contributions. Over the years, BCO-DMO has developed a series of data delivery systems driven by the supporting metadata. Improved access to research data, a primary goal of the BCO-DMO project, is achieved through geospatial and text-based data access systems that support data discovery, access, display, assessment, integration, and export of data resources. The addition of a semantically-enabled search capability improves data discovery options particularly for those investigators whose research interests are cross-domain and multi-disciplinary. Current efforts by BCO-DMO staff members are focused on identifying globally unique, persistent identifiers to unambiguously identify resources of interest curated by and available from BCO-DMO. The process involves several essential components: (1) identifying a trusted authoritative source of complementary content and the appropriate contact; (2) determining the globally unique, persistent identifier system for resources of interest and (3) negotiating the requisite syntactic and semantic exchange systems. A variety of technologies have been deployed including: (1) controlled vocabulary term lists for some of the essential concepts/classes; (2) the Ocean Data Ontology; (3) publishing content as Linked Open Data and (4) SPARQL queries and inference. The final results are emerging as a semantic layer comprising domain-specific controlled vocabularies typed to community standard definitions, an ontology with the concepts and relationships needed to describe ocean data, a semantically-enabled faceted search, and inferencing services. We are exploring use of these technologies to improve the accuracy of the BCO-DMO data collection and to facilitate exchange of information with complementary ocean data repositories. Integrating a semantic layer into the BCO-DMO data system architecture improves data and information resource discovery, access and integration.
Microfluidic-based mini-metagenomics enables discovery of novel microbial lineages from complex environmental samples

PubMed Central

Yu, Feiqiao Brian; Blainey, Paul C; Schulz, Frederik; Woyke, Tanja; Horowitz, Mark A; Quake, Stephen R

2017-01-01

Metagenomics and single-cell genomics have enabled genome discovery from unknown branches of life. However, extracting novel genomes from complex mixtures of metagenomic data can still be challenging and represents an ill-posed problem which is generally approached with ad hoc methods. Here we present a microfluidic-based mini-metagenomic method which offers a statistically rigorous approach to extract novel microbial genomes while preserving single-cell resolution. We used this approach to analyze two hot spring samples from Yellowstone National Park and extracted 29 new genomes, including three deeply branching lineages. The single-cell resolution enabled accurate quantification of genome function and abundance, down to 1% in relative abundance. Our analyses of genome level SNP distributions also revealed low to moderate environmental selection. The scale, resolution, and statistical power of microfluidic-based mini-metagenomics make it a powerful tool to dissect the genomic structure of microbial communities while effectively preserving the fundamental unit of biology, the single cell. DOI: http://dx.doi.org/10.7554/eLife.26580.001 PMID:28678007
Ensemble-based docking: From hit discovery to metabolism and toxicity predictions.

PubMed

Evangelista, Wilfredo; Weir, Rebecca L; Ellingson, Sally R; Harris, Jason B; Kapoor, Karan; Smith, Jeremy C; Baudry, Jerome

2016-10-15

This paper describes and illustrates the use of ensemble-based docking, i.e., using a collection of protein structures in docking calculations for hit discovery, the exploration of biochemical pathways and toxicity prediction of drug candidates. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials. Copyright © 2016 Elsevier Ltd. All rights reserved.
Epigenome-wide association studies for cancer biomarker discovery in circulating cell-free DNA: technical advances and challenges.

PubMed

Tanić, Miljana; Beck, Stephan

2017-02-01

Since introducing the concept of epigenome-wide association studies (EWAS) in 2011, there has been a vast increase in the number of published EWAS studies in common diseases, including in cancer. These studies have increased our understanding of epigenetic events underlying carcinogenesis and have enabled the discovery of cancer-specific methylation biomarkers. In this mini-review, we have focused on the state of the art in EWAS applied to cell-free circulating DNA for epigenetic biomarker discovery in cancer and discussed associated technical advances and challenges, and our expectations for the future of the field. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.

Semantically Enabling Knowledge Representation of Metamorphic Petrology Data

NASA Astrophysics Data System (ADS)

West, P.; Fox, P. A.; Spear, F. S.; Adali, S.; Nguyen, C.; Hallett, B. W.; Horkley, L. K.

2012-12-01

More and more metamorphic petrology data is being collected around the world, and is now being organized together into different virtual data portals by means of virtual organizations. For example, there is the virtual data portal Petrological Database (PetDB, http://www.petdb.org) of the Ocean Floor that is organizing scientific information about geochemical data of ocean floor igneous and metamorphic rocks; and also The Metamorphic Petrology Database (MetPetDB, http://metpetdb.rpi.edu) that is being created by a global community of metamorphic petrologists in collaboration with software engineers and data managers at Rensselaer Polytechnic Institute. The current focus is to provide the ability for scientists and researchers to register their data and search the databases for information regarding sample collections. What we present here is the next step in evolution of the MetPetDB portal, utilizing semantically enabled features such as discovery, data casting, faceted search, knowledge representation, and linked data as well as organizing information about the community and collaboration within the virtual community itself. We take the information that is currently represented in a relational database and make it available through web services, SPARQL endpoints, semantic and triple-stores where inferencing is enabled. We will be leveraging research that has taken place in virtual observatories, such as the Virtual Solar Terrestrial Observatory (VSTO) and the Biological and Chemical Oceanography Data Management Office (BCO-DMO); vocabulary work done in various communities such as Observations and Measurements (ISO 19156), FOAF (Friend of a Friend), Bibo (Bibliography Ontology), and domain specific ontologies; enabling provenance traces of samples and subsamples using the different provenance ontologies; and providing the much needed linking of data from the various research organizations into a common, collaborative virtual observatory. In addition to better representing and presenting the actual data, we also look to organize and represent the knowledge information and expertise behind the data. Domain experts hold a lot of knowledge in their minds, in their presentations and publications, and elsewhere. Not only is this a technical issue, this is also a social issue in that we need to be able to encourage the domain experts to share their knowledge in a way that can be searched and queried over. With this additional focus in MetPetDB the site can be used more efficiently by other domain experts, but can also be utilized by non-specialists as well in order to educate people of the importance of the work being done as well as enable future domain experts.
UCSF Small Molecule Discovery Center: innovation, collaboration and chemical biology in the Bay Area.

PubMed

Arkin, Michelle R; Ang, Kenny K H; Chen, Steven; Davies, Julia; Merron, Connie; Tang, Yinyan; Wilson, Christopher G M; Renslo, Adam R

2014-05-01

The Small Molecule Discovery Center (SMDC) at the University of California, San Francisco, works collaboratively with the scientific community to solve challenging problems in chemical biology and drug discovery. The SMDC includes a high throughput screening facility, medicinal chemistry, and research labs focused on fundamental problems in biochemistry and targeted drug delivery. Here, we outline our HTS program and provide examples of chemical tools developed through SMDC collaborations. We have an active research program in developing quantitative cell-based screens for primary cells and whole organisms; here, we describe whole-organism screens to find drugs against parasites that cause neglected tropical diseases. We are also very interested in target-based approaches for so-called "undruggable", protein classes and fragment-based lead discovery. This expertise has led to several pharmaceutical collaborations; additionally, the SMDC works with start-up companies to enable their early-stage research. The SMDC, located in the biotech-focused Mission Bay neighborhood in San Francisco, is a hub for innovative small-molecule discovery research at UCSF.
Refractive Index of Alkali Halides and Its Wavelength and Temperature Derivatives.

DTIC Science & Technology

1975-05-01

of CoBr . . . .......... 236 82. Comparison of Dispersion Equations Proposed for CsBr ... . 237 83. Recommmded Values on the Refractive Index and Its... discovery of empirical relationships which enable us to calculate dn/dT data at 293 K for some ma- terials on which no data are available. In the data...or in handbooks. In the present work, however, this problem 160 was solved by our empirical discoveries by which the unknown parameters of Eq. (19) for
Functional Metagenomics: Construction and High-Throughput Screening of Fosmid Libraries for Discovery of Novel Carbohydrate-Active Enzymes.

PubMed

Ufarté, Lisa; Bozonnet, Sophie; Laville, Elisabeth; Cecchini, Davide A; Pizzut-Serin, Sandra; Jacquiod, Samuel; Demanèche, Sandrine; Simonet, Pascal; Franqueville, Laure; Veronese, Gabrielle Potocki

2016-01-01

Activity-based metagenomics is one of the most efficient approaches to boost the discovery of novel biocatalysts from the huge reservoir of uncultivated bacteria. In this chapter, we describe a highly generic procedure of metagenomic library construction and high-throughput screening for carbohydrate-active enzymes. Applicable to any bacterial ecosystem, it enables the swift identification of functional enzymes that are highly efficient, alone or acting in synergy, to break down polysaccharides and oligosaccharides.
Discovery of a diazo-forming enzyme in cremeomycin biosynthesis.

PubMed

Waldman, Abraham J; Balskus, Emily P

2018-05-17

The molecular architectures and potent bioactivities of diazo-containing natural products have attracted the interest of synthetic and biological chemists. Despite this attention, the biosynthetic enzymes involved in diazo group construction have not been identified. Here, we show the ATP-dependent enzyme CreM installs the diazo group in cremeomycin via late-stage N-N bond formation using nitrite. This finding should inspire efforts to use diazo-forming enzymes in biocatalysis and synthetic biology and enable genome-based discovery of new diazo-containing metabolites.
On the Discovery of Evolving Truth

PubMed Central

Li, Yaliang; Li, Qi; Gao, Jing; Su, Lu; Zhao, Bo; Fan, Wei; Han, Jiawei

2015-01-01

In the era of big data, information regarding the same objects can be collected from increasingly more sources. Unfortunately, there usually exist conflicts among the information coming from different sources. To tackle this challenge, truth discovery, i.e., to integrate multi-source noisy information by estimating the reliability of each source, has emerged as a hot topic. In many real world applications, however, the information may come sequentially, and as a consequence, the truth of objects as well as the reliability of sources may be dynamically evolving. Existing truth discovery methods, unfortunately, cannot handle such scenarios. To address this problem, we investigate the temporal relations among both object truths and source reliability, and propose an incremental truth discovery framework that can dynamically update object truths and source weights upon the arrival of new data. Theoretical analysis is provided to show that the proposed method is guaranteed to converge at a fast rate. The experiments on three real world applications and a set of synthetic data demonstrate the advantages of the proposed method over state-of-the-art truth discovery methods. PMID:26705502
Toward Routine Automatic Pathway Discovery from On-line Scientific Text Abstracts.

PubMed

Ng; Wong

1999-01-01

We are entering a new era of research where the latest scientific discoveries are often first reported online and are readily accessible by scientists worldwide. This rapid electronic dissemination of research breakthroughs has greatly accelerated the current pace in genomics and proteomics research. The race to the discovery of a gene or a drug has now become increasingly dependent on how quickly a scientist can scan through voluminous amount of information available online to construct the relevant picture (such as protein-protein interaction pathways) as it takes shape amongst the rapidly expanding pool of globally accessible biological data (e.g. GENBANK) and scientific literature (e.g. MEDLINE). We describe a prototype system for automatic pathway discovery from on-line text abstracts, combining technologies that (1) retrieve research abstracts from online sources, (2) extract relevant information from the free texts, and (3) present the extracted information graphically and intuitively. Our work demonstrates that this framework allows us to routinely scan online scientific literature for automatic discovery of knowledge, giving modern scientists the necessary competitive edge in managing the information explosion in this electronic age.
34 CFR 81.16 - Discovery.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 34 Education 1 2014-07-01 2014-07-01 false Discovery. 81.16 Section 81.16 Education Office of the... Discovery. (a) The parties to a case are encouraged to exchange relevant documents and information voluntarily. (b) The ALJ, at a party's request, may order compulsory discovery described in paragraph (c) of...
7 CFR 283.12 - Discovery.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 7 Agriculture 4 2011-01-01 2011-01-01 false Discovery. 283.12 Section 283.12 Agriculture... of $50,000 or More § 283.12 Discovery. (a) Dispositions—(1) Motion for taking deposition. Only upon a... exist if the information sought appears reasonably calculated to lead to the discovery of admissible...
7 CFR 283.12 - Discovery.

Code of Federal Regulations, 2012 CFR

2012-01-01

... 7 Agriculture 4 2012-01-01 2012-01-01 false Discovery. 283.12 Section 283.12 Agriculture... of $50,000 or More § 283.12 Discovery. (a) Dispositions—(1) Motion for taking deposition. Only upon a... exist if the information sought appears reasonably calculated to lead to the discovery of admissible...
7 CFR 283.12 - Discovery.

Code of Federal Regulations, 2013 CFR

2013-01-01

... 7 Agriculture 4 2013-01-01 2013-01-01 false Discovery. 283.12 Section 283.12 Agriculture... of $50,000 or More § 283.12 Discovery. (a) Dispositions—(1) Motion for taking deposition. Only upon a... exist if the information sought appears reasonably calculated to lead to the discovery of admissible...
34 CFR 81.16 - Discovery.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 34 Education 1 2012-07-01 2012-07-01 false Discovery. 81.16 Section 81.16 Education Office of the... Discovery. (a) The parties to a case are encouraged to exchange relevant documents and information voluntarily. (b) The ALJ, at a party's request, may order compulsory discovery described in paragraph (c) of...
7 CFR 283.12 - Discovery.

Code of Federal Regulations, 2014 CFR

2014-01-01

... 7 Agriculture 4 2014-01-01 2014-01-01 false Discovery. 283.12 Section 283.12 Agriculture... of $50,000 or More § 283.12 Discovery. (a) Dispositions—(1) Motion for taking deposition. Only upon a... exist if the information sought appears reasonably calculated to lead to the discovery of admissible...
34 CFR 81.16 - Discovery.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 34 Education 1 2013-07-01 2013-07-01 false Discovery. 81.16 Section 81.16 Education Office of the... Discovery. (a) The parties to a case are encouraged to exchange relevant documents and information voluntarily. (b) The ALJ, at a party's request, may order compulsory discovery described in paragraph (c) of...
34 CFR 81.16 - Discovery.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 34 Education 1 2011-07-01 2011-07-01 false Discovery. 81.16 Section 81.16 Education Office of the... Discovery. (a) The parties to a case are encouraged to exchange relevant documents and information voluntarily. (b) The ALJ, at a party's request, may order compulsory discovery described in paragraph (c) of...
34 CFR 81.16 - Discovery.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 34 Education 1 2010-07-01 2010-07-01 false Discovery. 81.16 Section 81.16 Education Office of the... Discovery. (a) The parties to a case are encouraged to exchange relevant documents and information voluntarily. (b) The ALJ, at a party's request, may order compulsory discovery described in paragraph (c) of...
Cheaper faster drug development validated by the repositioning of drugs against neglected tropical diseases.

PubMed

Williams, Kevin; Bilsland, Elizabeth; Sparkes, Andrew; Aubrey, Wayne; Young, Michael; Soldatova, Larisa N; De Grave, Kurt; Ramon, Jan; de Clare, Michaela; Sirawaraporn, Worachart; Oliver, Stephen G; King, Ross D

2015-03-06

There is an urgent need to make drug discovery cheaper and faster. This will enable the development of treatments for diseases currently neglected for economic reasons, such as tropical and orphan diseases, and generally increase the supply of new drugs. Here, we report the Robot Scientist 'Eve' designed to make drug discovery more economical. A Robot Scientist is a laboratory automation system that uses artificial intelligence (AI) techniques to discover scientific knowledge through cycles of experimentation. Eve integrates and automates library-screening, hit-confirmation, and lead generation through cycles of quantitative structure activity relationship learning and testing. Using econometric modelling we demonstrate that the use of AI to select compounds economically outperforms standard drug screening. For further efficiency Eve uses a standardized form of assay to compute Boolean functions of compound properties. These assays can be quickly and cheaply engineered using synthetic biology, enabling more targets to be assayed for a given budget. Eve has repositioned several drugs against specific targets in parasites that cause tropical diseases. One validated discovery is that the anti-cancer compound TNP-470 is a potent inhibitor of dihydrofolate reductase from the malaria-causing parasite Plasmodium vivax.
The Unseen Companion of HD 114762

NASA Astrophysics Data System (ADS)

Latham, David W.

2014-01-01

I have told the story of the discovery of the unseen companion of HD114762 (Latham et al. 1989, Nature, 389, 38-40) in a recent publication (Latham 2012, New Astronomy Reviews 56, 16-18). The discovery was enabled by a happy combination of some thinking outside the box by Tsevi Mazeh at Tel Aviv University and the development of new technology for measuring stellar spectra at the Harvard-Smithsonian Center for Astrophysics. Tsevi's unconventional idea was that giant exoplanets might be found much closer to their host stars than Jupiter and Saturn are to the Sun, well inside the snow line. Our instrument was a high-resolution echelle spectrograph optimized for measuring radial velocities of stars similar to the Sun. The key technological developments were an intensified Reticon photon-counting detector under computer control combined with sophisticated analysis of the digital spectra. The detector signal-processing electronics eliminated persistence, which had plagued other intensified systems. This allowed bright Th-Ar calibration exposures before and after every stellar observation, which in turn enabled careful correction for spectrograph drifts. We built three of these systems for telescopes in Massachusetts and Arizona and christened them the "CfA Digital Speedometers". The discovery of HD 114762-b was serendipitous, but not accidental.
Cheaper faster drug development validated by the repositioning of drugs against neglected tropical diseases

PubMed Central

Williams, Kevin; Bilsland, Elizabeth; Sparkes, Andrew; Aubrey, Wayne; Young, Michael; Soldatova, Larisa N.; De Grave, Kurt; Ramon, Jan; de Clare, Michaela; Sirawaraporn, Worachart; Oliver, Stephen G.; King, Ross D.

2015-01-01

There is an urgent need to make drug discovery cheaper and faster. This will enable the development of treatments for diseases currently neglected for economic reasons, such as tropical and orphan diseases, and generally increase the supply of new drugs. Here, we report the Robot Scientist ‘Eve’ designed to make drug discovery more economical. A Robot Scientist is a laboratory automation system that uses artificial intelligence (AI) techniques to discover scientific knowledge through cycles of experimentation. Eve integrates and automates library-screening, hit-confirmation, and lead generation through cycles of quantitative structure activity relationship learning and testing. Using econometric modelling we demonstrate that the use of AI to select compounds economically outperforms standard drug screening. For further efficiency Eve uses a standardized form of assay to compute Boolean functions of compound properties. These assays can be quickly and cheaply engineered using synthetic biology, enabling more targets to be assayed for a given budget. Eve has repositioned several drugs against specific targets in parasites that cause tropical diseases. One validated discovery is that the anti-cancer compound TNP-470 is a potent inhibitor of dihydrofolate reductase from the malaria-causing parasite Plasmodium vivax. PMID:25652463
The Stellar Imager (SI) - A Mission to Resolve Stellar Surfaces, Interiors, and Magnetic Activity

NASA Astrophysics Data System (ADS)

Christensen-Dalsgaard, Jørgen; Carpenter, Kenneth G.; Schrijver, Carolus J.; Karovska, Margarita; Si Team

2011-01-01

The Stellar Imager (SI) is a space-based, UV/Optical Interferometer (UVOI) designed to enable 0.1 milli-arcsecond (mas) spectral imaging of stellar surfaces and of the Universe in general. It will also probe via asteroseismology flows and structures in stellar interiors. SI will enable the development and testing of a predictive dynamo model for the Sun, by observing patterns of surface activity and imaging of the structure and differential rotation of stellar interiors in a population study of Sun-like stars to determine the dependence of dynamo action on mass, internal structure and flows, and time. SI's science focuses on the role of magnetism in the Universe and will revolutionize our understanding of the formation of planetary systems, of the habitability and climatology of distant planets, and of many magneto-hydrodynamically controlled processes in the Universe. SI is a "Landmark/Discovery Mission" in the 2005 Heliophysics Roadmap, an implementation of the UVOI in the 2006 Astrophysics Strategic Plan, and a NASA Vision Mission ("NASA Space Science Vision Missions" (2008), ed. M. Allen). We present here the science goals of the SI Mission, a mission architecture that could meet those goals, and the technology development needed to enable this mission. Additional information on SI can be found at: http://hires.gsfc.nasa.gov/si/.

Resident Space Object Characterization and Behavior Understanding via Machine Learning and Ontology-based Bayesian Networks

NASA Astrophysics Data System (ADS)

Furfaro, R.; Linares, R.; Gaylor, D.; Jah, M.; Walls, R.

2016-09-01

In this paper, we present an end-to-end approach that employs machine learning techniques and Ontology-based Bayesian Networks (BN) to characterize the behavior of resident space objects. State-of-the-Art machine learning architectures (e.g. Extreme Learning Machines, Convolutional Deep Networks) are trained on physical models to learn the Resident Space Object (RSO) features in the vectorized energy and momentum states and parameters. The mapping from measurements to vectorized energy and momentum states and parameters enables behavior characterization via clustering in the features space and subsequent RSO classification. Additionally, Space Object Behavioral Ontologies (SOBO) are employed to define and capture the domain knowledge-base (KB) and BNs are constructed from the SOBO in a semi-automatic fashion to execute probabilistic reasoning over conclusions drawn from trained classifiers and/or directly from processed data. Such an approach enables integrating machine learning classifiers and probabilistic reasoning to support higher-level decision making for space domain awareness applications. The innovation here is to use these methods (which have enjoyed great success in other domains) in synergy so that it enables a "from data to discovery" paradigm by facilitating the linkage and fusion of large and disparate sources of information via a Big Data Science and Analytics framework.
Omics-Based Strategies in Precision Medicine: Toward a Paradigm Shift in Inborn Errors of Metabolism Investigations

PubMed Central

Tebani, Abdellah; Afonso, Carlos; Marret, Stéphane; Bekri, Soumeya

2016-01-01

The rise of technologies that simultaneously measure thousands of data points represents the heart of systems biology. These technologies have had a huge impact on the discovery of next-generation diagnostics, biomarkers, and drugs in the precision medicine era. Systems biology aims to achieve systemic exploration of complex interactions in biological systems. Driven by high-throughput omics technologies and the computational surge, it enables multi-scale and insightful overviews of cells, organisms, and populations. Precision medicine capitalizes on these conceptual and technological advancements and stands on two main pillars: data generation and data modeling. High-throughput omics technologies allow the retrieval of comprehensive and holistic biological information, whereas computational capabilities enable high-dimensional data modeling and, therefore, accessible and user-friendly visualization. Furthermore, bioinformatics has enabled comprehensive multi-omics and clinical data integration for insightful interpretation. Despite their promise, the translation of these technologies into clinically actionable tools has been slow. In this review, we present state-of-the-art multi-omics data analysis strategies in a clinical context. The challenges of omics-based biomarker translation are discussed. Perspectives regarding the use of multi-omics approaches for inborn errors of metabolism (IEM) are presented by introducing a new paradigm shift in addressing IEM investigations in the post-genomic era. PMID:27649151
Ion channel drug discovery and research: the automated Nano-Patch-Clamp technology.

PubMed

Brueggemann, A; George, M; Klau, M; Beckler, M; Steindl, J; Behrends, J C; Fertig, N

2004-01-01

Unlike the genomics revolution, which was largely enabled by a single technological advance (high throughput sequencing), rapid advancement in proteomics will require a broader effort to increase the throughput of a number of key tools for functional analysis of different types of proteins. In the case of ion channels -a class of (membrane) proteins of great physiological importance and potential as drug targets- the lack of adequate assay technologies is felt particularly strongly. The available, indirect, high throughput screening methods for ion channels clearly generate insufficient information. The best technology to study ion channel function and screen for compound interaction is the patch clamp technique, but patch clamping suffers from low throughput, which is not acceptable for drug screening. A first step towards a solution is presented here. The nano patch clamp technology, which is based on a planar, microstructured glass chip, enables automatic whole cell patch clamp measurements. The Port-a-Patch is an automated electrophysiology workstation, which uses planar patch clamp chips. This approach enables high quality and high content ion channel and compound evaluation on a one-cell-at-a-time basis. The presented automation of the patch process and its scalability to an array format are the prerequisites for any higher throughput electrophysiology instruments.
Omics-Based Strategies in Precision Medicine: Toward a Paradigm Shift in Inborn Errors of Metabolism Investigations.

PubMed

Tebani, Abdellah; Afonso, Carlos; Marret, Stéphane; Bekri, Soumeya

2016-09-14

The rise of technologies that simultaneously measure thousands of data points represents the heart of systems biology. These technologies have had a huge impact on the discovery of next-generation diagnostics, biomarkers, and drugs in the precision medicine era. Systems biology aims to achieve systemic exploration of complex interactions in biological systems. Driven by high-throughput omics technologies and the computational surge, it enables multi-scale and insightful overviews of cells, organisms, and populations. Precision medicine capitalizes on these conceptual and technological advancements and stands on two main pillars: data generation and data modeling. High-throughput omics technologies allow the retrieval of comprehensive and holistic biological information, whereas computational capabilities enable high-dimensional data modeling and, therefore, accessible and user-friendly visualization. Furthermore, bioinformatics has enabled comprehensive multi-omics and clinical data integration for insightful interpretation. Despite their promise, the translation of these technologies into clinically actionable tools has been slow. In this review, we present state-of-the-art multi-omics data analysis strategies in a clinical context. The challenges of omics-based biomarker translation are discussed. Perspectives regarding the use of multi-omics approaches for inborn errors of metabolism (IEM) are presented by introducing a new paradigm shift in addressing IEM investigations in the post-genomic era.
Learning and Relevance in Information Retrieval: A Study in the Application of Exploration and User Knowledge to Enhance Performance

ERIC Educational Resources Information Center

Hyman, Harvey

2012-01-01

This dissertation examines the impact of exploration and learning upon eDiscovery information retrieval; it is written in three parts. Part I contains foundational concepts and background on the topics of information retrieval and eDiscovery. This part informs the reader about the research frameworks, methodologies, data collection, and…
Physical Characterization of the Near-Earth Object Population

NASA Technical Reports Server (NTRS)

Binzel, Richard P.

2004-01-01

Many pieces of the puzzle must be brought together in order to have a clear picture of the near-Earth object (NEO) population. Four of the pieces that can be described include: i) the taxonomic distribution of the population as measured by observational sampling, ii) the determination of albedos that can be associated with the taxonomic distribution, iii) discovery statistics for the NE0 population, and iv) the debiasing of the discovery statistics using the taxonomic and albedo information. Support from this grant enables us to address three of these four pieces. Binzel et al. (2004, submitted) presents the first piece, detailing the observations and observed characteristics of the NE0 and Mars-crossing (MC) population. For the second piece, a complementary program of albedo measurements is pursued at the Keck Observatory (Binzel, P. I.) with first results published in Delbo et al. (2003). For the third piece, the most extensive NE0 discovery statistics are provided by the LINEAR survey. Binzel has supervised the MIT Ph. D. thesis work of Stuart (2003) to bring the fourth piece, submitted for publication by Stuart and Binzel (2004). Our results provide new constraints for the NE0 population and progress for the Spaceguard Survey, illuminate asteroid and comet source regions for the NEOs, and provide new evidence for space weathering processes linking asteroids and meteorites. Further, we are identifying top priority near-Earth spacecraft mission candidates based on their spectral properties and inferred compositions.
Purposive discovery of operations

NASA Technical Reports Server (NTRS)

Sims, Michael H.; Bresina, John L.

1992-01-01

The Generate, Prune & Prove (GPP) methodology for discovering definitions of mathematical operators is introduced. GPP is a task within the IL exploration discovery system. We developed GPP for use in the discovery of mathematical operators with a wider class of representations than was possible with the previous methods by Lenat and by Shen. GPP utilizes the purpose for which an operator is created to prune the possible definitions. The relevant search spaces are immense and there exists insufficient information for a complete evaluation of the purpose constraint, so it is necessary to perform a partial evaluation of the purpose (i.e., pruning) constraint. The constraint is first transformed so that it is operational with respect to the partial information, and then it is applied to examples in order to test the generated candidates for an operator's definition. In the GPP process, once a candidate definition survives this empirical prune, it is passed on to a theorem prover for formal verification. We describe the application of this methodology to the (re)discovery of the definition of multiplication for Conway numbers, a discovery which is difficult for human mathematicians. We successfully model this discovery process utilizing information which was reasonably available at the time of Conway's original discovery. As part of this discovery process, we reduce the size of the search space from a computationally intractable size to 3468 elements.
Computational tools for comparative phenomics; the role and promise of ontologies

PubMed Central

Gkoutos, Georgios V.; Schofield, Paul N.; Hoehndorf, Robert

2012-01-01

A major aim of the biological sciences is to gain an understanding of human physiology and disease. One important step towards such a goal is the discovery of the function of genes that will lead to better understanding of the physiology and pathophysiology of organisms ultimately providing better understanding, diagnosis, and therapy. Our increasing ability to phenotypically characterise genetic variants of model organisms coupled with systematic and hypothesis-driven mutagenesis is resulting in a wealth of information that could potentially provide insight to the functions of all genes in an organism. The challenge we are now facing is to develop computational methods that can integrate and analyse such data. The introduction of formal ontologies that make their semantics explicit and accessible to automated reasoning promises the tantalizing possibility of standardizing biomedical knowledge allowing for novel, powerful queries that bridge multiple domains, disciplines, species and levels of granularity. We review recent computational approaches that facilitate the integration of experimental data from model organisms with clinical observations in humans. These methods foster novel cross species analysis approaches, thereby enabling comparative phenomics and leading to the potential of translating basic discoveries from the model systems into diagnostic and therapeutic advances at the clinical level. PMID:22814867
Opportunities for Epidemiologists in Implementation Science: A Primer.

PubMed

Neta, Gila; Brownson, Ross C; Chambers, David A

2018-05-01

The field of epidemiology has been defined as the study of the spread and control of disease. However, epidemiology frequently focuses on studies of etiology and distribution of disease at the cost of understanding the best ways to control disease. Moreover, only a small fraction of scientific discoveries are translated into public health practice, and the process from discovery to translation is exceedingly slow. Given the importance of translational science, the future of epidemiologic training should include competency in implementation science, whose goal is to rapidly move evidence into practice. Our purpose in this paper is to provide epidemiologists with a primer in implementation science, which includes dissemination research and implementation research as defined by the National Institutes of Health. We describe the basic principles of implementation science, highlight key components for conducting research, provide examples of implementation studies that encompass epidemiology, and offer resources and opportunities for continued learning. There is a clear need for greater speed, relevance, and application of evidence into practice, programs, and policies and an opportunity to enable epidemiologists to conduct research that not only will inform practitioners and policy-makers of risk but also will enhance the likelihood that evidence will be implemented.
Chemistry of berkelium: A review

NASA Astrophysics Data System (ADS)

Hobart, D. E.; Peterson, J. R.

Element 97 was first produced in December 1949, by the bombardment of americium-241 with accelerated alpha particles. This new element was named berkelium (Bk) after Berkeley, California, the city of its discovery. In the 36 years since the discovery of Bk, a substantial amount of knowledge concerning the physicochemical properties of this relatively scarce transplutonium element was acquired. All of the Bk isotopes of mass numbers 240 and 242 through 251 are presently known, but only berkelium-249 is available in sufficient quantities for bulk chemical studies. About 0.7 gram of this isotope was isolated at the HFIR/TRU Complex in Oak Ridge, Tennessee in the last 18 years. Over the same time period, the scale of experimental work using berkelium-249 has increased from the tracer level to bulk studies at the microgram level to solution and solid state investigations with milligram quantities. Extended knowledge of the physicochemical behavior of berkelium is important in its own right, because Bk is the first member of the second half of the actinide series. In addition, such information should enable more accurate extrapolations to the predicted behavior of heavier elements for which experimental studies are severely limited by lack of material and/or by intense radioactivity.
Anonymization of electronic medical records for validating genome-wide association studies

PubMed Central

Loukides, Grigorios; Gkoulalas-Divanis, Aris; Malin, Bradley

2010-01-01

Genome-wide association studies (GWAS) facilitate the discovery of genotype–phenotype relations from population-based sequence databases, which is an integral facet of personalized medicine. The increasing adoption of electronic medical records allows large amounts of patients’ standardized clinical features to be combined with the genomic sequences of these patients and shared to support validation of GWAS findings and to enable novel discoveries. However, disseminating these data “as is” may lead to patient reidentification when genomic sequences are linked to resources that contain the corresponding patients’ identity information based on standardized clinical features. This work proposes an approach that provably prevents this type of data linkage and furnishes a result that helps support GWAS. Our approach automatically extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases. Extensive experiments with real patient data derived from the Vanderbilt's University Medical Center verify that our approach generates data that eliminate the threat of individual reidentification, while supporting GWAS validation and clinical case analysis tasks. PMID:20385806
Cell type discovery using single-cell transcriptomics: implications for ontological representation.

PubMed

Aevermann, Brian D; Novotny, Mark; Bakken, Trygve; Miller, Jeremy A; Diehl, Alexander D; Osumi-Sutherland, David; Lasken, Roger S; Lein, Ed S; Scheuermann, Richard H

2018-05-01

Cells are fundamental function units of multicellular organisms, with different cell types playing distinct physiological roles in the body. The recent advent of single-cell transcriptional profiling using RNA sequencing is producing 'big data', enabling the identification of novel human cell types at an unprecedented rate. In this review, we summarize recent work characterizing cell types in the human central nervous and immune systems using single-cell and single-nuclei RNA sequencing, and discuss the implications that these discoveries are having on the representation of cell types in the reference Cell Ontology (CL). We propose a method, based on random forest machine learning, for identifying sets of necessary and sufficient marker genes, which can be used to assemble consistent and reproducible cell type definitions for incorporation into the CL. The representation of defined cell type classes and their relationships in the CL using this strategy will make the cell type classes being identified by high-throughput/high-content technologies findable, accessible, interoperable and reusable (FAIR), allowing the CL to serve as a reference knowledgebase of information about the role that distinct cellular phenotypes play in human health and disease.
Accidental Discovery of Information on the User-Defined Social Web: A Mixed-Method Study

ERIC Educational Resources Information Center

Lu, Chi-Jung

2012-01-01

Frequently interacting with other people or working in an information-rich environment can foster the "accidental discovery of information" (ADI) (Erdelez, 2000; McCay-Peet & Toms, 2010). With the increasing adoption of social web technologies, online user-participation communities and user-generated content have provided users the…
IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites.

PubMed

Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita

2015-07-14

In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world. Copyright © 2015 Hadjithomas et al.
Modeling Emergence in Neuroprotective Regulatory Networks

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sanfilippo, Antonio P.; Haack, Jereme N.; McDermott, Jason E.

2013-01-05

The use of predictive modeling in the analysis of gene expression data can greatly accelerate the pace of scientific discovery in biomedical research by enabling in silico experimentation to test disease triggers and potential drug therapies. Techniques that focus on modeling emergence, such as agent-based modeling and multi-agent simulations, are of particular interest as they support the discovery of pathways that may have never been observed in the past. Thus far, these techniques have been primarily applied at the multi-cellular level, or have focused on signaling and metabolic networks. We present an approach where emergence modeling is extended to regulatorymore » networks and demonstrate its application to the discovery of neuroprotective pathways. An initial evaluation of the approach indicates that emergence modeling provides novel insights for the analysis of regulatory networks that can advance the discovery of acute treatments for stroke and other diseases.« less
Conceptual Design For Interplanetary Spaceship Discovery

NASA Astrophysics Data System (ADS)

Benton, Mark G.

2006-01-01

With the recently revived national interest in Lunar and Mars missions, this design study was undertaken by the author in an attempt to satisfy the long-term space exploration vision of human travel ``to the Moon, Mars, and beyond'' with a single design or family of vehicles. This paper describes a conceptual design for an interplanetary spaceship of the not-to-distant future. It is a design that is outwardly similar to the spaceship Discovery depicted in the novel ``2001 - A Space Odyssey'' and film of the same name. Like its namesake, this spaceship could one day transport a human expedition to explore the moons of Jupiter. This spaceship Discovery is a real engineering design that is capable of being implemented using technologies that are currently at or near the state-of-the-art. The ship's main propulsion and electrical power are provided by bi-modal nuclear thermal rocket engines. Configurations are presented to satisfy four basic Design Reference Missions: (1) a high-energy mission to Jupiter's moon Callisto, (2) a high-energy mission to Mars, (3) a low-energy mission to Mars, and (4) a high-energy mission to the Moon. The spaceship design includes dual, strap-on boosters to enable the high-energy Mars and Jupiter missions. Three conceptual lander designs are presented: (1) Two types of Mars landers that utilize atmospheric and propulsive braking, and (2) a lander for Callisto or Earth's Moon that utilizes only propulsive braking. Spaceship Discovery offers many advantages for human exploration of the Solar System: (1) Nuclear propulsion enables propulsive capture and escape maneuvers at Earth and target planets, eliminating risky aero-capture maneuvers. (2) Strap-on boosters provide robust propulsive energy, enabling flexibility in mission planning, shorter transit times, expanded launch windows, and free-return abort trajectories from Mars. (3) A backup abort propulsion system enables crew aborts at multiple points in the mission. (4) Clustered NTR engines provide ``engine out'' redundancy. (5) The design efficiently implements galactic cosmic ray shielding using main propellant liquid hydrogen. (6) The design provides artificial gravity to mitigate crew physiological problems on long-duration missions. (7) The design is modular and can be launched using the proposed upgrades to the Evolved Expendable Launch Vehicles or Shuttle-derived heavy lift launch vehicles. (8) High value modules are reusable for Mars and Lunar missions. (9) The design has inherent growth capability, and can be tailored to satisfy expanding mission requirements to enable an in-family progression ``to the Moon, Mars, and beyond.''
Systems modelling methodology for the analysis of apoptosis signal transduction and cell death decisions.

PubMed

Rehm, Markus; Prehn, Jochen H M

2013-06-01

Systems biology and systems medicine, i.e. the application of systems biology in a clinical context, is becoming of increasing importance in biology, drug discovery and health care. Systems biology incorporates knowledge and methods that are applied in mathematics, physics and engineering, but may not be part of classical training in biology. We here provide an introduction to basic concepts and methods relevant to the construction and application of systems models for apoptosis research. We present the key methods relevant to the representation of biochemical processes in signal transduction models, with a particular reference to apoptotic processes. We demonstrate how such models enable a quantitative and temporal analysis of changes in molecular entities in response to an apoptosis-inducing stimulus, and provide information on cell survival and cell death decisions. We introduce methods for analyzing the spatial propagation of cell death signals, and discuss the concepts of sensitivity analyses that enable a prediction of network responses to disturbances of single or multiple parameters. Copyright © 2013 Elsevier Inc. All rights reserved.
Genomics, Telomere Length, Epigenetics, and Metabolomics in the Nurses’ Health Studies

PubMed Central

Aschard, Hugues; De Vivo, Immaculata; Michels, Karin B.; Kraft, Peter

2016-01-01

Objectives. To review the contribution of the Nurses’ Health Study (NHS) and NHS II to genomics, epigenetics, and metabolomics research. Methods. We performed a narrative review of the publications of the NHS and NHS II between 1990 and 2016 based on biospecimens, including blood and tumor tissue, collected from participants. Results. The NHS has contributed to the discovery of genetic loci influencing more than 45 complex human phenotypes, including cancers, diabetes, cardiovascular disease, reproductive characteristics, and anthropometric traits. The combination of genomewide genotype data with extensive exposure and lifestyle data has enabled the evaluation of gene–environment interactions. Furthermore, data suggest that longer telomere length increases risk of cancers not related to smoking, and that modifiable factors (e.g., diet) may have an impact on telomere length. “Omics” research in the NHS continues to expand, with epigenetics and metabolomics becoming greater areas of focus. Conclusions. The combination of prospective biomarker data and broad exposure information has enabled the NHS to participate in a variety of “omics” research, contributing to understanding of the epidemiology and biology of multiple complex diseases. PMID:27459442
Genomics, Telomere Length, Epigenetics, and Metabolomics in the Nurses' Health Studies.

PubMed

Townsend, Mary K; Aschard, Hugues; De Vivo, Immaculata; Michels, Karin B; Kraft, Peter

2016-09-01

To review the contribution of the Nurses' Health Study (NHS) and NHS II to genomics, epigenetics, and metabolomics research. We performed a narrative review of the publications of the NHS and NHS II between 1990 and 2016 based on biospecimens, including blood and tumor tissue, collected from participants. The NHS has contributed to the discovery of genetic loci influencing more than 45 complex human phenotypes, including cancers, diabetes, cardiovascular disease, reproductive characteristics, and anthropometric traits. The combination of genomewide genotype data with extensive exposure and lifestyle data has enabled the evaluation of gene-environment interactions. Furthermore, data suggest that longer telomere length increases risk of cancers not related to smoking, and that modifiable factors (e.g., diet) may have an impact on telomere length. "Omics" research in the NHS continues to expand, with epigenetics and metabolomics becoming greater areas of focus. The combination of prospective biomarker data and broad exposure information has enabled the NHS to participate in a variety of "omics" research, contributing to understanding of the epidemiology and biology of multiple complex diseases.
NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources

PubMed Central

Jonquet, Clement; LePendu, Paea; Falconer, Sean; Coulet, Adrien; Noy, Natalya F.; Musen, Mark A.; Shah, Nigam H.

2011-01-01

The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index—a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics “under the hood.” PMID:21918645

SensorDB: a virtual laboratory for the integration, visualization and analysis of varied biological sensor data.

PubMed

Salehi, Ali; Jimenez-Berni, Jose; Deery, David M; Palmer, Doug; Holland, Edward; Rozas-Larraondo, Pablo; Chapman, Scott C; Georgakopoulos, Dimitrios; Furbank, Robert T

2015-01-01

To our knowledge, there is no software or database solution that supports large volumes of biological time series sensor data efficiently and enables data visualization and analysis in real time. Existing solutions for managing data typically use unstructured file systems or relational databases. These systems are not designed to provide instantaneous response to user queries. Furthermore, they do not support rapid data analysis and visualization to enable interactive experiments. In large scale experiments, this behaviour slows research discovery, discourages the widespread sharing and reuse of data that could otherwise inform critical decisions in a timely manner and encourage effective collaboration between groups. In this paper we present SensorDB, a web based virtual laboratory that can manage large volumes of biological time series sensor data while supporting rapid data queries and real-time user interaction. SensorDB is sensor agnostic and uses web-based, state-of-the-art cloud and storage technologies to efficiently gather, analyse and visualize data. Collaboration and data sharing between different agencies and groups is thereby facilitated. SensorDB is available online at http://sensordb.csiro.au.
NCBO Resource Index: Ontology-Based Search and Mining of Biomedical Resources.

PubMed

Jonquet, Clement; Lependu, Paea; Falconer, Sean; Coulet, Adrien; Noy, Natalya F; Musen, Mark A; Shah, Nigam H

2011-09-01

The volume of publicly available data in biomedicine is constantly increasing. However, these data are stored in different formats and on different platforms. Integrating these data will enable us to facilitate the pace of medical discoveries by providing scientists with a unified view of this diverse information. Under the auspices of the National Center for Biomedical Ontology (NCBO), we have developed the Resource Index-a growing, large-scale ontology-based index of more than twenty heterogeneous biomedical resources. The resources come from a variety of repositories maintained by organizations from around the world. We use a set of over 200 publicly available ontologies contributed by researchers in various domains to annotate the elements in these resources. We use the semantics that the ontologies encode, such as different properties of classes, the class hierarchies, and the mappings between ontologies, in order to improve the search experience for the Resource Index user. Our user interface enables scientists to search the multiple resources quickly and efficiently using domain terms, without even being aware that there is semantics "under the hood."
The UCSC Genome Browser: What Every Molecular Biologist Should Know.

PubMed

Mangan, Mary E; Williams, Jennifer M; Kuhn, Robert M; Lathe, Warren C

2014-07-01

Electronic data resources can enable molecular biologists to quickly get information from around the world that a decade ago would have been buried in papers scattered throughout the library. The ability to access, query, and display these data makes benchwork much more efficient and drives new discoveries. Increasingly, mastery of software resources and corresponding data repositories is required to fully explore the volume of data generated in biomedical and agricultural research, because only small amounts of data are actually found in traditional publications. The UCSC Genome Browser provides a wealth of data and tools that advance understanding of genomic context for many species, enable detailed analysis of data, and provide the ability to interrogate regions of interest across disparate data sets from a wide variety of sources. Researchers can also supplement the standard display with their own data to query and share this with others. Effective use of these resources has become crucial to biological research today, and this unit describes some practical applications of the UCSC Genome Browser. Copyright © 2014 John Wiley & Sons, Inc.
Brokered virtual hubs for facilitating access and use of geospatial Open Data

NASA Astrophysics Data System (ADS)

Mazzetti, Paolo; Latre, Miguel; Kamali, Nargess; Brumana, Raffaella; Braumann, Stefan; Nativi, Stefano

2016-04-01

Open Data is a major trend in current information technology scenario and it is often publicised as one of the pillars of the information society in the near future. In particular, geospatial Open Data have a huge potential also for Earth Sciences, through the enablement of innovative applications and services integrating heterogeneous information. However, open does not mean usable. As it was recognized at the very beginning of the Web revolution, many different degrees of openness exist: from simple sharing in a proprietary format to advanced sharing in standard formats and including semantic information. Therefore, to fully unleash the potential of geospatial Open Data, advanced infrastructures are needed to increase the data openness degree, enhancing their usability. In October 2014, the ENERGIC OD (European NEtwork for Redistributing Geospatial Information to user Communities - Open Data) project, funded by the European Union under the Competitiveness and Innovation framework Programme (CIP), has started. In response to the EU call, the general objective of the project is to "facilitate the use of open (freely available) geographic data from different sources for the creation of innovative applications and services through the creation of Virtual Hubs". The ENERGIC OD Virtual Hubs aim to facilitate the use of geospatial Open Data by lowering and possibly removing the main barriers which hampers geo-information (GI) usage by end-users and application developers. Data and services heterogeneity is recognized as one of the major barriers to Open Data (re-)use. It imposes end-users and developers to spend a lot of effort in accessing different infrastructures and harmonizing datasets. Such heterogeneity cannot be completely removed through the adoption of standard specifications for service interfaces, metadata and data models, since different infrastructures adopt different standards to answer to specific challenges and to address specific use-cases. Thus, beyond a certain extent, heterogeneity is irreducible especially in interdisciplinary contexts. ENERGIC OD Virtual Hubs address heterogeneity adopting a mediation and brokering approach: specific components (brokers) are dedicated to harmonize service interfaces, metadata and data models, enabling seamless discovery and access to heterogeneous infrastructures and datasets. As an innovation project, ENERGIC OD integrates several existing technologies to implement Virtual Hubs as single points of access to geospatial datasets provided by new or existing platforms and infrastructures, including INSPIRE-compliant systems and Copernicus services. A first version of the ENERGIC OD brokers has been implemented based on the GI-Suite Brokering Framework developed by CNR-IIA, and complemented with other tools under integration and development. It already enables mediated discovery and harmonized access to different geospatial Open Data sources. It is accessible by users as Software-as-a-Service through a browser. Moreover, open APIs and a Javascript library are available for application developers. Six ENERGIC OD Virtual Hubs have been currently deployed: one at regional level (Berlin metropolitan area) and five at national-level (in France, Germany, Italy, Poland and Spain). Each Virtual Hub manager decided the deployment strategy (local infrastructure or commercial Infrastructure-as-a-Service cloud), and the list of connected Open Data sources. The ENERGIC OD Virtual Hubs are under test and validation through the development of ten different mobile and Web applications.
Discovery in a World of Mashups

NASA Astrophysics Data System (ADS)

King, T. A.; Ritschel, B.; Hourcle, J. A.; Moon, I. S.

2014-12-01

When the first digital information was stored electronically, discovery of what existed was through file names and the organization of the file system. With the advent of networks, digital information was shared on a wider scale, but discovery remained based on file and folder names. With a growing number of information sources, named based discovery quickly became ineffective. The keyword based search engine was one of the first types of a mashup in the world of Web 1.0. Embedded links from one document to another with prescribed relationships between files and the world of Web 2.0 was formed. Search engines like Google used the links to improve search results and a worldwide mashup was formed. While a vast improvement, the need for semantic (meaning rich) discovery was clear, especially for the discovery of scientific data. In response, every science discipline defined schemas to describe their type of data. Some core schemas where shared, but most schemas are custom tailored even though they share many common concepts. As with the networking of information sources, science increasingly relies on data from multiple disciplines. So there is a need to bring together multiple sources of semantically rich information. We explore how harvesting, conceptual mapping, facet based search engines, search term promotion, and style sheets can be combined to create the next generation of mashups in the emerging world of Web 3.0. We use NASA's Planetary Data System and NASA's Heliophysics Data Environment to illustrate how to create a multi-discipline mash-up.
Energy Analysis News | Energy Analysis | NREL

Science.gov Websites

26, 2018 Ten Years of Analyzing the Duck Chart: How an NREL Discovery in 2008 Is Helping Enable More version of what was later named the "duck curve" by the California Independent System Operator
Advantages and application of label-free detection assays in drug screening.

PubMed

Cunningham, Brian T; Laing, Lance G

2008-08-01

Adoption is accelerating for a new family of label-free optical biosensors incorporated into standard format microplates owing to their ability to enable highly sensitive detection of small molecules, proteins and cells for high-throughput drug discovery applications. Label-free approaches are displacing other detection technologies owing to their ability to provide simple assay procedures for hit finding/validation, accessing difficult target classes, screening the interaction of cells with drugs and analyzing the affinity of small molecule inhibitors to target proteins. This review describes several new drug discovery applications that are under development for microplate-based photonic crystal optical biosensors and the key issues that will drive adoption of the technology. Microplate-based optical biosensors are enabling a variety of cell-based assays, inhibition assays, protein-protein binding assays and protein-small molecule binding assays to be performed with high-throughput and high sensitivity.
Automated DBS microsampling, microscale automation and microflow LC-MS for therapeutic protein PK.

PubMed

Zhang, Qian; Tomazela, Daniela; Vasicek, Lisa A; Spellman, Daniel S; Beaumont, Maribel; Shyong, BaoJen; Kenny, Jacqueline; Fauty, Scott; Fillgrove, Kerry; Harrelson, Jane; Bateman, Kevin P

2016-04-01

Reduce animal usage for discovery-stage PK studies for biologics programs using microsampling-based approaches and microscale LC-MS. We report the development of an automated DBS-based serial microsampling approach for studying the PK of therapeutic proteins in mice. Automated sample preparation and microflow LC-MS were used to enable assay miniaturization and improve overall assay throughput. Serial sampling of mice was possible over the full 21-day study period with the first six time points over 24 h being collected using automated DBS sample collection. Overall, this approach demonstrated comparable data to a previous study using single mice per time point liquid samples while reducing animal and compound requirements by 14-fold. Reduction in animals and drug material is enabled by the use of automated serial DBS microsampling for mice studies in discovery-stage studies of protein therapeutics.
Covalent inhibitors: an opportunity for rational target selectivity.

PubMed

Lagoutte, Roman; Patouret, Remi; Winssinger, Nicolas

2017-08-01

There is a resurging interest in compounds that engage their target through covalent interactions. Cysteine's thiol is endowed with enhanced reactivity, making it the nucleophile of choice for covalent engagement with a ligand aligning an electrophilic trap with a cysteine residue in a target of interest. The paucity of cysteine in the proteome coupled to the fact that closely related proteins do not necessarily share a given cysteine residue enable a level of unprecedented rational target selectivity. The recent demonstration that a lysine's amine can also be engaged covalently with a mild electrophile extends the potential of covalent inhibitors. The growing database of protein structures facilitates the discovery of covalent inhibitors while the advent of proteomic technologies enables a finer resolution in the selectivity of covalently engaged proteins. Here, we discuss recent examples of discovery and design of covalent inhibitors. Copyright © 2017 Elsevier Ltd. All rights reserved.
Symmetry as Bias: Rediscovering Special Relativity

NASA Technical Reports Server (NTRS)

Lowry, Michael R.

1992-01-01

This paper describes a rational reconstruction of Einstein's discovery of special relativity, validated through an implementation: the Erlanger program. Einstein's discovery of special relativity revolutionized both the content of physics and the research strategy used by theoretical physicists. This research strategy entails a mutual bootstrapping process between a hypothesis space for biases, defined through different postulated symmetries of the universe, and a hypothesis space for physical theories. The invariance principle mutually constrains these two spaces. The invariance principle enables detecting when an evolving physical theory becomes inconsistent with its bias, and also when the biases for theories describing different phenomena are inconsistent. Structural properties of the invariance principle facilitate generating a new bias when an inconsistency is detected. After a new bias is generated. this principle facilitates reformulating the old, inconsistent theory by treating the latter as a limiting approximation. The structural properties of the invariance principle can be suitably generalized to other types of biases to enable primal-dual learning.
Towards a Conceptual Design of a Cross-Domain Integrative Information System for the Geosciences

NASA Astrophysics Data System (ADS)

Zaslavsky, I.; Richard, S. M.; Valentine, D. W.; Malik, T.; Gupta, A.

2013-12-01

As geoscientists increasingly focus on studying processes that span multiple research domains, there is an increased need for cross-domain interoperability solutions that can scale to the entire geosciences, bridging information and knowledge systems, models, software tools, as well as connecting researchers and organization. Creating a community-driven cyberinfrastructure (CI) to address the grand challenges of integrative Earth science research and education is the focus of EarthCube, a new research initiative of the U.S. National Science Foundation. We are approaching EarthCube design as a complex socio-technical system of systems, in which communication between various domain subsystems, people and organizations enables more comprehensive, data-intensive research designs and knowledge sharing. In particular, we focus on integrating 'traditional' layered CI components - including information sources, catalogs, vocabularies, services, analysis and modeling tools - with CI components supporting scholarly communication, self-organization and social networking (e.g. research profiles, Q&A systems, annotations), in a manner that follows and enhances existing patterns of data, information and knowledge exchange within and across geoscience domains. We describe an initial architecture design focused on enabling the CI to (a) provide an environment for scientifically sound information and software discovery and reuse; (b) evolve by factoring in the impact of maturing movements like linked data, 'big data', and social collaborations, as well as experience from work on large information systems in other domains; (c) handle the ever increasing volume, complexity and diversity of geoscience information; (d) incorporate new information and analytical requirements, tools, and techniques, and emerging types of earth observations and models; (e) accommodate different ideas and approaches to research and data stewardship; (f) be responsive to the existing and anticipated needs of researchers and organizations representing both established and emerging CI users; and (g) make best use of NSF's current investment in the geoscience CI. The presentation will focus on the challenges and methodology of EarthCube CI design, in particular on supporting social engagement and interaction between geoscientists and computer scientists as a core function of EarthCube architecture. This capability must include mechanisms to not only locate and integrate available geoscience resources, but also engage individuals and projects, research products and publications, and enable efficient communication across many EarthCube stakeholders leading to long-term institutional alignment and trusted collaborations.
Knowledge Discovery and Data Mining: An Overview

NASA Technical Reports Server (NTRS)

Fayyad, U.

1995-01-01

The process of knowledge discovery and data mining is the process of information extraction from very large databases. Its importance is described along with several techniques and considerations for selecting the most appropriate technique for extracting information from a particular data set.
76 FR 69241 - Proposed Information Collection; Comment Request; Papahanaumokuakea Marine National Monument...

Federal Register 2010, 2011, 2012, 2013, 2014

2011-11-08

... Collection; Comment Request; Papahanaumokuakea Marine National Monument Mokupapapa Discovery Center Exhibit... collection. Mokupapapa Discovery Center (Center) is an outreach arm of Papahanaumokuakea Marine National... of automated collection techniques or other forms of information technology. Comments submitted in...
Allchin's Shoehorn, or Why Science Is Hypothetico-Deductive.

ERIC Educational Resources Information Center

Lawson, Anton E.

2003-01-01

Criticizes Allchin's article about Lawson's analysis of Galileo's discovery of Jupiter's moons. Suggests that a careful analysis of the way humans spontaneously process information and reason supports a general hypothetico-deductive theory of human information processing, reasoning, and scientific discovery. (SOE)
The MED-SUV Multidisciplinary Interoperability Infrastructure

NASA Astrophysics Data System (ADS)

Mazzetti, Paolo; D'Auria, Luca; Reitano, Danilo; Papeschi, Fabrizio; Roncella, Roberto; Puglisi, Giuseppe; Nativi, Stefano

2016-04-01

In accordance with the international Supersite initiative concept, the MED-SUV (MEDiterranean SUpersite Volcanoes) European project (http://med-suv.eu/) aims to enable long-term monitoring experiment in two relevant geologically active regions of Europe prone to natural hazards: Mt. Vesuvio/Campi Flegrei and Mt. Etna. This objective requires the integration of existing components, such as monitoring systems and data bases and novel sensors for the measurements of volcanic parameters. Moreover, MED-SUV is also a direct contribution to the Global Earth Observation System of Systems (GEOSS) as one the volcano Supersites recognized by the Group on Earth Observation (GEO). To achieve its goal, MED-SUV set up an advanced e-infrastructure allowing the discovery of and access to heterogeneous data for multidisciplinary applications, and the integration with external systems like GEOSS. The MED-SUV overall infrastructure is conceived as a three layer architecture with the lower layer (Data level) including the identified relevant data sources, the mid-tier (Supersite level) including components for mediation and harmonization , and the upper tier (Global level) composed of the systems that MED-SUV must serve, such as GEOSS and possibly other global/community systems. The Data level is mostly composed of existing data sources, such as space agencies satellite data archives, the UNAVCO system, the INGV-Rome data service. They share data according to different specifications for metadata, data and service interfaces, and cannot be changed. Thus, the only relevant MED-SUV activity at this level was the creation of a MED-SUV local repository based on Web Accessible Folder (WAF) technology, deployed in the INGV site in Catania, and hosting in-situ data and products collected and generated during the project. The Supersite level is at the core of the MED-SUV architecture, since it must mediate between the disparate data sources in the layer below, and provide a harmonized view to the layer above. In order to address data and service heteogeneity, the MED-SUV infrastructure is based on the brokered architecture approach, implemented using the GI-suite Brokering Framework for discovery and access. The GI-Suite Brokering Framework has been extended and configured to broker all the identified relevant data sources. It is also able to publish data according to several de-iure and de-facto standards including OGC CSW and OpenSearch, facilitating the interconnection with external systems. At the Global level, MED-SUV identified the interconnection with GEOSS as the main requirement. Since MED-SUV Supersite level is implemented based on the same technology adopted in the current GEOSS Common Infrastructure (GCI) by the GEO Discovery and Access Broker (GEO DAB), no major interoperability problem is foreseen. The MED-SUV Multidisciplinary Interoperability Infrastructure is complemented by a user portal providing human-to-machine interaction, and enabling data discovery and access. The GI-Suite Brokering Framework APIs and javascript library support machine-to-machine interaction, enabling the creation of mobile and Web applications using information available through the MED-SUV Supersite.
Geoscience Information Network (USGIN) Solutions for Interoperable Open Data Access Requirements

NASA Astrophysics Data System (ADS)

Allison, M. L.; Richard, S. M.; Patten, K.

2014-12-01

The geosciences are leading development of free, interoperable open access to data. US Geoscience Information Network (USGIN) is a freely available data integration framework, jointly developed by the USGS and the Association of American State Geologists (AASG), in compliance with international standards and protocols to provide easy discovery, access, and interoperability for geoscience data. USGIN standards include the geologic exchange language 'GeoSciML' (v 3.2 which enables instant interoperability of geologic formation data) which is also the base standard used by the 117-nation OneGeology consortium. The USGIN deployment of NGDS serves as a continent-scale operational demonstration of the expanded OneGeology vision to provide access to all geoscience data worldwide. USGIN is developed to accommodate a variety of applications; for example, the International Renewable Energy Agency streams data live to the Global Atlas of Renewable Energy. Alternatively, users without robust data sharing systems can download and implement a free software packet, "GINstack" to easily deploy web services for exposing data online for discovery and access. The White House Open Data Access Initiative requires all federally funded research projects and federal agencies to make their data publicly accessible in an open source, interoperable format, with metadata. USGIN currently incorporates all aspects of the Initiative as it emphasizes interoperability. The system is successfully deployed as the National Geothermal Data System (NGDS), officially launched at the White House Energy Datapalooza in May, 2014. The USGIN Foundation has been established to ensure this technology continues to be accessible and available.
Customer Discovery as the First Essential Step for Successful Health Information Technology System Development

PubMed Central

Thamjamrassri, Punyotai; Song, YuJin; Tak, JaeHyun; Kang, HoYong; Hong, Jeeyoung

2018-01-01

Objectives Customer discovery (CD) is a method to determine if there are actual customers for a product/service and what they would want before actually developing the product/service. This concept, however, is rather new to health information technology (IT) systems. Therefore, the aim of this paper was to demonstrate how to use the CD method in developing a comprehensive health IT service for patients with knee/leg pain. Methods We participated in a 6-week I-Corps program to perform CD, in which we interviewed 55 people in person, by phone, or by video conference within 6 weeks: 4 weeks in the United States and 2 weeks in Korea. The interviewees included orthopedic doctors, physical therapists, physical trainers, physicians, researchers, pharmacists, vendors, and patients. By analyzing the interview data, the aim was to revise our business model accordingly. Results Using the CD approach enabled us to understand the customer segments and identify value propositions. We concluded that a facilitating tele-rehabilitation system is needed the most and that the most suitable customer segment is early stage arthritis patients. We identified a new design concept for the customer segment. Furthermore, CD is required to identify value propositions in detail. Conclusions CD is crucial to determine a more desirable direction in developing health IT systems, and it can be a powerful tool to increase the potential for successful commercialization in the health IT field. PMID:29503756
Improving the quality of biomarker discovery research: the right samples and enough of them.

PubMed

Pepe, Margaret S; Li, Christopher I; Feng, Ziding

2015-06-01

Biomarker discovery research has yielded few biomarkers that validate for clinical use. A contributing factor may be poor study designs. The goal in discovery research is to identify a subset of potentially useful markers from a large set of candidates assayed on case and control samples. We recommend the PRoBE design for selecting samples. We propose sample size calculations that require specifying: (i) a definition for biomarker performance; (ii) the proportion of useful markers the study should identify (Discovery Power); and (iii) the tolerable number of useless markers amongst those identified (False Leads Expected, FLE). We apply the methodology to a study of 9,000 candidate biomarkers for risk of colon cancer recurrence where a useful biomarker has positive predictive value ≥ 30%. We find that 40 patients with recurrence and 160 without recurrence suffice to filter out 98% of useless markers (2% FLE) while identifying 95% of useful biomarkers (95% Discovery Power). Alternative methods for sample size calculation required more assumptions. Biomarker discovery research should utilize quality biospecimen repositories and include sample sizes that enable markers meeting prespecified performance characteristics for well-defined clinical applications to be identified. The scientific rigor of discovery research should be improved. ©2015 American Association for Cancer Research.
Analysis of the Waggle Dance Motion of Honeybees for the Design of a Biomimetic Honeybee Robot

PubMed Central

Landgraf, Tim; Rojas, Raúl; Nguyen, Hai; Kriegel, Fabian; Stettin, Katja

2011-01-01

The honeybee dance “language” is one of the most popular examples of information transfer in the animal world. Today, more than 60 years after its discovery it still remains unknown how follower bees decode the information contained in the dance. In order to build a robotic honeybee that allows a deeper investigation of the communication process we have recorded hundreds of videos of waggle dances. In this paper we analyze the statistics of visually captured high-precision dance trajectories of European honeybees (Apis mellifera carnica). The trajectories were produced using a novel automatic tracking system and represent the most detailed honeybee dance motion information available. Although honeybee dances seem very variable, some properties turned out to be invariant. We use these properties as a minimal set of parameters that enables us to model the honeybee dance motion. We provide a detailed statistical description of various dance properties that have not been characterized before and discuss the role of particular dance components in the commmunication process. PMID:21857906
A Geospatial Semantic Enrichment and Query Service for Geotagged Photographs

PubMed Central

Ennis, Andrew; Nugent, Chris; Morrow, Philip; Chen, Liming; Ioannidis, George; Stan, Alexandru; Rachev, Preslav

2015-01-01

With the increasing abundance of technologies and smart devices, equipped with a multitude of sensors for sensing the environment around them, information creation and consumption has now become effortless. This, in particular, is the case for photographs with vast amounts being created and shared every day. For example, at the time of this writing, Instagram users upload 70 million photographs a day. Nevertheless, it still remains a challenge to discover the “right” information for the appropriate purpose. This paper describes an approach to create semantic geospatial metadata for photographs, which can facilitate photograph search and discovery. To achieve this we have developed and implemented a semantic geospatial data model by which a photograph can be enrich with geospatial metadata extracted from several geospatial data sources based on the raw low-level geo-metadata from a smartphone photograph. We present the details of our method and implementation for searching and querying the semantic geospatial metadata repository to enable a user or third party system to find the information they are looking for. PMID:26205265

Strengthening protections for human subjects: proposed restrictions on the publication of transplant research involving prisoners.

PubMed

Valapour, Maryam; Paulson, Kristin M; Hilde, Alisha

2013-04-01

Publication is one of the primary rewards in the academic research community and is the first step in the dissemination of a new discovery that could lead to recognition and opportunity. Because of this, the publication of research can serve as a tacit endorsement of the methodology behind the science. This becomes a problem when vulnerable populations that are incapable of giving legitimate informed consent, such as prisoners, are used in research. The problem is especially critical in the field of transplant research, in which unverified consent can enable research that exploits the vulnerabilities of prisoners, especially those awaiting execution. Because the doctrine of informed consent is central to the protection of vulnerable populations, we have performed a historical analysis of the standards of informed consent in codes of international human subject protections to form the foundation for our limit and ban recommendations: (1) limit the publication of transplant research involving prisoners in general and (2) ban the publication of transplant research involving executed prisoners in particular. Copyright © 2013 American Association for the Study of Liver Diseases.
Australia's TERN: Advancing Ecosystem Data Management in Australia

NASA Astrophysics Data System (ADS)

Phinn, S. R.; Christensen, R.; Guru, S.

2013-12-01

Globally, there is a consistent movement towards more open, collaborative and transparent science, where the publication and citation of data is considered standard practice. Australia's Terrestrial Ecosystem Research Network (TERN) is a national research infrastructure investment designed to support the ecosystem science community through all stages of the data lifecycle. TERN has developed and implemented a comprehensive network of ';hard' and ';soft' infrastructure that enables Australia's ecosystem scientists to collect, publish, store, share, discover and re-use data in ways not previously possible. The aim of this poster is to demonstrate how TERN has successfully delivered infrastructure that is enabling a significant cultural and practical shift in Australia's ecosystem science community towards consistent approaches for data collection, meta-data, data licensing, and data publishing. TERN enables multiple disciplines, within the ecosystem sciences to more effectively and efficiently collect, store and publish their data. A critical part of TERN's approach has been to build on existing data collection activities, networks and skilled people to enable further coordination and collaboration to build each data collection facility and coordinate data publishing. Data collection in TERN is through discipline based facilities, covering long term collection of: (1) systematic plot based measurements of vegetation structure, composition and faunal biodiversity; (2) instrumented towers making systematic measurements of solar, water and gas fluxes; and (3) satellite and airborne maps of biophysical properties of vegetation, soils and the atmosphere. Several other facilities collect and integrate environmental data to produce national products for fauna and vegetation surveys, soils and coastal data, as well as integrated or synthesised products for modelling applications. Data management, publishing and sharing in TERN are implemented through a tailored data licensing framework suitable for ecosystem data, national standards for metadata, a DOI-minting service, and context-appropriate data repositories and portals. The TERN Data infrastructure is based on loosely coupled 'network of networks.' Overall, the data formats used across the TERN facilities vary from NetCDF, comma-separated values and descriptive documents. Metadata standards include ISO19115, Ecological Metadata Language and rich semantic enabled contextual information. Data services vary from Web Mapping Service, Web Feature Service, OpeNDAP, file servers and KNB Metacat. These approaches enable each data collection facility to maintain their discipline based data collection and storage protocols. TERN facility meta-data are harvested regularly for the central TERN Data Discovery Portal and converted to a national standard format. This approach enables centralised discovery, access, and re-use of data simply and effectively, while maintaining disciplinary diversity. Effort is still required to support the cultural shift towards acceptance of effective data management, publication, sharing and re-use as standard practice. To this end TERN's future activities will be directed to supporting this transformation and undertaking ';education' to enable ecosystem scientists to take full advantage of TERN's infrastructure, and providing training and guidance for best practice data management.
Building Knowledge Graphs for NASA's Earth Science Enterprise

NASA Astrophysics Data System (ADS)

Zhang, J.; Lee, T. J.; Ramachandran, R.; Shi, R.; Bao, Q.; Gatlin, P. N.; Weigel, A. M.; Maskey, M.; Miller, J. J.

2016-12-01

Inspired by Google Knowledge Graph, we have been building a prototype Knowledge Graph for Earth scientists, connecting information and data in NASA's Earth science enterprise. Our primary goal is to advance the state-of-the-art NASA knowledge extraction capability by going beyond traditional catalog search and linking different distributed information (such as data, publications, services, tools and people). This will enable a more efficient pathway to knowledge discovery. While Google Knowledge Graph provides impressive semantic-search and aggregation capabilities, it is limited to search topics for general public. We use the similar knowledge graph approach to semantically link information gathered from a wide variety of sources within the NASA Earth Science enterprise. Our prototype serves as a proof of concept on the viability of building an operational "knowledge base" system for NASA Earth science. Information is pulled from structured sources (such as NASA CMR catalog, GCMD, and Climate and Forecast Conventions) and unstructured sources (such as research papers). Leveraging modern techniques of machine learning, information retrieval, and deep learning, we provide an integrated data mining and information discovery environment to help Earth scientists to use the best data, tools, methodologies, and models available to answer a hypothesis. Our knowledge graph would be able to answer questions like: Which articles discuss topics investigating similar hypotheses? How have these methods been tested for accuracy? Which approaches have been highly cited within the scientific community? What variables were used for this method and what datasets were used to represent them? What processing was necessary to use this data? These questions then lead researchers and citizen scientists to investigate the sources where data can be found, available user guides, information on how the data was acquired, and available tools and models to use with this data. As a proof of concept, we focus on a well-defined domain - Hurricane Science linking research articles and their findings, data, people and tools/services. Modern information retrieval, natural language processing machine learning and deep learning techniques are applied to build the knowledge network.
Empowering Accelerated Personal, Professional and Scholarly Discovery among Information Seekers: An Educational Vision

ERIC Educational Resources Information Center

Harmon, Glynn

2013-01-01

The term discovery applies herein to the successful outcome of inquiry in which a significant personal, professional or scholarly breakthrough or insight occurs, and which is individually or socially acknowledged as a key contribution to knowledge. Since discoveries culminate at fixed points in time, discoveries can serve as an outcome metric for…
Application of Ontologies for Big Earth Data

NASA Astrophysics Data System (ADS)

Huang, T.; Chang, G.; Armstrong, E. M.; Boening, C.

2014-12-01

Connected data is smarter data! Earth Science research infrastructure must do more than just being able to support temporal, geospatial discovery of satellite data. As the Earth Science data archives continue to expand across NASA data centers, the research communities are demanding smarter data services. A successful research infrastructure must be able to present researchers the complete picture, that is, datasets with linked citations, related interdisciplinary data, imageries, current events, social media discussions, and scientific data tools that are relevant to the particular dataset. The popular Semantic Web for Earth and Environmental Terminology (SWEET) ontologies is a collection of ontologies and concepts designed to improve discovery and application of Earth Science data. The SWEET ontologies collection was initially developed to capture the relationships between keywords in the NASA Global Change Master Directory (GCMD). Over the years this popular ontologies collection has expanded to cover over 200 ontologies and 6000 concepts to enable scalable classification of Earth system science concepts and Space science. This presentation discusses the semantic web technologies as the enabling technology for data-intensive science. We will discuss the application of the SWEET ontologies as a critical component in knowledge-driven research infrastructure for some of the recent projects, which include the DARPA Ontological System for Context Artifact and Resources (OSCAR), 2013 NASA ACCESS Virtual Quality Screening Service (VQSS), and the 2013 NASA Sea Level Change Portal (SLCP) projects. The presentation will also discuss the benefits in using semantic web technologies in developing research infrastructure for Big Earth Science Data in an attempt to "accommodate all domains and provide the necessary glue for information to be cross-linked, correlated, and discovered in a semantically rich manner." [1] [1] Savas Parastatidis: A platform for all that we know: creating a knowledge-driven research infrastructure. The Fourth Paradigm 2009: 165-172
A diversity-oriented synthesis strategy enabling the combinatorial-type variation of macrocyclic peptidomimetic scaffolds† †Electronic supplementary information (ESI) available: Experimental procedures, characterization data and details of the computational analyses. See DOI: 10.1039/c5ob00371g Click here for additional data file.

PubMed Central

Isidro-Llobet, Albert; Hadje Georgiou, Kathy; Galloway, Warren R. J. D.; Giacomini, Elisa; Hansen, Mette R.; Méndez-Abt, Gabriela; Tan, Yaw Sing; Carro, Laura; Sore, Hannah F.

2015-01-01

Macrocyclic peptidomimetics are associated with a broad range of biological activities. However, despite such potentially valuable properties, the macrocyclic peptidomimetic structural class is generally considered as being poorly explored within drug discovery. This has been attributed to the lack of general methods for producing collections of macrocyclic peptidomimetics with high levels of structural, and thus shape, diversity. In particular, there is a lack of scaffold diversity in current macrocyclic peptidomimetic libraries; indeed, the efficient construction of diverse molecular scaffolds presents a formidable general challenge to the synthetic chemist. Herein we describe a new, advanced strategy for the diversity-oriented synthesis (DOS) of macrocyclic peptidomimetics that enables the combinatorial variation of molecular scaffolds (core macrocyclic ring architectures). The generality and robustness of this DOS strategy is demonstrated by the step-efficient synthesis of a structurally diverse library of over 200 macrocyclic peptidomimetic compounds, each based around a distinct molecular scaffold and isolated in milligram quantities, from readily available building-blocks. To the best of our knowledge this represents an unprecedented level of scaffold diversity in a synthetically derived library of macrocyclic peptidomimetics. Cheminformatic analysis indicated that the library compounds access regions of chemical space that are distinct from those addressed by top-selling brand-name drugs and macrocyclic natural products, illustrating the value of our DOS approach to sample regions of chemical space underexploited in current drug discovery efforts. An analysis of three-dimensional molecular shapes illustrated that the DOS library has a relatively high level of shape diversity. PMID:25778821
Understanding drug targets: no such thing as bad news.

PubMed

Roberts, Ruth A

2018-05-24

How can small-to-medium pharma and biotech companies enhance the chances of running a successful drug project and maximise the return on a limited number of assets? Having a full appreciation of the safety risks associated with proposed drug targets is a crucial element in understanding the unwanted side-effects that might stop a project in its tracks. Having this information is necessary to complement knowledge about the probable efficacy of a future drug. However, the lack of data-rich insight into drug-target safety is one of the major causes of drug-project failure today. Conducting comprehensive target-safety reviews early in the drug discovery process enables project teams to make the right decisions about which drug targets to take forward. Copyright © 2018 Elsevier Ltd. All rights reserved.
Improved Access to NSF Funded Ocean Research Data

NASA Astrophysics Data System (ADS)

Chandler, C. L.; Groman, R. C.; Kinkade, D.; Shepherd, A.; Rauch, S.; Allison, M. D.; Gegg, S. R.; Wiebe, P. H.; Glover, D. M.

2015-12-01

Data from NSF-funded, hypothesis-driven research comprise an essential part of the research results upon which we base our knowledge and improved understanding of the impacts of climate change. Initially funded in 2006, the Biological and Chemical Oceanography Data Management Office (BCO-DMO) works with marine scientists to ensure that data from NSF-funded ocean research programs are fully documented and freely available for future use. BCO-DMO works in partnership with information technology professionals, other marine data repositories and national data archive centers to ensure long-term preservation of these valuable environmental research data. Data contributed to BCO-DMO by the original investigators are enhanced with sufficient discipline-specific documentation and published in a variety of standards-compliant forms designed to enable discovery and support accurate re-use.
How to Quickly Import CAD Geometry into Thermal Desktop

NASA Technical Reports Server (NTRS)

Wright, Shonte; Beltran, Emilio

2002-01-01

There are several groups at JPL (Jet Propulsion Laboratory) that are committed to concurrent design efforts, two are featured here. Center for Space Mission Architecture and Design (CSMAD) enables the practical application of advanced process technologies in JPL's mission architecture process. Team I functions as an incubator for projects that are in the Discovery, and even pre-Discovery proposal stages. JPL's concurrent design environment is to a large extent centered on the CAD (Computer Aided Design) file. During concurrent design sessions CAD geometry is ported to other more specialized engineering design packages.
Pulsar discovery by global volunteer computing.

PubMed

Knispel, B; Allen, B; Cordes, J M; Deneva, J S; Anderson, D; Aulbert, C; Bhat, N D R; Bock, O; Bogdanov, S; Brazier, A; Camilo, F; Champion, D J; Chatterjee, S; Crawford, F; Demorest, P B; Fehrmann, H; Freire, P C C; Gonzalez, M E; Hammer, D; Hessels, J W T; Jenet, F A; Kasian, L; Kaspi, V M; Kramer, M; Lazarus, P; van Leeuwen, J; Lorimer, D R; Lyne, A G; Machenschalk, B; McLaughlin, M A; Messenger, C; Nice, D J; Papa, M A; Pletsch, H J; Prix, R; Ransom, S M; Siemens, X; Stairs, I H; Stappers, B W; Stovall, K; Venkataraman, A

2010-09-10

Einstein@Home aggregates the computer power of hundreds of thousands of volunteers from 192 countries to mine large data sets. It has now found a 40.8-hertz isolated pulsar in radio survey data from the Arecibo Observatory taken in February 2007. Additional timing observations indicate that this pulsar is likely a disrupted recycled pulsar. PSR J2007+2722's pulse profile is remarkably wide with emission over almost the entire spin period; the pulsar likely has closely aligned magnetic and spin axes. The massive computing power provided by volunteers should enable many more such discoveries.
Ensemble-based docking: From hit discovery to metabolism and toxicity predictions

DOE PAGES

Evangelista, Wilfredo; Weir, Rebecca; Ellingson, Sally; ...

2016-07-29

The use of ensemble-based docking for the exploration of biochemical pathways and toxicity prediction of drug candidates is described. We describe the computational engineering work necessary to enable large ensemble docking campaigns on supercomputers. We show examples where ensemble-based docking has significantly increased the number and the diversity of validated drug candidates. Finally, we illustrate how ensemble-based docking can be extended beyond hit discovery and toward providing a structural basis for the prediction of metabolism and off-target binding relevant to pre-clinical and clinical trials.
Borderless Geospatial Web (bolegweb)

NASA Astrophysics Data System (ADS)

Cetl, V.; Kliment, T.; Kliment, M.

2016-06-01

The effective access and use of geospatial information (GI) resources acquires a critical value of importance in modern knowledge based society. Standard web services defined by Open Geospatial Consortium (OGC) are frequently used within the implementations of spatial data infrastructures (SDIs) to facilitate discovery and use of geospatial data. This data is stored in databases located in a layer, called the invisible web, thus are ignored by search engines. SDI uses a catalogue (discovery) service for the web as a gateway to the GI world through the metadata defined by ISO standards, which are structurally diverse to OGC metadata. Therefore, a crosswalk needs to be implemented to bridge the OGC resources discovered on mainstream web with those documented by metadata in an SDI to enrich its information extent. A public global wide and user friendly portal of OGC resources available on the web ensures and enhances the use of GI within a multidisciplinary context and bridges the geospatial web from the end-user perspective, thus opens its borders to everybody. Project "Crosswalking the layers of geospatial information resources to enable a borderless geospatial web" with the acronym BOLEGWEB is ongoing as a postdoctoral research project at the Faculty of Geodesy, University of Zagreb in Croatia (http://bolegweb.geof.unizg.hr/). The research leading to the results of the project has received funding from the European Union Seventh Framework Programme (FP7 2007-2013) under Marie Curie FP7-PEOPLE-2011-COFUND. The project started in the November 2014 and is planned to be finished by the end of 2016. This paper provides an overview of the project, research questions and methodology, so far achieved results and future steps.
imDEV: a graphical user interface to R multivariate analysis tools in Microsoft Excel

PubMed Central

Grapov, Dmitry; Newman, John W.

2012-01-01

Summary: Interactive modules for Data Exploration and Visualization (imDEV) is a Microsoft Excel spreadsheet embedded application providing an integrated environment for the analysis of omics data through a user-friendly interface. Individual modules enables interactive and dynamic analyses of large data by interfacing R's multivariate statistics and highly customizable visualizations with the spreadsheet environment, aiding robust inferences and generating information-rich data visualizations. This tool provides access to multiple comparisons with false discovery correction, hierarchical clustering, principal and independent component analyses, partial least squares regression and discriminant analysis, through an intuitive interface for creating high-quality two- and a three-dimensional visualizations including scatter plot matrices, distribution plots, dendrograms, heat maps, biplots, trellis biplots and correlation networks. Availability and implementation: Freely available for download at http://sourceforge.net/projects/imdev/. Implemented in R and VBA and supported by Microsoft Excel (2003, 2007 and 2010). Contact: John.Newman@ars.usda.gov Supplementary Information: Installation instructions, tutorials and users manual are available at http://sourceforge.net/projects/imdev/. PMID:22815358
Some experiences and opportunities for big data in translational research.

PubMed

Chute, Christopher G; Ullman-Cullere, Mollie; Wood, Grant M; Lin, Simon M; He, Min; Pathak, Jyotishman

2013-10-01

Health care has become increasingly information intensive. The advent of genomic data, integrated into patient care, significantly accelerates the complexity and amount of clinical data. Translational research in the present day increasingly embraces new biomedical discovery in this data-intensive world, thus entering the domain of "big data." The Electronic Medical Records and Genomics consortium has taught us many lessons, while simultaneously advances in commodity computing methods enable the academic community to affordably manage and process big data. Although great promise can emerge from the adoption of big data methods and philosophy, the heterogeneity and complexity of clinical data, in particular, pose additional challenges for big data inferencing and clinical application. However, the ultimate comparability and consistency of heterogeneous clinical information sources can be enhanced by existing and emerging data standards, which promise to bring order to clinical data chaos. Meaningful Use data standards in particular have already simplified the task of identifying clinical phenotyping patterns in electronic health records.
AB022. Harnessing big data to transform clinical care of cardiovascular diseases

PubMed Central

Cutiongco-de la Paz, Eva Maria

2015-01-01

Diseases of the heart and vascular system are the leading causes of mortality worldwide. A number of risk factors have already been identified such as obesity, diabetes and smoking; in the recent years, research has shifted its focus on genetic risk factors. Discoveries on the role of genes partnered with the technological developments have enabled advances in the understanding of human genetics and its influence on disease and treatment. There are initiatives now to combine medical records and genetic and other molecular data into a single “knowledge network” to achieve these aptly known as precision medicine. With next generation sequencing readily available at a more affordable cost, it is expected that genetic information of patients will be increasingly available and can be used to guide clinical decisions. Big data generated and stored necessitates broad and extensive interpretation to be valuable in clinical care. Accumulating evidence on the use of such genetic information in the cardiovascular clinics will be presented.
Microsaccadic sampling of moving image information provides Drosophila hyperacute vision

PubMed Central

Solanki, Narendra; Rien, Diana; Jaciuch, David; Dongre, Sidhartha Anil; Blanchard, Florence; de Polavieja, Gonzalo G; Hardie, Roger C; Takalo, Jouni

2017-01-01

Small fly eyes should not see fine image details. Because flies exhibit saccadic visual behaviors and their compound eyes have relatively few ommatidia (sampling points), their photoreceptors would be expected to generate blurry and coarse retinal images of the world. Here we demonstrate that Drosophila see the world far better than predicted from the classic theories. By using electrophysiological, optical and behavioral assays, we found that R1-R6 photoreceptors’ encoding capacity in time is maximized to fast high-contrast bursts, which resemble their light input during saccadic behaviors. Whilst over space, R1-R6s resolve moving objects at saccadic speeds beyond the predicted motion-blur-limit. Our results show how refractory phototransduction and rapid photomechanical photoreceptor contractions jointly sharpen retinal images of moving objects in space-time, enabling hyperacute vision, and explain how such microsaccadic information sampling exceeds the compound eyes’ optical limits. These discoveries elucidate how acuity depends upon photoreceptor function and eye movements. PMID:28870284
IBS: an illustrator for the presentation and visualization of biological sequences

PubMed Central

Liu, Wenzhong; Xie, Yubin; Ma, Jiyong; Luo, Xiaotong; Nie, Peng; Zuo, Zhixiang; Lahrmann, Urs; Zhao, Qi; Zheng, Yueyuan; Zhao, Yong; Xue, Yu; Ren, Jian

2015-01-01

Summary: Biological sequence diagrams are fundamental for visualizing various functional elements in protein or nucleotide sequences that enable a summarization and presentation of existing information as well as means of intuitive new discoveries. Here, we present a software package called illustrator of biological sequences (IBS) that can be used for representing the organization of either protein or nucleotide sequences in a convenient, efficient and precise manner. Multiple options are provided in IBS, and biological sequences can be manipulated, recolored or rescaled in a user-defined mode. Also, the final representational artwork can be directly exported into a publication-quality figure. Availability and implementation: The standalone package of IBS was implemented in JAVA, while the online service was implemented in HTML5 and JavaScript. Both the standalone package and online service are freely available at http://ibs.biocuckoo.org. Contact: renjian.sysu@gmail.com or xueyu@hust.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26069263
CORUM: the comprehensive resource of mammalian protein complexes

PubMed Central

Ruepp, Andreas; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Stransky, Michael; Waegele, Brigitte; Schmidt, Thorsten; Doudieu, Octave Noubibou; Stümpflen, Volker; Mewes, H. Werner

2008-01-01

Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The CORUM (http://mips.gsf.de/genre/proj/corum/index.html) database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scientific literature from expert annotators. Information about protein complexes includes protein complex names, subunits, literature references as well as the function of the complexes. For functional annotation, we use the FunCat catalogue that enables to organize the protein complex space into biologically meaningful subsets. The database contains more than 1750 protein complexes that are built from 2400 different genes, thus representing 12% of the protein-coding genes in human. A web-based system is available to query, view and download the data. CORUM provides a comprehensive dataset of protein complexes for discoveries in systems biology, analyses of protein networks and protein complex-associated diseases. Comparable to the MIPS reference dataset of protein complexes from yeast, CORUM intends to serve as a reference for mammalian protein complexes. PMID:17965090
Some experiences and opportunities for big data in translational research

PubMed Central

Chute, Christopher G.; Ullman-Cullere, Mollie; Wood, Grant M.; Lin, Simon M.; He, Min; Pathak, Jyotishman

2014-01-01

Health care has become increasingly information intensive. The advent of genomic data, integrated into patient care, significantly accelerates the complexity and amount of clinical data. Translational research in the present day increasingly embraces new biomedical discovery in this data-intensive world, thus entering the domain of “big data.” The Electronic Medical Records and Genomics consortium has taught us many lessons, while simultaneously advances in commodity computing methods enable the academic community to affordably manage and process big data. Although great promise can emerge from the adoption of big data methods and philosophy, the heterogeneity and complexity of clinical data, in particular, pose additional challenges for big data inferencing and clinical application. However, the ultimate comparability and consistency of heterogeneous clinical information sources can be enhanced by existing and emerging data standards, which promise to bring order to clinical data chaos. Meaningful Use data standards in particular have already simplified the task of identifying clinical phenotyping patterns in electronic health records. PMID:24008998
Nanoinformatics knowledge infrastructures: bringing efficient information management to nanomedical research.

PubMed

de la Iglesia, D; Cachau, R E; García-Remesal, M; Maojo, V

2013-11-27

Nanotechnology represents an area of particular promise and significant opportunity across multiple scientific disciplines. Ongoing nanotechnology research ranges from the characterization of nanoparticles and nanomaterials to the analysis and processing of experimental data seeking correlations between nanoparticles and their functionalities and side effects. Due to their special properties, nanoparticles are suitable for cellular-level diagnostics and therapy, offering numerous applications in medicine, e.g. development of biomedical devices, tissue repair, drug delivery systems and biosensors. In nanomedicine, recent studies are producing large amounts of structural and property data, highlighting the role for computational approaches in information management. While in vitro and in vivo assays are expensive, the cost of computing is falling. Furthermore, improvements in the accuracy of computational methods (e.g. data mining, knowledge discovery, modeling and simulation) have enabled effective tools to automate the extraction, management and storage of these vast data volumes. Since this information is widely distributed, one major issue is how to locate and access data where it resides (which also poses data-sharing limitations). The novel discipline of nanoinformatics addresses the information challenges related to nanotechnology research. In this paper, we summarize the needs and challenges in the field and present an overview of extant initiatives and efforts.

Nanoinformatics knowledge infrastructures: bringing efficient information management to nanomedical research

NASA Astrophysics Data System (ADS)

de la Iglesia, D.; Cachau, R. E.; García-Remesal, M.; Maojo, V.

2013-01-01

Nanotechnology represents an area of particular promise and significant opportunity across multiple scientific disciplines. Ongoing nanotechnology research ranges from the characterization of nanoparticles and nanomaterials to the analysis and processing of experimental data seeking correlations between nanoparticles and their functionalities and side effects. Due to their special properties, nanoparticles are suitable for cellular-level diagnostics and therapy, offering numerous applications in medicine, e.g. development of biomedical devices, tissue repair, drug delivery systems and biosensors. In nanomedicine, recent studies are producing large amounts of structural and property data, highlighting the role for computational approaches in information management. While in vitro and in vivo assays are expensive, the cost of computing is falling. Furthermore, improvements in the accuracy of computational methods (e.g. data mining, knowledge discovery, modeling and simulation) have enabled effective tools to automate the extraction, management and storage of these vast data volumes. Since this information is widely distributed, one major issue is how to locate and access data where it resides (which also poses data-sharing limitations). The novel discipline of nanoinformatics addresses the information challenges related to nanotechnology research. In this paper, we summarize the needs and challenges in the field and present an overview of extant initiatives and efforts.
Cryo-EM in drug discovery: achievements, limitations and prospects.

PubMed

Renaud, Jean-Paul; Chari, Ashwin; Ciferri, Claudio; Liu, Wen-Ti; Rémigy, Hervé-William; Stark, Holger; Wiesmann, Christian

2018-06-08

Cryo-electron microscopy (cryo-EM) of non-crystalline single particles is a biophysical technique that can be used to determine the structure of biological macromolecules and assemblies. Historically, its potential for application in drug discovery has been heavily limited by two issues: the minimum size of the structures it can be used to study and the resolution of the images. However, recent technological advances - including the development of direct electron detectors and more effective computational image analysis techniques - are revolutionizing the utility of cryo-EM, leading to a burst of high-resolution structures of large macromolecular assemblies. These advances have raised hopes that single-particle cryo-EM might soon become an important tool for drug discovery, particularly if they could enable structural determination for 'intractable' targets that are still not accessible to X-ray crystallographic analysis. This article describes the recent advances in the field and critically assesses their relevance for drug discovery as well as discussing at what stages of the drug discovery pipeline cryo-EM can be useful today and what to expect in the near future.
Scientific Prediction and Prophetic Patenting in Drug Discovery.

PubMed

Curry, Stephen H; Schneiderman, Anne M

2015-01-01

Pharmaceutical patenting involves writing claims based on both discoveries already made, and on prophesy of future developments in an ongoing project. This is necessitated by the very different timelines involved in the drug discovery and product development process on the one hand, and successful patenting on the other. If patents are sought too early there is a risk that patent examiners will disallow claims because of lack of enablement. If patenting is delayed, claims are at risk of being denied on the basis of existence of prior art, because the body of relevant known science will have developed significantly while the project was being pursued. This review examines the role of prophetic patenting in relation to the essential predictability of many aspects of drug discovery science, promoting the concepts of discipline-related and project-related prediction. This is especially directed towards patenting activities supporting commercialization of academia-based discoveries, where long project timelines occur, and where experience, and resources to pay for patenting, are limited. The need for improved collaborative understanding among project scientists, technology transfer professionals in, for example, universities, patent attorneys, and patent examiners is emphasized.
Fragment-Based Phenotypic Lead Discovery: Cell-Based Assay to Target Leishmaniasis.

PubMed

Ayotte, Yann; Bilodeau, François; Descoteaux, Albert; LaPlante, Steven R

2018-05-02

A rapid and practical approach for the discovery of new chemical matter for targeting pathogens and diseases is described. Fragment-based phenotypic lead discovery (FPLD) combines aspects of traditional fragment-based lead discovery (FBLD), which involves the screening of small-molecule fragment libraries to target specific proteins, with phenotypic lead discovery (PLD), which typically involves the screening of drug-like compounds in cell-based assays. To enable FPLD, a diverse library of fragments was first designed, assembled, and curated. This library of soluble, low-molecular-weight compounds was then pooled to expedite screening. Axenic cultures of Leishmania promastigotes were screened, and single hits were then tested for leishmanicidal activity against intracellular amastigote forms in infected murine bone-marrow-derived macrophages without evidence of toxicity toward mammalian cells. These studies demonstrate that FPLD can be a rapid and effective means to discover hits that can serve as leads for further medicinal chemistry purposes or as tool compounds for identifying known or novel targets. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
7 CFR 1.642 - When must a party supplement or amend information it has previously provided?

Code of Federal Regulations, 2010 CFR

2010-01-01

... 7 Agriculture 1 2010-01-01 2010-01-01 false When must a party supplement or amend information it... When must a party supplement or amend information it has previously provided? (a) Discovery. A party must promptly supplement or amend any prior response to a discovery request if it learns that the...
The lessons of Varsovian's reconnaissance

NASA Technical Reports Server (NTRS)

Bents, D. J.

1990-01-01

The role played by advanced technology is illustrated with respect to the anticipated era of discovery and exploration (in space): how bold new exploration initiatives may or may not be enabled. Enabling technology makes the mission feasible. To be truly enabling, however, the technology must not only render the proposed mission technically feasible, but also make it viable economically; that is, low enough in cost (relative to the economy supporting it) that urgent national need is not required for justification, low enough that risks can be programmatically tolerated. An allegorical parallel is drawn to the Roman Empire of the second century AD, shown to have possessed by that time the necessary knowledge, motivation, means, and technical capability of mounting, through the use of innovative mission planning, an initiative similar to Columbus' voyage. They failed to do so; not because they lacked the vision, but because their technology was not advanced enough to make it an acceptable proposition economically. Speculation, based on the historical perspective, is made on the outcome of contemporary plans for future exploration showing how they will be subjected to the same historical forces, within limits imposed by the state of technology development, that shaped the timing of that previous era of discovery and exploration.
Demonstration and Science Experiment (DSX) Space Weather Experiment (SWx)

DTIC Science & Technology

2009-01-01

environment encountered by medium-earth orbits (MEO). at an altitude range from 6,000 to 15.000 km "’. The discovery of the earth’s radiation...forecast models that enable future space missions in the medium Earth orbit regime to enable better spacecraft designed to withstand the harsh environment...the size of the sensor and to exploit a compact layout. The inside spherical section has an attraction voltage and the outside section has the
Learning to Love Your Discovery Tool: Strategies for Integrating a Discovery Tool in Face-to-Face, Synchronous, and Asynchronous Instruction

ERIC Educational Resources Information Center

Fawley, Nancy; Krysak, Nikki

2014-01-01

Some librarians embrace discovery tools while others refuse to use them. This lack of consensus can have consequences for student learning when there is inconsistent use, especially in large-scale instruction programs. The authors surveyed academic librarians whose institutions have a discovery tool and who teach information literacy classes in…
Working with Data: Discovering Knowledge through Mining and Analysis; Systematic Knowledge Management and Knowledge Discovery; Text Mining; Methodological Approach in Discovering User Search Patterns through Web Log Analysis; Knowledge Discovery in Databases Using Formal Concept Analysis; Knowledge Discovery with a Little Perspective.

ERIC Educational Resources Information Center

Qin, Jian; Jurisica, Igor; Liddy, Elizabeth D.; Jansen, Bernard J; Spink, Amanda; Priss, Uta; Norton, Melanie J.

2000-01-01

These six articles discuss knowledge discovery in databases (KDD). Topics include data mining; knowledge management systems; applications of knowledge discovery; text and Web mining; text mining and information retrieval; user search patterns through Web log analysis; concept analysis; data collection; and data structure inconsistency. (LRW)
Cafe Variome: general-purpose software for making genotype-phenotype data discoverable in restricted or open access contexts.

PubMed

Lancaster, Owen; Beck, Tim; Atlan, David; Swertz, Morris; Thangavelu, Dhiwagaran; Veal, Colin; Dalgleish, Raymond; Brookes, Anthony J

2015-10-01

Biomedical data sharing is desirable, but problematic. Data "discovery" approaches-which establish the existence rather than the substance of data-precisely connect data owners with data seekers, and thereby promote data sharing. Cafe Variome (http://www.cafevariome.org) was therefore designed to provide a general-purpose, Web-based, data discovery tool that can be quickly installed by any genotype-phenotype data owner, or network of data owners, to make safe or sensitive content appropriately discoverable. Data fields or content of any type can be accommodated, from simple ID and label fields through to extensive genotype and phenotype details based on ontologies. The system provides a "shop window" in front of data, with main interfaces being a simple search box and a powerful "query-builder" that enable very elaborate queries to be formulated. After a successful search, counts of records are reported grouped by "openAccess" (data may be directly accessed), "linkedAccess" (a source link is provided), and "restrictedAccess" (facilitated data requests and subsequent provision of approved records). An administrator interface provides a wide range of options for system configuration, enabling highly customized single-site or federated networks to be established. Current uses include rare disease data discovery, patient matchmaking, and a Beacon Web service. © 2015 WILEY PERIODICALS, INC.
Recommendations and Ongoing Efforts within the NASA Data Quality Working Group

NASA Astrophysics Data System (ADS)

Moroni, D. F.; Ramapriyan, H.; Bagwell, R.; Downs, R. R.

2015-12-01

Since its inception in March 2014, the NASA Data Quality Working Group (DQWG) has procured a set of 12 high level recommendations which had been sifted from and aggregated from a prioritized subset of nearly 100 unique recommendations spanning four unique data quality management phases and distributed between two actionable categories. The four data quality management phases as identified by the DQWG are: 1. Capturing (i.e., deriving, collecting and organizing the information), 2. Describing (i.e., documenting and procuring the information for public consumption), 3. Facilitating Discovery (i.e., publishing and providing access to the information), and 4. Enabling Use (i.e., enhancing the utility of the information). Mapping each of our recommendations to one or more of the above management phases is intended to enable improved assessment of cost, feasibility, and relevancy to the entities responsible for implementing such recommendations. The DQWG further defined two distinct actionable categories: 1) Data Systems and 2) Science. The purpose of these actionable categories is to define specifically who is responsible for the implementation and adherence toward these recommendations; we refer to the responsible entities as the "actionees". Here we will summarize each of the high level recommendations along with their corresponding management phases and actionees. We will present what has recently been identified as our set of "low-hanging fruit" recommendations, which are intended for near-term implementation. Finally, we will present the status and motivation for continuing and future planned activities, which include but are not limited to: engaging inter-agency and international communities, more direct feedback from Earth observation missions, and mapping of "low-hanging fruit" recommendations to existing solutions.
Mining large heterogeneous data sets in drug discovery.

PubMed

Wild, David J

2009-10-01

Increasingly, effective drug discovery involves the searching and data mining of large volumes of information from many sources covering the domains of chemistry, biology and pharmacology amongst others. This has led to a proliferation of databases and data sources relevant to drug discovery. This paper provides a review of the publicly-available large-scale databases relevant to drug discovery, describes the kinds of data mining approaches that can be applied to them and discusses recent work in integrative data mining that looks for associations that pan multiple sources, including the use of Semantic Web techniques. The future of mining large data sets for drug discovery requires intelligent, semantic aggregation of information from all of the data sources described in this review, along with the application of advanced methods such as intelligent agents and inference engines in client applications.
Modeling & Informatics at Vertex Pharmaceuticals Incorporated: our philosophy for sustained impact

NASA Astrophysics Data System (ADS)

McGaughey, Georgia; Patrick Walters, W.

2017-03-01

Molecular modelers and informaticians have the unique opportunity to integrate cross-functional data using a myriad of tools, methods and visuals to generate information. Using their drug discovery expertise, information is transformed to knowledge that impacts drug discovery. These insights are often times formulated locally and then applied more broadly, which influence the discovery of new medicines. This is particularly true in an organization where the members are exposed to projects throughout an organization, such as in the case of the global Modeling & Informatics group at Vertex Pharmaceuticals. From its inception, Vertex has been a leader in the development and use of computational methods for drug discovery. In this paper, we describe the Modeling & Informatics group at Vertex and the underlying philosophy, which has driven this team to sustain impact on the discovery of first-in-class transformative medicines.
Using directed information for influence discovery in interconnected dynamical systems

NASA Astrophysics Data System (ADS)

Rao, Arvind; Hero, Alfred O.; States, David J.; Engel, James Douglas

2008-08-01

Structure discovery in non-linear dynamical systems is an important and challenging problem that arises in various applications such as computational neuroscience, econometrics, and biological network discovery. Each of these systems have multiple interacting variables and the key problem is the inference of the underlying structure of the systems (which variables are connected to which others) based on the output observations (such as multiple time trajectories of the variables). Since such applications demand the inference of directed relationships among variables in these non-linear systems, current methods that have a linear assumption on structure or yield undirected variable dependencies are insufficient. Hence, in this work, we present a methodology for structure discovery using an information-theoretic metric called directed time information (DTI). Using both synthetic dynamical systems as well as true biological datasets (kidney development and T-cell data), we demonstrate the utility of DTI in such problems.
IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, I-Min; Chu, Ken; Ratner, Anna

2014-10-28

In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorwaymore » to a new era in the discovery of novel molecules.« less
Future Mission Proposal Opportunities: Discovery, New Frontiers, and Project Prometheus

NASA Technical Reports Server (NTRS)

Niebur, S. M.; Morgan, T. H.; Niebur, C. S.

2003-01-01

The NASA Office of Space Science is expanding opportunities to propose missions to comets, asteroids, and other solar system targets. The Discovery Program continues to be popular, with two sample return missions, Stardust and Genesis, currently in operation. The New Frontiers Program, a new proposal opportunity modeled on the successful Discovery Program, begins this year with the release of its first Announcement of Opportunity. Project Prometheus, a program to develop nuclear electric power and propulsion technology intended to enable a new class of high-power, high-capability investigations, is a third opportunity to propose solar system exploration. All three classes of mission include a commitment to provide data to the Planetary Data System, any samples to the NASA Curatorial Facility at Johnson Space Center, and programs for education and public outreach.
Cyberinfrastructure for Atmospheric Discovery

NASA Astrophysics Data System (ADS)

Wilhelmson, R.; Moore, C. W.

2004-12-01

Each year across the United States, floods, tornadoes, hail, strong winds, lightning, hurricanes, and winter storms cause hundreds of deaths, routinely disrupt transportation and commerce, and result in billions of dollars in annual economic losses . MEAD and LEAD are two recent efforts aimed at developing the cyberinfrastructure for studying and forecasting these events through collection, integration, and analysis of observational data coupled with numerical simulation, data mining, and visualization. MEAD (Modeling Environment for Atmospheric Discovery) has been funded for two years as an NCSA (National Center for Supercomputing Applications) Alliance Expedition. The goal of this expedition has been the development/adaptation of cyberinfrastructure that will enable research simulations, datamining, machine learning and visualization of hurricanes and storms utilizing the high performance computing environments including the TeraGrid. Portal grid and web infrastructure are being tested that will enable launching of hundreds of individual WRF (Weather Research and Forecasting) simulations. In a similar way, multiple Regional Ocean Modeling System (ROMS) or WRF/ROMS simulations can be carried out. Metadata and the resulting large volumes of data will then be made available for further study and for educational purposes using analysis, mining, and visualization services. Initial coupling of the ROMS and WRF codes has been completed and parallel I/O is being implemented for these models. Management of these activities (services) are being enabled through Grid workflow technologies (e.g. OGCE). LEAD (Linked Environments for Atmospheric Discovery) is a recently funded 5-year, large NSF ITR grant that involves 9 institutions who are developing a comprehensive national cyberinfrastructure in mesoscale meteorology, particularly one that can interoperate with others being developed. LEAD is addressing the fundamental information technology (IT) research challenges needed to create an integrated, scalable for identifying, accessing, preparing, assimilating, predicting, managing, analyzing, mining, and visualizing a broad array of meteorological data and model output, independent of format and physical location. A transforming element of LEAD is Workflow Orchestration for On-Demand, Real-Time, Dynamically-Adaptive Systems (WOORDS), which allows the use of analysis tools, forecast models, and data repositories as dynamically adaptive, on-demand, Grid-enabled systems that can a) change configuration rapidly and automatically in response to weather; b) continually be steered by new data; c) respond to decision-driven inputs from users; d) initiate other processes automatically; and e) steer remote observing technologies to optimize data collection for the problem at hand. Although LEAD efforts are primiarly directed at mesoscale meteorology, the IT services being developed has general applicability to other geoscience and environmental science. Integration of traditional and new data sources is a crucial component in LEAD for data analysis and assimilation, for integration of (ensemble mining) of data from sets of simulations, and for comparing results to observational data. As part of the integration effort, LEAD is creating a myLEAD metadata catalog service: a personal metacatalog that extends the Globus MCS system and is built on top of the OGSA-DAI system developed at the National e-Science Center in Edinburgh, Scotland.
Formal and Informal Context Factors as Contributors to Student Engagement in a Guided Discovery-Based Program of Game Design Learning

ERIC Educational Resources Information Center

Reynolds, Rebecca; Chiu, Ming Ming

2013-01-01

This paper explored informal (after-school) and formal (elective course in-school) learning contexts as contributors to middle-school student attitudinal changes in a guided discovery-based and blended e-learning program in which students designed web games and used social media and information resources for a full school year. Formality of the…
Conceptual biology, hypothesis discovery, and text mining: Swanson's legacy.

PubMed

Bekhuis, Tanja

2006-04-03

Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians.
Rembrandt: Helping Personalized Medicine Become a Reality Through Integrative Translational Research

PubMed Central

Madhavan, Subha; Zenklusen, Jean-Claude; Kotliarov, Yuri; Sahni, Himanso; Fine, Howard A.; Buetow, Kenneth

2009-01-01

Finding better therapies for the treatment of brain tumors is hampered by the lack of consistently obtained molecular data in a large sample set, and ability to integrate biomedical data from disparate sources enabling translation of therapies from bench to bedside. Hence, a critical factor in the advancement of biomedical research and clinical translation is the ease with which data can be integrated, redistributed and analyzed both within and across functional domains. Novel biomedical informatics infrastructure and tools are essential for developing individualized patient treatment based on the specific genomic signatures in each patient’s tumor. Here we present Rembrandt, Repository of Molecular BRAin Neoplasia DaTa, a cancer clinical genomics database and a web-based data mining and analysis platform aimed at facilitating discovery by connecting the dots between clinical information and genomic characterization data. To date, Rembrandt contains data generated through the Glioma Molecular Diagnostic Initiative from 874 glioma specimens comprising nearly 566 gene expression arrays, 834 copy number arrays and 13,472 clinical phenotype data points. Data can be queried and visualized for a selected gene across all data platforms or for multiple genes in a selected platform. Additionally, gene sets can be limited to clinically important annotations including secreted, kinase, membrane, and known gene-anomaly pairs to facilitate the discovery of novel biomarkers and therapeutic targets. We believe that REMBRANDT represents a prototype of how high throughput genomic and clinical data can be integrated in a way that will allow expeditious and efficient translation of laboratory discoveries to the clinic. PMID:19208739

Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

PubMed Central

Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

2012-01-01

Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086
Future technology insight: mass spectrometry imaging as a tool in drug research and development

PubMed Central

Cobice, D F; Goodwin, R J A; Andren, P E; Nilsson, A; Mackay, C L; Andrew, R

2015-01-01

In pharmaceutical research, understanding the biodistribution, accumulation and metabolism of drugs in tissue plays a key role during drug discovery and development. In particular, information regarding pharmacokinetics, pharmacodynamics and transport properties of compounds in tissues is crucial during early screening. Historically, the abundance and distribution of drugs have been assessed by well-established techniques such as quantitative whole-body autoradiography (WBA) or tissue homogenization with LC/MS analysis. However, WBA does not distinguish active drug from its metabolites and LC/MS, while highly sensitive, does not report spatial distribution. Mass spectrometry imaging (MSI) can discriminate drug and its metabolites and endogenous compounds, while simultaneously reporting their distribution. MSI data are influencing drug development and currently used in investigational studies in areas such as compound toxicity. In in vivo studies MSI results may soon be used to support new drug regulatory applications, although clinical trial MSI data will take longer to be validated for incorporation into submissions. We review the current and future applications of MSI, focussing on applications for drug discovery and development, with examples to highlight the impact of this promising technique in early drug screening. Recent sample preparation and analysis methods that enable effective MSI, including quantitative analysis of drugs from tissue sections will be summarized and key aspects of methodological protocols to increase the effectiveness of MSI analysis for previously undetectable targets addressed. These examples highlight how MSI has become a powerful tool in drug research and development and offers great potential in streamlining the drug discovery process. PMID:25766375
Commentary: Why Pharmaceutical Scientists in Early Drug Discovery Are Critical for Influencing the Design and Selection of Optimal Drug Candidates.

PubMed

Landis, Margaret S; Bhattachar, Shobha; Yazdanian, Mehran; Morrison, John

2018-01-01

This commentary reflects the collective view of pharmaceutical scientists from four different organizations with extensive experience in the field of drug discovery support. Herein, engaging discussion is presented on the current and future approaches for the selection of the most optimal and developable drug candidates. Over the past two decades, developability assessment programs have been implemented with the intention of improving physicochemical and metabolic properties. However, the complexity of both new drug targets and non-traditional drug candidates provides continuing challenges for developing formulations for optimal drug delivery. The need for more enabled technologies to deliver drug candidates has necessitated an even more active role for pharmaceutical scientists to influence many key molecular parameters during compound optimization and selection. This enhanced role begins at the early in vitro screening stages, where key learnings regarding the interplay of molecular structure and pharmaceutical property relationships can be derived. Performance of the drug candidates in formulations intended to support key in vivo studies provides important information on chemotype-formulation compatibility relationships. Structure modifications to support the selection of the solid form are also important to consider, and predictive in silico models are being rapidly developed in this area. Ultimately, the role of pharmaceutical scientists in drug discovery now extends beyond rapid solubility screening, early form assessment, and data delivery. This multidisciplinary role has evolved to include the practice of proactively taking part in the molecular design to better align solid form and formulation requirements to enhance developability potential.
Generation of Multiple Metadata Formats from a Geospatial Data Repository

NASA Astrophysics Data System (ADS)

Hudspeth, W. B.; Benedict, K. K.; Scott, S.

2012-12-01

The Earth Data Analysis Center (EDAC) at the University of New Mexico is partnering with the CYBERShARE and Environmental Health Group from the Center for Environmental Resource Management (CERM), located at the University of Texas, El Paso (UTEP), the Biodiversity Institute at the University of Kansas (KU), and the New Mexico Geo- Epidemiology Research Network (GERN) to provide a technical infrastructure that enables investigation of a variety of climate-driven human/environmental systems. Two significant goals of this NASA-funded project are: a) to increase the use of NASA Earth observational data at EDAC by various modeling communities through enabling better discovery, access, and use of relevant information, and b) to expose these communities to the benefits of provenance for improving understanding and usability of heterogeneous data sources and derived model products. To realize these goals, EDAC has leveraged the core capabilities of its Geographic Storage, Transformation, and Retrieval Engine (Gstore) platform, developed with support of the NSF EPSCoR Program. The Gstore geospatial services platform provides general purpose web services based upon the REST service model, and is capable of data discovery, access, and publication functions, metadata delivery functions, data transformation, and auto-generated OGC services for those data products that can support those services. Central to the NASA ACCESS project is the delivery of geospatial metadata in a variety of formats, including ISO 19115-2/19139, FGDC CSDGM, and the Proof Markup Language (PML). This presentation details the extraction and persistence of relevant metadata in the Gstore data store, and their transformation into multiple metadata formats that are increasingly utilized by the geospatial community to document not only core library catalog elements (e.g. title, abstract, publication data, geographic extent, projection information, and database elements), but also the processing steps used to generate derived modeling products. In particular, we discuss the generation and service delivery of provenance, or trace of data sources and analytical methods used in a scientific analysis, for archived data. We discuss the workflows developed by EDAC to capture end-to-end provenance, the storage model for those data in a delivery format independent data structure, and delivery of PML, ISO, and FGDC documents to clients requesting those products.
Information Fusion for Natural and Man-Made Disasters

DTIC Science & Technology

2007-01-31

comprehensively large, and metaphysically accurate model of situations, through which specific tasks such as situation assessment, knowledge discovery , or the...significance” is always context specific. Event discovery is a very important element of the HLF process, which can lead to knowledge discovery about...expected, given the current state of knowledge . Examples of such behavior may include discovery of a new aggregate or situation, a specific pattern of
Genomics for the identification of novel antimicrobials

USDA-ARS?s Scientific Manuscript database

There is a critical need in animal agriculture for developing novel antimicrobials and alternative strategies to reduce the use of antibiotics and address the challenges of antimicrobial resistance. High-throughput gene expression analysis is providing new tools that are enabling the discovery of h...
Discovery of 100K SNP array and its utilization in sugarcane

USDA-ARS?s Scientific Manuscript database

Next generation sequencing (NGS) enable us to identify thousands of single nucleotide polymorphisms (SNPs) marker for genotyping and fingerprinting. However, the process requires very precise bioinformatics analysis and filtering process. High throughput SNP array with predefined genomic location co...
The iPlant collaborative: cyberinfrastructure for enabling data to discovery for the life sciences

USDA-ARS?s Scientific Manuscript database

The iPlant Collaborative provides life science research communities access to comprehensive, scalable, and cohesive computational infrastructure for data management; identify management; collaboration tools; and cloud, high-performance, high-throughput computing. iPlant provides training, learning m...
Simultaneous Proteomic Discovery and Targeted Monitoring using Liquid Chromatography, Ion Mobility Spectrometry, and Mass Spectrometry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burnum-Johnson, Kristin E.; Nie, Song; Casey, Cameron P.

Current proteomics approaches are comprised of both broad discovery measurements as well as more quantitative targeted measurements. These two different measurement types are used to initially identify potentially important proteins (e.g., candidate biomarkers) and then enable improved quantification for a limited number of selected proteins. However, both approaches suffer from limitations, particularly the lower sensitivity, accuracy, and quantitation precision for discovery approaches compared to targeted approaches, and the limited proteome coverage provided by targeted approaches. Herein, we describe a new proteomics approach that allows both discovery and targeted monitoring (DTM) in a single analysis using liquid chromatography, ion mobility spectrometrymore » and mass spectrometry (LC-IMS-MS). In DTM, heavy labeled peptides for target ions are spiked into tryptic digests and both the labeled and unlabeled peptides are broadly detected using LC-IMS-MS instrumentation, allowing the benefits of discovery and targeted approaches. To understand the possible improvement of the DTM approach, it was compared to LC-MS broad measurements using an accurate mass and time tag database and selected reaction monitoring (SRM) targeted measurements. The DTM results yielded greater peptide/protein coverage and a significant improvement in the detection of lower abundance species compared to LC-MS discovery measurements. DTM was also observed to have similar detection limits as SRM for the targeted measurements indicating its potential for combining the discovery and targeted approaches.« less
Discovery radiomics via evolutionary deep radiomic sequencer discovery for pathologically proven lung cancer detection.

PubMed

Shafiee, Mohammad Javad; Chung, Audrey G; Khalvati, Farzad; Haider, Masoom A; Wong, Alexander

2017-10-01

While lung cancer is the second most diagnosed form of cancer in men and women, a sufficiently early diagnosis can be pivotal in patient survival rates. Imaging-based, or radiomics-driven, detection methods have been developed to aid diagnosticians, but largely rely on hand-crafted features that may not fully encapsulate the differences between cancerous and healthy tissue. Recently, the concept of discovery radiomics was introduced, where custom abstract features are discovered from readily available imaging data. We propose an evolutionary deep radiomic sequencer discovery approach based on evolutionary deep intelligence. Motivated by patient privacy concerns and the idea of operational artificial intelligence, the evolutionary deep radiomic sequencer discovery approach organically evolves increasingly more efficient deep radiomic sequencers that produce significantly more compact yet similarly descriptive radiomic sequences over multiple generations. As a result, this framework improves operational efficiency and enables diagnosis to be run locally at the radiologist's computer while maintaining detection accuracy. We evaluated the evolved deep radiomic sequencer (EDRS) discovered via the proposed evolutionary deep radiomic sequencer discovery framework against state-of-the-art radiomics-driven and discovery radiomics methods using clinical lung CT data with pathologically proven diagnostic data from the LIDC-IDRI dataset. The EDRS shows improved sensitivity (93.42%), specificity (82.39%), and diagnostic accuracy (88.78%) relative to previous radiomics approaches.
Peptoid architectures: elaboration, actuation, and application.

PubMed

Yoo, Barney; Kirshenbaum, Kent

2008-12-01

Peptoids are peptidomimetic oligomers composed of N-substituted glycine units. Their convenient synthesis enables strict control over the sequence of highly diverse monomers and is capable of generating extensive compound libraries. Recent studies are beginning to explore the relationship between peptoid sequence, structure and function. We describe new approaches to direct the conformation of the peptoid backbone, leading to secondary structures such as helices, loops, and turns. These advances are enabling the discovery of bioactive peptoids and will establish modules for the design and assembly of protein mimetics.
A New System To Support Knowledge Discovery: Telemakus.

ERIC Educational Resources Information Center

Revere, Debra; Fuller, Sherrilynne S.; Bugni, Paul F.; Martin, George M.

2003-01-01

The Telemakus System builds on the areas of concept representation, schema theory, and information visualization to enhance knowledge discovery from scientific literature. This article describes the underlying theories and an overview of a working implementation designed to enhance the knowledge discovery process through retrieval, visual and…
Knowledge Discovery from Databases: An Introductory Review.

ERIC Educational Resources Information Center

Vickery, Brian

1997-01-01

Introduces new procedures being used to extract knowledge from databases and discusses rationales for developing knowledge discovery methods. Methods are described for such techniques as classification, clustering, and the detection of deviations from pre-established norms. Examines potential uses of knowledge discovery in the information field.…
Anatomy of an Extensible Open Source PACS.

PubMed

Valente, Frederico; Silva, Luís A Bastião; Godinho, Tiago Marques; Costa, Carlos

2016-06-01

The conception and deployment of cost effective Picture Archiving and Communication Systems (PACS) is a concern for small to medium medical imaging facilities, research environments, and developing countries' healthcare institutions. Financial constraints and the specificity of these scenarios contribute to a low adoption rate of PACS in those environments. Furthermore, with the advent of ubiquitous computing and new initiatives to improve healthcare information technologies and data sharing, such as IHE and XDS-i, a PACS must adapt quickly to changes. This paper describes Dicoogle, a software framework that enables developers and researchers to quickly prototype and deploy new functionality taking advantage of the embedded Digital Imaging and Communications in Medicine (DICOM) services. This full-fledged implementation of a PACS archive is very amenable to extension due to its plugin-based architecture and out-of-the-box functionality, which enables the exploration of large DICOM datasets and associated metadata. These characteristics make the proposed solution very interesting for prototyping, experimentation, and bridging functionality with deployed applications. Besides being an advanced mechanism for data discovery and retrieval based on DICOM object indexing, it enables the detection of inconsistencies in an institution's data and processes. Several use cases have benefited from this approach such as radiation dosage monitoring, Content-Based Image Retrieval (CBIR), and the use of the framework as support for classes targeting software engineering for clinical contexts.
Recent Advances in the Chemistry and Biology of Naturally Occurring Antibiotics

PubMed Central

Chen, Jason S.; Edmonds, David J.; Estrada, Anthony A.

2009-01-01

Lead-in Ever since the world-shaping discovery of penicillin, nature’s molecular diversity has been extensively screened for new medications and lead compounds in drug discovery. The search for anti-infective agents intended to combat infectious diseases has been of particular interest and has enjoyed a high degree of success. Indeed, the history of antibiotics is marked with impressive discoveries and drug development stories, the overwhelming majority of which have their origins in nature. Chemistry, and in particular chemical synthesis, has played a major role in bringing naturally occurring antibiotics and their derivatives to the clinic, and no doubt these disciplines will continue to be key enabling technologies for future developments in the field. In this review article, we highlight a number of recent discoveries and advances in the chemistry, biology, and medicine of naturally occurring antibiotics, with particular emphasis on the total synthesis, analog design, and biological evaluation of molecules with novel mechanisms of action. PMID:19130444
K2 Citizen Science Discovery of a Four-Planet System in a Chain of 3:2 Resonances

NASA Astrophysics Data System (ADS)

Barentsen, Geert; Christiansen, Jessie; Crossfield, Ian; Barclay, Thomas; Lintott, Chris; Cox, Brian; Zemiro, Julia; Simmons, Brooke; Miller, Grant; NASA K2, Zooniverse, BBC, ABC

2017-06-01

We report on the discovery of a compact system of four transiting super-Earth-sized planets around a moderately bright K-type star (V=12) using data from Campaign 12 of NASA's K2 mission. Uniquely, the periods of the planets are 3.6d, 5.4d, 8.3d, and 12.8d, forming an unbroken chain of near 3:2 resonances. It is the first discovery made by citizen scientists participating in the Exoplanet Explorers project on the Zooniverse platform, and was discovered with the help of 15,000 volunteers recruited via the "Stargazing Live" show on Australia's ABC TV channel. K2's open data policy, combined with the unique format of a BBC TV production that does not shy away from including advanced scientific content, enabled the process of a genuine scientific discovery to be executed and witnessed live on air by nearly a million viewers.
Evolution of the NASA/IPAC Extragalactic Database (NED) into a Data Mining Discovery Engine

NASA Astrophysics Data System (ADS)

Mazzarella, Joseph M.; NED Team

2017-06-01

We review recent advances and ongoing work in evolving the NASA/IPAC Extragalactic Database (NED) beyond an object reference database into a data mining discovery engine. Updates to the infrastructure and data integration techniques are enabling more than a 10-fold expansion; NED will soon contain over a billion objects with their fundamental attributes fused across the spectrum via cross-identifications among the largest sky surveys (e.g., GALEX, SDSS, 2MASS, AllWISE, EMU), and over 100,000 smaller but scientifically important catalogs and journal articles. The recent discovery of super-luminous spiral galaxies exemplifies the opportunities for data mining and science discovery directly from NED's rich data synthesis. Enhancements to the user interface, including new APIs, VO protocols, and queries involving derived physical quantities, are opening new pathways for panchromatic studies of large galaxy samples. Examples are shown of graphics characterizing the content of NED, as well as initial steps in exploring the database via interactive statistical visualizations.
Discovery and History of Amino Acid Fermentation.

PubMed

Hashimoto, Shin-Ichi

There has been a strong demand in Japan and East Asia for L-glutamic acid as a seasoning since monosodium glutamate was found to present umami taste in 1907. The discovery of glutamate fermentation by Corynebacterium glutamicum in 1956 enabled abundant and low-cost production of the amino acid, creating a large market. The discovery also prompted researchers to develop fermentative production processes for other L-amino acids, such as lysine. Currently, the amino acid fermentation industry is so huge that more than 5 million metric tons of amino acids are manufactured annually all over the world, and this number continues to grow. Research on amino acid fermentation fostered the notion and skills of metabolic engineering which has been applied for the production of other compounds from renewable resources. The discovery of glutamate fermentation has had revolutionary impacts on both the industry and science. In this chapter, the history and development of glutamate fermentation, including the very early stage of fermentation of other amino acids, are reviewed.
Organic synthesis provides opportunities to transform drug discovery

NASA Astrophysics Data System (ADS)

Blakemore, David C.; Castro, Luis; Churcher, Ian; Rees, David C.; Thomas, Andrew W.; Wilson, David M.; Wood, Anthony

2018-04-01

Despite decades of ground-breaking research in academia, organic synthesis is still a rate-limiting factor in drug-discovery projects. Here we present some current challenges in synthetic organic chemistry from the perspective of the pharmaceutical industry and highlight problematic steps that, if overcome, would find extensive application in the discovery of transformational medicines. Significant synthesis challenges arise from the fact that drug molecules typically contain amines and N-heterocycles, as well as unprotected polar groups. There is also a need for new reactions that enable non-traditional disconnections, more C-H bond activation and late-stage functionalization, as well as stereoselectively substituted aliphatic heterocyclic ring synthesis, C-X or C-C bond formation. We also emphasize that syntheses compatible with biomacromolecules will find increasing use, while new technologies such as machine-assisted approaches and artificial intelligence for synthesis planning have the potential to dramatically accelerate the drug-discovery process. We believe that increasing collaboration between academic and industrial chemists is crucial to address the challenges outlined here.
GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns

DOE PAGES

Senin, Pavel; Lin, Jessica; Wang, Xing; ...

2018-02-23

The problems of recurrent and anomalous pattern discovery in time series, e.g., motifs and discords, respectively, have received a lot of attention from researchers in the past decade. However, since the pattern search space is usually intractable, most existing detection algorithms require that the patterns have discriminative characteristics and have its length known in advance and provided as input, which is an unreasonable requirement for many real-world problems. In addition, patterns of similar structure, but of different lengths may co-exist in a time series. In order to address these issues, we have developed algorithms for variable-length time series pattern discoverymore » that are based on symbolic discretization and grammar inference—two techniques whose combination enables the structured reduction of the search space and discovery of the candidate patterns in linear time. In this work, we present GrammarViz 3.0—a software package that provides implementations of proposed algorithms and graphical user interface for interactive variable-length time series pattern discovery. The current version of the software provides an alternative grammar inference algorithm that improves the time series motif discovery workflow, and introduces an experimental procedure for automated discretization parameter selection that builds upon the minimum cardinality maximum cover principle and aids the time series recurrent and anomalous pattern discovery.« less

GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns

DOE Office of Scientific and Technical Information (OSTI.GOV)

Senin, Pavel; Lin, Jessica; Wang, Xing

The problems of recurrent and anomalous pattern discovery in time series, e.g., motifs and discords, respectively, have received a lot of attention from researchers in the past decade. However, since the pattern search space is usually intractable, most existing detection algorithms require that the patterns have discriminative characteristics and have its length known in advance and provided as input, which is an unreasonable requirement for many real-world problems. In addition, patterns of similar structure, but of different lengths may co-exist in a time series. In order to address these issues, we have developed algorithms for variable-length time series pattern discoverymore » that are based on symbolic discretization and grammar inference—two techniques whose combination enables the structured reduction of the search space and discovery of the candidate patterns in linear time. In this work, we present GrammarViz 3.0—a software package that provides implementations of proposed algorithms and graphical user interface for interactive variable-length time series pattern discovery. The current version of the software provides an alternative grammar inference algorithm that improves the time series motif discovery workflow, and introduces an experimental procedure for automated discretization parameter selection that builds upon the minimum cardinality maximum cover principle and aids the time series recurrent and anomalous pattern discovery.« less
Structure-based discovery and binding site analysis of histamine receptor ligands.

PubMed

Kiss, Róbert; Keserű, György M

2016-12-01

The application of structure-based drug discovery in histamine receptor projects was previously hampered by the lack of experimental structures. The publication of the first X-ray structure of the histamine H1 receptor has been followed by several successful virtual screens and binding site analysis studies of H1-antihistamines. This structure together with several other recently solved aminergic G-protein coupled receptors (GPCRs) enabled the development of more realistic homology models for H2, H3 and H4 receptors. Areas covered: In this paper, the authors review the development of histamine receptor models and their application in drug discovery. Expert opinion: In the authors' opinion, the application of atomistic histamine receptor models has played a significant role in understanding key ligand-receptor interactions as well as in the discovery of novel chemical starting points. The recently solved H1 receptor structure is a major milestone in structure-based drug discovery; however, our analysis also demonstrates that for building H3 and H4 receptor homology models, other GPCRs may be more suitable as templates. For these receptors, the authors envisage that the development of higher quality homology models will significantly contribute to the discovery and optimization of novel H3 and H4 ligands.
Understanding price discovery in interconnected markets: Generalized Langevin process approach and simulation

NASA Astrophysics Data System (ADS)

Schenck, Natalya A.; Horvath, Philip A.; Sinha, Amit K.

2018-02-01

While the literature on price discovery process and information flow between dominant and satellite market is exhaustive, most studies have applied an approach that can be traced back to Hasbrouck (1995) or Gonzalo and Granger (1995). In this paper, however, we propose a Generalized Langevin process with asymmetric double-well potential function, with co-integrated time series and interconnected diffusion processes to model the information flow and price discovery process in two, a dominant and a satellite, interconnected markets. A simulated illustration of the model is also provided.
Advancements in Large-Scale Data/Metadata Management for Scientific Data.

NASA Astrophysics Data System (ADS)

Guntupally, K.; Devarakonda, R.; Palanisamy, G.; Frame, M. T.

2017-12-01

Scientific data often comes with complex and diverse metadata which are critical for data discovery and users. The Online Metadata Editor (OME) tool, which was developed by an Oak Ridge National Laboratory team, effectively manages diverse scientific datasets across several federal data centers, such as DOE's Atmospheric Radiation Measurement (ARM) Data Center and USGS's Core Science Analytics, Synthesis, and Libraries (CSAS&L) project. This presentation will focus mainly on recent developments and future strategies for refining OME tool within these centers. The ARM OME is a standard based tool (https://www.archive.arm.gov/armome) that allows scientists to create and maintain metadata about their data products. The tool has been improved with new workflows that help metadata coordinators and submitting investigators to submit and review their data more efficiently. The ARM Data Center's newly upgraded Data Discovery Tool (http://www.archive.arm.gov/discovery) uses rich metadata generated by the OME to enable search and discovery of thousands of datasets, while also providing a citation generator and modern order-delivery techniques like Globus (using GridFTP), Dropbox and THREDDS. The Data Discovery Tool also supports incremental indexing, which allows users to find new data as and when they are added. The USGS CSAS&L search catalog employs a custom version of the OME (https://www1.usgs.gov/csas/ome), which has been upgraded with high-level Federal Geographic Data Committee (FGDC) validations and the ability to reserve and mint Digital Object Identifiers (DOIs). The USGS's Science Data Catalog (SDC) (https://data.usgs.gov/datacatalog) allows users to discover a myriad of science data holdings through a web portal. Recent major upgrades to the SDC and ARM Data Discovery Tool include improved harvesting performance and migration using new search software, such as Apache Solr 6.0 for serving up data/metadata to scientific communities. Our presentation will highlight the future enhancements of these tools which enable users to retrieve fast search results, along with parallelizing the retrieval process from online and High Performance Storage Systems. In addition, these improvements to the tools will support additional metadata formats like the Large-Eddy Simulation (LES) ARM Symbiotic and Observation (LASSO) bundle data.
Biomedical Information Extraction: Mining Disease Associated Genes from Literature

ERIC Educational Resources Information Center

Huang, Zhong

2014-01-01

Disease associated gene discovery is a critical step to realize the future of personalized medicine. However empirical and clinical validation of disease associated genes are time consuming and expensive. In silico discovery of disease associated genes from literature is therefore becoming the first essential step for biomarker discovery to…
Self Assessment and Discovery Learning

ERIC Educational Resources Information Center

McDonald, Betty

2011-01-01

Discovery learning in higher education has been reported to be effective in assisting learners to understand difficult concepts and retain long term information. This paper seeks to illustrate how one self assessment model may be used to demonstrate discovery learning in a collaborative atmosphere of students sharing and getting to know each…
Cross-Layer Service Discovery Mechanism for OLSRv2 Mobile Ad Hoc Networks.

PubMed

Vara, M Isabel; Campo, Celeste

2015-07-20

Service discovery plays an important role in mobile ad hoc networks (MANETs). The lack of central infrastructure, limited resources and high mobility make service discovery a challenging issue for this kind of network. This article proposes a new service discovery mechanism for discovering and advertising services integrated into the Optimized Link State Routing Protocol Version 2 (OLSRv2). In previous studies, we demonstrated the validity of a similar service discovery mechanism integrated into the previous version of OLSR (OLSRv1). In order to advertise services, we have added a new type-length-value structure (TLV) to the OLSRv2 protocol, called service discovery message (SDM), according to the Generalized MANET Packet/Message Format defined in Request For Comments (RFC) 5444. Each node in the ad hoc network only advertises its own services. The advertisement frequency is a user-configurable parameter, so that it can be modified depending on the user requirements. Each node maintains two service tables, one to store information about its own services and another one to store information about the services it discovers in the network. We present simulation results, that compare our service discovery integrated into OLSRv2 with the one defined for OLSRv1 and with the integration of service discovery in Ad hoc On-demand Distance Vector (AODV) protocol, in terms of service discovery ratio, service latency and network overhead.
Cross-Layer Service Discovery Mechanism for OLSRv2 Mobile Ad Hoc Networks

PubMed Central

Vara, M. Isabel; Campo, Celeste

2015-01-01

Service discovery plays an important role in mobile ad hoc networks (MANETs). The lack of central infrastructure, limited resources and high mobility make service discovery a challenging issue for this kind of network. This article proposes a new service discovery mechanism for discovering and advertising services integrated into the Optimized Link State Routing Protocol Version 2 (OLSRv2). In previous studies, we demonstrated the validity of a similar service discovery mechanism integrated into the previous version of OLSR (OLSRv1). In order to advertise services, we have added a new type-length-value structure (TLV) to the OLSRv2 protocol, called service discovery message (SDM), according to the Generalized MANET Packet/Message Format defined in Request For Comments (RFC) 5444. Each node in the ad hoc network only advertises its own services. The advertisement frequency is a user-configurable parameter, so that it can be modified depending on the user requirements. Each node maintains two service tables, one to store information about its own services and another one to store information about the services it discovers in the network. We present simulation results, that compare our service discovery integrated into OLSRv2 with the one defined for OLSRv1 and with the integration of service discovery in Ad hoc On-demand Distance Vector (AODV) protocol, in terms of service discovery ratio, service latency and network overhead. PMID:26205272
47 CFR 1.729 - Discovery.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 47 Telecommunication 1 2010-10-01 2010-10-01 false Discovery. 1.729 Section 1.729... witnesses in the information designation required by § 1.721(a)(10)(i). In its § 1.721(a)(10)(i) information... limitations applicable to fact witnesses. [63 FR 1038, Jan. 7, 1998, as amended at 63 FR 41447, Aug. 4, 1998] ...
A pleiotropy-informed Bayesian false discovery rate adapted to a shared control design finds new disease associations from GWAS summary statistics.

PubMed

Liley, James; Wallace, Chris

2015-02-01

Genome-wide association studies (GWAS) have been successful in identifying single nucleotide polymorphisms (SNPs) associated with many traits and diseases. However, at existing sample sizes, these variants explain only part of the estimated heritability. Leverage of GWAS results from related phenotypes may improve detection without the need for larger datasets. The Bayesian conditional false discovery rate (cFDR) constitutes an upper bound on the expected false discovery rate (FDR) across a set of SNPs whose p values for two diseases are both less than two disease-specific thresholds. Calculation of the cFDR requires only summary statistics and have several advantages over traditional GWAS analysis. However, existing methods require distinct control samples between studies. Here, we extend the technique to allow for some or all controls to be shared, increasing applicability. Several different SNP sets can be defined with the same cFDR value, and we show that the expected FDR across the union of these sets may exceed expected FDR in any single set. We describe a procedure to establish an upper bound for the expected FDR among the union of such sets of SNPs. We apply our technique to pairwise analysis of p values from ten autoimmune diseases with variable sharing of controls, enabling discovery of 59 SNP-disease associations which do not reach GWAS significance after genomic control in individual datasets. Most of the SNPs we highlight have previously been confirmed using replication studies or larger GWAS, a useful validation of our technique; we report eight SNP-disease associations across five diseases not previously declared. Our technique extends and strengthens the previous algorithm, and establishes robust limits on the expected FDR. This approach can improve SNP detection in GWAS, and give insight into shared aetiology between phenotypically related conditions.
Precision medicine in the age of big data: The present and future role of large-scale unbiased sequencing in drug discovery and development.

PubMed

Vicini, P; Fields, O; Lai, E; Litwack, E D; Martin, A-M; Morgan, T M; Pacanowski, M A; Papaluca, M; Perez, O D; Ringel, M S; Robson, M; Sakul, H; Vockley, J; Zaks, T; Dolsten, M; Søgaard, M

2016-02-01

High throughput molecular and functional profiling of patients is a key driver of precision medicine. DNA and RNA characterization has been enabled at unprecedented cost and scale through rapid, disruptive progress in sequencing technology, but challenges persist in data management and interpretation. We analyze the state-of-the-art of large-scale unbiased sequencing in drug discovery and development, including technology, application, ethical, regulatory, policy and commercial considerations, and discuss issues of LUS implementation in clinical and regulatory practice. © 2015 American Society for Clinical Pharmacology and Therapeutics.
Opportunities and challenges provided by cloud repositories for bioinformatics-enabled drug discovery.

PubMed

Dalpé, Gratien; Joly, Yann

2014-09-01

Healthcare-related bioinformatics databases are increasingly offering the possibility to maintain, organize, and distribute DNA sequencing data. Different national and international institutions are currently hosting such databases that offer researchers website platforms where they can obtain sequencing data on which they can perform different types of analysis. Until recently, this process remained mostly one-dimensional, with most analysis concentrated on a limited amount of data. However, newer genome sequencing technology is producing a huge amount of data that current computer facilities are unable to handle. An alternative approach has been to start adopting cloud computing services for combining the information embedded in genomic and model system biology data, patient healthcare records, and clinical trials' data. In this new technological paradigm, researchers use virtual space and computing power from existing commercial or not-for-profit cloud service providers to access, store, and analyze data via different application programming interfaces. Cloud services are an alternative to the need of larger data storage; however, they raise different ethical, legal, and social issues. The purpose of this Commentary is to summarize how cloud computing can contribute to bioinformatics-based drug discovery and to highlight some of the outstanding legal, ethical, and social issues that are inherent in the use of cloud services. © 2014 Wiley Periodicals, Inc.
Global Diversity and Review of Siphonophorae (Cnidaria: Hydrozoa)

PubMed Central

Mapstone, Gillian M.

2014-01-01

In this review the history of discovery of siphonophores, from the first formal description by Carl Linnaeus in 1785 to the present, is summarized, and species richness together with a summary of world-wide distribution of this pelagic group within the clade Hydrozoa discussed. Siphonophores exhibit three basic body plans which are briefly explained and figured, whilst other atypical body plans are also noted. Currently, 175 valid siphonophore species are recognized in the latest WoRMS world list, including 16 families and 65 genera. Much new information since the last review in 1987 is revealed from the first molecular analysis of the group, enabling identification of some new morphological characters diagnostic for physonect siphonophores. Ten types of nematocysts (stinging cells) are identified in siphonophores, more than in any other cnidarian; these are incorporated into batteries in the side branches of the tentacles in most species (here termed tentilla), and tentilla are reviewed in the last section of this paper. Their discharge mechanisms are explained and also how the tentilla of several physonect siphonophores are modified into lures. Of particular interest is the recent discovery of a previously unknown red fluorescent lure in the tentilla of the deep sea physonect Erenna, the first described example of emission of red light by an invertebrate to attract prey. PMID:24516560
Enhancing knowledge discovery from cancer genomics data with Galaxy

PubMed Central

Albuquerque, Marco A.; Grande, Bruno M.; Ritch, Elie J.; Pararajalingam, Prasath; Jessa, Selin; Krzywinski, Martin; Grewal, Jasleen K.; Shah, Sohrab P.; Boutros, Paul C.

2017-01-01

Abstract The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. PMID:28327945
Enhancing knowledge discovery from cancer genomics data with Galaxy.

PubMed

Albuquerque, Marco A; Grande, Bruno M; Ritch, Elie J; Pararajalingam, Prasath; Jessa, Selin; Krzywinski, Martin; Grewal, Jasleen K; Shah, Sohrab P; Boutros, Paul C; Morin, Ryan D

2017-05-01

The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker. © The Author 2017. Published by Oxford University Press.
Global diversity and review of Siphonophorae (Cnidaria: Hydrozoa).

PubMed

Mapstone, Gillian M

2014-01-01

In this review the history of discovery of siphonophores, from the first formal description by Carl Linnaeus in 1785 to the present, is summarized, and species richness together with a summary of world-wide distribution of this pelagic group within the clade Hydrozoa discussed. Siphonophores exhibit three basic body plans which are briefly explained and figured, whilst other atypical body plans are also noted. Currently, 175 valid siphonophore species are recognized in the latest WoRMS world list, including 16 families and 65 genera. Much new information since the last review in 1987 is revealed from the first molecular analysis of the group, enabling identification of some new morphological characters diagnostic for physonect siphonophores. Ten types of nematocysts (stinging cells) are identified in siphonophores, more than in any other cnidarian; these are incorporated into batteries in the side branches of the tentacles in most species (here termed tentilla), and tentilla are reviewed in the last section of this paper. Their discharge mechanisms are explained and also how the tentilla of several physonect siphonophores are modified into lures. Of particular interest is the recent discovery of a previously unknown red fluorescent lure in the tentilla of the deep sea physonect Erenna, the first described example of emission of red light by an invertebrate to attract prey.
Semantically-enabled Knowledge Discovery in the Deep Carbon Observatory

NASA Astrophysics Data System (ADS)

Wang, H.; Chen, Y.; Ma, X.; Erickson, J. S.; West, P.; Fox, P. A.

2013-12-01

The Deep Carbon Observatory (DCO) is a decadal effort aimed at transforming scientific and public understanding of carbon in the complex deep earth system from the perspectives of Deep Energy, Deep Life, Extreme Physics and Chemistry, and Reservoirs and Fluxes. Over the course of the decade DCO scientific activities will generate a massive volume of data across a variety of disciplines, presenting significant challenges in terms of data integration, management, analysis and visualization, and ultimately limiting the ability of scientists across disciplines to make insights and unlock new knowledge. The DCO Data Science Team (DCO-DS) is applying Semantic Web methodologies to construct a knowledge representation focused on the DCO Earth science disciplines, and use it together with other technologies (e.g. natural language processing and data mining) to create a more expressive representation of the distributed corpus of DCO artifacts including datasets, metadata, instruments, sensors, platforms, deployments, researchers, organizations, funding agencies, grants and various awards. The embodiment of this knowledge representation is the DCO Data Science Infrastructure, in which unique entities within the DCO domain and the relations between them are recognized and explicitly identified. The DCO-DS Infrastructure will serve as a platform for more efficient and reliable searching, discovery, access, and publication of information and knowledge for the DCO scientific community and beyond.
IMPPAT: A curated database of Indian Medicinal Plants, Phytochemistry And Therapeutics.

PubMed

Mohanraj, Karthikeyan; Karthikeyan, Bagavathy Shanmugam; Vivek-Ananth, R P; Chand, R P Bharath; Aparna, S R; Mangalapandi, Pattulingam; Samal, Areejit

2018-03-12

Phytochemicals of medicinal plants encompass a diverse chemical space for drug discovery. India is rich with a flora of indigenous medicinal plants that have been used for centuries in traditional Indian medicine to treat human maladies. A comprehensive online database on the phytochemistry of Indian medicinal plants will enable computational approaches towards natural product based drug discovery. In this direction, we present, IMPPAT, a manually curated database of 1742 Indian Medicinal Plants, 9596 Phytochemicals, And 1124 Therapeutic uses spanning 27074 plant-phytochemical associations and 11514 plant-therapeutic associations. Notably, the curation effort led to a non-redundant in silico library of 9596 phytochemicals with standard chemical identifiers and structure information. Using cheminformatic approaches, we have computed the physicochemical, ADMET (absorption, distribution, metabolism, excretion, toxicity) and drug-likeliness properties of the IMPPAT phytochemicals. We show that the stereochemical complexity and shape complexity of IMPPAT phytochemicals differ from libraries of commercial compounds or diversity-oriented synthesis compounds while being similar to other libraries of natural products. Within IMPPAT, we have filtered a subset of 960 potential druggable phytochemicals, of which majority have no significant similarity to existing FDA approved drugs, and thus, rendering them as good candidates for prospective drugs. IMPPAT database is openly accessible at: https://cb.imsc.res.in/imppat .
Mars Exploration Rover: surface operations

NASA Technical Reports Server (NTRS)

Erickson, J. K.; Adler, M.; Crisp, J.; Mishkin, A.; Welch, R.

2002-01-01

This paper will provide an overview of the planned mission, and also focus on the different operations challenges inherent in operating these two very off road vehicles, and the solutions adopted to enable the best utilization of their capabilities for high science return and responsiveness to scientific discovery.
Enabling knowledge discovery: taxonomy development for NASA

NASA Technical Reports Server (NTRS)

Dutra, J.; Busch, J.

2003-01-01

This white paper provides the background for why it is important to take the next steps with the NASA taxonomy including test and validation, XML schema development, integration with the FirstGov federal search engine, the OneNASA portal and its supporting web content management system.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.