Science.gov

Sample records for model organism database

  1. Model organism databases in behavioral neuroscience.

    PubMed

    Shimoyama, Mary; Smith, Jennifer R; Hayman, G Thomas; Petri, Victoria; Nigam, Rajni

    2012-01-01

    Model Organism Databases (MODs) are an important informatics tool for researchers. They provide comprehensive organism specific genetic, genomic, and phenotype datasets. MODs ensure accurate data identification and integrity and provide official nomenclature for genes, Quantitative Trait Loci, and strains. Most importantly, the MODs provide professionally curated data drawn from the literature for function, phenotype and disease associations, and pathway involvement. These data, along with nomenclature and data identity, are incorporated into larger scale genomic databases and research publications. MODs also offer a number of software tools that allow researchers to access, display, and analyze data from reports to genome browsers. Copyright © 2012 Elsevier Inc. All rights reserved.

  2. Curation accuracy of model organism databases

    PubMed Central

    Keseler, Ingrid M.; Skrzypek, Marek; Weerasinghe, Deepika; Chen, Albert Y.; Fulcher, Carol; Li, Gene-Wei; Lemmer, Kimberly C.; Mladinich, Katherine M.; Chow, Edmond D.; Sherlock, Gavin; Karp, Peter D.

    2014-01-01

    Manual extraction of information from the biomedical literature—or biocuration—is the central methodology used to construct many biological databases. For example, the UniProt protein database, the EcoCyc Escherichia coli database and the Candida Genome Database (CGD) are all based on biocuration. Biological databases are used extensively by life science researchers, as online encyclopedias, as aids in the interpretation of new experimental data and as golden standards for the development of new bioinformatics algorithms. Although manual curation has been assumed to be highly accurate, we are aware of only one previous study of biocuration accuracy. We assessed the accuracy of EcoCyc and CGD by manually selecting curated assertions within randomly chosen EcoCyc and CGD gene pages and by then validating that the data found in the referenced publications supported those assertions. A database assertion is considered to be in error if that assertion could not be found in the publication cited for that assertion. We identified 10 errors in the 633 facts that we validated across the two databases, for an overall error rate of 1.58%, and individual error rates of 1.82% for CGD and 1.40% for EcoCyc. These data suggest that manual curation of the experimental literature by Ph.D-level scientists is highly accurate. Database URL: http://ecocyc.org/, http://www.candidagenome.org// PMID:24923819

  3. dictyBase, the model organism database for Dictyostelium discoideum.

    PubMed

    Chisholm, Rex L; Gaudet, Pascale; Just, Eric M; Pilcher, Karen E; Fey, Petra; Merchant, Sohel N; Kibbe, Warren A

    2006-01-01

    dictyBase (http://dictybase.org) is the model organism database (MOD) for the social amoeba Dictyostelium discoideum. The unique biology and phylogenetic position of Dictyostelium offer a great opportunity to gain knowledge of processes not characterized in other organisms. The recent completion of the 34 MB genome sequence, together with the sizable scientific literature using Dictyostelium as a research organism, provided the necessary tools to create a well-annotated genome. dictyBase has leveraged software developed by the Saccharomyces Genome Database and the Generic Model Organism Database project. This has reduced the time required to develop a full-featured MOD and greatly facilitated our ability to focus on annotation and providing new functionality. We hope that manual curation of the Dictyostelium genome will facilitate the annotation of other genomes.

  4. Xanthusbase: adapting wikipedia principles to a model organism database.

    PubMed

    Arshinoff, Bradley I; Suen, Garret; Just, Eric M; Merchant, Sohel M; Kibbe, Warren A; Chisholm, Rex L; Welch, Roy D

    2007-01-01

    xanthusBase (http://www.xanthusbase.org) is the official model organism database (MOD) for the social bacterium Myxococcus xanthus. In many respects, M.xanthus represents the pioneer model organism (MO) for studying the genetic, biochemical, and mechanistic basis of prokaryotic multicellularity, a topic that has garnered considerable attention due to the significance of biofilms in both basic and applied microbiology research. To facilitate its utility, the design of xanthusBase incorporates open-source software, leveraging the cumulative experience made available through the Generic Model Organism Database (GMOD) project, MediaWiki (http://www.mediawiki.org), and dictyBase (http://www.dictybase.org), to create a MOD that is both highly useful and easily navigable. In addition, we have incorporated a unique Wikipedia-style curation model which exploits the internet's inherent interactivity, thus enabling M.xanthus and other myxobacterial researchers to contribute directly toward the ongoing genome annotation.

  5. Re-thinking organisms: The impact of databases on model organism biology.

    PubMed

    Leonelli, Sabina; Ankeny, Rachel A

    2012-03-01

    Community databases have become crucial to the collection, ordering and retrieval of data gathered on model organisms, as well as to the ways in which these data are interpreted and used across a range of research contexts. This paper analyses the impact of community databases on research practices in model organism biology by focusing on the history and current use of four community databases: FlyBase, Mouse Genome Informatics, WormBase and The Arabidopsis Information Resource. We discuss the standards used by the curators of these databases for what counts as reliable evidence, acceptable terminology, appropriate experimental set-ups and adequate materials (e.g., specimens). On the one hand, these choices are informed by the collaborative research ethos characterising most model organism communities. On the other hand, the deployment of these standards in databases reinforces this ethos and gives it concrete and precise instantiations by shaping the skills, practices, values and background knowledge required of the database users. We conclude that the increasing reliance on community databases as vehicles to circulate data is having a major impact on how researchers conduct and communicate their research, which affects how they understand the biology of model organisms and its relation to the biology of other species. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. ZFIN, the Zebrafish Model Organism Database: updates and new directions

    PubMed Central

    Ruzicka, Leyla; Bradford, Yvonne M.; Frazer, Ken; Howe, Douglas G.; Paddock, Holly; Ramachandran, Sridhar; Singer, Amy; Toro, Sabrina; Van Slyke, Ceri E.; Eagle, Anne E.; Fashena, David; Kalita, Patrick; Knight, Jonathan; Mani, Prita; Martin, Ryan; Moxon, Sierra A. T.; Pich, Christian; Schaper, Kevin; Shao, Xiang; Westerfield, Monte

    2015-01-01

    The Zebrafish Model Organism Database (ZFIN; http://zfin.org) is the central resource for genetic and genomic data from zebrafish (Danio rerio) research. ZFIN staff curate detailed information about genes, mutants, genotypes, reporter lines, sequences, constructs, antibodies, knockdown reagents, expression patterns, phenotypes, gene product function, and orthology from publications. Researchers can submit mutant, transgenic, expression, and phenotype data directly to ZFIN and use the ZFIN Community Wiki to share antibody and protocol information. Data can be accessed through topic-specific searches, a new site-wide search, and the data-mining resource ZebrafishMine (http://zebrafishmine.org). Data download and web service options are also available. ZFIN collaborates with major bioinformatics organizations to verify and integrate genomic sequence data, provide nomenclature support, establish reciprocal links and participate in the development of standardized structured vocabularies (ontologies) used for data annotation and searching. ZFIN-curated gene, function, expression, and phenotype data are available for comparative exploration at several multi-species resources. The use of zebrafish as a model for human disease is increasing. ZFIN is supporting this growing area with three major projects: adding easy access to computed orthology data from gene pages, curating details of the gene expression pattern changes in mutant fish, and curating zebrafish models of human diseases. PMID:26097180

  7. Choosing a Genome Browser for a Model Organism Database (MOD): Surveying the Maize Community

    USDA-ARS?s Scientific Manuscript database

    As the maize genome sequencing is nearing its completion, the Maize Genetics and Genomics Database (MaizeGDB), the Model Organism Database for maize, integrated a genome browser to its already existing Web interface and database. The addition of the MaizeGDB Genome Browser to MaizeGDB will allow it ...

  8. Model organism databases: essential resources that need the support of both funders and users.

    PubMed

    Oliver, Stephen G; Lock, Antonia; Harris, Midori A; Nurse, Paul; Wood, Valerie

    2016-06-22

    Modern biomedical research depends critically on access to databases that house and disseminate genetic, genomic, molecular, and cell biological knowledge. Even as the explosion of available genome sequences and associated genome-scale data continues apace, the sustainability of professionally maintained biological databases is under threat due to policy changes by major funding agencies. Here, we focus on model organism databases to demonstrate the myriad ways in which biological databases not only act as repositories but actively facilitate advances in research. We present data that show that reducing financial support to model organism databases could prove to be not just scientifically, but also economically, unsound.

  9. Gene name identification and normalization using a model organism database.

    PubMed

    Morgan, Alexander A; Hirschman, Lynette; Colosimo, Marc; Yeh, Alexander S; Colombe, Jeff B

    2004-12-01

    Biology has now become an information science, and researchers are increasingly dependent on expert-curated biological databases to organize the findings from the published literature. We report here on a series of experiments related to the application of natural language processing to aid in the curation process for FlyBase. We focused on listing the normalized form of genes and gene products discussed in an article. We broke this into two steps: gene mention tagging in text, followed by normalization of gene names. For gene mention tagging, we adopted a statistical approach. To provide training data, we were able to reverse engineer the gene lists from the associated articles and abstracts, to generate text labeled (imperfectly) with gene mentions. We then evaluated the quality of the noisy training data (precision of 78%, recall 88%) and the quality of the HMM tagger output trained on this noisy data (precision 78%, recall 71%). In order to generate normalized gene lists, we explored two approaches. First, we explored simple pattern matching based on synonym lists to obtain a high recall/low precision system (recall 95%, precision 2%). Using a series of filters, we were able to improve precision to 50% with a recall of 72% (balanced F-measure of 0.59). Our second approach combined the HMM gene mention tagger with various filters to remove ambiguous mentions; this approach achieved an F-measure of 0.72 (precision 88%, recall 61%). These experiments indicate that the lexical resources provided by FlyBase are complete enough to achieve high recall on the gene list task, and that normalization requires accurate disambiguation; different strategies for tagging and normalization trade off recall for precision.

  10. MaizeGDB update: New tools, data, and interface for the maize model organism database

    USDA-ARS?s Scientific Manuscript database

    MaizeGDB is a highly curated, community-oriented database and informatics service to researchers focused on the crop plant and model organism Zea mays ssp. mays. Although some form of the maize community database has existed over the last 25 years, there have only been two major releases. In 1991, ...

  11. MaizeGDB, the maize model organism database

    USDA-ARS?s Scientific Manuscript database

    MaizeGDB is the maize research community's database for maize genetic and genomic information. In this seminar I will outline our current endeavors including a full website redesign, the status of maize genome assembly and annotation projects, and work toward genome functional annotation. Mechanis...

  12. Using semantic data modeling techniques to organize an object-oriented database for extending the mass storage model

    NASA Technical Reports Server (NTRS)

    Campbell, William J.; Short, Nicholas M., Jr.; Roelofs, Larry H.; Dorfman, Erik

    1991-01-01

    A methodology for optimizing organization of data obtained by NASA earth and space missions is discussed. The methodology uses a concept based on semantic data modeling techniques implemented in a hierarchical storage model. The modeling is used to organize objects in mass storage devices, relational database systems, and object-oriented databases. The semantic data modeling at the metadata record level is examined, including the simulation of a knowledge base and semantic metadata storage issues. The semantic data model hierarchy and its application for efficient data storage is addressed, as is the mapping of the application structure to the mass storage.

  13. Using semantic data modeling techniques to organize an object-oriented database for extending the mass storage model

    NASA Technical Reports Server (NTRS)

    Campbell, William J.; Short, Nicholas M., Jr.; Roelofs, Larry H.; Dorfman, Erik

    1991-01-01

    A methodology for optimizing organization of data obtained by NASA earth and space missions is discussed. The methodology uses a concept based on semantic data modeling techniques implemented in a hierarchical storage model. The modeling is used to organize objects in mass storage devices, relational database systems, and object-oriented databases. The semantic data modeling at the metadata record level is examined, including the simulation of a knowledge base and semantic metadata storage issues. The semantic data model hierarchy and its application for efficient data storage is addressed, as is the mapping of the application structure to the mass storage.

  14. Integrated interactions database: tissue-specific view of the human and model organism interactomes.

    PubMed

    Kotlyar, Max; Pastrello, Chiara; Sheahan, Nicholas; Jurisica, Igor

    2016-01-04

    IID (Integrated Interactions Database) is the first database providing tissue-specific protein-protein interactions (PPIs) for model organisms and human. IID covers six species (S. cerevisiae (yeast), C. elegans (worm), D. melonogaster (fly), R. norvegicus (rat), M. musculus (mouse) and H. sapiens (human)) and up to 30 tissues per species. Users query IID by providing a set of proteins or PPIs from any of these organisms, and specifying species and tissues where IID should search for interactions. If query proteins are not from the selected species, IID enables searches across species and tissues automatically by using their orthologs; for example, retrieving interactions in a given tissue, conserved in human and mouse. Interaction data in IID comprises three types of PPI networks: experimentally detected PPIs from major databases, orthologous PPIs and high-confidence computationally predicted PPIs. Interactions are assigned to tissues where their proteins pairs or encoding genes are expressed. IID is a major replacement of the I2D interaction database, with larger PPI networks (a total of 1,566,043 PPIs among 68,831 proteins), tissue annotations for interactions, and new query, analysis and data visualization capabilities. IID is available at http://ophid.utoronto.ca/iid.

  15. IntPath--an integrated pathway gene relationship database for model organisms and important pathogens

    PubMed Central

    2012-01-01

    Background Pathway data are important for understanding the relationship between genes, proteins and many other molecules in living organisms. Pathway gene relationships are crucial information for guidance, prediction, reference and assessment in biochemistry, computational biology, and medicine. Many well-established databases--e.g., KEGG, WikiPathways, and BioCyc--are dedicated to collecting pathway data for public access. However, the effectiveness of these databases is hindered by issues such as incompatible data formats, inconsistent molecular representations, inconsistent molecular relationship representations, inconsistent referrals to pathway names, and incomprehensive data from different databases. Results In this paper, we overcome these issues through extraction, normalization and integration of pathway data from several major public databases (KEGG, WikiPathways, BioCyc, etc). We build a database that not only hosts our integrated pathway gene relationship data for public access but also maintains the necessary updates in the long run. This public repository is named IntPath (Integrated Pathway gene relationship database for model organisms and important pathogens). Four organisms--S. cerevisiae, M. tuberculosis H37Rv, H. Sapiens and M. musculus--are included in this version (V2.0) of IntPath. IntPath uses the "full unification" approach to ensure no deletion and no introduced noise in this process. Therefore, IntPath contains much richer pathway-gene and pathway-gene pair relationships and much larger number of non-redundant genes and gene pairs than any of the single-source databases. The gene relationships of each gene (measured by average node degree) per pathway are significantly richer. The gene relationships in each pathway (measured by average number of gene pairs per pathway) are also considerably richer in the integrated pathways. Moderate manual curation are involved to get rid of errors and noises from source data (e.g., the gene ID errors

  16. MaizeGDB update: new tools, data and interface for the maize model organism database.

    PubMed

    Andorf, Carson M; Cannon, Ethalinda K; Portwood, John L; Gardiner, Jack M; Harper, Lisa C; Schaeffer, Mary L; Braun, Bremen L; Campbell, Darwin A; Vinnakota, Abhinav G; Sribalusu, Venktanaga V; Huerta, Miranda; Cho, Kyoung Tak; Wimalanathan, Kokulapalan; Richter, Jacqueline D; Mauch, Emily D; Rao, Bhavani S; Birkett, Scott M; Sen, Taner Z; Lawrence-Dill, Carolyn J

    2016-01-04

    MaizeGDB is a highly curated, community-oriented database and informatics service to researchers focused on the crop plant and model organism Zea mays ssp. mays. Although some form of the maize community database has existed over the last 25 years, there have only been two major releases. In 1991, the original maize genetics database MaizeDB was created. In 2003, the combined contents of MaizeDB and the sequence data from ZmDB were made accessible as a single resource named MaizeGDB. Over the next decade, MaizeGDB became more sequence driven while still maintaining traditional maize genetics datasets. This enabled the project to meet the continued growing and evolving needs of the maize research community, yet the interface and underlying infrastructure remained unchanged. In 2015, the MaizeGDB team completed a multi-year effort to update the MaizeGDB resource by reorganizing existing data, upgrading hardware and infrastructure, creating new tools, incorporating new data types (including diversity data, expression data, gene models, and metabolic pathways), and developing and deploying a modern interface. In addition to coordinating a data resource, the MaizeGDB team coordinates activities and provides technical support to the maize research community. MaizeGDB is accessible online at http://www.maizegdb.org. © Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  17. Protein Model Database

    SciTech Connect

    Fidelis, K; Adzhubej, A; Kryshtafovych, A; Daniluk, P

    2005-02-23

    The phenomenal success of the genome sequencing projects reveals the power of completeness in revolutionizing biological science. Currently it is possible to sequence entire organisms at a time, allowing for a systemic rather than fractional view of their organization and the various genome-encoded functions. There is an international plan to move towards a similar goal in the area of protein structure. This will not be achieved by experiment alone, but rather by a combination of efforts in crystallography, NMR spectroscopy, and computational modeling. Only a small fraction of structures are expected to be identified experimentally, the remainder to be modeled. Presently there is no organized infrastructure to critically evaluate and present these data to the biological community. The goal of the Protein Model Database project is to create such infrastructure, including (1) public database of theoretically derived protein structures; (2) reliable annotation of protein model quality, (3) novel structure analysis tools, and (4) access to the highest quality modeling techniques available.

  18. MyMpn: a database for the systems biology model organism Mycoplasma pneumoniae.

    PubMed

    Wodke, Judith A H; Alibés, Andreu; Cozzuto, Luca; Hermoso, Antonio; Yus, Eva; Lluch-Senar, Maria; Serrano, Luis; Roma, Guglielmo

    2015-01-01

    MyMpn (http://mympn.crg.eu) is an online resource devoted to studying the human pathogen Mycoplasma pneumoniae, a minimal bacterium causing lower respiratory tract infections. Due to its small size, its ability to grow in vitro, and the amount of data produced over the past decades, M. pneumoniae is an interesting model organisms for the development of systems biology approaches for unicellular organisms. Our database hosts a wealth of omics-scale datasets generated by hundreds of experimental and computational analyses. These include data obtained from gene expression profiling experiments, gene essentiality studies, protein abundance profiling, protein complex analysis, metabolic reactions and network modeling, cell growth experiments, comparative genomics and 3D tomography. In addition, the intuitive web interface provides access to several visualization and analysis tools as well as to different data search options. The availability and--even more relevant--the accessibility of properly structured and organized data are of up-most importance when aiming to understand the biology of an organism on a global scale. Therefore, MyMpn constitutes a unique and valuable new resource for the large systems biology and microbiology community. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. MyMpn: a database for the systems biology model organism Mycoplasma pneumoniae

    PubMed Central

    Wodke, Judith A. H.; Alibés, Andreu; Cozzuto, Luca; Hermoso, Antonio; Yus, Eva; Lluch-Senar, Maria; Serrano, Luis; Roma, Guglielmo

    2015-01-01

    MyMpn (http://mympn.crg.eu) is an online resource devoted to studying the human pathogen Mycoplasma pneumoniae, a minimal bacterium causing lower respiratory tract infections. Due to its small size, its ability to grow in vitro, and the amount of data produced over the past decades, M. pneumoniae is an interesting model organisms for the development of systems biology approaches for unicellular organisms. Our database hosts a wealth of omics-scale datasets generated by hundreds of experimental and computational analyses. These include data obtained from gene expression profiling experiments, gene essentiality studies, protein abundance profiling, protein complex analysis, metabolic reactions and network modeling, cell growth experiments, comparative genomics and 3D tomography. In addition, the intuitive web interface provides access to several visualization and analysis tools as well as to different data search options. The availability and—even more relevant—the accessibility of properly structured and organized data are of up-most importance when aiming to understand the biology of an organism on a global scale. Therefore, MyMpn constitutes a unique and valuable new resource for the large systems biology and microbiology community. PMID:25378328

  20. SubtiWiki 2.0--an integrated database for the model organism Bacillus subtilis.

    PubMed

    Michna, Raphael H; Zhu, Bingyao; Mäder, Ulrike; Stülke, Jörg

    2016-01-04

    To understand living cells, we need knowledge of each of their parts as well as about the interactions of these parts. To gain rapid and comprehensive access to this information, annotation databases are required. Here, we present SubtiWiki 2.0, the integrated database for the model bacterium Bacillus subtilis (http://subtiwiki.uni-goettingen.de/). SubtiWiki provides text-based access to published information about the genes and proteins of B. subtilis as well as presentations of metabolic and regulatory pathways. Moreover, manually curated protein-protein interactions diagrams are linked to the protein pages. Finally, expression data are shown with respect to gene expression under 104 different conditions as well as absolute protein quantification for cytoplasmic proteins. To facilitate the mobile use of SubtiWiki, we have now expanded it by Apps that are available for iOS and Android devices. Importantly, the App allows to link private notes and pictures to the gene/protein pages. Today, SubtiWiki has become one of the most complete collections of knowledge on a living organism in one single resource.

  1. ZFIN, the Zebrafish Model Organism Database: increased support for mutants and transgenics.

    PubMed

    Howe, Douglas G; Bradford, Yvonne M; Conlin, Tom; Eagle, Anne E; Fashena, David; Frazer, Ken; Knight, Jonathan; Mani, Prita; Martin, Ryan; Moxon, Sierra A Taylor; Paddock, Holly; Pich, Christian; Ramachandran, Sridhar; Ruef, Barbara J; Ruzicka, Leyla; Schaper, Kevin; Shao, Xiang; Singer, Amy; Sprunger, Brock; Van Slyke, Ceri E; Westerfield, Monte

    2013-01-01

    ZFIN, the Zebrafish Model Organism Database (http://zfin.org), is the central resource for zebrafish genetic, genomic, phenotypic and developmental data. ZFIN curators manually curate and integrate comprehensive data involving zebrafish genes, mutants, transgenics, phenotypes, genotypes, gene expressions, morpholinos, antibodies, anatomical structures and publications. Integrated views of these data, as well as data gathered through collaborations and data exchanges, are provided through a wide selection of web-based search forms. Among the vertebrate model organisms, zebrafish are uniquely well suited for rapid and targeted generation of mutant lines. The recent rapid production of mutants and transgenic zebrafish is making management of data associated with these resources particularly important to the research community. Here, we describe recent enhancements to ZFIN aimed at improving our support for mutant and transgenic lines, including (i) enhanced mutant/transgenic search functionality; (ii) more expressive phenotype curation methods; (iii) new downloads files and archival data access; (iv) incorporation of new data loads from laboratories undertaking large-scale generation of mutant or transgenic lines and (v) new GBrowse tracks for transgenic insertions, genes with antibodies and morpholinos.

  2. ZFIN, the Zebrafish Model Organism Database: increased support for mutants and transgenics

    PubMed Central

    Howe, Douglas G.; Bradford, Yvonne M.; Conlin, Tom; Eagle, Anne E.; Fashena, David; Frazer, Ken; Knight, Jonathan; Mani, Prita; Martin, Ryan; Moxon, Sierra A. Taylor; Paddock, Holly; Pich, Christian; Ramachandran, Sridhar; Ruef, Barbara J.; Ruzicka, Leyla; Schaper, Kevin; Shao, Xiang; Singer, Amy; Sprunger, Brock; Van Slyke, Ceri E.; Westerfield, Monte

    2013-01-01

    ZFIN, the Zebrafish Model Organism Database (http://zfin.org), is the central resource for zebrafish genetic, genomic, phenotypic and developmental data. ZFIN curators manually curate and integrate comprehensive data involving zebrafish genes, mutants, transgenics, phenotypes, genotypes, gene expressions, morpholinos, antibodies, anatomical structures and publications. Integrated views of these data, as well as data gathered through collaborations and data exchanges, are provided through a wide selection of web-based search forms. Among the vertebrate model organisms, zebrafish are uniquely well suited for rapid and targeted generation of mutant lines. The recent rapid production of mutants and transgenic zebrafish is making management of data associated with these resources particularly important to the research community. Here, we describe recent enhancements to ZFIN aimed at improving our support for mutant and transgenic lines, including (i) enhanced mutant/transgenic search functionality; (ii) more expressive phenotype curation methods; (iii) new downloads files and archival data access; (iv) incorporation of new data loads from laboratories undertaking large-scale generation of mutant or transgenic lines and (v) new GBrowse tracks for transgenic insertions, genes with antibodies and morpholinos. PMID:23074187

  3. The Zebrafish Model Organism Database: new support for human disease models, mutation details, gene expression phenotypes and searching

    PubMed Central

    Howe, Douglas G.; Bradford, Yvonne M.; Eagle, Anne; Fashena, David; Frazer, Ken; Kalita, Patrick; Mani, Prita; Martin, Ryan; Moxon, Sierra Taylor; Paddock, Holly; Pich, Christian; Ramachandran, Sridhar; Ruzicka, Leyla; Schaper, Kevin; Shao, Xiang; Singer, Amy; Toro, Sabrina; Van Slyke, Ceri; Westerfield, Monte

    2017-01-01

    The Zebrafish Model Organism Database (ZFIN; http://zfin.org) is the central resource for zebrafish (Danio rerio) genetic, genomic, phenotypic and developmental data. ZFIN curators provide expert manual curation and integration of comprehensive data involving zebrafish genes, mutants, transgenic constructs and lines, phenotypes, genotypes, gene expressions, morpholinos, TALENs, CRISPRs, antibodies, anatomical structures, models of human disease and publications. We integrate curated, directly submitted, and collaboratively generated data, making these available to zebrafish research community. Among the vertebrate model organisms, zebrafish are superbly suited for rapid generation of sequence-targeted mutant lines, characterization of phenotypes including gene expression patterns, and generation of human disease models. The recent rapid adoption of zebrafish as human disease models is making management of these data particularly important to both the research and clinical communities. Here, we describe recent enhancements to ZFIN including use of the zebrafish experimental conditions ontology, ‘Fish’ records in the ZFIN database, support for gene expression phenotypes, models of human disease, mutation details at the DNA, RNA and protein levels, and updates to the ZFIN single box search. PMID:27899582

  4. The Zebrafish Model Organism Database: new support for human disease models, mutation details, gene expression phenotypes and searching.

    PubMed

    Howe, Douglas G; Bradford, Yvonne M; Eagle, Anne; Fashena, David; Frazer, Ken; Kalita, Patrick; Mani, Prita; Martin, Ryan; Moxon, Sierra Taylor; Paddock, Holly; Pich, Christian; Ramachandran, Sridhar; Ruzicka, Leyla; Schaper, Kevin; Shao, Xiang; Singer, Amy; Toro, Sabrina; Van Slyke, Ceri; Westerfield, Monte

    2017-01-04

    The Zebrafish Model Organism Database (ZFIN; http://zfin.org) is the central resource for zebrafish (Danio rerio) genetic, genomic, phenotypic and developmental data. ZFIN curators provide expert manual curation and integration of comprehensive data involving zebrafish genes, mutants, transgenic constructs and lines, phenotypes, genotypes, gene expressions, morpholinos, TALENs, CRISPRs, antibodies, anatomical structures, models of human disease and publications. We integrate curated, directly submitted, and collaboratively generated data, making these available to zebrafish research community. Among the vertebrate model organisms, zebrafish are superbly suited for rapid generation of sequence-targeted mutant lines, characterization of phenotypes including gene expression patterns, and generation of human disease models. The recent rapid adoption of zebrafish as human disease models is making management of these data particularly important to both the research and clinical communities. Here, we describe recent enhancements to ZFIN including use of the zebrafish experimental conditions ontology, 'Fish' records in the ZFIN database, support for gene expression phenotypes, models of human disease, mutation details at the DNA, RNA and protein levels, and updates to the ZFIN single box search.

  5. Xenbase, the Xenopus model organism database; new virtualized system, data types and genomes

    PubMed Central

    Karpinka, J. Brad; Fortriede, Joshua D.; Burns, Kevin A.; James-Zorn, Christina; Ponferrada, Virgilio G.; Lee, Jacqueline; Karimi, Kamran; Zorn, Aaron M.; Vize, Peter D.

    2015-01-01

    Xenbase (http://www.xenbase.org), the Xenopus frog model organism database, integrates a wide variety of data from this biomedical model genus. Two closely related species are represented: the allotetraploid Xenopus laevis that is widely used for microinjection and tissue explant-based protocols, and the diploid Xenopus tropicalis which is used for genetics and gene targeting. The two species are extremely similar and protocols, reagents and results from each species are often interchangeable. Xenbase imports, indexes, curates and manages data from both species; all of which are mapped via unique IDs and can be queried in either a species-specific or species agnostic manner. All our services have now migrated to a private cloud to achieve better performance and reliability. We have added new content, including providing full support for morpholino reagents, used to inhibit mRNA translation or splicing and binding to regulatory microRNAs. New genomes assembled by the JGI for both species and are displayed in Gbrowse and are also available for searches using BLAST. Researchers can easily navigate from genome content to gene page reports, literature, experimental reagents and many other features using hyperlinks. Xenbase has also greatly expanded image content for figures published in papers describing Xenopus research via PubMedCentral. PMID:25313157

  6. Design, Implementation and Maintenance of a Model Organism Database for Arabidopsis thaliana

    PubMed Central

    Weems, Danforth; Miller, Neil; Garcia-Hernandez, Margarita; Huala, Eva

    2004-01-01

    The Arabidopsis Information Resource (TAIR) is a web-based community database for the model plant Arabidopsis thaliana. It provides an integrated view of genes, sequences, proteins, germplasms, clones, metabolic pathways, gene expression, ecotypes, polymorphisms, publications, maps and community information. TAIR is developed and maintained by collaboration between software developers and biologists. Biologists provide specification and use cases for the system, acquire, analyse and curate data, interact with users and test the software. Software developers design, implement and test the database and software. In this review, we briefly describe how TAIR was built and is being maintained. PMID:18629167

  7. Developing a biocuration workflow for AgBase, a non-model organism database.

    PubMed

    Pillai, Lakshmi; Chouvarine, Philippe; Tudor, Catalina O; Schmidt, Carl J; Vijay-Shanker, K; McCarthy, Fiona M

    2012-01-01

    AgBase provides annotation for agricultural gene products using the Gene Ontology (GO) and Plant Ontology, as appropriate. Unlike model organism species, agricultural species have a body of literature that does not just focus on gene function; to improve efficiency, we use text mining to identify literature for curation. The first component of our annotation interface is the gene prioritization interface that ranks gene products for annotation. Biocurators select the top-ranked gene and mark annotation for these genes as 'in progress' or 'completed'; links enable biocurators to move directly to our biocuration interface (BI). Our BI includes all current GO annotation for gene products and is the main interface to add/modify AgBase curation data. The BI also displays Extracting Genic Information from Text (eGIFT) results for each gene product. eGIFT is a web-based, text-mining tool that associates ranked, informative terms (iTerms) and the articles and sentences containing them, with genes. Moreover, iTerms are linked to GO terms, where they match either a GO term name or a synonym. This enables AgBase biocurators to rapidly identify literature for further curation based on possible GO terms. Because most agricultural species do not have standardized literature, eGIFT searches all gene names and synonyms to associate articles with genes. As many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene, and filtering is applied to remove abstracts that mention a gene in passing. The BI is linked to our Journal Database (JDB) where corresponding journal citations are stored. Just as importantly, biocurators also add to the JDB citations that have no GO annotation. The AgBase BI also supports bulk annotation upload to facilitate our Inferred from electronic annotation of agricultural gene products. All annotations must pass standard GO Consortium quality checking before release in AgBase. Database URL

  8. Developing a biocuration workflow for AgBase, a non-model organism database

    PubMed Central

    Pillai, Lakshmi; Chouvarine, Philippe; Tudor, Catalina O.; Schmidt, Carl J.; Vijay-Shanker, K.; McCarthy, Fiona M.

    2012-01-01

    AgBase provides annotation for agricultural gene products using the Gene Ontology (GO) and Plant Ontology, as appropriate. Unlike model organism species, agricultural species have a body of literature that does not just focus on gene function; to improve efficiency, we use text mining to identify literature for curation. The first component of our annotation interface is the gene prioritization interface that ranks gene products for annotation. Biocurators select the top-ranked gene and mark annotation for these genes as ‘in progress’ or ‘completed’; links enable biocurators to move directly to our biocuration interface (BI). Our BI includes all current GO annotation for gene products and is the main interface to add/modify AgBase curation data. The BI also displays Extracting Genic Information from Text (eGIFT) results for each gene product. eGIFT is a web-based, text-mining tool that associates ranked, informative terms (iTerms) and the articles and sentences containing them, with genes. Moreover, iTerms are linked to GO terms, where they match either a GO term name or a synonym. This enables AgBase biocurators to rapidly identify literature for further curation based on possible GO terms. Because most agricultural species do not have standardized literature, eGIFT searches all gene names and synonyms to associate articles with genes. As many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene, and filtering is applied to remove abstracts that mention a gene in passing. The BI is linked to our Journal Database (JDB) where corresponding journal citations are stored. Just as importantly, biocurators also add to the JDB citations that have no GO annotation. The AgBase BI also supports bulk annotation upload to facilitate our Inferred from electronic annotation of agricultural gene products. All annotations must pass standard GO Consortium quality checking before release in Ag

  9. Pancreatic Expression database: a generic model for the organization, integration and mining of complex cancer datasets

    PubMed Central

    Chelala, Claude; Hahn, Stephan A; Whiteman, Hannah J; Barry, Sayka; Hariharan, Deepak; Radon, Tomasz P; Lemoine, Nicholas R; Crnogorac-Jurcevic, Tatjana

    2007-01-01

    the progression of cancer, cross-platform meta-analysis, SNP selection for pancreatic cancer association studies, cancer gene promoter analysis as well as mining cancer ontology information. The data model is generic and can be easily extended and applied to other types of cancer. The database is available online with no restrictions for the scientific community at . PMID:18045474

  10. Combining next-generation sequencing and online databases for microsatellite development in non-model organisms

    PubMed Central

    Rico, Ciro; Normandeau, Eric; Dion-Côté, Anne-Marie; Rico, María Inés; Côté, Guillaume; Bernatchez, Louis

    2013-01-01

    Next-generation sequencing (NGS) is revolutionising marker development and the rapidly increasing amount of transcriptomes published across a wide variety of taxa is providing valuable sequence databases for the identification of genetic markers without the need to generate new sequences. Microsatellites are still the most important source of polymorphic markers in ecology and evolution. Motivated by our long-term interest in the adaptive radiation of a non-model species complex of whitefishes (Coregonus spp.), in this study, we focus on microsatellite characterisation and multiplex optimisation using transcriptome sequences generated by Illumina® and Roche-454, as well as online databases of Expressed Sequence Tags (EST) for the study of whitefish evolution and demographic history. We identified and optimised 40 polymorphic loci in multiplex PCR reactions and validated the robustness of our analyses by testing several population genetics and phylogeographic predictions using 494 fish from five lakes and 2 distinct ecotypes. PMID:24296905

  11. Combining next-generation sequencing and online databases for microsatellite development in non-model organisms.

    PubMed

    Rico, Ciro; Normandeau, Eric; Dion-Côté, Anne-Marie; Rico, María Inés; Côté, Guillaume; Bernatchez, Louis

    2013-12-03

    Next-generation sequencing (NGS) is revolutionising marker development and the rapidly increasing amount of transcriptomes published across a wide variety of taxa is providing valuable sequence databases for the identification of genetic markers without the need to generate new sequences. Microsatellites are still the most important source of polymorphic markers in ecology and evolution. Motivated by our long-term interest in the adaptive radiation of a non-model species complex of whitefishes (Coregonus spp.), in this study, we focus on microsatellite characterisation and multiplex optimisation using transcriptome sequences generated by Illumina® and Roche-454, as well as online databases of Expressed Sequence Tags (EST) for the study of whitefish evolution and demographic history. We identified and optimised 40 polymorphic loci in multiplex PCR reactions and validated the robustness of our analyses by testing several population genetics and phylogeographic predictions using 494 fish from five lakes and 2 distinct ecotypes.

  12. Immediate dissemination of student discoveries to a model organism database enhances classroom-based research experiences.

    PubMed

    Wiley, Emily A; Stover, Nicholas A

    2014-01-01

    Use of inquiry-based research modules in the classroom has soared over recent years, largely in response to national calls for teaching that provides experience with scientific processes and methodologies. To increase the visibility of in-class studies among interested researchers and to strengthen their impact on student learning, we have extended the typical model of inquiry-based labs to include a means for targeted dissemination of student-generated discoveries. This initiative required: 1) creating a set of research-based lab activities with the potential to yield results that a particular scientific community would find useful and 2) developing a means for immediate sharing of student-generated results. Working toward these goals, we designed guides for course-based research aimed to fulfill the need for functional annotation of the Tetrahymena thermophila genome, and developed an interactive Web database that links directly to the official Tetrahymena Genome Database for immediate, targeted dissemination of student discoveries. This combination of research via the course modules and the opportunity for students to immediately "publish" their novel results on a Web database actively used by outside scientists culminated in a motivational tool that enhanced students' efforts to engage the scientific process and pursue additional research opportunities beyond the course.

  13. Immediate Dissemination of Student Discoveries to a Model Organism Database Enhances Classroom-Based Research Experiences

    PubMed Central

    Wiley, Emily A.; Stover, Nicholas A.

    2014-01-01

    Use of inquiry-based research modules in the classroom has soared over recent years, largely in response to national calls for teaching that provides experience with scientific processes and methodologies. To increase the visibility of in-class studies among interested researchers and to strengthen their impact on student learning, we have extended the typical model of inquiry-based labs to include a means for targeted dissemination of student-generated discoveries. This initiative required: 1) creating a set of research-based lab activities with the potential to yield results that a particular scientific community would find useful and 2) developing a means for immediate sharing of student-generated results. Working toward these goals, we designed guides for course-based research aimed to fulfill the need for functional annotation of the Tetrahymena thermophila genome, and developed an interactive Web database that links directly to the official Tetrahymena Genome Database for immediate, targeted dissemination of student discoveries. This combination of research via the course modules and the opportunity for students to immediately “publish” their novel results on a Web database actively used by outside scientists culminated in a motivational tool that enhanced students’ efforts to engage the scientific process and pursue additional research opportunities beyond the course. PMID:24591511

  14. MaizeGDB: The Maize Model Organism Database for Basic, Translational, and Applied Research

    PubMed Central

    Lawrence, Carolyn J.; Harper, Lisa C.; Schaeffer, Mary L.; Sen, Taner Z.; Seigfried, Trent E.; Campbell, Darwin A.

    2008-01-01

    In 2001 maize became the number one production crop in the world with the Food and Agriculture Organization of the United Nations reporting over 614 million tonnes produced. Its success is due to the high productivity per acre in tandem with a wide variety of commercial uses. Not only is maize an excellent source of food, feed, and fuel, but also its by-products are used in the production of various commercial products. Maize's unparalleled success in agriculture stems from basic research, the outcomes of which drive breeding and product development. In order for basic, translational, and applied researchers to benefit from others' investigations, newly generated data must be made freely and easily accessible. MaizeGDB is the maize research community's central repository for genetics and genomics information. The overall goals of MaizeGDB are to facilitate access to the outcomes of maize research by integrating new maize data into the database and to support the maize research community by coordinating group activities. PMID:18769488

  15. Use of Model Organism and Disease Databases to Support Matchmaking for Human Disease Gene Discovery

    PubMed Central

    Mungall, Christopher J.; Washington, Nicole L.; Nguyen-Xuan, Jeremy; Condit, Christopher; Smedley, Damian; Köhler, Sebastian; Groza, Tudor; Shefchek, Kent; Hochheiser, Harry; Robinson, Peter N.; Lewis, Suzanna E.; Haendel, Melissa A.

    2017-01-01

    The Matchmaker Exchange API allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant-disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enables matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases. PMID:26269093

  16. Use of model organism and disease databases to support matchmaking for human disease gene discovery.

    PubMed

    Mungall, Christopher J; Washington, Nicole L; Nguyen-Xuan, Jeremy; Condit, Christopher; Smedley, Damian; Köhler, Sebastian; Groza, Tudor; Shefchek, Kent; Hochheiser, Harry; Robinson, Peter N; Lewis, Suzanna E; Haendel, Melissa A

    2015-10-01

    The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases.

  17. Database for propagation models

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.

    1991-01-01

    A propagation researcher or a systems engineer who intends to use the results of a propagation experiment is generally faced with various database tasks such as the selection of the computer software, the hardware, and the writing of the programs to pass the data through the models of interest. This task is repeated every time a new experiment is conducted or the same experiment is carried out at a different location generating different data. Thus the users of this data have to spend a considerable portion of their time learning how to implement the computer hardware and the software towards the desired end. This situation may be facilitated considerably if an easily accessible propagation database is created that has all the accepted (standardized) propagation phenomena models approved by the propagation research community. Also, the handling of data will become easier for the user. Such a database construction can only stimulate the growth of the propagation research it if is available to all the researchers, so that the results of the experiment conducted by one researcher can be examined independently by another, without different hardware and software being used. The database may be made flexible so that the researchers need not be confined only to the contents of the database. Another way in which the database may help the researchers is by the fact that they will not have to document the software and hardware tools used in their research since the propagation research community will know the database already. The following sections show a possible database construction, as well as properties of the database for the propagation research.

  18. SubtiWiki 2.0—an integrated database for the model organism Bacillus subtilis

    PubMed Central

    Michna, Raphael H.; Zhu, Bingyao; Mäder, Ulrike; Stülke, Jörg

    2016-01-01

    To understand living cells, we need knowledge of each of their parts as well as about the interactions of these parts. To gain rapid and comprehensive access to this information, annotation databases are required. Here, we present SubtiWiki 2.0, the integrated database for the model bacterium Bacillus subtilis (http://subtiwiki.uni-goettingen.de/). SubtiWiki provides text-based access to published information about the genes and proteins of B. subtilis as well as presentations of metabolic and regulatory pathways. Moreover, manually curated protein-protein interactions diagrams are linked to the protein pages. Finally, expression data are shown with respect to gene expression under 104 different conditions as well as absolute protein quantification for cytoplasmic proteins. To facilitate the mobile use of SubtiWiki, we have now expanded it by Apps that are available for iOS and Android devices. Importantly, the App allows to link private notes and pictures to the gene/protein pages. Today, SubtiWiki has become one of the most complete collections of knowledge on a living organism in one single resource. PMID:26433225

  19. A Database for Propagation Models

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.; Rucker, James

    1997-01-01

    The Propagation Models Database is designed to allow the scientists and experimenters in the propagation field to process their data through many known and accepted propagation models. The database is an Excel 5.0 based software that houses user-callable propagation models of propagation phenomena. It does not contain a database of propagation data generated out of the experiments. The database not only provides a powerful software tool to process the data generated by the experiments, but is also a time- and energy-saving tool for plotting results, generating tables and producing impressive and crisp hard copy for presentation and filing.

  20. Integration of an Evidence Base into a Probabilistic Risk Assessment Model. The Integrated Medical Model Database: An Organized Evidence Base for Assessing In-Flight Crew Health Risk and System Design

    NASA Technical Reports Server (NTRS)

    Saile, Lynn; Lopez, Vilma; Bickham, Grandin; FreiredeCarvalho, Mary; Kerstman, Eric; Byrne, Vicky; Butler, Douglas; Myers, Jerry; Walton, Marlei

    2011-01-01

    This slide presentation reviews the Integrated Medical Model (IMM) database, which is an organized evidence base for assessing in-flight crew health risk. The database is a relational database accessible to many people. The database quantifies the model inputs by a ranking based on the highest value of the data as Level of Evidence (LOE) and the quality of evidence (QOE) score that provides an assessment of the evidence base for each medical condition. The IMM evidence base has already been able to provide invaluable information for designers, and for other uses.

  1. A database for propagation models

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.; Suwitra, Krisjani S.

    1992-01-01

    In June 1991, a paper at the fifteenth NASA Propagation Experimenters Meeting (NAPEX 15) was presented outlining the development of a database for propagation models. The database is designed to allow the scientists and experimenters in the propagation field to process their data through any known and accepted propagation model. The architecture of the database also incorporates the possibility of changing the standard models in the database to fit the scientist's or the experimenter's needs. The database not only provides powerful software to process the data generated by the experiments, but is also a time- and energy-saving tool for plotting results, generating tables, and producing impressive and crisp hard copy for presentation and filing.

  2. Organizing a breast cancer database: data management.

    PubMed

    Yi, Min; Hunt, Kelly K

    2016-06-01

    Developing and organizing a breast cancer database can provide data and serve as valuable research tools for those interested in the etiology, diagnosis, and treatment of cancer. Depending on the research setting, the quality of the data can be a major issue. Assuring that the data collection process does not contribute inaccuracies can help to assure the overall quality of subsequent analyses. Data management is work that involves the planning, development, implementation, and administration of systems for the acquisition, storage, and retrieval of data while protecting it by implementing high security levels. A properly designed database provides you with access to up-to-date, accurate information. Database design is an important component of application design. If you take the time to design your databases properly, you'll be rewarded with a solid application foundation on which you can build the rest of your application.

  3. Building a Database for a Quantitative Model

    NASA Technical Reports Server (NTRS)

    Kahn, C. Joseph; Kleinhammer, Roger

    2014-01-01

    A database can greatly benefit a quantitative analysis. The defining characteristic of a quantitative risk, or reliability, model is the use of failure estimate data. Models can easily contain a thousand Basic Events, relying on hundreds of individual data sources. Obviously, entering so much data by hand will eventually lead to errors. Not so obviously entering data this way does not aid linking the Basic Events to the data sources. The best way to organize large amounts of data on a computer is with a database. But a model does not require a large, enterprise-level database with dedicated developers and administrators. A database built in Excel can be quite sufficient. A simple spreadsheet database can link every Basic Event to the individual data source selected for them. This database can also contain the manipulations appropriate for how the data is used in the model. These manipulations include stressing factors based on use and maintenance cycles, dormancy, unique failure modes, the modeling of multiple items as a single "Super component" Basic Event, and Bayesian Updating based on flight and testing experience. A simple, unique metadata field in both the model and database provides a link from any Basic Event in the model to its data source and all relevant calculations. The credibility for the entire model often rests on the credibility and traceability of the data.

  4. A database for propagation models

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.; Suwitra, Krisjani; Le, Choung

    1994-01-01

    A database of various propagation phenomena models that can be used by telecommunications systems engineers to obtain parameter values for systems design is presented. This is an easy-to-use tool and is currently available for either a PC using Excel software under Windows environment or a Macintosh using Excel software for Macintosh. All the steps necessary to use the software are easy and many times self-explanatory; however, a sample run of the CCIR rain attenuation model is presented.

  5. A database for propagation models

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.; Suwitra, Krisjani; Le, Chuong

    1995-01-01

    A database of various propagation phenomena models that can be used by telecommunications systems engineers to obtain parameter values for systems design is presented. This is an easy-to-use tool and is currently available for either a PC using Excel software under Windows environment or a Macintosh using Excel software for Macintosh. All the steps necessary to use the software are easy and many times self explanatory.

  6. Nonbibliographic Databases in a Corporate Health, Safety, and Environment Organization.

    ERIC Educational Resources Information Center

    Cubillas, Mary M.

    1981-01-01

    Summarizes the characteristics of TOXIN, CHEMFILE, and the Product Profile Information System (PPIS), nonbibliographic databases used by Shell Oil Company's Health, Safety, and Environment Organization. (FM)

  7. Software Engineering Laboratory (SEL) database organization and user's guide

    NASA Technical Reports Server (NTRS)

    So, Maria; Heller, Gerard; Steinberg, Sandra; Spiegel, Douglas

    1989-01-01

    The organization of the Software Engineering Laboratory (SEL) database is presented. Included are definitions and detailed descriptions of the database tables and views, the SEL data, and system support data. The mapping from the SEL and system support data to the base tables is described. In addition, techniques for accessing the database, through the Database Access Manager for the SEL (DAMSEL) system and via the ORACLE structured query language (SQL), are discussed.

  8. Community Organizing for Database Trial Buy-In by Patrons

    ERIC Educational Resources Information Center

    Pionke, J. J.

    2015-01-01

    Database trials do not often garner a lot of feedback. Using community-organizing techniques can not only potentially increase the amount of feedback received but also deepen the relationship between the librarian and his or her constituent group. This is a case study of the use of community-organizing techniques in a series of database trials for…

  9. Community Organizing for Database Trial Buy-In by Patrons

    ERIC Educational Resources Information Center

    Pionke, J. J.

    2015-01-01

    Database trials do not often garner a lot of feedback. Using community-organizing techniques can not only potentially increase the amount of feedback received but also deepen the relationship between the librarian and his or her constituent group. This is a case study of the use of community-organizing techniques in a series of database trials for…

  10. DEPOT: A Database of Environmental Parameters, Organizations and Tools

    SciTech Connect

    CARSON,SUSAN D.; HUNTER,REGINA LEE; MALCZYNSKI,LEONARD A.; POHL,PHILLIP I.; QUINTANA,ENRICO; SOUZA,CAROLINE A.; HIGLEY,KATHRYN; MURPHIE,WILLIAM

    2000-12-19

    The Database of Environmental Parameters, Organizations, and Tools (DEPOT) has been developed by the Department of Energy (DOE) as a central warehouse for access to data essential for environmental risk assessment analyses. Initial efforts have concentrated on groundwater and vadose zone transport data and bioaccumulation factors. DEPOT seeks to provide a source of referenced data that, wherever possible, includes the level of uncertainty associated with these parameters. Based on the amount of data available for a particular parameter, uncertainty is expressed as a standard deviation or a distribution function. DEPOT also provides DOE site-specific performance assessment data, pathway-specific transport data, and links to environmental regulations, disposal site waste acceptance criteria, other environmental parameter databases, and environmental risk assessment models.

  11. Building an organ-specific carcinogenic database for SAR analyses.

    PubMed

    Young, John; Tong, Weida; Fang, Hong; Xie, Qian; Pearce, Bruce; Hashemi, Ray; Beger, Richard; Cheeseman, Mitchell; Chen, James; Chang, Yuan-Chin; Kodell, Ralph

    2004-09-10

    FDA reviewers need a means to rapidly predict organ-specific carcinogenicity to aid in evaluating new chemicals submitted for approval. This research addressed the building of a database to use in developing a predictive model for such an application based on structure-activity relationships (SAR). The Internet availability of the Carcinogenic Potency Database (CPDB) provided a solid foundation on which to base such a model. The addition of molecular structures to the CPDB provided the extra ingredient necessary for SAR analyses. However, the CPDB had to be compressed from a multirecord to a single record per chemical database; multiple records representing each gender, species, route of administration, and organ-specific toxicity had to be summarized into a single record for each study. Multiple studies on a single chemical had to be further reduced based on a hierarchical scheme. Structural cleanup involved removal of all chemicals that would impede the accurate generation of SAR type descriptors from commercial software programs; that is, inorganic chemicals, mixtures, and organometallics were removed. Counterions such as Na, K, sulfates, hydrates, and salts were also removed for structural consistency. Structural modification sometimes resulted in duplicate records that also had to be reduced to a single record based on the hierarchical scheme. The modified database containing 999 chemicals was evaluated for liver-specific carcinogenicity using a variety of analysis techniques. These preliminary analyses all yielded approximately the same results with an overall predictability of about 63%, which was comprised of a sensitivity of about 30% and a specificity of about 77%. Copyright Taylor & Francis Inc.

  12. Conceptual and logical level of database modeling

    NASA Astrophysics Data System (ADS)

    Hunka, Frantisek; Matula, Jiri

    2016-06-01

    Conceptual and logical levels form the top most levels of database modeling. Usually, ORM (Object Role Modeling) and ER diagrams are utilized to capture the corresponding schema. The final aim of business process modeling is to store its results in the form of database solution. For this reason, value oriented business process modeling which utilizes ER diagram to express the modeling entities and relationships between them are used. However, ER diagrams form the logical level of database schema. To extend possibilities of different business process modeling methodologies, the conceptual level of database modeling is needed. The paper deals with the REA value modeling approach to business process modeling using ER-diagrams, and derives conceptual model utilizing ORM modeling approach. Conceptual model extends possibilities for value modeling to other business modeling approaches.

  13. Organ system heterogeneity DB: a database for the visualization of phenotypes at the organ system level.

    PubMed

    Mannil, Deepthi; Vogt, Ingo; Prinz, Jeanette; Campillos, Monica

    2015-01-01

    Perturbations of mammalian organisms including diseases, drug treatments and gene perturbations in mice affect organ systems differently. Some perturbations impair relatively few organ systems while others lead to highly heterogeneous or systemic effects. Organ System Heterogeneity DB (http://mips.helmholtz-muenchen.de/Organ_System_Heterogeneity/) provides information on the phenotypic effects of 4865 human diseases, 1667 drugs and 5361 genetically modified mouse models on 26 different organ systems. Disease symptoms, drug side effects and mouse phenotypes are mapped to the System Organ Class (SOC) level of the Medical Dictionary of Regulatory Activities (MedDRA). Then, the organ system heterogeneity value, a measurement of the systemic impact of a perturbation, is calculated from the relative frequency of phenotypic features across all SOCs. For perturbations of interest, the database displays the distribution of phenotypic effects across organ systems along with the heterogeneity value and the distance between organ system distributions. In this way, it allows, in an easy and comprehensible fashion, the comparison of the phenotypic organ system distributions of diseases, drugs and their corresponding genetically modified mouse models of associated disease genes and drug targets. The Organ System Heterogeneity DB is thus a platform for the visualization and comparison of organ system level phenotypic effects of drugs, diseases and genes.

  14. Organ system heterogeneity DB: a database for the visualization of phenotypes at the organ system level

    PubMed Central

    Mannil, Deepthi; Vogt, Ingo; Prinz, Jeanette; Campillos, Monica

    2015-01-01

    Perturbations of mammalian organisms including diseases, drug treatments and gene perturbations in mice affect organ systems differently. Some perturbations impair relatively few organ systems while others lead to highly heterogeneous or systemic effects. Organ System Heterogeneity DB (http://mips.helmholtz-muenchen.de/Organ_System_Heterogeneity/) provides information on the phenotypic effects of 4865 human diseases, 1667 drugs and 5361 genetically modified mouse models on 26 different organ systems. Disease symptoms, drug side effects and mouse phenotypes are mapped to the System Organ Class (SOC) level of the Medical Dictionary of Regulatory Activities (MedDRA). Then, the organ system heterogeneity value, a measurement of the systemic impact of a perturbation, is calculated from the relative frequency of phenotypic features across all SOCs. For perturbations of interest, the database displays the distribution of phenotypic effects across organ systems along with the heterogeneity value and the distance between organ system distributions. In this way, it allows, in an easy and comprehensible fashion, the comparison of the phenotypic organ system distributions of diseases, drugs and their corresponding genetically modified mouse models of associated disease genes and drug targets. The Organ System Heterogeneity DB is thus a platform for the visualization and comparison of organ system level phenotypic effects of drugs, diseases and genes. PMID:25313158

  15. Data-based mechanistic modeling of dissolved organic carbon load through storms using continuous 15-minute resolution observations within UK upland watersheds

    NASA Astrophysics Data System (ADS)

    Jones, T.; Chappell, N. A.

    2013-12-01

    Few watershed modeling studies have addressed DOC dynamics through storm hydrographs (notable exceptions include Boyer et al., 1997 Hydrol Process; Jutras et al., 2011 Ecol Model; Xu et al., 2012 Water Resour Res). In part this has been a consequence of an incomplete understanding of the biogeochemical processes leading to DOC export to streams (Neff & Asner, 2001, Ecosystems) & an insufficient frequency of DOC monitoring to capture sometimes complex time-varying relationships between DOC & storm hydrographs (Kirchner et al., 2004, Hydrol Process). We present the results of a new & ongoing UK study that integrates two components - 1/ New observations of DOC concentrations (& derived load) continuously monitored at 15 minute intervals through multiple seasons for replicated watersheds; & 2/ A dynamic modeling technique that is able to quantify storage-decay effects, plus hysteretic, nonlinear, lagged & non-stationary relationships between DOC & controlling variables (including rainfall, streamflow, temperature & specific biogeochemical variables e.g., pH, nitrate). DOC concentration is being monitored continuously using the latest generation of UV spectrophotometers (i.e. S::CAN spectro::lysers) with in situ calibrations to laboratory analyzed DOC. The controlling variables are recorded simultaneously at the same stream stations. The watersheds selected for study are among the most intensively studied basins in the UK uplands, namely the Plynlimon & Llyn Brianne experimental basins. All contain areas of organic soils, with three having improved grasslands & three conifer afforested. The dynamic response characteristics (DRCs) that describe detailed DOC behaviour through sequences of storms are simulated using the latest identification routines for continuous time transfer function (CT-TF) models within the Matlab-based CAPTAIN toolbox (some incorporating nonlinear components). To our knowledge this is the first application of CT-TFs to modelling DOC processes

  16. Dynamic publication model for neurophysiology databases.

    PubMed

    Gardner, D; Abato, M; Knuth, K H; DeBellis, R; Erde, S M

    2001-08-29

    We have implemented a pair of database projects, one serving cortical electrophysiology and the other invertebrate neurones and recordings. The design for each combines aspects of two proven schemes for information interchange. The journal article metaphor determined the type, scope, organization and quantity of data to comprise each submission. Sequence databases encouraged intuitive tools for data viewing, capture, and direct submission by authors. Neurophysiology required transcending these models with new datatypes. Time-series, histogram and bivariate datatypes, including illustration-like wrappers, were selected by their utility to the community of investigators. As interpretation of neurophysiological recordings depends on context supplied by metadata attributes, searches are via visual interfaces to sets of controlled-vocabulary metadata trees. Neurones, for example, can be specified by metadata describing functional and anatomical characteristics. Permanence is advanced by data model and data formats largely independent of contemporary technology or implementation, including Java and the XML standard. All user tools, including dynamic data viewers that serve as a virtual oscilloscope, are Java-based, free, multiplatform, and distributed by our application servers to any contemporary networked computer. Copyright is retained by submitters; viewer displays are dynamic and do not violate copyright of related journal figures. Panels of neurophysiologists view and test schemas and tools, enhancing community support.

  17. Database design using NIAM (Nijssen Information Analysis Method) modeling

    SciTech Connect

    Stevens, N.H.

    1989-01-01

    The Nissjen Information Analysis Method (NIAM) is an information modeling technique based on semantics and founded in set theory. A NIAM information model is a graphical representation of the information requirements for some universe of discourse. Information models facilitate data integration and communication within an organization about data semantics. An information model is sometimes referred to as the semantic model or the conceptual schema. It helps in the logical and physical design and implementation of databases. NIAM information modeling is used at Sandia National Laboratories to design and implement relational databases containing engineering information which meet the users' information requirements. The paper focuses on the design of one database which satisfied the data needs of four disjoint but closely related applications. The applications as they existed before did not talk to each other even though they stored much of the same data redundantly. NIAM was used to determine the information requirements and design the integrated database. 6 refs., 7 figs.

  18. The methodology of database design in organization management systems

    NASA Astrophysics Data System (ADS)

    Chudinov, I. L.; Osipova, V. V.; Bobrova, Y. V.

    2017-01-01

    The paper describes the unified methodology of database design for management information systems. Designing the conceptual information model for the domain area is the most important and labor-intensive stage in database design. Basing on the proposed integrated approach to design, the conceptual information model, the main principles of developing the relation databases are provided and user’s information needs are considered. According to the methodology, the process of designing the conceptual information model includes three basic stages, which are defined in detail. Finally, the article describes the process of performing the results of analyzing user’s information needs and the rationale for use of classifiers.

  19. Spatial Database Organization for Multi-attribute Sensor Data Representation

    NASA Astrophysics Data System (ADS)

    Gouveia, Feliz R.; Barthes, Jean-Paul A.

    1990-03-01

    This paper surveys spatial database organization and modelling as it is becoming a crucial issue for an ever increasing number of geometric data manipulation systems. We are here interested in efficient representation and storage structures for rapid processing of large sets of geometric data, as required by robotics applications, Very Large Scale Integration (VLSI) layout design, cartography, Computer Aided Design (CAD), or geographic information systems (GIS), where frequent operations involve spatial reasoning over that data. Existing database systems lack expressiveness to store some kinds of information which are inherently present in a geometric reasoning process, such as metric information, e.g. proximity, parallelism; or topological information, e.g. inclusion, intersection, contiguity, crossing. Geometric databases (GDB) alleviate this problem by providing an explicit representation for the spatial layout of the world in terms of empty and occupied space, together with a complete description of each object in it. Access to the data is done in an associative manner, that is, by specifying values over some usually small (sub)set of attributes, e.g. the coordinates of physical space. Manipulating data in GDB systems involves often spatially localized operations, i.e., locations, and consequently objects, which are accessed in the present are likely to be accessed again in a near future; this locality of reference which Hegron [24] calls temporal coherence, is due mainly to real world physical constraints. Indeed if accesses are caused for example by a sensor module which inspects its surroundings, then it is reasonable to suppose that successive scanned territories are not very far apart.

  20. Hierarchical clustering techniques for image database organization and summarization

    NASA Astrophysics Data System (ADS)

    Vellaikal, Asha; Kuo, C.-C. Jay

    1998-10-01

    This paper investigates clustering techniques as a method of organizing image databases to support popular visual management functions such as searching, browsing and navigation. Different types of hierarchical agglomerative clustering techniques are studied as a method of organizing features space as well as summarizing image groups by the selection of a few appropriate representatives. Retrieval performance using both single and multiple level hierarchies are experimented with and the algorithms show an interesting relationship between the top k correct retrievals and the number of comparisons required. Some arguments are given to support the use of such cluster-based techniques for managing distributed image databases.

  1. Web resources for model organism studies.

    PubMed

    Tang, Bixia; Wang, Yanqing; Zhu, Junwei; Zhao, Wenming

    2015-02-01

    An ever-growing number of resources on model organisms have emerged with the continued development of sequencing technologies. In this paper, we review 13 databases of model organisms, most of which are reported by the National Institutes of Health of the United States (NIH; http://www.nih.gov/science/models/). We provide a brief description for each database, as well as detail its data source and types, functions, tools, and availability of access. In addition, we also provide a quality assessment about these databases. Significantly, the organism databases instituted in the early 1990s--such as the Mouse Genome Database (MGD), Saccharomyces Genome Database (SGD), and FlyBase--have developed into what are now comprehensive, core authority resources. Furthermore, all of the databases mentioned here update continually according to user feedback and with advancing technologies. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

  2. Organic materials database: An open-access online database for data mining

    PubMed Central

    Geilhufe, R. Matthias; Balatsky, Alexander V.

    2017-01-01

    We present an organic materials database (OMDB) hosting thousands of Kohn-Sham electronic band structures, which is freely accessible online at http://omdb.diracmaterials.org. The OMDB focus lies on electronic structure, density of states and other properties for purely organic and organometallic compounds that are known to date. The electronic band structures are calculated using density functional theory for the crystal structures contained in the Crystallography Open Database. The OMDB web interface allows users to retrieve materials with specified target properties using non-trivial queries about their electronic structure. We illustrate the use of the OMDB and how it can become an organic part of search and prediction of novel functional materials via data mining techniques. As a specific example, we provide data mining results for metals and semiconductors, which are known to be rare in the class of organic materials. PMID:28182744

  3. Organic materials database: An open-access online database for data mining.

    PubMed

    Borysov, Stanislav S; Geilhufe, R Matthias; Balatsky, Alexander V

    2017-01-01

    We present an organic materials database (OMDB) hosting thousands of Kohn-Sham electronic band structures, which is freely accessible online at http://omdb.diracmaterials.org. The OMDB focus lies on electronic structure, density of states and other properties for purely organic and organometallic compounds that are known to date. The electronic band structures are calculated using density functional theory for the crystal structures contained in the Crystallography Open Database. The OMDB web interface allows users to retrieve materials with specified target properties using non-trivial queries about their electronic structure. We illustrate the use of the OMDB and how it can become an organic part of search and prediction of novel functional materials via data mining techniques. As a specific example, we provide data mining results for metals and semiconductors, which are known to be rare in the class of organic materials.

  4. The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community

    PubMed Central

    Yon Rhee, Seung; Beavis, William; Berardini, Tanya Z.; Chen, Guanghong; Dixon, David; Doyle, Aisling; Garcia-Hernandez, Margarita; Huala, Eva; Lander, Gabriel; Montoya, Mary; Miller, Neil; Mueller, Lukas A.; Mundodi, Suparna; Reiser, Leonore; Tacklind, Julie; Weems, Dan C.; Wu, Yihe; Xu, Iris; Yoo, Daniel; Yoon, Jungwon; Zhang, Peifen

    2003-01-01

    Arabidopsis thaliana is the most widely-studied plant today. The concerted efforts of over 11 000 researchers and 4000 organizations around the world are generating a rich diversity and quantity of information and materials. This information is made available through a comprehensive on-line resource called the Arabidopsis Information Resource (TAIR) (http://arabidopsis.org), which is accessible via commonly used web browsers and can be searched and downloaded in a number of ways. In the last two years, efforts have been focused on increasing data content and diversity, functionally annotating genes and gene products with controlled vocabularies, and improving data retrieval, analysis and visualization tools. New information include sequence polymorphisms including alleles, germplasms and phenotypes, Gene Ontology annotations, gene families, protein information, metabolic pathways, gene expression data from microarray experiments and seed and DNA stocks. New data visualization and analysis tools include SeqViewer, which interactively displays the genome from the whole chromosome down to 10 kb of nucleotide sequence and AraCyc, a metabolic pathway database and map tool that allows overlaying expression data onto the pathway diagrams. Finally, we have recently incorporated seed and DNA stock information from the Arabidopsis Biological Resource Center (ABRC) and implemented a shopping-cart style on-line ordering system. PMID:12519987

  5. The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community.

    PubMed

    Rhee, Seung Yon; Beavis, William; Berardini, Tanya Z; Chen, Guanghong; Dixon, David; Doyle, Aisling; Garcia-Hernandez, Margarita; Huala, Eva; Lander, Gabriel; Montoya, Mary; Miller, Neil; Mueller, Lukas A; Mundodi, Suparna; Reiser, Leonore; Tacklind, Julie; Weems, Dan C; Wu, Yihe; Xu, Iris; Yoo, Daniel; Yoon, Jungwon; Zhang, Peifen

    2003-01-01

    Arabidopsis thaliana is the most widely-studied plant today. The concerted efforts of over 11 000 researchers and 4000 organizations around the world are generating a rich diversity and quantity of information and materials. This information is made available through a comprehensive on-line resource called the Arabidopsis Information Resource (TAIR) (http://arabidopsis.org), which is accessible via commonly used web browsers and can be searched and downloaded in a number of ways. In the last two years, efforts have been focused on increasing data content and diversity, functionally annotating genes and gene products with controlled vocabularies, and improving data retrieval, analysis and visualization tools. New information include sequence polymorphisms including alleles, germplasms and phenotypes, Gene Ontology annotations, gene families, protein information, metabolic pathways, gene expression data from microarray experiments and seed and DNA stocks. New data visualization and analysis tools include SeqViewer, which interactively displays the genome from the whole chromosome down to 10 kb of nucleotide sequence and AraCyc, a metabolic pathway database and map tool that allows overlaying expression data onto the pathway diagrams. Finally, we have recently incorporated seed and DNA stock information from the Arabidopsis Biological Resource Center (ABRC) and implemented a shopping-cart style on-line ordering system.

  6. Development and mining of a volatile organic compound database.

    PubMed

    Abdullah, Azian Azamimi; Altaf-Ul-Amin, Md; Ono, Naoaki; Sato, Tetsuo; Sugiura, Tadao; Morita, Aki Hirai; Katsuragi, Tetsuo; Muto, Ai; Nishioka, Takaaki; Kanaya, Shigehiko

    2015-01-01

    Volatile organic compounds (VOCs) are small molecules that exhibit high vapor pressure under ambient conditions and have low boiling points. Although VOCs contribute only a small proportion of the total metabolites produced by living organisms, they play an important role in chemical ecology specifically in the biological interactions between organisms and ecosystems. VOCs are also important in the health care field as they are presently used as a biomarker to detect various human diseases. Information on VOCs is scattered in the literature until now; however, there is still no available database describing VOCs and their biological activities. To attain this purpose, we have developed KNApSAcK Metabolite Ecology Database, which contains the information on the relationships between VOCs and their emitting organisms. The KNApSAcK Metabolite Ecology is also linked with the KNApSAcK Core and KNApSAcK Metabolite Activity Database to provide further information on the metabolites and their biological activities. The VOC database can be accessed online.

  7. Development and Mining of a Volatile Organic Compound Database

    PubMed Central

    Abdullah, Azian Azamimi; Altaf-Ul-Amin, Md.; Ono, Naoaki; Sato, Tetsuo; Sugiura, Tadao; Morita, Aki Hirai; Katsuragi, Tetsuo; Muto, Ai; Nishioka, Takaaki; Kanaya, Shigehiko

    2015-01-01

    Volatile organic compounds (VOCs) are small molecules that exhibit high vapor pressure under ambient conditions and have low boiling points. Although VOCs contribute only a small proportion of the total metabolites produced by living organisms, they play an important role in chemical ecology specifically in the biological interactions between organisms and ecosystems. VOCs are also important in the health care field as they are presently used as a biomarker to detect various human diseases. Information on VOCs is scattered in the literature until now; however, there is still no available database describing VOCs and their biological activities. To attain this purpose, we have developed KNApSAcK Metabolite Ecology Database, which contains the information on the relationships between VOCs and their emitting organisms. The KNApSAcK Metabolite Ecology is also linked with the KNApSAcK Core and KNApSAcK Metabolite Activity Database to provide further information on the metabolites and their biological activities. The VOC database can be accessed online. PMID:26495281

  8. MODBASE, a database of annotated comparative protein structure models.

    PubMed

    Pieper, Ursula; Eswar, Narayanan; Stuart, Ashley C; Ilyin, Valentin A; Sali, Andrej

    2002-01-01

    MODBASE (http://guitar.rockefeller.edu/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on PSI-BLAST, IMPALA and MODELLER. MODBASE uses the MySQL relational database management system for flexible and efficient querying, and the MODVIEW Netscape plugin for viewing and manipulating multiple sequences and structures. It is updated regularly to reflect the growth of the protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different datasets. The largest dataset contains models for domains in 304 517 out of 539 171 unique protein sequences in the complete TrEMBL database (23 March 2001); only models based on significant alignments (PSI-BLAST E-value < 10(-4)) and models assessed to have the correct fold are included. Other datasets include models for target selection and structure-based annotation by the New York Structural Genomics Research Consortium, models for prediction of genes in the Drosophila melanogaster genome, models for structure determination of several ribosomal particles and models calculated by the MODWEB comparative modeling web server.

  9. An information model based weld schedule database

    SciTech Connect

    Kleban, S.D.; Knorovsky, G.A.; Hicken, G.K.; Gershanok, G.A.

    1997-08-01

    As part of a computerized system (SmartWeld) developed at Sandia National Laboratories to facilitate agile manufacturing of welded assemblies, a weld schedule database (WSDB) was also developed. SmartWeld`s overall goals are to shorten the design-to-product time frame and to promote right-the-first-time weldment design and manufacture by providing welding process selection guidance to component designers. The associated WSDB evolved into a substantial subproject by itself. At first, it was thought that the database would store perhaps 50 parameters about a weld schedule. This was a woeful underestimate: the current WSDB has over 500 parameters defined in 73 tables. This includes data bout the weld, the piece parts involved, the piece part geometry, and great detail about the schedule and intervals involved in performing the weld. This complex database was built using information modeling techniques. Information modeling is a process that creates a model of objects and their roles for a given domain (i.e. welding). The Natural-Language Information Analysis methodology (NIAM) technique was used, which is characterized by: (1) elementary facts being stated in natural language by the welding expert, (2) determinism (the resulting model is provably repeatable, i.e. it gives the same answer every time), and (3) extensibility (the model can be added to without changing existing structure). The information model produced a highly normalized relational schema that was translated to Oracle{trademark} Relational Database Management Systems for implementation.

  10. The database for reaching experiments and models.

    PubMed

    Walker, Ben; Kording, Konrad

    2013-01-01

    Reaching is one of the central experimental paradigms in the field of motor control, and many computational models of reaching have been published. While most of these models try to explain subject data (such as movement kinematics, reaching performance, forces, etc.) from only a single experiment, distinct experiments often share experimental conditions and record similar kinematics. This suggests that reaching models could be applied to (and falsified by) multiple experiments. However, using multiple datasets is difficult because experimental data formats vary widely. Standardizing data formats promises to enable scientists to test model predictions against many experiments and to compare experimental results across labs. Here we report on the development of a new resource available to scientists: a database of reaching called the Database for Reaching Experiments And Models (DREAM). DREAM collects both experimental datasets and models and facilitates their comparison by standardizing formats. The DREAM project promises to be useful for experimentalists who want to understand how their data relates to models, for modelers who want to test their theories, and for educators who want to help students better understand reaching experiments, models, and data analysis.

  11. Hydroacoustic forcing function modeling using DNS database

    NASA Technical Reports Server (NTRS)

    Zawadzki, I.; Gershfield, J. L.; Na, Y.; Wang, M.

    1996-01-01

    A wall pressure frequency spectrum model (Blake 1971 ) has been evaluated using databases from Direct Numerical Simulations (DNS) of a turbulent boundary layer (Na & Moin 1996). Good agreement is found for moderate to strong adverse pressure gradient flows in the absence of separation. In the separated flow region, the model underpredicts the directly calculated spectra by an order of magnitude. The discrepancy is attributed to the violation of the model assumptions in that part of the flow domain. DNS computed coherence length scales and the normalized wall pressure cross-spectra are compared with experimental data. The DNS results are consistent with experimental observations.

  12. Combining Soil Databases for Topsoil Organic Carbon Mapping in Europe.

    PubMed

    Aksoy, Ece; Yigini, Yusuf; Montanarella, Luca

    2016-01-01

    Accuracy in assessing the distribution of soil organic carbon (SOC) is an important issue because of playing key roles in the functions of both natural ecosystems and agricultural systems. There are several studies in the literature with the aim of finding the best method to assess and map the distribution of SOC content for Europe. Therefore this study aims searching for another aspect of this issue by looking to the performances of using aggregated soil samples coming from different studies and land-uses. The total number of the soil samples in this study was 23,835 and they're collected from the "Land Use/Cover Area frame Statistical Survey" (LUCAS) Project (samples from agricultural soil), BioSoil Project (samples from forest soil), and "Soil Transformations in European Catchments" (SoilTrEC) Project (samples from local soil data coming from six different critical zone observatories (CZOs) in Europe). Moreover, 15 spatial indicators (slope, aspect, elevation, compound topographic index (CTI), CORINE land-cover classification, parent material, texture, world reference base (WRB) soil classification, geological formations, annual average temperature, min-max temperature, total precipitation and average precipitation (for years 1960-1990 and 2000-2010)) were used as auxiliary variables in this prediction. One of the most popular geostatistical techniques, Regression-Kriging (RK), was applied to build the model and assess the distribution of SOC. This study showed that, even though RK method was appropriate for successful SOC mapping, using combined databases was not helpful to increase the statistical significance of the method results for assessing the SOC distribution. According to our results; SOC variation was mainly affected by elevation, slope, CTI, average temperature, average and total precipitation, texture, WRB and CORINE variables for Europe scale in our model. Moreover, the highest average SOC contents were found in the wetland areas; agricultural

  13. Combining Soil Databases for Topsoil Organic Carbon Mapping in Europe

    PubMed Central

    Aksoy, Ece

    2016-01-01

    Accuracy in assessing the distribution of soil organic carbon (SOC) is an important issue because of playing key roles in the functions of both natural ecosystems and agricultural systems. There are several studies in the literature with the aim of finding the best method to assess and map the distribution of SOC content for Europe. Therefore this study aims searching for another aspect of this issue by looking to the performances of using aggregated soil samples coming from different studies and land-uses. The total number of the soil samples in this study was 23,835 and they’re collected from the “Land Use/Cover Area frame Statistical Survey” (LUCAS) Project (samples from agricultural soil), BioSoil Project (samples from forest soil), and “Soil Transformations in European Catchments” (SoilTrEC) Project (samples from local soil data coming from six different critical zone observatories (CZOs) in Europe). Moreover, 15 spatial indicators (slope, aspect, elevation, compound topographic index (CTI), CORINE land-cover classification, parent material, texture, world reference base (WRB) soil classification, geological formations, annual average temperature, min-max temperature, total precipitation and average precipitation (for years 1960–1990 and 2000–2010)) were used as auxiliary variables in this prediction. One of the most popular geostatistical techniques, Regression-Kriging (RK), was applied to build the model and assess the distribution of SOC. This study showed that, even though RK method was appropriate for successful SOC mapping, using combined databases was not helpful to increase the statistical significance of the method results for assessing the SOC distribution. According to our results; SOC variation was mainly affected by elevation, slope, CTI, average temperature, average and total precipitation, texture, WRB and CORINE variables for Europe scale in our model. Moreover, the highest average SOC contents were found in the wetland areas

  14. Spatial Database Modeling for Indoor Navigation Systems

    NASA Astrophysics Data System (ADS)

    Gotlib, Dariusz; Gnat, Miłosz

    2013-12-01

    For many years, cartographers are involved in designing GIS and navigation systems. Most GIS applications use the outdoor data. Increasingly, similar applications are used inside buildings. Therefore it is important to find the proper model of indoor spatial database. The development of indoor navigation systems should utilize advanced teleinformation, geoinformatics, geodetic and cartographical knowledge. The authors present the fundamental requirements for the indoor data model for navigation purposes. Presenting some of the solutions adopted in the world they emphasize that navigation applications require specific data to present the navigation routes in the right way. There is presented original solution for indoor data model created by authors on the basis of BISDM model. Its purpose is to expand the opportunities for use in indoor navigation.

  15. Assessment of the SFC database for analysis and modeling

    NASA Technical Reports Server (NTRS)

    Centeno, Martha A.

    1994-01-01

    SFC is one of the four clusters that make up the Integrated Work Control System (IWCS), which will integrate the shuttle processing databases at Kennedy Space Center (KSC). The IWCS framework will enable communication among the four clusters and add new data collection protocols. The Shop Floor Control (SFC) module has been operational for two and a half years; however, at this stage, automatic links to the other 3 modules have not been implemented yet, except for a partial link to IOS (CASPR). SFC revolves around a DB/2 database with PFORMS acting as the database management system (DBMS). PFORMS is an off-the-shelf DB/2 application that provides a set of data entry screens and query forms. The main dynamic entity in the SFC and IOS database is a task; thus, the physical storage location and update privileges are driven by the status of the WAD. As we explored the SFC values, we realized that there was much to do before actually engaging in continuous analysis of the SFC data. Half way into this effort, it was realized that full scale analysis would have to be a future third phase of this effort. So, we concentrated on getting to know the contents of the database, and in establishing an initial set of tools to start the continuous analysis process. Specifically, we set out to: (1) provide specific procedures for statistical models, so as to enhance the TP-OAO office analysis and modeling capabilities; (2) design a data exchange interface; (3) prototype the interface to provide inputs to SCRAM; and (4) design a modeling database. These objectives were set with the expectation that, if met, they would provide former TP-OAO engineers with tools that would help them demonstrate the importance of process-based analyses. The latter, in return, will help them obtain the cooperation of various organizations in charting out their individual processes.

  16. Asteroid models from the Lowell photometric database

    NASA Astrophysics Data System (ADS)

    Ďurech, J.; Hanuš, J.; Oszkiewicz, D.; Vančo, R.

    2016-03-01

    Context. Information about shapes and spin states of individual asteroids is important for the study of the whole asteroid population. For asteroids from the main belt, most of the shape models available now have been reconstructed from disk-integrated photometry by the lightcurve inversion method. Aims: We want to significantly enlarge the current sample (~350) of available asteroid models. Methods: We use the lightcurve inversion method to derive new shape models and spin states of asteroids from the sparse-in-time photometry compiled in the Lowell Photometric Database. To speed up the time-consuming process of scanning the period parameter space through the use of convex shape models, we use the distributed computing project Asteroids@home, running on the Berkeley Open Infrastructure for Network Computing (BOINC) platform. This way, the period-search interval is divided into hundreds of smaller intervals. These intervals are scanned separately by different volunteers and then joined together. We also use an alternative, faster, approach when searching the best-fit period by using a model of triaxial ellipsoid. By this, we can independently confirm periods found with convex models and also find rotation periods for some of those asteroids for which the convex-model approach gives too many solutions. Results: From the analysis of Lowell photometric data of the first 100 000 numbered asteroids, we derived 328 new models. This almost doubles the number of available models. We tested the reliability of our results by comparing models that were derived from purely Lowell data with those based on dense lightcurves, and we found that the rate of false-positive solutions is very low. We also present updated plots of the distribution of spin obliquities and pole ecliptic longitudes that confirm previous findings about a non-uniform distribution of spin axes. However, the models reconstructed from noisy sparse data are heavily biased towards more elongated bodies with high

  17. Modeling and Simulation Terrain Database Management

    DTIC Science & Technology

    2005-07-01

    data for use in combat modeling. The SVDR is still under development by the Science Applications International Corporation (SAIC) in conjunction with...DOD organization for comunication between Unmanned Systems. But they have not worked on how to communicate terrain between systems. Hope this helps

  18. Carotenoids Database: structures, chemical fingerprints and distribution among organisms.

    PubMed

    Yabuzaki, Junko

    2017-01-01

    To promote understanding of how organisms are related via carotenoids, either evolutionarily or symbiotically, or in food chains through natural histories, we built the Carotenoids Database. This provides chemical information on 1117 natural carotenoids with 683 source organisms. For extracting organisms closely related through the biosynthesis of carotenoids, we offer a new similarity search system 'Search similar carotenoids' using our original chemical fingerprint 'Carotenoid DB Chemical Fingerprints'. These Carotenoid DB Chemical Fingerprints describe the chemical substructure and the modification details based upon International Union of Pure and Applied Chemistry (IUPAC) semi-systematic names of the carotenoids. The fingerprints also allow (i) easier prediction of six biological functions of carotenoids: provitamin A, membrane stabilizers, odorous substances, allelochemicals, antiproliferative activity and reverse MDR activity against cancer cells, (ii) easier classification of carotenoid structures, (iii) partial and exact structure searching and (iv) easier extraction of structural isomers and stereoisomers. We believe this to be the first attempt to establish fingerprints using the IUPAC semi-systematic names. For extracting close profiled organisms, we provide a new tool 'Search similar profiled organisms'. Our current statistics show some insights into natural history: carotenoids seem to have been spread largely by bacteria, as they produce C30, C40, C45 and C50 carotenoids, with the widest range of end groups, and they share a small portion of C40 carotenoids with eukaryotes. Archaea share an even smaller portion with eukaryotes. Eukaryotes then have evolved a considerable variety of C40 carotenoids. Considering carotenoids, eukaryotes seem more closely related to bacteria than to archaea aside from 16S rRNA lineage analysis. : http://carotenoiddb.jp.

  19. A Model Based Mars Climate Database for the Mission Design

    NASA Technical Reports Server (NTRS)

    2005-01-01

    A viewgraph presentation on a model based climate database is shown. The topics include: 1) Why a model based climate database?; 2) Mars Climate Database v3.1 Who uses it ? (approx. 60 users!); 3) The new Mars Climate database MCD v4.0; 4) MCD v4.0: what's new ? 5) Simulation of Water ice clouds; 6) Simulation of Water ice cycle; 7) A new tool for surface pressure prediction; 8) Acces to the database MCD 4.0; 9) How to access the database; and 10) New web access

  20. Integrated Space Asset Management Database and Modeling

    NASA Technical Reports Server (NTRS)

    MacLeod, Todd; Gagliano, Larry; Percy, Thomas; Mason, Shane

    2015-01-01

    Effective Space Asset Management is one key to addressing the ever-growing issue of space congestion. It is imperative that agencies around the world have access to data regarding the numerous active assets and pieces of space junk currently tracked in orbit around the Earth. At the center of this issues is the effective management of data of many types related to orbiting objects. As the population of tracked objects grows, so too should the data management structure used to catalog technical specifications, orbital information, and metadata related to those populations. Marshall Space Flight Center's Space Asset Management Database (SAM-D) was implemented in order to effectively catalog a broad set of data related to known objects in space by ingesting information from a variety of database and processing that data into useful technical information. Using the universal NORAD number as a unique identifier, the SAM-D processes two-line element data into orbital characteristics and cross-references this technical data with metadata related to functional status, country of ownership, and application category. The SAM-D began as an Excel spreadsheet and was later upgraded to an Access database. While SAM-D performs its task very well, it is limited by its current platform and is not available outside of the local user base. Further, while modeling and simulation can be powerful tools to exploit the information contained in SAM-D, the current system does not allow proper integration options for combining the data with both legacy and new M&S tools. This paper provides a summary of SAM-D development efforts to date and outlines a proposed data management infrastructure that extends SAM-D to support the larger data sets to be generated. A service-oriented architecture model using an information sharing platform named SIMON will allow it to easily expand to incorporate new capabilities, including advanced analytics, M&S tools, fusion techniques and user interface for

  1. A computational framework for a database of terrestrial biosphere models

    NASA Astrophysics Data System (ADS)

    Metzler, Holger; Müller, Markus; Ceballos-Núñez, Verónika; Sierra, Carlos A.

    2016-04-01

    Most terrestrial biosphere models consist of a set of coupled ordinary first order differential equations. Each equation represents a pool containing carbon with a certain turnover rate. Although such models share some basic mathematical structures, they can have very different properties such as number of pools, cycling rates, and internal fluxes. We present a computational framework that helps analyze the structure and behavior of terrestrial biosphere models using as an example the process of soil organic matter decomposition. The same framework can also be used for other sub-processes such as carbon fixation or allocation. First, the models have to be fed into a database consisting of simple text files with a common structure. Then they are read in using Python and transformed into an internal 'Model Class' that can be used to automatically create an overview stating the model's structure, state variables, internal and external fluxes. SymPy, a Python library for symbolic mathematics, helps to also calculate the Jacobian matrix at possibly given steady states and the eigenvalues of this matrix. If complete parameter sets are available, the model can also be run using R to simulate its behavior under certain conditions and to support a deeper stability analysis. In this case, the framework is also able to provide phase-plane plots if appropriate. Furthermore, an overview of all the models in the database can be given to help identify their similarities and differences.

  2. First Database Course--Keeping It All Organized

    ERIC Educational Resources Information Center

    Baugh, Jeanne M.

    2015-01-01

    All Computer Information Systems programs require a database course for their majors. This paper describes an approach to such a course in which real world examples, both design projects and actual database application projects are incorporated throughout the semester. Students are expected to apply the traditional database concepts to actual…

  3. Synthesized Population Databases: A US Geospatial Database for Agent-Based Models.

    PubMed

    Wheaton, William D; Cajka, James C; Chasteen, Bernadette M; Wagener, Diane K; Cooley, Philip C; Ganapathi, Laxminarayana; Roberts, Douglas J; Allpress, Justine L

    2009-05-01

    Agent-based models simulate large-scale social systems. They assign behaviors and activities to "agents" (individuals) within the population being modeled and then allow the agents to interact with the environment and each other in complex simulations. Agent-based models are frequently used to simulate infectious disease outbreaks, among other uses.RTI used and extended an iterative proportional fitting method to generate a synthesized, geospatially explicit, human agent database that represents the US population in the 50 states and the District of Columbia in the year 2000. Each agent is assigned to a household; other agents make up the household occupants.For this database, RTI developed the methods for generating synthesized households and personsassigning agents to schools and workplaces so that complex interactions among agents as they go about their daily activities can be taken into accountgenerating synthesized human agents who occupy group quarters (military bases, college dormitories, prisons, nursing homes).In this report, we describe both the methods used to generate the synthesized population database and the final data structure and data content of the database. This information will provide researchers with the information they need to use the database in developing agent-based models.Portions of the synthesized agent database are available to any user upon request. RTI will extract a portion (a county, region, or state) of the database for users who wish to use this database in their own agent-based models.

  4. Techniques to Access Databases and Integrate Data for Hydrologic Modeling

    SciTech Connect

    Whelan, Gene; Tenney, Nathan D.; Pelton, Mitchell A.; Coleman, Andre M.; Ward, Duane L.; Droppo, James G.; Meyer, Philip D.; Dorow, Kevin E.; Taira, Randal Y.

    2009-06-17

    This document addresses techniques to access and integrate data for defining site-specific conditions and behaviors associated with ground-water and surface-water radionuclide transport applicable to U.S. Nuclear Regulatory Commission reviews. Environmental models typically require input data from multiple internal and external sources that may include, but are not limited to, stream and rainfall gage data, meteorological data, hydrogeological data, habitat data, and biological data. These data may be retrieved from a variety of organizations (e.g., federal, state, and regional) and source types (e.g., HTTP, FTP, and databases). Available data sources relevant to hydrologic analyses for reactor licensing are identified and reviewed. The data sources described can be useful to define model inputs and parameters, including site features (e.g., watershed boundaries, stream locations, reservoirs, site topography), site properties (e.g., surface conditions, subsurface hydraulic properties, water quality), and site boundary conditions, input forcings, and extreme events (e.g., stream discharge, lake levels, precipitation, recharge, flood and drought characteristics). Available software tools for accessing established databases, retrieving the data, and integrating it with models were identified and reviewed. The emphasis in this review was on existing software products with minimal required modifications to enable their use with the FRAMES modeling framework. The ability of four of these tools to access and retrieve the identified data sources was reviewed. These four software tools were the Hydrologic Data Acquisition and Processing System (HDAPS), Integrated Water Resources Modeling System (IWRMS) External Data Harvester, Data for Environmental Modeling Environmental Data Download Tool (D4EM EDDT), and the FRAMES Internet Database Tools. The IWRMS External Data Harvester and the D4EM EDDT were identified as the most promising tools based on their ability to access and

  5. Fish Karyome version 2.1: a chromosome database of fishes and other aquatic organisms.

    PubMed

    Nagpure, Naresh Sahebrao; Pathak, Ajey Kumar; Pati, Rameshwar; Rashid, Iliyas; Sharma, Jyoti; Singh, Shri Prakash; Singh, Mahender; Sarkar, Uttam Kumar; Kushwaha, Basdeo; Kumar, Ravindra; Murali, S

    2016-01-01

    A voluminous information is available on karyological studies of fishes; however, limited efforts were made for compilation and curation of the available karyological data in a digital form. 'Fish Karyome' database was the preliminary attempt to compile and digitize the available karyological information on finfishes belonging to the Indian subcontinent. But the database had limitations since it covered data only on Indian finfishes with limited search options. Perceiving the feedbacks from the users and its utility in fish cytogenetic studies, the Fish Karyome database was upgraded by applying Linux, Apache, MySQL and PHP (pre hypertext processor) (LAMP) technologies. In the present version, the scope of the system was increased by compiling and curating the available chromosomal information over the globe on fishes and other aquatic organisms, such as echinoderms, molluscs and arthropods, especially of aquaculture importance. Thus, Fish Karyome version 2.1 presently covers 866 chromosomal records for 726 species supported with 253 published articles and the information is being updated regularly. The database provides information on chromosome number and morphology, sex chromosomes, chromosome banding, molecular cytogenetic markers, etc. supported by fish and karyotype images through interactive tools. It also enables the users to browse and view chromosomal information based on habitat, family, conservation status and chromosome number. The system also displays chromosome number in model organisms, protocol for chromosome preparation and allied techniques and glossary of cytogenetic terms. A data submission facility has also been provided through data submission panel. The database can serve as a unique and useful resource for cytogenetic characterization, sex determination, chromosomal mapping, cytotaxonomy, karyo-evolution and systematics of fishes. Database URL: http://mail.nbfgr.res.in/Fish_Karyome.

  6. Fish Karyome version 2.1: a chromosome database of fishes and other aquatic organisms

    PubMed Central

    Nagpure, Naresh Sahebrao; Pathak, Ajey Kumar; Pati, Rameshwar; Rashid, Iliyas; Sharma, Jyoti; Singh, Shri Prakash; Singh, Mahender; Sarkar, Uttam Kumar; Kushwaha, Basdeo; Kumar, Ravindra; Murali, S.

    2016-01-01

    A voluminous information is available on karyological studies of fishes; however, limited efforts were made for compilation and curation of the available karyological data in a digital form. ‘Fish Karyome’ database was the preliminary attempt to compile and digitize the available karyological information on finfishes belonging to the Indian subcontinent. But the database had limitations since it covered data only on Indian finfishes with limited search options. Perceiving the feedbacks from the users and its utility in fish cytogenetic studies, the Fish Karyome database was upgraded by applying Linux, Apache, MySQL and PHP (pre hypertext processor) (LAMP) technologies. In the present version, the scope of the system was increased by compiling and curating the available chromosomal information over the globe on fishes and other aquatic organisms, such as echinoderms, molluscs and arthropods, especially of aquaculture importance. Thus, Fish Karyome version 2.1 presently covers 866 chromosomal records for 726 species supported with 253 published articles and the information is being updated regularly. The database provides information on chromosome number and morphology, sex chromosomes, chromosome banding, molecular cytogenetic markers, etc. supported by fish and karyotype images through interactive tools. It also enables the users to browse and view chromosomal information based on habitat, family, conservation status and chromosome number. The system also displays chromosome number in model organisms, protocol for chromosome preparation and allied techniques and glossary of cytogenetic terms. A data submission facility has also been provided through data submission panel. The database can serve as a unique and useful resource for cytogenetic characterization, sex determination, chromosomal mapping, cytotaxonomy, karyo-evolution and systematics of fishes. Database URL: http://mail.nbfgr.res.in/Fish_Karyome PMID:26980518

  7. Sequence modelling and an extensible data model for genomic database

    SciTech Connect

    Li, Peter Wei-Der

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS`s do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the ``Extensible Object Model``, to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  8. Sequence modelling and an extensible data model for genomic database

    SciTech Connect

    Li, Peter Wei-Der Lawrence Berkeley Lab., CA )

    1992-01-01

    The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.

  9. Software Engineering Laboratory (SEL) database organization and user's guide, revision 2

    NASA Technical Reports Server (NTRS)

    Morusiewicz, Linda; Bristow, John

    1992-01-01

    The organization of the Software Engineering Laboratory (SEL) database is presented. Included are definitions and detailed descriptions of the database tables and views, the SEL data, and system support data. The mapping from the SEL and system support data to the base table is described. In addition, techniques for accessing the database through the Database Access Manager for the SEL (DAMSEL) system and via the ORACLE structured query language (SQL) are discussed.

  10. Object-Oriented Geographical Database Model

    NASA Technical Reports Server (NTRS)

    Johnson, M. L.; Bryant, N.; Sapounas, D.

    1996-01-01

    Terbase is an Object-Oriented database system under development at the Jet Propulsion Laboratory (JPL). Terbase is designed for flexibility, reusability, maintenace ease, multi-user collaboration and independence, and efficiency. This paper details the design and development of Terbase as a geographic data server...

  11. Solid Waste Projection Model: Database (Version 1. 3)

    SciTech Connect

    Blackburn, C.L.

    1991-11-01

    The Solid Waste Projection Model (SWPM) system is an analytical tool developed by Pacific Northwest Laboratory (PNL) for Westinghouse Hanford Company (WHC). The SWPM system provides a modeling and analysis environment that supports decisions in the process of evaluating various solid waste management alternatives. This document, one of a series describing the SWPM system, contains detailed information regarding the software and data structures utilized in developing the SWPM Version 1.3 Database. This document is intended for use by experienced database specialists and supports database maintenance, utility development, and database enhancement.

  12. Integrated Space Asset Management Database and Modeling

    NASA Astrophysics Data System (ADS)

    Gagliano, L.; MacLeod, T.; Mason, S.; Percy, T.; Prescott, J.

    The Space Asset Management Database (SAM-D) was implemented in order to effectively track known objects in space by ingesting information from a variety of databases and performing calculations to determine the expected position of the object at a specified time. While SAM-D performs this task very well, it is limited by technology and is not available outside of the local user base. Modeling and simulation can be powerful tools to exploit the information contained in SAM-D. However, the current system does not allow proper integration options for combining the data with both legacy and new M&S tools. A more capable data management infrastructure would extend SAM-D to support the larger data sets to be generated by the COI. A service-oriented architecture model will allow it to easily expand to incorporate new capabilities, including advanced analytics, M&S tools, fusion techniques and user interface for visualizations. Based on a web-centric approach, the entire COI will be able to access the data and related analytics. In addition, tight control of information sharing policy will increase confidence in the system, which would encourage industry partners to provide commercial data. SIMON is a Government off the Shelf information sharing platform in use throughout DoD and DHS information sharing and situation awareness communities. SIMON providing fine grained control to data owners allowing them to determine exactly how and when their data is shared. SIMON supports a micro-service approach to system development, meaning M&S and analytic services can be easily built or adapted. It is uniquely positioned to fill this need as an information-sharing platform with a proven track record of successful situational awareness system deployments. Combined with the integration of new and legacy M&S tools, a SIMON-based architecture will provide a robust SA environment for the NASA SA COI that can be extended and expanded indefinitely. First Results of Coherent Uplink from a

  13. The BioImage Database Project: organizing multidimensional biological images in an object-relational database.

    PubMed

    Carazo, J M; Stelzer, E H

    1999-01-01

    The BioImage Database Project collects and structures multidimensional data sets recorded by various microscopic techniques relevant to modern life sciences. It provides, as precisely as possible, the circumstances in which the sample was prepared and the data were recorded. It grants access to the actual data and maintains links between related data sets. In order to promote the interdisciplinary approach of modern science, it offers a large set of key words, which covers essentially all aspects of microscopy. Nonspecialists can, therefore, access and retrieve significant information recorded and submitted by specialists in other areas. A key issue of the undertaking is to exploit the available technology and to provide a well-defined yet flexible structure for dealing with data. Its pivotal element is, therefore, a modern object relational database that structures the metadata and ameliorates the provision of a complete service. The BioImage database can be accessed through the Internet.

  14. EPA's Drinking Water Treatability Database and Treatment Cost Models

    EPA Science Inventory

    USEPA Drinking Water Treatability Database and Drinking Water Treatment Cost Models are valuable tools for determining the effectiveness and cost of treatment for contaminants of emerging concern. The models will be introduced, explained, and demonstrated.

  15. EPA's Drinking Water Treatability Database and Treatment Cost Models

    EPA Science Inventory

    USEPA Drinking Water Treatability Database and Drinking Water Treatment Cost Models are valuable tools for determining the effectiveness and cost of treatment for contaminants of emerging concern. The models will be introduced, explained, and demonstrated.

  16. Expanding on Successful Concepts, Models, and Organization

    EPA Science Inventory

    If the goal of the AEP framework was to replace existing exposure models or databases for organizing exposure data with a concept, we would share Dr. von Göetz concerns. Instead, the outcome we promote is broader use of an organizational framework for exposure science. The f...

  17. Expanding on Successful Concepts, Models, and Organization

    EPA Science Inventory

    If the goal of the AEP framework was to replace existing exposure models or databases for organizing exposure data with a concept, we would share Dr. von Göetz concerns. Instead, the outcome we promote is broader use of an organizational framework for exposure science. The f...

  18. Cyclebase 3.0: a multi-organism database on cell-cycle regulation and phenotypes.

    PubMed

    Santos, Alberto; Wernersson, Rasmus; Jensen, Lars Juhl

    2015-01-01

    The eukaryotic cell division cycle is a highly regulated process that consists of a complex series of events and involves thousands of proteins. Researchers have studied the regulation of the cell cycle in several organisms, employing a wide range of high-throughput technologies, such as microarray-based mRNA expression profiling and quantitative proteomics. Due to its complexity, the cell cycle can also fail or otherwise change in many different ways if important genes are knocked out, which has been studied in several microscopy-based knockdown screens. The data from these many large-scale efforts are not easily accessed, analyzed and combined due to their inherent heterogeneity. To address this, we have created Cyclebase--available at http://www.cyclebase.org--an online database that allows users to easily visualize and download results from genome-wide cell-cycle-related experiments. In Cyclebase version 3.0, we have updated the content of the database to reflect changes to genome annotation, added new mRNA and protein expression data, and integrated cell-cycle phenotype information from high-content screens and model-organism databases. The new version of Cyclebase also features a new web interface, designed around an overview figure that summarizes all the cell-cycle-related data for a gene. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. A Web Database To Manage and Organize ANSI Standards Collections.

    ERIC Educational Resources Information Center

    Matylonek, John C.; Peasley, Maren

    2001-01-01

    Discusses collections of standards by ANSI (American National Standards Institute) and the problems they create for technical libraries. Describes a custom-designed Web database at Oregon State University that is linked to online catalog records, thus enhancing access to the standards collection. (LRW)

  20. Nonparametric Bayesian Modeling for Automated Database Schema Matching

    SciTech Connect

    Ferragut, Erik M; Laska, Jason A

    2015-01-01

    The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.

  1. A Robust Damage Assessment Model for Corrupted Database Systems

    NASA Astrophysics Data System (ADS)

    Fu, Ge; Zhu, Hong; Li, Yingjiu

    An intrusion tolerant database uses damage assessment techniques to detect damage propagation scales in a corrupted database system. Traditional damage assessment approaches in a intrusion tolerant database system can only locate damages which are caused by reading corrupted data. In fact, there are many other damage spreading patterns that have not been considered in traditional damage assessment model. In this paper, we systematically analyze inter-transaction dependency relationships that have been neglected in the previous research and propose four different dependency relationships between transactions which may cause damage propagation. We extend existing damage assessment model based on the four novel dependency relationships. The essential properties of our model is also discussed.

  2. Accessing Network Databases via SQL Transactions in a Multi-Model Database System

    DTIC Science & Technology

    1989-12-01

    DATA MODEL (ABDM) AND LANGUAGE ( ABDL ) ------------------------- 11 B. THE RELATIONAL DATA MODEL AND LANGUAGE ------ 16 C. THE NETWORK DATA MODEL AND...29 B. AN IMPLEMENTATION --------------------------- 36 V. MAPPING SQL STATEMENTS TO ABDL STATEMENTS FOR ACCESSING A NETWORK...data respond to the same data language. The user’s data language is translated into the attribute-based data language ( ABDL ) and the database created in

  3. Evaluation of an SQL model of the HELP patient database.

    PubMed

    Huff, S M; Berthelsen, C L; Pryor, T A; Dudley, A S

    1991-01-01

    We tested a new model of the HELP patient database that makes use of relational tables to store patient data and provides access to data using SQL (Structured Query Language). The SQL database required more storage space and had many more physical records than the HELP database, but it was faster and more efficient in storing data than the standard HELP utilities. The HELP utilities used disk space more efficiently and were faster than the SQL tools when retrieving data for typical clinical reports. However, the SQL model provides networking capabilities, general report writing tools, detailed user documentation, and an ability for creating secondary indexes that offset its poorer performance.

  4. Evaluation of an SQL model of the HELP patient database.

    PubMed Central

    Huff, S. M.; Berthelsen, C. L.; Pryor, T. A.; Dudley, A. S.

    1991-01-01

    We tested a new model of the HELP patient database that makes use of relational tables to store patient data and provides access to data using SQL (Structured Query Language). The SQL database required more storage space and had many more physical records than the HELP database, but it was faster and more efficient in storing data than the standard HELP utilities. The HELP utilities used disk space more efficiently and were faster than the SQL tools when retrieving data for typical clinical reports. However, the SQL model provides networking capabilities, general report writing tools, detailed user documentation, and an ability for creating secondary indexes that offset its poorer performance. PMID:1807629

  5. Performance modeling for large database systems

    NASA Astrophysics Data System (ADS)

    Schaar, Stephen; Hum, Frank; Romano, Joe

    1997-02-01

    One of the unique approaches Science Applications International Corporation took to meet performance requirements was to start the modeling effort during the proposal phase of the Interstate Identification Index/Federal Bureau of Investigations (III/FBI) project. The III/FBI Performance Model uses analytical modeling techniques to represent the III/FBI system. Inputs to the model include workloads for each transaction type, record size for each record type, number of records for each file, hardware envelope characteristics, engineering margins and estimates for software instructions, memory, and I/O for each transaction type. The model uses queuing theory to calculate the average transaction queue length. The model calculates a response time and the resources needed for each transaction type. Outputs of the model include the total resources needed for the system, a hardware configuration, and projected inherent and operational availability. The III/FBI Performance Model is used to evaluate what-if scenarios and allows a rapid response to engineering change proposals and technical enhancements.

  6. Effects of distributed database modeling on evaluation of transaction rollbacks

    NASA Technical Reports Server (NTRS)

    Mukkamala, Ravi

    1991-01-01

    Data distribution, degree of data replication, and transaction access patterns are key factors in determining the performance of distributed database systems. In order to simplify the evaluation of performance measures, database designers and researchers tend to make simplistic assumptions about the system. The effect is studied of modeling assumptions on the evaluation of one such measure, the number of transaction rollbacks, in a partitioned distributed database system. Six probabilistic models and expressions are developed for the numbers of rollbacks under each of these models. Essentially, the models differ in terms of the available system information. The analytical results so obtained are compared to results from simulation. From here, it is concluded that most of the probabilistic models yield overly conservative estimates of the number of rollbacks. The effect of transaction commutativity on system throughout is also grossly undermined when such models are employed.

  7. Effects of distributed database modeling on evaluation of transaction rollbacks

    NASA Technical Reports Server (NTRS)

    Mukkamala, Ravi

    1991-01-01

    Data distribution, degree of data replication, and transaction access patterns are key factors in determining the performance of distributed database systems. In order to simplify the evaluation of performance measures, database designers and researchers tend to make simplistic assumptions about the system. Here, researchers investigate the effect of modeling assumptions on the evaluation of one such measure, the number of transaction rollbacks in a partitioned distributed database system. The researchers developed six probabilistic models and expressions for the number of rollbacks under each of these models. Essentially, the models differ in terms of the available system information. The analytical results obtained are compared to results from simulation. It was concluded that most of the probabilistic models yield overly conservative estimates of the number of rollbacks. The effect of transaction commutativity on system throughput is also grossly undermined when such models are employed.

  8. Using LUCAS topsoil database to estimate soil organic carbon content in local spectral libraries

    NASA Astrophysics Data System (ADS)

    Castaldi, Fabio; van Wesemael, Bas; Chabrillat, Sabine; Chartin, Caroline

    2017-04-01

    The quantification of the soil organic carbon (SOC) content over large areas is mandatory to obtain accurate soil characterization and classification, which can improve site specific management at local or regional scale exploiting the strong relationship between SOC and crop growth. The estimation of the SOC is not only important for agricultural purposes: in recent years, the increasing attention towards global warming highlighted the crucial role of the soil in the global carbon cycle. In this context, soil spectroscopy is a well consolidated and widespread method to estimate soil variables exploiting the interaction between chromophores and electromagnetic radiation. The importance of spectroscopy in soil science is reflected by the increasing number of large soil spectral libraries collected in the world. These large libraries contain soil samples derived from a consistent number of pedological regions and thus from different parent material and soil types; this heterogeneity entails, in turn, a large variability in terms of mineralogical and organic composition. In the light of the huge variability of the spectral responses to SOC content and composition, a rigorous classification process is necessary to subset large spectral libraries and to avoid the calibration of global models failing to predict local variation in SOC content. In this regard, this study proposes a method to subset the European LUCAS topsoil database into soil classes using a clustering analysis based on a large number of soil properties. The LUCAS database was chosen to apply a standardized multivariate calibration approach valid for large areas without the need for extensive field and laboratory work for calibration of local models. Seven soil classes were detected by the clustering analyses and the samples belonging to each class were used to calibrate specific partial least square regression (PLSR) models to estimate SOC content of three local libraries collected in Belgium (Loam belt

  9. GIS-based Conceptual Database Model for Planetary Geoscientific Mapping

    NASA Astrophysics Data System (ADS)

    van Gasselt, Stephan; Nass, Andrea; Neukum, Gerhard

    2010-05-01

    concerning, e.g., map products (product and cartograpic representation), sensor-data products, stratigraphy definitions for each planet (facies, formation, ...), and mapping units. Domains and subtypes as well as a set of two dozens relationships define their interaction and allow a high level of constraints that aid to limit errors by domain- and topologic boundary conditions without limiting the abilitiy of the mapper to perform his/her task. The geodatabase model is part of a data model currently under development and design in the context of providing tools and definitions for mapping, cartographic representations and data exploitation. The database model as an integral part is designed for portability with respect to geoscientific mapping tasks in general and can be applied to every GIS project dealing with terrestrial planetary objects. It will be accompanied by definitions and representations on the cartographic level as well as tools and utilities for providing easy accessible workflows focussing on query, organization, maintainance, integration of planetary data and meta information. The data model's layout is modularized with individual components dealing with symbol representations (geology and geomorphology), metadata accessibility and modification, definition of stratigraphic entitites and their relationships as well as attribute domains, extensions for planetary mapping and analysis tasks as well as integration of data information on the level of vector representations for easy accessible querying, data processing in connection with ISIS/GDAL and data integration.

  10. Human Thermal Model Evaluation Using the JSC Human Thermal Database

    NASA Technical Reports Server (NTRS)

    Bue, Grant; Makinen, Janice; Cognata, Thomas

    2012-01-01

    Human thermal modeling has considerable long term utility to human space flight. Such models provide a tool to predict crew survivability in support of vehicle design and to evaluate crew response in untested space environments. It is to the benefit of any such model not only to collect relevant experimental data to correlate it against, but also to maintain an experimental standard or benchmark for future development in a readily and rapidly searchable and software accessible format. The Human thermal database project is intended to do just so; to collect relevant data from literature and experimentation and to store the data in a database structure for immediate and future use as a benchmark to judge human thermal models against, in identifying model strengths and weakness, to support model development and improve correlation, and to statistically quantify a model s predictive quality. The human thermal database developed at the Johnson Space Center (JSC) is intended to evaluate a set of widely used human thermal models. This set includes the Wissler human thermal model, a model that has been widely used to predict the human thermoregulatory response to a variety of cold and hot environments. These models are statistically compared to the current database, which contains experiments of human subjects primarily in air from a literature survey ranging between 1953 and 2004 and from a suited experiment recently performed by the authors, for a quantitative study of relative strength and predictive quality of the models.

  11. Imprecision and Uncertainty in the UFO Database Model.

    ERIC Educational Resources Information Center

    Van Gyseghem, Nancy; De Caluwe, Rita

    1998-01-01

    Discusses how imprecision and uncertainty are dealt with in the UFO (Uncertainty and Fuzziness in an Object-oriented) database model. Such information is expressed by means of possibility distributions, and modeled by means of the proposed concept of "role objects." The role objects model uncertain, tentative information about objects,…

  12. Imprecision and Uncertainty in the UFO Database Model.

    ERIC Educational Resources Information Center

    Van Gyseghem, Nancy; De Caluwe, Rita

    1998-01-01

    Discusses how imprecision and uncertainty are dealt with in the UFO (Uncertainty and Fuzziness in an Object-oriented) database model. Such information is expressed by means of possibility distributions, and modeled by means of the proposed concept of "role objects." The role objects model uncertain, tentative information about objects,…

  13. Flood forecasting for River Mekong with data-based models

    NASA Astrophysics Data System (ADS)

    Shahzad, Khurram M.; Plate, Erich J.

    2014-09-01

    In many regions of the world, the task of flood forecasting is made difficult because only a limited database is available for generating a suitable forecast model. This paper demonstrates that in such cases parsimonious data-based hydrological models for flood forecasting can be developed if the special conditions of climate and topography are used to advantage. As an example, the middle reach of River Mekong in South East Asia is considered, where a database of discharges from seven gaging stations on the river and 31 rainfall stations on the subcatchments between gaging stations is available for model calibration. Special conditions existing for River Mekong are identified and used in developing first a network connecting all discharge gages and then models for forecasting discharge increments between gaging stations. Our final forecast model (Model 3) is a linear combination of two structurally different basic models: a model (Model 1) using linear regressions for forecasting discharge increments, and a model (Model 2) using rainfall-runoff models. Although the model based on linear regressions works reasonably well for short times, better results are obtained with rainfall-runoff modeling. However, forecast accuracy of Model 2 is limited by the quality of rainfall forecasts. For best results, both models are combined by taking weighted averages to form Model 3. Model quality is assessed by means of both persistence index PI and standard deviation of forecast error.

  14. DAMIT: Database of Asteroid Models from Inversion Techniques

    NASA Astrophysics Data System (ADS)

    Kaasalainen, Mikko; Ďurech, Josef; Sidorin, Vojtěch

    2014-12-01

    DAMIT (Database of Asteroid Models from Inversion Techniques) is a database of three-dimensional models of asteroids computed using inversion techniques; it provides access to reliable and up-to-date physical models of asteroids, i.e., their shapes, rotation periods, and spin axis directions. Models from DAMIT can be used for further detailed studies of individual objects as well as for statistical studies of the whole set. The source codes for lightcurve inversion routines together with brief manuals, sample lightcurves, and the code for the direct problem are available for download.

  15. A Database Model for Medical Consultation.

    ERIC Educational Resources Information Center

    Anvari, Morteza

    1991-01-01

    Describes a relational data model that can be used for knowledge representation and manipulation in rule-based medical consultation systems. Fuzzy queries or attribute values and fuzzy set theory are discussed, functional dependencies are described, and an example is presented of a system for diagnosing causes of eye inflammation. (15 references)…

  16. A Database Model for Medical Consultation.

    ERIC Educational Resources Information Center

    Anvari, Morteza

    1991-01-01

    Describes a relational data model that can be used for knowledge representation and manipulation in rule-based medical consultation systems. Fuzzy queries or attribute values and fuzzy set theory are discussed, functional dependencies are described, and an example is presented of a system for diagnosing causes of eye inflammation. (15 references)…

  17. Examining the Factors That Contribute to Successful Database Application Implementation Using the Technology Acceptance Model

    ERIC Educational Resources Information Center

    Nworji, Alexander O.

    2013-01-01

    Most organizations spend millions of dollars due to the impact of improperly implemented database application systems as evidenced by poor data quality problems. The purpose of this quantitative study was to use, and extend, the technology acceptance model (TAM) to assess the impact of information quality and technical quality factors on database…

  18. Examining the Factors That Contribute to Successful Database Application Implementation Using the Technology Acceptance Model

    ERIC Educational Resources Information Center

    Nworji, Alexander O.

    2013-01-01

    Most organizations spend millions of dollars due to the impact of improperly implemented database application systems as evidenced by poor data quality problems. The purpose of this quantitative study was to use, and extend, the technology acceptance model (TAM) to assess the impact of information quality and technical quality factors on database…

  19. Human Thermal Model Evaluation Using the JSC Human Thermal Database

    NASA Technical Reports Server (NTRS)

    Cognata, T.; Bue, G.; Makinen, J.

    2011-01-01

    The human thermal database developed at the Johnson Space Center (JSC) is used to evaluate a set of widely used human thermal models. This database will facilitate a more accurate evaluation of human thermoregulatory response using in a variety of situations, including those situations that might otherwise prove too dangerous for actual testing--such as extreme hot or cold splashdown conditions. This set includes the Wissler human thermal model, a model that has been widely used to predict the human thermoregulatory response to a variety of cold and hot environments. These models are statistically compared to the current database, which contains experiments of human subjects primarily in air from a literature survey ranging between 1953 and 2004 and from a suited experiment recently performed by the authors, for a quantitative study of relative strength and predictive quality of the models. Human thermal modeling has considerable long term utility to human space flight. Such models provide a tool to predict crew survivability in support of vehicle design and to evaluate crew response in untested environments. It is to the benefit of any such model not only to collect relevant experimental data to correlate it against, but also to maintain an experimental standard or benchmark for future development in a readily and rapidly searchable and software accessible format. The Human thermal database project is intended to do just so; to collect relevant data from literature and experimentation and to store the data in a database structure for immediate and future use as a benchmark to judge human thermal models against, in identifying model strengths and weakness, to support model development and improve correlation, and to statistically quantify a model s predictive quality.

  20. Materials Database Development for Ballistic Impact Modeling

    NASA Technical Reports Server (NTRS)

    Pereira, J. Michael

    2007-01-01

    A set of experimental data is being generated under the Fundamental Aeronautics Program Supersonics project to help create and validate accurate computational impact models of jet engine impact events. The data generated will include material property data generated at a range of different strain rates, from 1x10(exp -4)/sec to 5x10(exp 4)/sec, over a range of temperatures. In addition, carefully instrumented ballistic impact tests will be conducted on flat plates and curved structures to provide material and structural response information to help validate the computational models. The material property data and the ballistic impact data will be generated using materials from the same lot, as far as possible. It was found in preliminary testing that the surface finish of test specimens has an effect on measured high strain rate tension response of AL2024. Both the maximum stress and maximum elongation are greater on specimens with a smoother finish. This report gives an overview of the testing that is being conducted and presents results of preliminary testing of the surface finish study.

  1. A new world lakes database for global hydrological modelling

    NASA Astrophysics Data System (ADS)

    Pimentel, Rafael; Hasan, Abdulghani; Isberg, Kristina; Arheimer, Berit

    2017-04-01

    Lakes are crucial systems in global hydrology, they constitutes approximately a 65% of the total amount of surface water over the world. The recent advances in remote sensing technology have allowed getting new higher spatiotemporal resolution for global water bodies information. Within them, ESA global map of water bodies, stationary map at 150 m spatial resolution, (Lamarche et al., 2015) and the new high-resolution mapping of global surface water and its long-term changes, 32 years product with a 30 m spatial resolution (Pekel et al., 2016). Nevertheless, these databases identifies all the water bodies, they do not make differences between lakes, rivers, wetlands and seas. Some global databases with isolate lake information are available, i.e. GLWD (Global Lakes and Wetland Database) (Lernhard and Döll, 2004), however the location of some of the lakes is shifted in relation with topography and their extension have also experimented changes since the creation of the database. This work presents a new world lake database based on ESA global map water bodies and relied on the lakes in GLWD. Lakes from ESA global map of water bodies were identified using a flood fill algorithm, which is initialized using the centroid of the lakes defined in GLWD. Some manual checks were done to split lakes that are really connected but identified as different lakes in GLWD database. In this way the database associated information provided in GLDW is maintained. Moreover, the locations of the outlet of all them were included in the new database. The high resolution upstream area information provided by Global Width Database for Large Rivers (GWD-LR) was used for that. This additional points location constitutes very useful information for watershed delineation by global hydrological modelling.. The methodology was validated using in situ information from Sweden lakes and extended over the world. 13 500 lakes greater than 0.1 km2 were identified.

  2. Spatial-temporal database model based on geodatabase

    NASA Astrophysics Data System (ADS)

    Zhu, Hongmei; Luo, Yu

    2009-10-01

    Entities in the real world have non-spatial attributes, as well as spatial and temporal features. A spatial-temporal data model aims at describing appropriately these intrinsic characteristics within the entities and model them on a conceptual level so that the model can present both static information and dynamic information that occurs over time. In this paper, we devise a novel spatial-temporal data model which is based on Geodatabase. The model employs object-oriented analysis method, combining object concept with event. The entity is defined as a feature class encapsulating attributes and operations. The operations detect change and store the changes automatically in a historic database in Geodatabase. Furthermore, the model takes advantage of the existing strengths of the relational database at the bottom level of Geodatabase, such as trigger and constraint, to monitor events on the attributes or locations and respond to the events correctly. A case of geographic database for Kunming municipal sewerage geographic information system is implemented by the model. The database reveals excellent performance on managing data and tracking the details of change. It provides a perfect data platform for querying, recurring history and predicting the trend of future. The instance demonstrates the spatial-temporal data model is efficient and practicable.

  3. BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata.

    PubMed

    Barrett, Tanya; Clark, Karen; Gevorgyan, Robert; Gorelenkov, Vyacheslav; Gribov, Eugene; Karsch-Mizrachi, Ilene; Kimelman, Michael; Pruitt, Kim D; Resenchuk, Sergei; Tatusova, Tatiana; Yaschenko, Eugene; Ostell, James

    2012-01-01

    As the volume and complexity of data sets archived at NCBI grow rapidly, so does the need to gather and organize the associated metadata. Although metadata has been collected for some archival databases, previously, there was no centralized approach at NCBI for collecting this information and using it across databases. The BioProject database was recently established to facilitate organization and classification of project data submitted to NCBI, EBI and DDBJ databases. It captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. Concomitantly, the BioSample database is being developed to capture descriptive information about the biological samples investigated in projects. BioProject and BioSample records link to corresponding data stored in archival repositories. Submissions are supported by a web-based Submission Portal that guides users through a series of forms for input of rich metadata describing their projects and samples. Together, these databases offer improved ways for users to query, locate, integrate and interpret the masses of data held in NCBI's archival repositories. The BioProject and BioSample databases are available at http://www.ncbi.nlm.nih.gov/bioproject and http://www.ncbi.nlm.nih.gov/biosample, respectively.

  4. BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata

    PubMed Central

    Barrett, Tanya; Clark, Karen; Gevorgyan, Robert; Gorelenkov, Vyacheslav; Gribov, Eugene; Karsch-Mizrachi, Ilene; Kimelman, Michael; Pruitt, Kim D.; Resenchuk, Sergei; Tatusova, Tatiana; Yaschenko, Eugene; Ostell, James

    2012-01-01

    As the volume and complexity of data sets archived at NCBI grow rapidly, so does the need to gather and organize the associated metadata. Although metadata has been collected for some archival databases, previously, there was no centralized approach at NCBI for collecting this information and using it across databases. The BioProject database was recently established to facilitate organization and classification of project data submitted to NCBI, EBI and DDBJ databases. It captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. Concomitantly, the BioSample database is being developed to capture descriptive information about the biological samples investigated in projects. BioProject and BioSample records link to corresponding data stored in archival repositories. Submissions are supported by a web-based Submission Portal that guides users through a series of forms for input of rich metadata describing their projects and samples. Together, these databases offer improved ways for users to query, locate, integrate and interpret the masses of data held in NCBI's archival repositories. The BioProject and BioSample databases are available at http://www.ncbi.nlm.nih.gov/bioproject and http://www.ncbi.nlm.nih.gov/biosample, respectively. PMID:22139929

  5. Overarching framework for data-based modelling

    NASA Astrophysics Data System (ADS)

    Schelter, Björn; Mader, Malenka; Mader, Wolfgang; Sommerlade, Linda; Platt, Bettina; Lai, Ying-Cheng; Grebogi, Celso; Thiel, Marco

    2014-02-01

    One of the main modelling paradigms for complex physical systems are networks. When estimating the network structure from measured signals, typically several assumptions such as stationarity are made in the estimation process. Violating these assumptions renders standard analysis techniques fruitless. We here propose a framework to estimate the network structure from measurements of arbitrary non-linear, non-stationary, stochastic processes. To this end, we propose a rigorous mathematical theory that underlies this framework. Based on this theory, we present a highly efficient algorithm and the corresponding statistics that are immediately sensibly applicable to measured signals. We demonstrate its performance in a simulation study. In experiments of transitions between vigilance stages in rodents, we infer small network structures with complex, time-dependent interactions; this suggests biomarkers for such transitions, the key to understand and diagnose numerous diseases such as dementia. We argue that the suggested framework combines features that other approaches followed so far lack.

  6. Visual Analysis of Residuals from Data-Based Models in Complex Industrial Processes

    NASA Astrophysics Data System (ADS)

    Ordoñez, Daniel G.; Cuadrado, Abel A.; Díaz, Ignacio; García, Francisco J.; Díez, Alberto B.; Fuertes, Juan J.

    2012-10-01

    The use of data-based models for visualization purposes in an industrial background is discussed. Results using Self-Organizing Maps (SOM) show how through a good design of the model and a proper visualization of the residuals generated by the model itself, the behavior of essential parameters of the process can be easily tracked in a visual way. Real data from a cold rolling facility have been used to prove the advantages of these techniques.

  7. Content-Based Search on a Database of Geometric Models: Identifying Objects of Similar Shape

    SciTech Connect

    XAVIER, PATRICK G.; HENRY, TYSON R.; LAFARGE, ROBERT A.; MEIRANS, LILITA; RAY, LAWRENCE P.

    2001-11-01

    The Geometric Search Engine is a software system for storing and searching a database of geometric models. The database maybe searched for modeled objects similar in shape to a target model supplied by the user. The database models are generally from CAD models while the target model may be either a CAD model or a model generated from range data collected from a physical object. This document describes key generation, database layout, and search of the database.

  8. VIDA: a virus database system for the organization of animal virus genome open reading frames

    PubMed Central

    Albà, M. Mar; Lee, David; Pearl, Frances M. G.; Shepherd, Adrian J.; Martin, Nigel; Orengo, Christine A.; Kellam, Paul

    2001-01-01

    VIDA is a new virus database that organizes open reading frames (ORFs) from partial and complete genomic sequences from animal viruses. Currently VIDA includes all sequences from GenBank for Herpesviridae, Coronaviridae and Arteriviridae. The ORFs are organized into homologous protein families, which are identified on the basis of sequence similarity relationships. Conserved sequence regions of potential functional importance are identified and can be retrieved as sequence alignments. We use a controlled taxonomical and functional classification for all the proteins and protein families in the database. When available, protein structures that are related to the families have also been included. The database is available for online search and sequence information retrieval at http://www.biochem.ucl.ac.uk/bsm/virus_database/VIDA.html. PMID:11125070

  9. VIDA: a virus database system for the organization of animal virus genome open reading frames.

    PubMed

    Albà, M M; Lee, D; Pearl, F M; Shepherd, A J; Martin, N; Orengo, C A; Kellam, P

    2001-01-01

    VIDA is a new virus database that organizes open reading frames (ORFs) from partial and complete genomic sequences from animal viruses. Currently VIDA includes all sequences from GenBank for Herpesviridae, Coronaviridae and Arteriviridae. The ORFs are organized into homologous protein families, which are identified on the basis of sequence similarity relationships. Conserved sequence regions of potential functional importance are identified and can be retrieved as sequence alignments. We use a controlled taxonomical and functional classification for all the proteins and protein families in the database. When available, protein structures that are related to the families have also been included. The database is available for online search and sequence information retrieval at http://www.biochem.ucl.ac.uk/bsm/virus_database/ VIDA.html.

  10. SPECTRAFACTORY.NET: A DATABASE OF MOLECULAR MODEL SPECTRA

    SciTech Connect

    Cami, J.; Van Malderen, R.; Markwick, A. J. E-mail: Andrew.Markwick@manchester.ac.uk

    2010-04-01

    We present a homogeneous database of synthetic molecular absorption and emission spectra from the optical to mm wavelengths for a large range of temperatures and column densities relevant for various astrophysical purposes, but in particular for the analysis, identification, and first-order analysis of molecular bands in spectroscopic observations. All spectra are calculated in the LTE limit from several molecular line lists, and are presented at various spectral resolving powers corresponding to several specific instrument simulations. The database is available online at http://www.spectrafactory.net, where users can freely browse, search, display, and download the spectra. We describe how additional model spectra can be requested for (automatic) calculation and inclusion. The database already contains over half a million model spectra for 39 molecules (96 different isotopologues) over the wavelength range 350 nm-3 mm ({approx}3-30000 cm{sup -1})

  11. Medical student database development: a model for record management in a multi-departmental setting.

    PubMed Central

    Vercillo, D. M.; Holmes, K. C.; Pingree, M. J.; Bray, B. E.; Lincoln, M. J.

    1999-01-01

    Student records flow through medical school offices at a rapid rate. Much of this data is often tracked on paper, spread across multiple departments. The Medical Student Informatics Group at the University of Utah School of Medicine identified offices and organizations documenting student information. We assessed departmental needs, identified records, and researched database software available within the private sector and academic community. Although a host of database applications exist, few publications discuss database models for storage and retrieval of student records. We developed and deployed an Internet based application to meet current requirements, and allow for future expandability. During a test period, users were polled regarding utility, security, stability, ease of use, data accuracy, and potential project expansion. Feedback demonstrated widespread approval, and considerable interest in additional feature development. This experience suggests that many medical schools would benefit from centralized database management of student records. Images Figure 3 Figure 4 Figure 5 PMID:10566507

  12. Database integration in a multimedia-modeling environment

    SciTech Connect

    Dorow, Kevin E.

    2002-09-02

    Integration of data from disparate remote sources has direct applicability to modeling, which can support Brownfield assessments. To accomplish this task, a data integration framework needs to be established. A key element in this framework is the metadata that creates the relationship between the pieces of information that are important in the multimedia modeling environment and the information that is stored in the remote data source. The design philosophy is to allow modelers and database owners to collaborate by defining this metadata in such a way that allows interaction between their components. The main parts of this framework include tools to facilitate metadata definition, database extraction plan creation, automated extraction plan execution / data retrieval, and a central clearing house for metadata and modeling / database resources. Cross-platform compatibility (using Java) and standard communications protocols (http / https) allow these parts to run in a wide variety of computing environments (Local Area Networks, Internet, etc.), and, therefore, this framework provides many benefits. Because of the specific data relationships described in the metadata, the amount of data that have to be transferred is kept to a minimum (only the data that fulfill a specific request are provided as opposed to transferring the complete contents of a data source). This allows for real-time data extraction from the actual source. Also, the framework sets up collaborative responsibilities such that the different types of participants have control over the areas in which they have domain knowledge-the modelers are responsible for defining the data relevant to their models, while the database owners are responsible for mapping the contents of the database using the metadata definitions. Finally, the data extraction mechanism allows for the ability to control access to the data and what data are made available.

  13. An Approach to Query Cost Modelling in Numeric Databases.

    ERIC Educational Resources Information Center

    Jarvelin, Kalervo

    1989-01-01

    Examines factors that determine user charges based on query processing costs in numeric databases, and analyzes the problem of estimating such charges in advance. An approach to query cost estimation is presented which is based on the relational data model and the query optimization, cardinality estimation, and file design techniques developed in…

  14. Technical Work Plan for: Thermodynamic Database for Chemical Modeling

    SciTech Connect

    C.F. Jovecolon

    2006-09-07

    The objective of the work scope covered by this Technical Work Plan (TWP) is to correct and improve the Yucca Mountain Project (YMP) thermodynamic databases, to update their documentation, and to ensure reasonable consistency among them. In addition, the work scope will continue to generate database revisions, which are organized and named so as to be transparent to internal and external users and reviewers. Regarding consistency among databases, it is noted that aqueous speciation and mineral solubility data for a given system may differ according to how solubility was determined, and the method used for subsequent retrieval of thermodynamic parameter values from measured data. Of particular concern are the details of the determination of ''infinite dilution'' constants, which involve the use of specific methods for activity coefficient corrections. That is, equilibrium constants developed for a given system for one set of conditions may not be consistent with constants developed for other conditions, depending on the species considered in the chemical reactions and the methods used in the reported studies. Hence, there will be some differences (for example in log K values) between the Pitzer and ''B-dot'' database parameters for the same reactions or species.

  15. INTERCOMPARISON OF ALTERNATIVE VEGETATION DATABASES FOR REGIONAL AIR QUALITY MODELING

    EPA Science Inventory

    Vegetation cover data are used to characterize several regional air quality modeling processes, including the calculation of heat, moisture, and momentum fluxes with the Mesoscale Meteorological Model (MM5) and the estimate of biogenic volatile organic compound and nitric oxide...

  16. INTERCOMPARISON OF ALTERNATIVE VEGETATION DATABASES FOR REGIONAL AIR QUALITY MODELING

    EPA Science Inventory

    Vegetation cover data are used to characterize several regional air quality modeling processes, including the calculation of heat, moisture, and momentum fluxes with the Mesoscale Meteorological Model (MM5) and the estimate of biogenic volatile organic compound and nitric oxide...

  17. NGNP Risk Management Database: A Model for Managing Risk

    SciTech Connect

    John Collins

    2009-09-01

    To facilitate the implementation of the Risk Management Plan, the Next Generation Nuclear Plant (NGNP) Project has developed and employed an analytical software tool called the NGNP Risk Management System (RMS). A relational database developed in Microsoft® Access, the RMS provides conventional database utility including data maintenance, archiving, configuration control, and query ability. Additionally, the tool’s design provides a number of unique capabilities specifically designed to facilitate the development and execution of activities outlined in the Risk Management Plan. Specifically, the RMS provides the capability to establish the risk baseline, document and analyze the risk reduction plan, track the current risk reduction status, organize risks by reference configuration system, subsystem, and component (SSC) and Area, and increase the level of NGNP decision making.

  18. Artificial intelligence techniques for modeling database user behavior

    NASA Technical Reports Server (NTRS)

    Tanner, Steve; Graves, Sara J.

    1990-01-01

    The design and development of the adaptive modeling system is described. This system models how a user accesses a relational database management system in order to improve its performance by discovering use access patterns. In the current system, these patterns are used to improve the user interface and may be used to speed data retrieval, support query optimization and support a more flexible data representation. The system models both syntactic and semantic information about the user's access and employs both procedural and rule-based logic to manipulate the model.

  19. Comparing global soil models to soil carbon profile databases

    NASA Astrophysics Data System (ADS)

    Koven, C. D.; Harden, J. W.; He, Y.; Lawrence, D. M.; Nave, L. E.; O'Donnell, J. A.; Treat, C.; Sulman, B. N.; Kane, E. S.

    2015-12-01

    As global soil models begin to consider the dynamics of carbon below the surface layers, it is crucial to assess the realism of these models. We focus on the vertical profiles of soil C predicted across multiple biomes form the Community Land Model (CLM4.5), using different values for a parameter that controls the rate of decomposition at depth versus at the surface, and compare these to observationally-derived diagnostics derived from the International Soil Carbon Database (ISCN) to assess the realism of model predictions of carbon depthattenuation, and the ability of observations to provide a constraint on rates of decomposition at depth.

  20. SAPling: a Scan-Add-Print barcoding database system to label and track asexual organisms

    PubMed Central

    Thomas, Michael A.; Schötz, Eva-Maria

    2011-01-01

    SUMMARY We have developed a ‘Scan-Add-Print’ database system, SAPling, to track and monitor asexually reproducing organisms. Using barcodes to uniquely identify each animal, we can record information on the life of the individual in a computerized database containing its entire family tree. SAPling has enabled us to carry out large-scale population dynamics experiments with thousands of planarians and keep track of each individual. The database stores information such as family connections, birth date, division date and generation. We show that SAPling can be easily adapted to other asexually reproducing organisms and has a strong potential for use in large-scale and/or long-term population and senescence studies as well as studies of clonal diversity. The software is platform-independent, designed for reliability and ease of use, and provided open source from our webpage to allow project-specific customization. PMID:21993779

  1. Applications of the Cambridge Structural Database in organic chemistry and crystal chemistry.

    PubMed

    Allen, Frank H; Motherwell, W D Samuel

    2002-06-01

    The Cambridge Structural Database (CSD) and its associated software systems have formed the basis for more than 800 research applications in structural chemistry, crystallography and the life sciences. Relevant references, dating from the mid-1970s, and brief synopses of these papers are collected in a database, DBUse, which is freely available via the CCDC website. This database has been used to review research applications of the CSD in organic chemistry, including supramolecular applications, and in organic crystal chemistry. The review concentrates on applications that have been published since 1990 and covers a wide range of topics, including structure correlation, conformational analysis, hydrogen bonding and other intermolecular interactions, studies of crystal packing, extended structural motifs, crystal engineering and polymorphism, and crystal structure prediction. Applications of CSD information in studies of crystal structure precision, the determination of crystal structures from powder diffraction data, together with applications in chemical informatics, are also discussed.

  2. Accelerating Information Retrieval from Profile Hidden Markov Model Databases.

    PubMed

    Tamimi, Ahmad; Ashhab, Yaqoub; Tamimi, Hashem

    2016-01-01

    Profile Hidden Markov Model (Profile-HMM) is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases.

  3. Accelerating Information Retrieval from Profile Hidden Markov Model Databases

    PubMed Central

    Ashhab, Yaqoub; Tamimi, Hashem

    2016-01-01

    Profile Hidden Markov Model (Profile-HMM) is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases. PMID:27875548

  4. A database of biominerals with optical properties founded in living organisms

    NASA Astrophysics Data System (ADS)

    Pamirsky, Igor E.; Gutnikov, Sergei A.; Golokhvast, Kirill S.

    2016-11-01

    The living organisms - animals, plants, algae, fungi - contain microscopic inorganic inclusions (biominerals). A rather large body of information about their chemical composition, morphological types and presence in various parts of the organisms has been accumulated. Research in biominerals has a fundamental scientific value and can also be useful for development of materials with specific properties. We propose a database intended to comprise data about all known biominerals as an efficient practical tool for both fundamental biological research and development of biotechnology.

  5. ECOS E-MATRIX Methane and Volatile Organic Carbon (VOC) Emissions Best Practices Database

    SciTech Connect

    Parisien, Lia

    2016-01-31

    This final scientific/technical report on the ECOS e-MATRIX Methane and Volatile Organic Carbon (VOC) Emissions Best Practices Database provides a disclaimer and acknowledgement, table of contents, executive summary, description of project activities, and briefing/technical presentation link.

  6. Using the Cambridge Structural Database to Teach Molecular Geometry Concepts in Organic Chemistry

    ERIC Educational Resources Information Center

    Wackerly, Jay Wm.; Janowicz, Philip A.; Ritchey, Joshua A.; Caruso, Mary M.; Elliott, Erin L.; Moore, Jeffrey S.

    2009-01-01

    This article reports a set of two homework assignments that can be used in a second-year undergraduate organic chemistry class. These assignments were designed to help reinforce concepts of molecular geometry and to give students the opportunity to use a technological database and data mining to analyze experimentally determined chemical…

  7. Using the Cambridge Structural Database to Teach Molecular Geometry Concepts in Organic Chemistry

    ERIC Educational Resources Information Center

    Wackerly, Jay Wm.; Janowicz, Philip A.; Ritchey, Joshua A.; Caruso, Mary M.; Elliott, Erin L.; Moore, Jeffrey S.

    2009-01-01

    This article reports a set of two homework assignments that can be used in a second-year undergraduate organic chemistry class. These assignments were designed to help reinforce concepts of molecular geometry and to give students the opportunity to use a technological database and data mining to analyze experimentally determined chemical…

  8. Teaching biology with model organisms

    NASA Astrophysics Data System (ADS)

    Keeley, Dolores A.

    The purpose of this study is to identify and use model organisms that represent each of the kingdoms biologists use to classify organisms, while experiencing the process of science through guided inquiry. The model organisms will be the basis for studying the four high school life science core ideas as identified by the Next Generation Science Standards (NGSS): LS1-From molecules to organisms, LS2-Ecosystems, LS3- Heredity, and LS4- Biological Evolution. NGSS also have identified four categories of science and engineering practices which include developing and using models and planning and carrying out investigations. The living organisms will be utilized to increase student interest and knowledge within the discipline of Biology. Pre-test and posttest analysis utilizing student t-test analysis supported the hypothesis. This study shows increased student learning as a result of using living organisms as models for classification and working in an inquiry-based learning environment.

  9. Fitting the Balding-Nichols model to forensic databases.

    PubMed

    Rohlfs, Rori V; Aguiar, Vitor R C; Lohmueller, Kirk E; Castro, Amanda M; Ferreira, Alessandro C S; Almeida, Vanessa C O; Louro, Iuri D; Nielsen, Rasmus

    2015-11-01

    Large forensic databases provide an opportunity to compare observed empirical rates of genotype matching with those expected under forensic genetic models. A number of researchers have taken advantage of this opportunity to validate some forensic genetic approaches, particularly to ensure that estimated rates of genotype matching between unrelated individuals are indeed slight overestimates of those observed. However, these studies have also revealed systematic error trends in genotype probability estimates. In this analysis, we investigate these error trends and show how they result from inappropriate implementation of the Balding-Nichols model in the context of database-wide matching. Specifically, we show that in addition to accounting for increased allelic matching between individuals with recent shared ancestry, studies must account for relatively decreased allelic matching between individuals with more ancient shared ancestry.

  10. CyanOmics: an integrated database of omics for the model cyanobacterium Synechococcus sp. PCC 7002

    PubMed Central

    Yang, Yaohua; Feng, Jie; Li, Tao; Ge, Feng; Zhao, Jindong

    2015-01-01

    Cyanobacteria are an important group of organisms that carry out oxygenic photosynthesis and play vital roles in both the carbon and nitrogen cycles of the Earth. The annotated genome of Synechococcus sp. PCC 7002, as an ideal model cyanobacterium, is available. A series of transcriptomic and proteomic studies of Synechococcus sp. PCC 7002 cells grown under different conditions have been reported. However, no database of such integrated omics studies has been constructed. Here we present CyanOmics, a database based on the results of Synechococcus sp. PCC 7002 omics studies. CyanOmics comprises one genomic dataset, 29 transcriptomic datasets and one proteomic dataset and should prove useful for systematic and comprehensive analysis of all those data. Powerful browsing and searching tools are integrated to help users directly access information of interest with enhanced visualization of the analytical results. Furthermore, Blast is included for sequence-based similarity searching and Cluster 3.0, as well as the R hclust function is provided for cluster analyses, to increase CyanOmics’s usefulness. To the best of our knowledge, it is the first integrated omics analysis database for cyanobacteria. This database should further understanding of the transcriptional patterns, and proteomic profiling of Synechococcus sp. PCC 7002 and other cyanobacteria. Additionally, the entire database framework is applicable to any sequenced prokaryotic genome and could be applied to other integrated omics analysis projects. Database URL: http://lag.ihb.ac.cn/cyanomics PMID:25632108

  11. Organizing the Extremely Large LSST Database forReal-Time Astronomical Processing

    SciTech Connect

    Becla, Jacek; Lim, Kian-Tat; Monkewitz, Serge; Nieto-Santisteban, Maria; Thakar, Ani; /Johns Hopkins U.

    2007-11-07

    The Large Synoptic Survey Telescope (LSST) will catalog billions of astronomical objects and trillions of sources, all of which will be stored and managed by a database management system. One of the main challenges is real-time alert generation. To generate alerts, up to 100K new difference detections have to be cross-correlated with the huge historical catalogs, and then further processed to prune false alerts. This paper explains the challenges, the implementation of the LSST Association Pipeline and the database organization strategies we are planning to use to meet the real-time requirements, including data partitioning, parallelization, and pre-loading.

  12. PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification

    PubMed Central

    Thomas, Paul D.; Kejariwal, Anish; Campbell, Michael J.; Mi, Huaiyu; Diemer, Karen; Guo, Nan; Ladunga, Istvan; Ulitsky-Lazareva, Betty; Muruganujan, Anushya; Rabkin, Steven; Vandergriff, Jody A.; Doremieux, Olivier

    2003-01-01

    The PANTHER database was designed for high-throughput analysis of protein sequences. One of the key features is a simplified ontology of protein function, which allows browsing of the database by biological functions. Biologist curators have associated the ontology terms with groups of protein sequences rather than individual sequences. Statistical models (Hidden Markov Models, or HMMs) are built from each of these groups. The advantage of this approach is that new sequences can be automatically classified as they become available. To ensure accurate functional classification, HMMs are constructed not only for families, but also for functionally distinct subfamilies. Multiple sequence alignments and phylogenetic trees, including curator-assigned information, are available for each family. The current version of the PANTHER database includes training sequences from all organisms in the GenBank non-redundant protein database, and the HMMs have been used to classify gene products across the entire genomes of human, and Drosophila melanogaster. PANTHER is publicly available on the web at http://panther.celera.com. PMID:12520017

  13. Filling Terrorism Gaps: VEOs, Evaluating Databases, and Applying Risk Terrain Modeling to Terrorism

    SciTech Connect

    Hagan, Ross F.

    2016-08-29

    This paper aims to address three issues: the lack of literature differentiating terrorism and violent extremist organizations (VEOs), terrorism incident databases, and the applicability of Risk Terrain Modeling (RTM) to terrorism. Current open source literature and publicly available government sources do not differentiate between terrorism and VEOs; furthermore, they fail to define them. Addressing the lack of a comprehensive comparison of existing terrorism data sources, a matrix comparing a dozen terrorism databases is constructed, providing insight toward the array of data available. RTM, a method for spatial risk analysis at a micro level, has some applicability to terrorism research, particularly for studies looking at risk indicators of terrorism. Leveraging attack data from multiple databases, combined with RTM, offers one avenue for closing existing research gaps in terrorism literature.

  14. Verification of road databases using multiple road models

    NASA Astrophysics Data System (ADS)

    Ziems, Marcel; Rottensteiner, Franz; Heipke, Christian

    2017-08-01

    In this paper a new approach for automatic road database verification based on remote sensing images is presented. In contrast to existing methods, the applicability of the new approach is not restricted to specific road types, context areas or geographic regions. This is achieved by combining several state-of-the-art road detection and road verification approaches that work well under different circumstances. Each one serves as an independent module representing a unique road model and a specific processing strategy. All modules provide independent solutions for the verification problem of each road object stored in the database in form of two probability distributions, the first one for the state of a database object (correct or incorrect), and a second one for the state of the underlying road model (applicable or not applicable). In accordance with the Dempster-Shafer Theory, both distributions are mapped to a new state space comprising the classes correct, incorrect and unknown. Statistical reasoning is applied to obtain the optimal state of a road object. A comparison with state-of-the-art road detection approaches using benchmark datasets shows that in general the proposed approach provides results with larger completeness. Additional experiments reveal that based on the proposed method a highly reliable semi-automatic approach for road data base verification can be designed.

  15. MtDB: a database for personalized data mining of the model legume Medicago truncatula transcriptome.

    PubMed

    Lamblin, Anne-Françoise J; Crow, John A; Johnson, James E; Silverstein, Kevin A T; Kunau, Timothy M; Kilian, Alan; Benz, Diane; Stromvik, Martina; Endré, Gabriella; VandenBosch, Kathryn A; Cook, Douglas R; Young, Nevin D; Retzel, Ernest F

    2003-01-01

    In order to identify the genes and gene functions that underlie key aspects of legume biology, researchers have selected the cool season legume Medicago truncatula (Mt) as a model system for legume research. A set of >170 000 Mt ESTs has been assembled based on in-depth sampling from various developmental stages and pathogen-challenged tissues. MtDB is a relational database that integrates Mt transcriptome data and provides a wide range of user-defined data mining options. The database is interrogated through a series of interfaces with 58 options grouped into two filters. In addition, the user can select and compare unigene sets generated by different assemblers: Phrap, Cap3 and Cap4. Sequence identifiers from all public Mt sites (e.g. IDs from GenBank, CCGB, TIGR, NCGR, INRA) are fully cross-referenced to facilitate comparisons between different sites, and hypertext links to the appropriate database records are provided for all queries' results. MtDB's goal is to provide researchers with the means to quickly and independently identify sequences that match specific research interests based on user-defined criteria. The underlying database and query software have been designed for ease of updates and portability to other model organisms. Public access to the database is at http://www.medicago.org/MtDB.

  16. ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species

    PubMed Central

    Zeng, Victor; Extavour, Cassandra G.

    2012-01-01

    The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental

  17. ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species.

    PubMed

    Zeng, Victor; Extavour, Cassandra G

    2012-01-01

    The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental

  18. NGNP Risk Management Database: A Model for Managing Risk

    SciTech Connect

    John Collins; John M. Beck

    2011-11-01

    The Next Generation Nuclear Plant (NGNP) Risk Management System (RMS) is a database used to maintain the project risk register. The RMS also maps risk reduction activities to specific identified risks. Further functionality of the RMS includes mapping reactor suppliers Design Data Needs (DDNs) to risk reduction tasks and mapping Phenomena Identification Ranking Table (PIRTs) to associated risks. This document outlines the basic instructions on how to use the RMS. This document constitutes Revision 1 of the NGNP Risk Management Database: A Model for Managing Risk. It incorporates the latest enhancements to the RMS. The enhancements include six new custom views of risk data - Impact/Consequence, Tasks by Project Phase, Tasks by Status, Tasks by Project Phase/Status, Tasks by Impact/WBS, and Tasks by Phase/Impact/WBS.

  19. GADB: A database facility for modelling naturally occurring geophysical fields

    NASA Technical Reports Server (NTRS)

    Dampney, C. N. G.

    1983-01-01

    In certain kinds of geophysical surveys, the fields are continua, but measured at discrete points referenced by their position or time of measurement. Systems of this kind are better modelled by databases built from basic data structures attuned to representing traverses across continua that are not of pre-defined fixed length. The general Array DataBase is built on arrays (ordered sequencies of data) with each array holding data elements of one type. The arrays each occupy their own physical data set, in turn inter-related by a hierarchy to other arrays over the same space/time reference points. The GADB illustrates the principle that a data facility should reflect the fundamental properties of its data, and support retrieval based on the application's view. The GADB is being tested by its use in NASA's project MAGSAT.

  20. Organization's Orderly Interest Exploration: Inception, Development and Insights of AIAA's Topics Database

    NASA Technical Reports Server (NTRS)

    Marshall, Jospeh R.; Morris, Allan T.

    2007-01-01

    Since 2003, AIAA's Computer Systems and Software Systems Technical Committees (TCs) have developed a database that aids technical committee management to map technical topics to their members. This Topics/Interest (T/I) database grew out of a collection of charts and spreadsheets maintained by the TCs. Since its inception, the tool has evolved into a multi-dimensional database whose dimensions include the importance, interest and expertise of TC members and whether or not a member and/or a TC is actively involved with the topic. In 2005, the database was expanded to include the TCs in AIAA s Information Systems Group and then expanded further to include all AIAA TCs. It was field tested at an AIAA Technical Activities Committee (TAC) Workshop in early 2006 through live access by over 80 users. Through the use of the topics database, TC and program committee (PC) members can accomplish relevant tasks such as: to identify topic experts (for Aerospace America articles or external contacts), to determine the interest of its members, to identify overlapping topics between diverse TCs and PCs, to guide new member drives and to reveal emerging topics. This paper will describe the origins, inception, initial development, field test and current version of the tool as well as elucidate the benefits and insights gained by using the database to aid the management of various TC functions. Suggestions will be provided to guide future development of the database for the purpose of providing dynamics and system level benefits to AIAA that currently do not exist in any technical organization.

  1. Lagrangian modelling tool for IAGOS database added-value products

    NASA Astrophysics Data System (ADS)

    Fontaine, Alain; Auby, Antoine; Petetin, Hervé; Sauvage, Bastien; Thouret, Valérie; Boulanger, Damien

    2015-04-01

    Since 1994, the IAGOS (In-Service Aircraft for a Global Observing System, http://www.iagos.fr) project has produced in-situ measurements of chemical as ozone, carbon monoxide or nitrogen oxides species through more than 40000 commercial aircraft flights. In order to help analysing these observations a tool which links the observed pollutants to their sources was developped based on the Stohl et al. (2003) methodology. Build on the lagrangian particle dispersion model FLEXPART coupled with ECMWF meteorological fields, this tool simulates contributions of anthropogenic and biomass burning emissions from the ECCAD database, to the measured carbon monoxide mixing ratio along each IAGOS flight. Thanks to automated processes, 20-days backward simulation are run from the observation, separating individual contributions from the different source regions. The main goal is to supply added-value product to the IAGOS database showing pollutants geographical origin and emission type and link trends in the atmospheric composition to changes in the transport pathways and to the evolution of emissions. This tool may also be used for statistical validation for intercomparisons of emission inventories, where they can be compared to the in-situ observations from the IAGOS database.

  2. Post-transplant lymphoproliferative disorder after pancreas transplantation: a United Network for Organ Sharing database analysis.

    PubMed

    Jackson, K; Ruppert, K; Shapiro, R

    2013-01-01

    There are not a great deal of data on post-transplant lymphoproliferative disorder (PTLD) following pancreas transplantation. We analyzed the United Network for Organ Sharing national database of pancreas transplants to identify predictors of PTLD development. A univariate Cox model was generated for each potential predictor, and those at least marginally associated (p < 0.15) with PTLD were entered into a multivariable Cox model. PTLD developed in 43 patients (1.0%) of 4205 pancreas transplants. Mean follow-up time was 4.9 ± 2.2 yr. In the multivariable Cox model, recipient EBV seronegativity (HR 5.52, 95% CI: 2.99-10.19, p < 0.001), not having tacrolimus in the immunosuppressive regimen (HR 6.02, 95% CI: 2.74-13.19, p < 0.001), recipient age (HR 0.96, 95% CI: 0.92-0.99, p = 0.02), non-white ethnicity (HR 0.11, 95% CI: 0.02-0.84, p = 0.03), and HLA mismatching (HR 0.80, 95% CI: 0.67-0.97, p = 0.02) were significantly associated with the development of PTLD. Patient survival was significantly decreased in patients with PTLD, with a one-, three-, and five-yr survival of 91%, 76%, and 70%, compared with 97%, 93%, and 88% in patients without PTLD (p < 0.001). PTLD is an uncommon but potentially lethal complication following pancreas transplantation. Patients with the risk factors identified should be monitored closely for the development of PTLD.

  3. Human Exposure Modeling - Databases to Support Exposure Modeling

    EPA Pesticide Factsheets

    Human exposure modeling relates pollutant concentrations in the larger environmental media to pollutant concentrations in the immediate exposure media. The models described here are available on other EPA websites.

  4. A future of the model organism model.

    PubMed

    Rine, Jasper

    2014-03-01

    Changes in technology are fundamentally reframing our concept of what constitutes a model organism. Nevertheless, research advances in the more traditional model organisms have enabled fresh and exciting opportunities for young scientists to establish new careers and offer the hope of comprehensive understanding of fundamental processes in life. New advances in translational research can be expected to heighten the importance of basic research in model organisms and expand opportunities. However, researchers must take special care and implement new resources to enable the newest members of the community to engage fully with the remarkable legacy of information in these fields.

  5. A future of the model organism model

    PubMed Central

    Rine, Jasper

    2014-01-01

    Changes in technology are fundamentally reframing our concept of what constitutes a model organism. Nevertheless, research advances in the more traditional model organisms have enabled fresh and exciting opportunities for young scientists to establish new careers and offer the hope of comprehensive understanding of fundamental processes in life. New advances in translational research can be expected to heighten the importance of basic research in model organisms and expand opportunities. However, researchers must take special care and implement new resources to enable the newest members of the community to engage fully with the remarkable legacy of information in these fields. PMID:24577733

  6. An atmospheric tritium release database for model comparisons. Revision 1

    SciTech Connect

    Murphy, C.E. Jr.; Wortham, G.R.

    1995-01-01

    A database of vegetation, soil, and air tritium concentrations at gridded coordinate locations following nine accidental atmospheric releases is described. While none of the releases caused a significant dose to the public, the data collected are valuable for comparison with the results of tritium transport models used for risk assessment. The largest, potential, individual off-site dose from any of the releases was calculated to be 1.6 mrem. The population dose from this same release was 46 person-rem which represents 0.04% of the natural background radiation dose to the population in the path of the release.

  7. On the Perceptual Organization of Image Databases Using Cognitive Discriminative Biplots

    NASA Astrophysics Data System (ADS)

    Theoharatos, Christos; Laskaris, Nikolaos A.; Economou, George; Fotopoulos, Spiros

    2006-12-01

    A human-centered approach to image database organization is presented in this study. The management of a generic image database is pursued using a standard psychophysical experimental procedure followed by a well-suited data analysis methodology that is based on simple geometrical concepts. The end result is a cognitive discriminative biplot, which is a visualization of the intrinsic organization of the image database best reflecting the user's perception. The discriminating power of the introduced cognitive biplot constitutes an appealing tool for image retrieval and a flexible interface for visual data mining tasks. These ideas were evaluated in two ways. First, the separability of semantically distinct image classes was measured according to their reduced representations on the biplot. Then, a nearest-neighbor retrieval scheme was run on the emerged low-dimensional terrain to measure the suitability of the biplot for performing content-based image retrieval (CBIR). The achieved organization performance when compared with the performance of a contemporary system was found superior. This promoted the further discussion of packing these ideas into a realizable algorithmic procedure for an efficient and effective personalized CBIR system.

  8. Database and Interim Glass Property Models for Hanford HLW Glasses

    SciTech Connect

    Hrma, Pavel R.; Piepel, Gregory F.; Vienna, John D.; Cooley, Scott K.; Kim, Dong-Sang; Russell, Renee L.

    2001-07-24

    The purpose of this report is to provide a methodology for an increase in the efficiency and a decrease in the cost of vitrifying high-level waste (HLW) by optimizing HLW glass formulation. This methodology consists in collecting and generating a database of glass properties that determine HLW glass processability and acceptability and relating these properties to glass composition. The report explains how the property-composition models are developed, fitted to data, used for glass formulation optimization, and continuously updated in response to changes in HLW composition estimates and changes in glass processing technology. Further, the report reviews the glass property-composition literature data and presents their preliminary critical evaluation and screening. Finally the report provides interim property-composition models for melt viscosity, for liquidus temperature (with spinel and zircon primary crystalline phases), and for the product consistency test normalized releases of B, Na, and Li. Models were fitted to a subset of the screened database deemed most relevant for the current HLW composition region.

  9. A Thermal Model Preprocessor For Graphics And Material Database Generation

    NASA Astrophysics Data System (ADS)

    Jones, Jack C.; Gonda, Teresa G.

    1989-08-01

    The process of developing a physical description of a target for thermal models is a time consuming and tedious task. The problem is one of data collection, data manipulation, and data storage. Information on targets can come from many sources and therefore could be in any form (2-D drawings, 3-D wireframe or solid model representations, etc.). TACOM has developed a preprocessor that decreases the time involved in creating a faceted target representation. This program allows the user to create the graphics for the vehicle and to assign the material properties to the graphics. The vehicle description file is then automatically generated by the preprocessor. By containing all the information in one database, the modeling process is made more accurate and data tracing can be done easily. A bridge to convert other graphics packages (such as BRL-CAD) to a faceted representation is being developed. When the bridge is finished, this preprocessor will be used to manipulate the converted data.

  10. Modeling, Measurements, and Fundamental Database Development for Nonequilibrium Hypersonic Aerothermodynamics

    NASA Technical Reports Server (NTRS)

    Bose, Deepak

    2012-01-01

    The design of entry vehicles requires predictions of aerothermal environment during the hypersonic phase of their flight trajectories. These predictions are made using computational fluid dynamics (CFD) codes that often rely on physics and chemistry models of nonequilibrium processes. The primary processes of interest are gas phase chemistry, internal energy relaxation, electronic excitation, nonequilibrium emission and absorption of radiation, and gas-surface interaction leading to surface recession and catalytic recombination. NASAs Hypersonics Project is advancing the state-of-the-art in modeling of nonequilibrium phenomena by making detailed spectroscopic measurements in shock tube and arcjets, using ab-initio quantum mechanical techniques develop fundamental chemistry and spectroscopic databases, making fundamental measurements of finite-rate gas surface interactions, implementing of detailed mechanisms in the state-of-the-art CFD codes, The development of new models is based on validation with relevant experiments. We will present the latest developments and a roadmap for the technical areas mentioned above

  11. The OTP-model applied to the Aklim site database

    NASA Astrophysics Data System (ADS)

    Mraini, Kamilia; Jabiri, Abdelhadi; Benkhaldoun, Zouhair; Bounhir, Aziza; Hach, Youssef; Sabil, Mohammed; Habib, Abdelfettah

    2014-08-01

    Within the framework of the site prospection for the future European Extremely Large Telescope (E-ELT), a wide site characterization was achieved. Aklim site located at an altitude of 2350 m at the geographical coordinates: lat.= 30°07'38" N, long.= 8°18'31" W , in the Moroccan Middle Atlas Mountains, was one of the candidate sites chosen by the Framework Programme VI (FP6) of the European Union. To complete the fulfilled study ([19]; [21]), we have used the ModelOTP (model of optical turbulence profiles) established by [15] and improved by [6]. This model allows getting the built-in profiles of the optical turbulence under various conditions. In this paper, we present an overview of the Aklim database results, in the boundary layers and in the free atmosphere separately and we make a comparison with Cerro Pachon result [15].

  12. Feasibility and utility of applications of the common data model to multiple, disparate observational health databases.

    PubMed

    Voss, Erica A; Makadia, Rupa; Matcho, Amy; Ma, Qianli; Knoll, Chris; Schuemie, Martijn; DeFalco, Frank J; Londhe, Ajit; Zhu, Vivienne; Ryan, Patrick B

    2015-05-01

    To evaluate the utility of applying the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) across multiple observational databases within an organization and to apply standardized analytics tools for conducting observational research. Six deidentified patient-level datasets were transformed to the OMOP CDM. We evaluated the extent of information loss that occurred through the standardization process. We developed a standardized analytic tool to replicate the cohort construction process from a published epidemiology protocol and applied the analysis to all 6 databases to assess time-to-execution and comparability of results. Transformation to the CDM resulted in minimal information loss across all 6 databases. Patients and observations excluded were due to identified data quality issues in the source system, 96% to 99% of condition records and 90% to 99% of drug records were successfully mapped into the CDM using the standard vocabulary. The full cohort replication and descriptive baseline summary was executed for 2 cohorts in 6 databases in less than 1 hour. The standardization process improved data quality, increased efficiency, and facilitated cross-database comparisons to support a more systematic approach to observational research. Comparisons across data sources showed consistency in the impact of inclusion criteria, using the protocol and identified differences in patient characteristics and coding practices across databases. Standardizing data structure (through a CDM), content (through a standard vocabulary with source code mappings), and analytics can enable an institution to apply a network-based approach to observational research across multiple, disparate observational health databases. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  13. Feasibility and utility of applications of the common data model to multiple, disparate observational health databases

    PubMed Central

    Makadia, Rupa; Matcho, Amy; Ma, Qianli; Knoll, Chris; Schuemie, Martijn; DeFalco, Frank J; Londhe, Ajit; Zhu, Vivienne; Ryan, Patrick B

    2015-01-01

    Objectives To evaluate the utility of applying the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) across multiple observational databases within an organization and to apply standardized analytics tools for conducting observational research. Materials and methods Six deidentified patient-level datasets were transformed to the OMOP CDM. We evaluated the extent of information loss that occurred through the standardization process. We developed a standardized analytic tool to replicate the cohort construction process from a published epidemiology protocol and applied the analysis to all 6 databases to assess time-to-execution and comparability of results. Results Transformation to the CDM resulted in minimal information loss across all 6 databases. Patients and observations excluded were due to identified data quality issues in the source system, 96% to 99% of condition records and 90% to 99% of drug records were successfully mapped into the CDM using the standard vocabulary. The full cohort replication and descriptive baseline summary was executed for 2 cohorts in 6 databases in less than 1 hour. Discussion The standardization process improved data quality, increased efficiency, and facilitated cross-database comparisons to support a more systematic approach to observational research. Comparisons across data sources showed consistency in the impact of inclusion criteria, using the protocol and identified differences in patient characteristics and coding practices across databases. Conclusion Standardizing data structure (through a CDM), content (through a standard vocabulary with source code mappings), and analytics can enable an institution to apply a network-based approach to observational research across multiple, disparate observational health databases. PMID:25670757

  14. Developing High-resolution Soil Database for Regional Crop Modeling in East Africa

    NASA Astrophysics Data System (ADS)

    Han, E.; Ines, A. V. M.

    2014-12-01

    The most readily available soil data for regional crop modeling in Africa is the World Inventory of Soil Emission potentials (WISE) dataset, which has 1125 soil profiles for the world, but does not extensively cover countries Ethiopia, Kenya, Uganda and Tanzania in East Africa. Another dataset available is the HC27 (Harvest Choice by IFPRI) in a gridded format (10km) but composed of generic soil profiles based on only three criteria (texture, rooting depth, and organic carbon content). In this paper, we present a development and application of a high-resolution (1km), gridded soil database for regional crop modeling in East Africa. Basic soil information is extracted from Africa Soil Information Service (AfSIS), which provides essential soil properties (bulk density, soil organic carbon, soil PH and percentages of sand, silt and clay) for 6 different standardized soil layers (5, 15, 30, 60, 100 and 200 cm) in 1km resolution. Soil hydraulic properties (e.g., field capacity and wilting point) are derived from the AfSIS soil dataset using well-proven pedo-transfer functions and are customized for DSSAT-CSM soil data requirements. The crop model is used to evaluate crop yield forecasts using the new high resolution soil database and compared with WISE and HC27. In this paper we will present also the results of DSSAT loosely coupled with a hydrologic model (VIC) to assimilate root-zone soil moisture. Creating a grid-based soil database, which provides a consistent soil input for two different models (DSSAT and VIC) is a critical part of this work. The created soil database is expected to contribute to future applications of DSSAT crop simulation in East Africa where food security is highly vulnerable.

  15. What makes a model organism?

    PubMed

    Leonelli, Sabina; Ankeny, Rachel A

    2013-12-01

    This article explains the key role of model organisms within contemporary research, while at the same time acknowledging their limitations as biological models. We analyse the epistemic and social characteristics of model organism biology as a form of "big science", which includes the development of large, centralised infrastructures, a shared ethos and a specific long-term vision about the "right way" to do research. In order to make wise use of existing resources, researchers now find themselves committed to carrying out this vision with its accompanying assumptions. By clarifying the specific characteristics of model organism work, we aim to provide a framework to assess how much funding should be allocated to such research. On the one hand, it is imperative to exploit the resources and knowledge accumulated using these models to study more diverse groups of organisms. On the other hand, this type of research may be inappropriate for research programmes where the processes of interest are much more delimited, can be usefully studied in isolation and/or are simply not captured by model organism biology.

  16. MOSAIC: An organic geochemical and sedimentological database for marine surface sediments

    NASA Astrophysics Data System (ADS)

    Tavagna, Maria Luisa; Usman, Muhammed; De Avelar, Silvania; Eglinton, Timothy

    2015-04-01

    Modern ocean sediments serve as the interface between the biosphere and the geosphere, play a key role in biogeochemical cycles and provide a window on how contemporary processes are written into the sedimentary record. Research over past decades has resulted in a wealth of information on the content and composition of organic matter in marine sediments, with ever-more sophisticated techniques continuing to yield information of greater detail and as an accelerating pace. However, there has been no attempt to synthesize this wealth of information. We are establishing a new database that incorporates information relevant to local, regional and global-scale assessment of the content, source and fate of organic materials accumulating in contemporary marine sediments. In the MOSAIC (Modern Ocean Sediment Archive and Inventory of Carbon) database, particular emphasis is placed on molecular and isotopic information, coupled with relevant contextual information (e.g., sedimentological properties) relevant to elucidating factors that influence the efficiency and nature of organic matter burial. The main features of MOSAIC include: (i) Emphasis on continental margin sediments as major loci of carbon burial, and as the interface between terrestrial and oceanic realms; (ii) Bulk to molecular-level organic geochemical properties and parameters, including concentration and isotopic compositions; (iii) Inclusion of extensive contextual data regarding the depositional setting, in particular with respect to sedimentological and redox characteristics. The ultimate goal is to create an open-access instrument, available on the web, to be utilized for research and education by the international community who can both contribute to, and interrogate the database. The submission will be accomplished by means of a pre-configured table available on the MOSAIC webpage. The information on the filled tables will be checked and eventually imported, via the Structural Query Language (SQL), into

  17. MetRxn: a knowledgebase of metabolites and reactions spanning metabolic models and databases

    PubMed Central

    2012-01-01

    Background Increasingly, metabolite and reaction information is organized in the form of genome-scale metabolic reconstructions that describe the reaction stoichiometry, directionality, and gene to protein to reaction associations. A key bottleneck in the pace of reconstruction of new, high-quality metabolic models is the inability to directly make use of metabolite/reaction information from biological databases or other models due to incompatibilities in content representation (i.e., metabolites with multiple names across databases and models), stoichiometric errors such as elemental or charge imbalances, and incomplete atomistic detail (e.g., use of generic R-group or non-explicit specification of stereo-specificity). Description MetRxn is a knowledgebase that includes standardized metabolite and reaction descriptions by integrating information from BRENDA, KEGG, MetaCyc, Reactome.org and 44 metabolic models into a single unified data set. All metabolite entries have matched synonyms, resolved protonation states, and are linked to unique structures. All reaction entries are elementally and charge balanced. This is accomplished through the use of a workflow of lexicographic, phonetic, and structural comparison algorithms. MetRxn allows for the download of standardized versions of existing genome-scale metabolic models and the use of metabolic information for the rapid reconstruction of new ones. Conclusions The standardization in description allows for the direct comparison of the metabolite and reaction content between metabolic models and databases and the exhaustive prospecting of pathways for biotechnological production. This ever-growing dataset currently consists of over 76,000 metabolites participating in more than 72,000 reactions (including unresolved entries). MetRxn is hosted on a web-based platform that uses relational database models (MySQL). PMID:22233419

  18. The eukaryotic promoter database in its 30th year: focus on non-vertebrate organisms

    PubMed Central

    Dreos, René; Ambrosini, Giovanna; Groux, Romain; Cavin Périer, Rouaïda; Bucher, Philipp

    2017-01-01

    We present an update of the Eukaryotic Promoter Database EPD (http://epd.vital-it.ch), more specifically on the EPDnew division, which contains comprehensive organisms-specific transcription start site (TSS) collections automatically derived from next generation sequencing (NGS) data. Thanks to the abundant release of new high-throughput transcript mapping data (CAGE, TSS-seq, GRO-cap) the database could be extended to plant and fungal species. We further report on the expansion of the mass genome annotation (MGA) repository containing promoter-relevant chromatin profiling data and on improvements for the EPD entry viewers. Finally, we present a new data access tool, ChIP-Extract, which enables computational biologists to extract diverse types of promoter-associated data in numerical table formats that are readily imported into statistical analysis platforms such as R. PMID:27899657

  19. Mouse Genome Database: From sequence to phenotypes and disease models

    PubMed Central

    Richardson, Joel E.; Kadin, James A.; Smith, Cynthia L.; Blake, Judith A.; Bult, Carol J.

    2015-01-01

    Summary The Mouse Genome Database (MGD, www.informatics.jax.org) is the international scientific database for genetic, genomic, and biological data on the laboratory mouse to support the research requirements of the biomedical community. To accomplish this goal, MGD provides broad data coverage, serves as the authoritative standard for mouse nomenclature for genes, mutants, and strains, and curates and integrates many types of data from literature and electronic sources. Among the key data sets MGD supports are: the complete catalog of mouse genes and genome features, comparative homology data for mouse and vertebrate genes, the authoritative set of Gene Ontology (GO) annotations for mouse gene functions, a comprehensive catalog of mouse mutations and their phenotypes, and a curated compendium of mouse models of human diseases. Here, we describe the data acquisition process, specifics about MGD's key data areas, methods to access and query MGD data, and outreach and user help facilities. genesis 53:458–473, 2015. © 2015 The Authors. Genesis Published by Wiley Periodicals, Inc. PMID:26150326

  20. LANL High-Level Model (HLM) database development letter report

    SciTech Connect

    1995-10-01

    Traditional methods of evaluating munitions have been able to successfully compare like munition`s capabilities. On the modern battlefield, however, many different types of munitions compete for the same set of targets. Assessing the overall stockpile capability and proper mix of these weapons is not a simple task, as their use depends upon the specific geographic region of the world, the threat capabilities, the tactics and operational strategy used by both the US and Threat commanders, and of course the type and quantity of munitions available to the CINC. To sort out these types of issues, a hierarchical set of dynamic, two-sided combat simulations are generally used. The DoD has numerous suitable models for this purpose, but rarely are the models focused on munitions expenditures. Rather, they are designed to perform overall platform assessments and force mix evaluations. However, in some cases, the models could be easily adapted to provide this information, since it is resident in the model`s database. Unfortunately, these simulations` complexity (their greatest strength) precludes quick turnaround assessments of the type and scope required by senior decision-makers.

  1. Virtual Organizations: Trends and Models

    NASA Astrophysics Data System (ADS)

    Nami, Mohammad Reza; Malekpour, Abbaas

    The Use of ICT in business has changed views about traditional business. With VO, organizations with out physical, geographical, or structural constraint can collaborate with together in order to fulfill customer requests in a networked environment. This idea improves resource utilization, reduces development process and costs, and saves time. Virtual Organization (VO) is always a form of partnership and managing partners and handling partnerships are crucial. Virtual organizations are defined as a temporary collection of enterprises that cooperate and share resources, knowledge, and competencies to better respond to business opportunities. This paper presents an overview of virtual organizations and main issues in collaboration such as security and management. It also presents a number of different model approaches according to their purpose and applications.

  2. An atmospheric tritium release database for model comparisons

    SciTech Connect

    Murphy, C.E. Jr.; Wortham, G.R.

    1997-10-13

    A database of vegetation, soil, and air tritium concentrations at gridded coordinate locations following nine accidental atmospheric releases is described. The concentration data is supported by climatological data taken during and immediately after the releases. In six cases, the release data is supplemented with meteorological data taken at seven towers scattered throughout the immediate area of the releases and data from a single television tower instrumented at eight heights. While none of the releases caused a significant dose to the public, the data collected is valuable for comparison with the results of tritium transport models used for risk assessment. The largest, potential off-site dose from any of the releases was calculated to be 1.6 mrem. The population dose from this same release was 46 person-rem which represents 0.04 percent of the natural background dose to the population in the path of the release.

  3. Global search tool for the Advanced Photon Source Integrated Relational Model of Installed Systems (IRMIS) database.

    SciTech Connect

    Quock, D. E. R.; Cianciarulo, M. B.; APS Engineering Support Division; Purdue Univ.

    2007-01-01

    The Integrated Relational Model of Installed Systems (IRMIS) is a relational database tool that has been implemented at the Advanced Photon Source to maintain an updated account of approximately 600 control system software applications, 400,000 process variables, and 30,000 control system hardware components. To effectively display this large amount of control system information to operators and engineers, IRMIS was initially built with nine Web-based viewers: Applications Organizing Index, IOC, PLC, Component Type, Installed Components, Network, Controls Spares, Process Variables, and Cables. However, since each viewer is designed to provide details from only one major category of the control system, the necessity for a one-stop global search tool for the entire database became apparent. The user requirements for extremely fast database search time and ease of navigation through search results led to the choice of Asynchronous JavaScript and XML (AJAX) technology in the implementation of the IRMIS global search tool. Unique features of the global search tool include a two-tier level of displayed search results, and a database data integrity validation and reporting mechanism.

  4. Podiform chromite deposits--database and grade and tonnage models

    USGS Publications Warehouse

    Mosier, Dan L.; Singer, Donald A.; Moring, Barry C.; Galloway, John P.

    2012-01-01

    Chromite ((Mg, Fe++)(Cr, Al, Fe+++)2O4) is the only source for the metallic element chromium, which is used in the metallurgical, chemical, and refractory industries. Podiform chromite deposits are small magmatic chromite bodies formed in the ultramafic section of an ophiolite complex in the oceanic crust. These deposits have been found in midoceanic ridge, off-ridge, and suprasubduction tectonic settings. Most podiform chromite deposits are found in dunite or peridotite near the contact of the cumulate and tectonite zones in ophiolites. We have identified 1,124 individual podiform chromite deposits, based on a 100-meter spatial rule, and have compiled them in a database. Of these, 619 deposits have been used to create three new grade and tonnage models for podiform chromite deposits. The major podiform chromite model has a median tonnage of 11,000 metric tons and a mean grade of 45 percent Cr2O3. The minor podiform chromite model has a median tonnage of 100 metric tons and a mean grade of 43 percent Cr2O3. The banded podiform chromite model has a median tonnage of 650 metric tons and a mean grade of 42 percent Cr2O3. Observed frequency distributions are also given for grades of rhodium, iridium, ruthenium, palladium, and platinum. In resource assessment applications, both major and minor podiform chromite models may be used for any ophiolite complex regardless of its tectonic setting or ophiolite zone. Expected sizes of undiscovered podiform chromite deposits, with respect to degree of deformation or ore-forming process, may determine which model is appropriate. The banded podiform chromite model may be applicable for ophiolites in both suprasubduction and midoceanic ridge settings.

  5. Determination of urban volatile organic compound emission ratios and comparison with an emissions database

    NASA Astrophysics Data System (ADS)

    Warneke, C.; McKeen, S. A.; de Gouw, J. A.; Goldan, P. D.; Kuster, W. C.; Holloway, J. S.; Williams, E. J.; Lerner, B. M.; Parrish, D. D.; Trainer, M.; Fehsenfeld, F. C.; Kato, S.; Atlas, E. L.; Baker, A.; Blake, D. R.

    2007-05-01

    During the NEAQS-ITCT2k4 campaign in New England, anthropogenic VOCs and CO were measured downwind from New York City and Boston. The emission ratios of VOCs relative to CO and acetylene were calculated using a method in which the ratio of a VOC with acetylene is plotted versus the photochemical age. The intercept at the photochemical age of zero gives the emission ratio. The so determined emission ratios were compared to other measurement sets, including data from the same location in 2002, canister samples collected inside New York City and Boston, aircraft measurements from Los Angeles in 2002, and the average urban composition of 39 U.S. cities. All the measurements generally agree within a factor of two. The measured emission ratios also agree for most compounds within a factor of two with vehicle exhaust data indicating that a major source of VOCs in urban areas is automobiles. A comparison with an anthropogenic emission database shows less agreement. Especially large discrepancies were found for the C2-C4 alkanes and most oxygenated species. As an example, the database overestimated toluene by almost a factor of three, which caused an air quality forecast model (WRF-CHEM) using this database to overpredict the toluene mixing ratio by a factor of 2.5 as well. On the other hand, the overall reactivity of the measured species and the reactivity of the same compounds in the emission database were found to agree within 30%.

  6. BioModels Database: An enhanced, curated and annotated resource for published quantitative kinetic models

    PubMed Central

    2010-01-01

    Background Quantitative models of biochemical and cellular systems are used to answer a variety of questions in the biological sciences. The number of published quantitative models is growing steadily thanks to increasing interest in the use of models as well as the development of improved software systems and the availability of better, cheaper computer hardware. To maximise the benefits of this growing body of models, the field needs centralised model repositories that will encourage, facilitate and promote model dissemination and reuse. Ideally, the models stored in these repositories should be extensively tested and encoded in community-supported and standardised formats. In addition, the models and their components should be cross-referenced with other resources in order to allow their unambiguous identification. Description BioModels Database http://www.ebi.ac.uk/biomodels/ is aimed at addressing exactly these needs. It is a freely-accessible online resource for storing, viewing, retrieving, and analysing published, peer-reviewed quantitative models of biochemical and cellular systems. The structure and behaviour of each simulation model distributed by BioModels Database are thoroughly checked; in addition, model elements are annotated with terms from controlled vocabularies as well as linked to relevant data resources. Models can be examined online or downloaded in various formats. Reaction network diagrams generated from the models are also available in several formats. BioModels Database also provides features such as online simulation and the extraction of components from large scale models into smaller submodels. Finally, the system provides a range of web services that external software systems can use to access up-to-date data from the database. Conclusions BioModels Database has become a recognised reference resource for systems biology. It is being used by the community in a variety of ways; for example, it is used to benchmark different simulation

  7. MicrobesFlux: a web platform for drafting metabolic models from the KEGG database.

    PubMed

    Feng, Xueyang; Xu, You; Chen, Yixin; Tang, Yinjie J

    2012-08-02

    Concurrent with the efforts currently underway in mapping microbial genomes using high-throughput sequencing methods, systems biologists are building metabolic models to characterize and predict cell metabolisms. One of the key steps in building a metabolic model is using multiple databases to collect and assemble essential information about genome-annotations and the architecture of the metabolic network for a specific organism. To speed up metabolic model development for a large number of microorganisms, we need a user-friendly platform to construct metabolic networks and to perform constraint-based flux balance analysis based on genome databases and experimental results. We have developed a semi-automatic, web-based platform (MicrobesFlux) for generating and reconstructing metabolic models for annotated microorganisms. MicrobesFlux is able to automatically download the metabolic network (including enzymatic reactions and metabolites) of ~1,200 species from the KEGG database (Kyoto Encyclopedia of Genes and Genomes) and then convert it to a metabolic model draft. The platform also provides diverse customized tools, such as gene knockouts and the introduction of heterologous pathways, for users to reconstruct the model network. The reconstructed metabolic network can be formulated to a constraint-based flux model to predict and analyze the carbon fluxes in microbial metabolisms. The simulation results can be exported in the SBML format (The Systems Biology Markup Language). Furthermore, we also demonstrated the platform functionalities by developing an FBA model (including 229 reactions) for a recent annotated bioethanol producer, Thermoanaerobacter sp. strain X514, to predict its biomass growth and ethanol production. MicrobesFlux is an installation-free and open-source platform that enables biologists without prior programming knowledge to develop metabolic models for annotated microorganisms in the KEGG database. Our system facilitates users to reconstruct

  8. dSED: A database tool for modeling sediment early diagenesis

    NASA Astrophysics Data System (ADS)

    Katsev, S.; Rancourt, D. G.; L'Heureux, I.

    2003-04-01

    Sediment early diagenesis reaction transport models (RTMs) are becoming powerful tools in providing kinetic descriptions of the metal and nutrient diagenetic cycling in marine, lacustrine, estuarine, and other aquatic sediments, as well as of exchanges with the water column. Whereas there exist several good database/program combinations for thermodynamic equilibrium calculations in aqueous systems, at present there exist no database tools for classification and analysis of the kinetic data essential to RTM development. We present a database tool that is intended to serve as an online resource for information about chemical reactions, solid phase and solute reactants, sorption reactions, transport mechanisms, and kinetic and equilibrium parameters that are relevant to sediment diagenesis processes. The list of reactive substances includes but is not limited to organic matter, Fe and Mn oxides and oxyhydroxides, sulfides and sulfates, calcium, iron, and manganese carbonates, phosphorus-bearing minerals, and silicates. Aqueous phases include dissolved carbon dioxide, oxygen, methane, hydrogen sulfide, sulfate, nitrate, phosphate, some organic compounds, and dissolved metal species. A number of filters allow extracting information according to user-specified criteria, e.g., about a class of substances contributing to the cycling of iron. The database also includes bibliographic information about published diagenetic models and the reactions and processes that they consider. At the time of preparing this abstract, dSED contained 128 reactions and 12 pre-defined filters. dSED is maintained by the Lake Sediment Structure and Evolution (LSSE) group at the University of Ottawa (www.science.uottawa.ca/LSSE/dSED) and we invite input from the geochemical community.

  9. A Conceptual Model of the Information Requirements of Nursing Organizations

    PubMed Central

    Miller, Emmy

    1989-01-01

    Three related issues play a role in the identification of the information requirements of nursing organizations. These issues are the current state of computer systems in health care organizations, the lack of a well-defined data set for nursing, and the absence of models representing data and information relevant to clinical and administrative nursing practice. This paper will examine current methods of data collection, processing, and storage in clinical and administrative nursing practice for the purpose of identifying the information requirements of nursing organizations. To satisfy these information requirements, database technology can be used; however, a model for database design is needed that reflects the conceptual framework of nursing and the professional concerns of nurses. A conceptual model of the types of data necessary to produce the desired information will be presented and the relationships among data will be delineated.

  10. Modeling of heavy organic deposition

    SciTech Connect

    Chung, F.T.H.

    1992-01-01

    Organic deposition is often a major problem in petroleum production and processing. This problem is manifested by current activities in gas flooding and heavy oil production. The need for understanding the nature of asphaltenes and asphaltics and developing solutions to the deposition problem is well recognized. Prediction technique is crucial to solution development. In the past 5 years, some progress in modeling organic deposition has been made. A state-of-the-art review of methods for modeling organic deposition is presented in this report. Two new models were developed in this work; one based on a thermodynamic equilibrium principle and the other on the colloidal stability theory. These two models are more general and realistic than others previously reported. Because experimental results on the characteristics of asphaltene are inconclusive, it is still not well known whether the asphaltenes is crude oil exist as a true solution or as a colloidal suspension. Further laboratory work which is designed to study the solubility properties of asphaltenes and to provide additional information for model development is proposed. Some experimental tests have been conducted to study the mechanisms of CO{sub 2}-induced asphaltene precipitation. Coreflooding experiments show that asphaltene precipitation occurs after gas breakthrough. The mechanism of CO{sub 2}-induced asphaltene precipitation is believed to occur by hydrocarbon extraction which causes change in oil composition. Oil swelling due to CO{sub 2} solubilization does not induce asphaltene precipitation.

  11. Share and enjoy: anatomical models database--generating and sharing cardiovascular model data using web services.

    PubMed

    Kerfoot, Eric; Lamata, Pablo; Niederer, Steve; Hose, Rod; Spaan, Jos; Smith, Nic

    2013-11-01

    Sharing data between scientists and with clinicians in cardiac research has been facilitated significantly by the use of web technologies. The potential of this technology has meant that information sharing has been routinely promoted through databases that have encouraged stakeholder participation in communities around these services. In this paper we discuss the Anatomical Model Database (AMDB) (Gianni et al. Functional imaging and modeling of the heart. Springer, Heidelberg, 2009; Gianni et al. Phil Trans Ser A Math Phys Eng Sci 368:3039-3056, 2010) which both facilitate a database-centric approach to collaboration, and also extends this framework with new capabilities for creating new mesh data. AMDB currently stores cardiac geometric models described in Gianni et al. (Functional imaging and modelling of the heart. Springer, Heidelberg, 2009), a number of additional cardiac models describing geometry and functional properties, and most recently models generated using a web service. The functional models represent data from simulations in geometric form, such as electrophysiology or mechanics, many of which are present in AMDB as part of a benchmark study. Finally, the heartgen service has been added for producing left or bi-ventricle models derived from binary image data using the methods described in Lamata et al. (Med Image Anal 15:801-813, 2011). The results can optionally be hosted on AMDB alongside other community-provided anatomical models. AMDB is, therefore, a unique database storing geometric data (rather than abstract models or image data) combined with a powerful web service for generating new geometric models.

  12. An Access Path Model for Physical Database Design.

    DTIC Science & Technology

    1979-12-28

    target system. 4.1 Algebraic Structure for Physical Design For the purposes of implementation-oriented design, we shall use the logical access paths...subsection, we present an algorithm for gen- erating a maximal labelling that specifies superior support for the access paths most heavily travelled. Assume...A.C.M. SIGMOD Conf., (May 79). [CARD731 Cardenas , A. F., "Evaluation and Selection of File Organization - A Model and a System," Comm. A.C.M., V 16, N

  13. Cascade fuzzy ART: a new extensible database for model-based object recognition

    NASA Astrophysics Data System (ADS)

    Hung, Hai-Lung; Liao, Hong-Yuan M.; Lin, Shing-Jong; Lin, Wei-Chung; Fan, Kuo-Chin

    1996-02-01

    In this paper, we propose a cascade fuzzy ART (CFART) neural network which can be used as an extensible database in a model-based object recognition system. The proposed CFART networks can accept both binary and continuous inputs. Besides, it preserves the prominent characteristics of a fuzzy ART network and extends the fuzzy ART's capability toward a hierarchical class representation of input patterns. The learning processes of the proposed network are unsupervised and self-organizing, which include coupled top-down searching and bottom-up learning processes. In addition, a global searching tree is built to speed up the learning and recognition processes.

  14. Global and Regional Ecosystem Modeling: Databases of Model Drivers and Validation Measurements

    SciTech Connect

    Olson, R.J.

    2002-03-19

    }-grid cells for which inventory, modeling, or remote-sensing tools were used to scale up the point measurements. Documentation of the content and organization of the EMDI databases are provided.

  15. Generic models of deep formation water calculated with PHREEQC using the "gebo"-database

    NASA Astrophysics Data System (ADS)

    Bozau, E.; van Berk, W.

    2012-04-01

    To identify processes during the use of formation waters for geothermal energy production an extended hydrogeochemical thermodynamic database (named "gebo"-database) for the well known and commonly used software PHREEQC has been developed by collecting and inserting data from literature. The following solution master species: Fe(+2), Fe(+3), S(-2), C(-4), Si, Zn, Pb, and Al are added to the database "pitzer.dat" which is provided with the code PHREEQC. According to the solution master species the necessary solution species and phases (solid phases and gases) are implemented. Furthermore, temperature and pressure adaptations of the mass action law constants, Pitzer parameters for the calculation of activity coefficients in waters of high ionic strength and solubility equilibria among gaseous and aqueous species of CO2, methane, and hydrogen sulphide are implemented into the "gebo"-database. Combined with the "gebo"-database the code PHREEQC can be used to test the behaviour of highly concentrated solutions (e.g. formation waters, brines). Chemical changes caused by temperature and pressure gradients as well as the exposure of the water to the atmosphere and technical equipments can be modelled. To check the plausibility of additional and adapted data/parameters experimental solubility data from literature (e.g. sulfate and carbonate minerals) are compared to modelled mineral solubilities at elevated levels of Total Dissolved Solids (TDS), temperature, and pressure. First results show good matches between modelled and experimental mineral solubility for barite, celestite, anhydrite, and calcite in high TDS waters indicating the plausibility of additional and adapted data and parameters. Furthermore, chemical parameters of geothermal wells in the North German Basin are used to test the "gebo"-database. The analysed water composition (starting with the main cations and anions) is calculated by thermodynamic equilibrium reactions of pure water with the minerals found in

  16. Spatial Models for Architectural Heritage in Urban Database Context

    NASA Astrophysics Data System (ADS)

    Costamagna, E.; Spanò, A.

    2011-08-01

    Despite the GIS (Geographic Information Systems/Geospatial Information Systems) have been provided with several applications to manage the two-dimensional geometric information and arrange the topological relations among different spatial primitives, most of these systems have limited capabilities to manage the three-dimensional space. Other tools, such as CAD systems, have already achieved a full capability of representing 3D data. Most of the researches in the field of GIS have underlined the necessity of a full 3D management capability which is not yet achieved by the available systems (Rahman, Pilouk 2008) (Zlatanova 2002). First of all to reach this goal is important to define the spatial data model, which is at the same time a geometric and topological model and so integrating these two aspects in relation to the database management efficiency and documentation purposes. The application field on which these model can be tested is the spatial data managing of Architectural Heritage documentation, to evaluate the pertinence of these spatial models to the requested scale for the needs of such a documentation. Most of the important aspects are the integration of metric data originated from different sources and the representation and management of multiscale data. The issues connected with the representation of objects at higher LOD than the ones defined by the CityGML will be taken into account. The aim of this paper is then to investigate which are the favorable application of a framework in order to integrate two different approaches: architectural heritage spatial documentation and urban scale spatial data management.

  17. GIS-based hydrogeological databases and groundwater modelling

    NASA Astrophysics Data System (ADS)

    Gogu, Radu Constantin; Carabin, Guy; Hallet, Vincent; Peters, Valerie; Dassargues, Alain

    2001-12-01

    Reliability and validity of groundwater analysis strongly depend on the availability of large volumes of high-quality data. Putting all data into a coherent and logical structure supported by a computing environment helps ensure validity and availability and provides a powerful tool for hydrogeological studies. A hydrogeological geographic information system (GIS) database that offers facilities for groundwater-vulnerability analysis and hydrogeological modelling has been designed in Belgium for the Walloon region. Data from five river basins, chosen for their contrasting hydrogeological characteristics, have been included in the database, and a set of applications that have been developed now allow further advances. Interest is growing in the potential for integrating GIS technology and groundwater simulation models. A "loose-coupling" tool was created between the spatial-database scheme and the groundwater numerical model interface GMS (Groundwater Modelling System). Following time and spatial queries, the hydrogeological data stored in the database can be easily used within different groundwater numerical models. Résumé. La validité et la reproductibilité de l'analyse d'un aquifère dépend étroitement de la disponibilité de grandes quantités de données de très bonne qualité. Le fait de mettre toutes les données dans une structure cohérente et logique soutenue par les logiciels nécessaires aide à assurer la validité et la disponibilité et fournit un outil puissant pour les études hydrogéologiques. Une base de données pour un système d'information géographique (SIG) hydrogéologique qui offre toutes les facilités pour l'analyse de la vulnérabilité des eaux souterraines et la modélisation hydrogéologique a été établi en Belgique pour la région Wallonne. Les données de cinq bassins de rivières, choisis pour leurs caractéristiques hydrogéologiques différentes, ont été introduites dans la base de données, et un ensemble d

  18. Incorporation of the CrossFire Beilstein Database into the Organic Chemistry Curriculum at the Royal Danish School of Pharmacy

    NASA Astrophysics Data System (ADS)

    Brøgger Christensen, S.; Franzyk, Henrik; Frølund, Bente; Jaroszewski, Jerzy W.; Stærk, Dan; Vedsø, Per

    2002-06-01

    The CrossFire Beilstein database has been incorporated into the organic chemistry curriculum at the Royal Danish School of Pharmacy as a powerful pedagogic tool. During a laboratory course in organic synthesis the database enables the students to get comprehensive overviews of known synthetic methods for a given compound. During a laboratory course in identification and as a part of an applied course in organic spectroscopy the students use the database for obtaining lists of all recorded isomeric compounds, facilitating an exhaustive identification. The main entrances for identification purposes are molecular formulas deduced either from titrations or from mass spectra combined with partial structures identified by chemical tests, or by interpretation of spectra. Thus, identifications made using the CrossFire Beilstein database will exclude some possibilities and point to correct structures from a selection of existing compounds. This appears to help the learning process considerably.

  19. Data model and relational database design for the New England Water-Use Data System (NEWUDS)

    USGS Publications Warehouse

    Tessler, Steven

    2001-01-01

    The New England Water-Use Data System (NEWUDS) is a database for the storage and retrieval of water-use data. NEWUDS can handle data covering many facets of water use, including (1) tracking various types of water-use activities (withdrawals, returns, transfers, distributions, consumptive-use, wastewater collection, and treatment); (2) the description, classification and location of places and organizations involved in water-use activities; (3) details about measured or estimated volumes of water associated with water-use activities; and (4) information about data sources and water resources associated with water use. In NEWUDS, each water transaction occurs unidirectionally between two site objects, and the sites and conveyances form a water network. The core entities in the NEWUDS model are site, conveyance, transaction/rate, location, and owner. Other important entities include water resources (used for withdrawals and returns), data sources, and aliases. Multiple water-exchange estimates can be stored for individual transactions based on different methods or data sources. Storage of user-defined details is accommodated for several of the main entities. Numerous tables containing classification terms facilitate detailed descriptions of data items and can be used for routine or custom data summarization. NEWUDS handles single-user and aggregate-user water-use data, can be used for large or small water-network projects, and is available as a stand-alone Microsoft? Access database structure. Users can customize and extend the database, link it to other databases, or implement the design in other relational database applications.

  20. Filling a missing link between biogeochemical, climate and ecosystem studies: a global database of atmospheric water-soluble organic nitrogen

    NASA Astrophysics Data System (ADS)

    Cornell, Sarah

    2015-04-01

    It is time to collate a global community database of atmospheric water-soluble organic nitrogen deposition. Organic nitrogen (ON) has long been known to be globally ubiquitous in atmospheric aerosol and precipitation, with implications for air and water quality, climate, biogeochemical cycles, ecosystems and human health. The number of studies of atmospheric ON deposition has increased steadily in recent years, but to date there is no accessible global dataset, for either bulk ON or its major components. Improved qualitative and quantitative understanding of the organic nitrogen component is needed to complement the well-established knowledge base pertaining to other components of atmospheric deposition (cf. Vet et al 2014). Without this basic information, we are increasingly constrained in addressing the current dynamics and potential interactions of atmospheric chemistry, climate and ecosystem change. To see the full picture we need global data synthesis, more targeted data gathering, and models that let us explore questions about the natural and anthropogenic dynamics of atmospheric ON. Collectively, our research community already has a substantial amount of atmospheric ON data. Published reports extend back over a century and now have near-global coverage. However, datasets available from the literature are very piecemeal and too often lack crucially important information that would enable aggregation or re-use. I am initiating an open collaborative process to construct a community database, so we can begin to systematically synthesize these datasets (generally from individual studies at a local and temporally limited scale) to increase their scientific usability and statistical power for studies of global change and anthropogenic perturbation. In drawing together our disparate knowledge, we must address various challenges and concerns, not least about the comparability of analysis and sampling methodologies, and the known complexity of composition of ON. We

  1. The Mouse Genome Database: Genotypes, Phenotypes, and Models of Human Disease

    PubMed Central

    Bult, Carol J.; Eppig, Janan T.; Blake, Judith A.; Kadin, James A.; Richardson, Joel E.

    2013-01-01

    The laboratory mouse is the premier animal model for studying human biology because all life stages can be accessed experimentally, a completely sequenced reference genome is publicly available and there exists a myriad of genomic tools for comparative and experimental research. In the current era of genome scale, data-driven biomedical research, the integration of genetic, genomic and biological data are essential for realizing the full potential of the mouse as an experimental model. The Mouse Genome Database (MGD; http://www.informatics.jax.org), the community model organism database for the laboratory mouse, is designed to facilitate the use of the laboratory mouse as a model system for understanding human biology and disease. To achieve this goal, MGD integrates genetic and genomic data related to the functional and phenotypic characterization of mouse genes and alleles and serves as a comprehensive catalog for mouse models of human disease. Recent enhancements to MGD include the addition of human ortholog details to mouse Gene Detail pages, the inclusion of microRNA knockouts to MGD’s catalog of alleles and phenotypes, the addition of video clips to phenotype images, providing access to genotype and phenotype data associated with quantitative trait loci (QTL) and improvements to the layout and display of Gene Ontology annotations. PMID:23175610

  2. The mouse genome database: genotypes, phenotypes, and models of human disease.

    PubMed

    Bult, Carol J; Eppig, Janan T; Blake, Judith A; Kadin, James A; Richardson, Joel E

    2013-01-01

    The laboratory mouse is the premier animal model for studying human biology because all life stages can be accessed experimentally, a completely sequenced reference genome is publicly available and there exists a myriad of genomic tools for comparative and experimental research. In the current era of genome scale, data-driven biomedical research, the integration of genetic, genomic and biological data are essential for realizing the full potential of the mouse as an experimental model. The Mouse Genome Database (MGD; http://www.informatics.jax.org), the community model organism database for the laboratory mouse, is designed to facilitate the use of the laboratory mouse as a model system for understanding human biology and disease. To achieve this goal, MGD integrates genetic and genomic data related to the functional and phenotypic characterization of mouse genes and alleles and serves as a comprehensive catalog for mouse models of human disease. Recent enhancements to MGD include the addition of human ortholog details to mouse Gene Detail pages, the inclusion of microRNA knockouts to MGD's catalog of alleles and phenotypes, the addition of video clips to phenotype images, providing access to genotype and phenotype data associated with quantitative trait loci (QTL) and improvements to the layout and display of Gene Ontology annotations.

  3. Geospatial Database for Strata Objects Based on Land Administration Domain Model (ladm)

    NASA Astrophysics Data System (ADS)

    Nasorudin, N. N.; Hassan, M. I.; Zulkifli, N. A.; Rahman, A. Abdul

    2016-09-01

    Recently in our country, the construction of buildings become more complex and it seems that strata objects database becomes more important in registering the real world as people now own and use multilevel of spaces. Furthermore, strata title was increasingly important and need to be well-managed. LADM is a standard model for land administration and it allows integrated 2D and 3D representation of spatial units. LADM also known as ISO 19152. The aim of this paper is to develop a strata objects database using LADM. This paper discusses the current 2D geospatial database and needs for 3D geospatial database in future. This paper also attempts to develop a strata objects database using a standard data model (LADM) and to analyze the developed strata objects database using LADM data model. The current cadastre system in Malaysia includes the strata title is discussed in this paper. The problems in the 2D geospatial database were listed and the needs for 3D geospatial database in future also is discussed. The processes to design a strata objects database are conceptual, logical and physical database design. The strata objects database will allow us to find the information on both non-spatial and spatial strata title information thus shows the location of the strata unit. This development of strata objects database may help to handle the strata title and information.

  4. A database for estimating organ dose for coronary angiography and brain perfusion CT scans for arbitrary spectra and angular tube current modulation

    SciTech Connect

    Rupcich, Franco; Badal, Andreu; Kyprianou, Iacovos; Schmidt, Taly Gilat

    2012-09-15

    Purpose: The purpose of this study was to develop a database for estimating organ dose in a voxelized patient model for coronary angiography and brain perfusion CT acquisitions with any spectra and angular tube current modulation setting. The database enables organ dose estimation for existing and novel acquisition techniques without requiring Monte Carlo simulations. Methods: The study simulated transport of monoenergetic photons between 5 and 150 keV for 1000 projections over 360 Degree-Sign through anthropomorphic voxelized female chest and head (0 Degree-Sign and 30 Degree-Sign tilt) phantoms and standard head and body CTDI dosimetry cylinders. The simulations resulted in tables of normalized dose deposition for several radiosensitive organs quantifying the organ dose per emitted photon for each incident photon energy and projection angle for coronary angiography and brain perfusion acquisitions. The values in a table can be multiplied by an incident spectrum and number of photons at each projection angle and then summed across all energies and angles to estimate total organ dose. Scanner-specific organ dose may be approximated by normalizing the database-estimated organ dose by the database-estimated CTDI{sub vol} and multiplying by a physical CTDI{sub vol} measurement. Two examples are provided demonstrating how to use the tables to estimate relative organ dose. In the first, the change in breast and lung dose during coronary angiography CT scans is calculated for reduced kVp, angular tube current modulation, and partial angle scanning protocols relative to a reference protocol. In the second example, the change in dose to the eye lens is calculated for a brain perfusion CT acquisition in which the gantry is tilted 30 Degree-Sign relative to a nontilted scan. Results: Our database provides tables of normalized dose deposition for several radiosensitive organs irradiated during coronary angiography and brain perfusion CT scans. Validation results indicate

  5. Avibase – a database system for managing and organizing taxonomic concepts

    PubMed Central

    Lepage, Denis; Vaidya, Gaurav; Guralnick, Robert

    2014-01-01

    Abstract Scientific names of biological entities offer an imperfect resolution of the concepts that they are intended to represent. Often they are labels applied to entities ranging from entire populations to individual specimens representing those populations, even though such names only unambiguously identify the type specimen to which they were originally attached. Thus the real-life referents of names are constantly changing as biological circumscriptions are redefined and thereby alter the sets of individuals bearing those names. This problem is compounded by other characteristics of names that make them ambiguous identifiers of biological concepts, including emendations, homonymy and synonymy. Taxonomic concepts have been proposed as a way to address issues related to scientific names, but they have yet to receive broad recognition or implementation. Some efforts have been made towards building systems that address these issues by cataloguing and organizing taxonomic concepts, but most are still in conceptual or proof-of-concept stage. We present the on-line database Avibase as one possible approach to organizing taxonomic concepts. Avibase has been successfully used to describe and organize 844,000 species-level and 705,000 subspecies-level taxonomic concepts across every major bird taxonomic checklist of the last 125 years. The use of taxonomic concepts in place of scientific names, coupled with efficient resolution services, is a major step toward addressing some of the main deficiencies in the current practices of scientific name dissemination and use. PMID:25061375

  6. Models for financial sustainability of biological databases and resources

    PubMed Central

    Chandras, Christina; Weaver, Thomas; Zouberakis, Michael; Smedley, Damian; Schughart, Klaus; Rosenthal, Nadia; Hancock, John M.; Kollias, George; Aidinis, Vassilis

    2009-01-01

    Following the technological advances that have enabled genome-wide analysis in most model organisms over the last decade, there has been unprecedented growth in genomic and post-genomic science with concomitant generation of an exponentially increasing volume of data and material resources. As a result, numerous repositories have been created to store and archive data, organisms and material, which are of substantial value to the whole community. Sustained access, facilitating re-use of these resources, is essential, not only for validation, but for re-analysis, testing of new hypotheses and developing new technologies/platforms. A common challenge for most data resources and biological repositories today is finding financial support for maintenance and development to best serve the scientific community. In this study we examine the problems that currently confront the data and resource infrastructure underlying the biomedical sciences. We discuss the financial sustainability issues and potential business models that could be adopted by biological resources and consider long term preservation issues within the context of mouse functional genomics efforts in Europe. PMID:20157490

  7. Demonstration of SLUMIS: a clinical database and management information system for a multi organ transplant program.

    PubMed Central

    Kurtz, M.; Bennett, T.; Garvin, P.; Manuel, F.; Williams, M.; Langreder, S.

    1991-01-01

    Because of the rapid evolution of the heart, heart/lung, liver, kidney and kidney/pancreas transplant programs at our institution, and because of a lack of an existing comprehensive database, we were required to develop a computerized management information system capable of supporting both clinical and research requirements of a multifaceted transplant program. SLUMIS (ST. LOUIS UNIVERSITY MULTI-ORGAN INFORMATION SYSTEM) was developed for the following reasons: 1) to comply with the reporting requirements of various transplant registries, 2) for reporting to an increasing number of government agencies and insurance carriers, 3) to obtain updates of our operative experience at regular intervals, 4) to integrate the Histocompatibility and Immunogenetics Laboratory (HLA) for online test result reporting, and 5) to facilitate clinical investigation. PMID:1807741

  8. Expanding on Successful Concepts, Models, and Organization

    SciTech Connect

    Teeguarden, Justin G.; Tan, Yu-Mei; Edwards, Stephen W.; Leonard, Jeremy A.; Anderson, Kim A.; Corley, Richard A.; Kile, Molly L.; L. Massey Simonich, Staci; Stone, David; Tanguay, Robert L.; Waters, Katrina M.; Harper, Stacey L.; Williams, David E.

    2016-09-06

    In her letter to the editor1 regarding our recent Feature Article “Completing the Link between Exposure Science and Toxicology for Improved Environmental Health Decision Making: The Aggregate Exposure Pathway Framework” 2, Dr. von Göetz expressed several concerns about terminology, and the perception that we propose the replacement of successful approaches and models for exposure assessment with a concept. We are glad to have the opportunity to address these issues here. If the goal of the AEP framework was to replace existing exposure models or databases for organizing exposure data with a concept, we would share Dr. von Göetz concerns. Instead, the outcome we promote is broader use of an organizational framework for exposure science. The framework would support improved generation, organization, and interpretation of data as well as modeling and prediction, not replacement of models. The field of toxicology has seen the benefits of wide use of one or more organizational frameworks (e.g., mode and mechanism of action, adverse outcome pathway). These frameworks influence how experiments are designed, data are collected, curated, stored and interpreted and ultimately how data are used in risk assessment. Exposure science is poised to similarly benefit from broader use of a parallel organizational framework, which Dr. von Göetz correctly points out, is currently used in the exposure modeling community. In our view, the concepts used so effectively in the exposure modeling community, expanded upon in the AEP framework, could see wider adoption by the field as a whole. The value of such a framework was recognized by the National Academy of Sciences.3 Replacement of models, databases, or any application with the AEP framework was not proposed in our article. The positive role broader more consistent use of such a framework might have in enabling and advancing “general activities such as data acquisition, organization…,” and exposure modeling was discussed

  9. MOAtox: A comprehensive mode of action and acute aquatic toxicity database for predictive model development.

    PubMed

    Barron, M G; Lilavois, C R; Martin, T M

    2015-04-01

    The mode of toxic action (MOA) has been recognized as a key determinant of chemical toxicity and as an alternative to chemical class-based predictive toxicity modeling. However, the development of quantitative structure activity relationship (QSAR) and other models has been limited by the availability of comprehensive high quality MOA and toxicity databases. The current study developed a dataset of MOA assignments for 1213 chemicals that included a diversity of metals, pesticides, and other organic compounds that encompassed six broad and 31 specific MOAs. MOA assignments were made using a combination of high confidence approaches that included international consensus classifications, QSAR predictions, and weight of evidence professional judgment based on an assessment of structure and literature information. A toxicity database of 674 acute values linked to chemical MOA was developed for fish and invertebrates. Additionally, species-specific measured or high confidence estimated acute values were developed for the four aquatic species with the most reported toxicity values: rainbow trout (Oncorhynchus mykiss), fathead minnow (Pimephales promelas), bluegill (Lepomis macrochirus), and the cladoceran (Daphnia magna). Measured acute toxicity values met strict standardization and quality assurance requirements. Toxicity values for chemicals with missing species-specific data were estimated using established interspecies correlation models and procedures (Web-ICE; http://epa.gov/ceampubl/fchain/webice/), with the highest confidence values selected. The resulting dataset of MOA assignments and paired toxicity values are provided in spreadsheet format as a comprehensive standardized dataset available for predictive aquatic toxicology model development.

  10. Representativeness of the Traumatic Brain Injury Model Systems National Database

    PubMed Central

    Corrigan, John D.; Cuthbert, Jeffrey P; Whiteneck, Gale G.; Dijkers, Marcel P.; Coronado, Victor; Heinemann, Allen W.; Harrison-Felix, Cynthia; Graham, James E.

    2012-01-01

    Objective To determine whether the Traumatic Brain Injury Model Systems National Database (TBIMS-NDB) is representative of individuals aged 16 years and older admitted for acute, inpatient rehabilitation in the United States with a primary diagnosis of traumatic brain injury (TBI). Design Secondary analysis of existing datasets. Setting Acute inpatient rehabilitation facilities. Participants Patients 16 years of age and older with a primary rehabilitation diagnosis of TBI. Interventions None. Main Outcome Measure demographic characteristics, functional status and hospital length of stay. Results From October 2001 through December 2007 patients included in the TBIMS-NDB were largely representative of all individuals 16 years and older admitted for rehabilitation in the U.S. with a primary diagnosis of TBI. The major difference in distribution was age—the TBIMS-NDB cohort did not include as many patients over age 65 as were admitted for rehabilitation with a primary diagnosis of TBI in the United States. Distributional differences for age-related characteristics were observed; however, groups of patients partitioned at age 65 differed minimally, especially the under 65 subset. Regardless of age, the proportion of patients with a rehabilitation stay of 1-9 days was larger nationwide. Nationwide admissions showed an age distribution similar to patients discharged alive from acute care with moderate, severe or penetrating TBI. The proportion of patients age 70 and older admitted for TBI rehabilitation in the United States increased every year, a trend that was not evident in the general population, TBIMS-NDB or among TBI patients in acute care. Conclusions These results provide substantial empirical evidence that the TBIMS-NDB is representative of patients receiving inpatient rehabilitation for TBI in the U.S. Researchers utilizing the TBIMS-NDB may want to adjust statistically for the lower percentage of patients over age 65 or those with stays less than 10 days

  11. Inorganic bromine in organic molecular crystals: Database survey and four case studies

    NASA Astrophysics Data System (ADS)

    Nemec, Vinko; Lisac, Katarina; Stilinović, Vladimir; Cinčić, Dominik

    2017-01-01

    We present a Cambridge Structural Database and experimental study of multicomponent molecular crystals containing bromine. The CSD study covers supramolecular behaviour of bromide and tribromide anions as well as halogen bonded dibromine molecules in crystal structures of organic salts and cocrystals, and a study of the geometries and complexities in polybromide anion systems. In addition, we present four case studies of organic structures with bromide, tribromide and polybromide anions as well as the neutral dibromine molecule. These include the first observed crystal with diprotonated phenazine, a double salt of phenazinium bromide and tribromide, a cocrystal of 4-methoxypyridine with the neutral dibromine molecule as a halogen bond donor, as well as bis(4-methoxypyridine)bromonium polybromide. Structural features of the four case studies are in the most part consistent with the statistically prevalent behaviour indicated by the CSD study for given bromine species, although they do exhibit some unorthodox structural features and in that indicate possible supramolecular causes for aberrations from the statistically most abundant (and presumably most favourable) geometries.

  12. FOAM (Functional Ontology Assignments for Metagenomes): A Hidden Markov Model (HMM) database with environmental focus

    SciTech Connect

    Prestat, Emmanuel; David, Maude M.; Hultman, Jenni; Ta , Neslihan; Lamendella, Regina; Dvornik, Jill; Mackelprang, Rachel; Myrold, David D.; Jumpponen, Ari; Tringe, Susannah G.; Holman, Elizabeth; Mavromatis, Konstantinos; Jansson, Janet K.

    2014-09-26

    A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associated functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.

  13. Database Administrator

    ERIC Educational Resources Information Center

    Moore, Pam

    2010-01-01

    The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…

  14. Database Administrator

    ERIC Educational Resources Information Center

    Moore, Pam

    2010-01-01

    The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…

  15. PCoM-DB Update: A Protein Co-Migration Database for Photosynthetic Organisms.

    PubMed

    Takabayashi, Atsushi; Takabayashi, Saeka; Takahashi, Kaori; Watanabe, Mai; Uchida, Hiroko; Murakami, Akio; Fujita, Tomomichi; Ikeuchi, Masahiko; Tanaka, Ayumi

    2016-12-22

    The identification of protein complexes is important for the understanding of protein structure and function and the regulation of cellular processes. We used blue-native PAGE and tandem mass spectrometry to identify protein complexes systematically, and built a web database, the protein co-migration database (PCoM-DB, http://pcomdb.lowtem.hokudai.ac.jp/proteins/top), to provide prediction tools for protein complexes. PCoM-DB provides migration profiles for any given protein of interest, and allows users to compare them with migration profiles of other proteins, showing the oligomeric states of proteins and thus identifying potential interaction partners. The initial version of PCoM-DB (launched in January 2013) included protein complex data for Synechocystis whole cells and Arabidopsis thaliana thylakoid membranes. Here we report PCoM-DB version 2.0, which includes new data sets and analytical tools. Additional data are included from whole cells of the pelagic marine picocyanobacterium Prochlorococcus marinus, the thermophilic cyanobacterium Thermosynechococcus elongatus, the unicellular green alga Chlamydomonas reinhardtii and the bryophyte Physcomitrella patens. The Arabidopsis protein data now include data for intact mitochondria, intact chloroplasts, chloroplast stroma and chloroplast envelopes. The new tools comprise a multiple-protein search form and a heat map viewer for protein migration profiles. Users can compare migration profiles of a protein of interest among different organelles or compare migration profiles among different proteins within the same sample. For Arabidopsis proteins, users can compare migration profiles of a protein of interest with putative homologous proteins from non-Arabidopsis organisms. The updated PCoM-DB will help researchers find novel protein complexes and estimate their evolutionary changes in the green lineage.

  16. Functional Analysis and Discovery of Microbial Genes Transforming Metallic and Organic Pollutants: Database and Experimental Tools

    SciTech Connect

    Lawrence P. Wackett; Lynda B.M. Ellis

    2004-12-09

    Microbial functional genomics is faced with a burgeoning list of genes which are denoted as unknown or hypothetical for lack of any knowledge about their function. The majority of microbial genes encode enzymes. Enzymes are the catalysts of metabolism; catabolism, anabolism, stress responses, and many other cell functions. A major problem facing microbial functional genomics is proposed here to derive from the breadth of microbial metabolism, much of which remains undiscovered. The breadth of microbial metabolism has been surveyed by the PIs and represented according to reaction types on the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD): http://umbbd.ahc.umn.edu/search/FuncGrps.html The database depicts metabolism of 49 chemical functional groups, representing most of current knowledge. Twice that number of chemical groups are proposed here to be metabolized by microbes. Thus, at least 50% of the unique biochemical reactions catalyzed by microbes remain undiscovered. This further suggests that many unknown and hypothetical genes encode functions yet undiscovered. This gap will be partly filled by the current proposal. The UM-BBD will be greatly expanded as a resource for microbial functional genomics. Computational methods will be developed to predict microbial metabolism which is not yet discovered. Moreover, a concentrated effort to discover new microbial metabolism will be conducted. The research will focus on metabolism of direct interest to DOE, dealing with the transformation of metals, metalloids, organometallics and toxic organics. This is precisely the type of metabolism which has been characterized most poorly to date. Moreover, these studies will directly impact functional genomic analysis of DOE-relevant genomes.

  17. Solid Waste Projection Model: Database (Version 1.3). Technical reference manual

    SciTech Connect

    Blackburn, C.L.

    1991-11-01

    The Solid Waste Projection Model (SWPM) system is an analytical tool developed by Pacific Northwest Laboratory (PNL) for Westinghouse Hanford Company (WHC). The SWPM system provides a modeling and analysis environment that supports decisions in the process of evaluating various solid waste management alternatives. This document, one of a series describing the SWPM system, contains detailed information regarding the software and data structures utilized in developing the SWPM Version 1.3 Database. This document is intended for use by experienced database specialists and supports database maintenance, utility development, and database enhancement.

  18. Bayesian statistical modeling of disinfection byproduct (DBP) bromine incorporation in the ICR database.

    PubMed

    Francis, Royce A; Vanbriesen, Jeanne M; Small, Mitchell J

    2010-02-15

    Statistical models are developed for bromine incorporation in the trihalomethane (THM), trihaloacetic acids (THAA), dihaloacetic acid (DHAA), and dihaloacetonitrile (DHAN) subclasses of disinfection byproducts (DBPs) using distribution system samples from plants applying only free chlorine as a primary or residual disinfectant in the Information Collection Rule (ICR) database. The objective of this study is to characterize the effect of water quality conditions before, during, and post-treatment on distribution system bromine incorporation into DBP mixtures. Bayesian Markov Chain Monte Carlo (MCMC) methods are used to model individual DBP concentrations and estimate the coefficients of the linear models used to predict the bromine incorporation fraction for distribution system DBP mixtures in each of the four priority DBP classes. The bromine incorporation models achieve good agreement with the data. The most important predictors of bromine incorporation fraction across DBP classes are alkalinity, specific UV absorption (SUVA), and the bromide to total organic carbon ratio (Br:TOC) at the first point of chlorine addition. Free chlorine residual in the distribution system, distribution system residence time, distribution system pH, turbidity, and temperature only slightly influence bromine incorporation. The bromide to applied chlorine (Br:Cl) ratio is not a significant predictor of the bromine incorporation fraction (BIF) in any of the four classes studied. These results indicate that removal of natural organic matter and the location of chlorine addition are important treatment decisions that have substantial implications for bromine incorporation into disinfection byproduct in drinking waters.

  19. Solid Waste Projection Model: Database (Version 1.4). Technical reference manual

    SciTech Connect

    Blackburn, C.; Cillan, T.

    1993-09-01

    The Solid Waste Projection Model (SWPM) system is an analytical tool developed by Pacific Northwest Laboratory (PNL) for Westinghouse Hanford Company (WHC). The SWPM system provides a modeling and analysis environment that supports decisions in the process of evaluating various solid waste management alternatives. This document, one of a series describing the SWPM system, contains detailed information regarding the software and data structures utilized in developing the SWPM Version 1.4 Database. This document is intended for use by experienced database specialists and supports database maintenance, utility development, and database enhancement. Those interested in using the SWPM database should refer to the SWPM Database User`s Guide. This document is available from the PNL Task M Project Manager (D. L. Stiles, 509-372-4358), the PNL Task L Project Manager (L. L. Armacost, 509-372-4304), the WHC Restoration Projects Section Manager (509-372-1443), or the WHC Waste Characterization Manager (509-372-1193).

  20. Modeling BVOC isoprene emissions based on a GIS and remote sensing database

    NASA Astrophysics Data System (ADS)

    Wong, Man Sing; Sarker, Md. Latifur Rahman; Nichol, Janet; Lee, Shun-cheng; Chen, Hongwei; Wan, Yiliang; Chan, P. W.

    2013-04-01

    This paper presents a geographic information systems (GIS) model to relate biogenic volatile organic compounds (BVOCs) isoprene emissions to ecosystem type, as well as environmental drivers such as light intensity, temperature, landscape factor and foliar density. Data and techniques have recently become available which can permit new improved estimates of isoprene emissions over Hong Kong. The techniques are based on Guenther et al.'s (1993, 1999) model. The spatially detailed mapping of isoprene emissions over Hong Kong at a resolution of 100 m and a database has been constructed for retrieval of the isoprene maps from February 2007 to January 2008. This approach assigns emission rates directly to ecosystem types not to individual species, since unlike in temperate regions where one or two single species may dominate over large regions, Hong Kong's vegetation is extremely diverse with up to 300 different species in 1 ha. Field measurements of emissions by canister sampling obtained a range of ambient emissions according to different climatic conditions for Hong Kong's main ecosystem types in both urban and rural areas, and these were used for model validation. Results show the model-derived isoprene flux to have high to moderate correlations with field observations (i.e. r2 = 0.77, r2 = 0.63, r2 = 0.37 for all 24 field measurements, subset for summer, and winter data, respectively) which indicate the robustness of the approach when applied to tropical forests at detailed level, as well as the promising role of remote sensing in isoprene mapping. The GIS model and raster database provide a simple and low cost estimation of the BVOC isoprene in Hong Kong at detailed level. City planners and environmental authorities may use the derived models for estimating isoprene transportation, and its interaction with anthropogenic pollutants in urban areas.

  1. IDAAPM: integrated database of ADMET and adverse effects of predictive modeling based on FDA approved drug data.

    PubMed

    Legehar, Ashenafi; Xhaard, Henri; Ghemtio, Leo

    2016-01-01

    The disposition of a pharmaceutical compound within an organism, i.e. its Absorption, Distribution, Metabolism, Excretion, Toxicity (ADMET) properties and adverse effects, critically affects late stage failure of drug candidates and has led to the withdrawal of approved drugs. Computational methods are effective approaches to reduce the number of safety issues by analyzing possible links between chemical structures and ADMET or adverse effects, but this is limited by the size, quality, and heterogeneity of the data available from individual sources. Thus, large, clean and integrated databases of approved drug data, associated with fast and efficient predictive tools are desirable early in the drug discovery process. We have built a relational database (IDAAPM) to integrate available approved drug data such as drug approval information, ADMET and adverse effects, chemical structures and molecular descriptors, targets, bioactivity and related references. The database has been coupled with a searchable web interface and modern data analytics platform (KNIME) to allow data access, data transformation, initial analysis and further predictive modeling. Data were extracted from FDA resources and supplemented from other publicly available databases. Currently, the database contains information regarding about 19,226 FDA approval applications for 31,815 products (small molecules and biologics) with their approval history, 2505 active ingredients, together with as many ADMET properties, 1629 molecular structures, 2.5 million adverse effects and 36,963 experimental drug-target bioactivity data. IDAAPM is a unique resource that, in a single relational database, provides detailed information on FDA approved drugs including their ADMET properties and adverse effects, the corresponding targets with bioactivity data, coupled with a data analytics platform. It can be used to perform basic to complex drug-target ADMET or adverse effects analysis and predictive modeling. IDAAPM is

  2. Beyond emotion archetypes: databases for emotion modelling using neural networks.

    PubMed

    Cowie, Roddy; Douglas-Cowie, Ellen; Cox, Cate

    2005-05-01

    There has been rapid development in conceptions of the kind of database that is needed for emotion research. Familiar archetypes are still influential, but the state of the art has moved beyond them. There is concern to capture emotion as it occurs in action and interaction ('pervasive emotion') as well as in short episodes dominated by emotion, and therefore in a range of contexts, which shape the way it is expressed. Context links to modality-different contexts favour different modalities. The strategy of using acted data is not suited to those aims, and has been supplemented by work on both fully natural emotion and emotion induced by various technique that allow more controlled records. Applications for that kind of work go far beyond the 'trouble shooting' that has been the focus for application: 'really natural language processing' is a key goal. The descriptions included in such a database ideally cover quality, emotional content, emotion-related signals and signs, and context. Several schemes are emerging as candidates for describing pervasive emotion. The major contemporary databases are listed, emphasising those which are naturalistic or induced, multimodal, and influential.

  3. Generating a mortality model from a pediatric ICU (PICU) database utilizing knowledge discovery.

    PubMed Central

    Kennedy, Curtis E.; Aoki, Noriaki

    2002-01-01

    Current models for predicting outcomes are limited by biases inherent in a priori hypothesis generation. Knowledge discovery algorithms generate models directly from databases, minimizing such limitations. Our objective was to generate a mortality model from a PICU database utilizing knowledge discovery techniques. The database contained 5067 records with 192 clinically relevant variables. It was randomly split into training (75%) and validation (25%) groups. We used decision tree induction to generate a mortality model from the training data, and validated its performance on the validation data. The original PRISM algorithm was used for comparison. The decision tree model contained 25 variables and predicted 53/88 deaths; 29 correctly (Sens:33%, Spec:98%, PPV:54%). PRISM predicted 27/88 deaths correctly (Sens:30%, Spec:98%, PPV:51%). Performance difference between models was not significant. We conclude that knowledge discovery algorithms can generate a mortality model from a PICU database, helping establish validity of such tools in the clinical medical domain. PMID:12463850

  4. Developing a comprehensive database management system for organization and evaluation of mammography datasets.

    PubMed

    Wu, Yirong; Rubin, Daniel L; Woods, Ryan W; Elezaby, Mai; Burnside, Elizabeth S

    2014-01-01

    We aimed to design and develop a comprehensive mammography database system (CMDB) to collect clinical datasets for outcome assessment and development of decision support tools. A Health Insurance Portability and Accountability Act (HIPAA) compliant CMDB was created to store multi-relational datasets of demographic risk factors and mammogram results using the Breast Imaging Reporting and Data System (BI-RADS) lexicon. The CMDB collected both biopsy pathology outcomes, in a breast pathology lexicon compiled by extending BI-RADS, and our institutional breast cancer registry. The audit results derived from the CMDB were in accordance with Mammography Quality Standards Act (MQSA) audits and national benchmarks. The CMDB has managed the challenges of multi-level organization demanded by the complexity of mammography practice and lexicon development in pathology. We foresee that the CMDB will be useful for efficient quality assurance audits and development of decision support tools to improve breast cancer diagnosis. Our procedure of developing the CMDB provides a framework to build a detailed data repository for breast imaging quality control and research, which has the potential to augment existing resources.

  5. Developing a Comprehensive Database Management System for Organization and Evaluation of Mammography Datasets

    PubMed Central

    Wu, Yirong; Rubin, Daniel L; Woods, Ryan W; Elezaby, Mai; Burnside, Elizabeth S

    2014-01-01

    We aimed to design and develop a comprehensive mammography database system (CMDB) to collect clinical datasets for outcome assessment and development of decision support tools. A Health Insurance Portability and Accountability Act (HIPAA) compliant CMDB was created to store multi-relational datasets of demographic risk factors and mammogram results using the Breast Imaging Reporting and Data System (BI-RADS) lexicon. The CMDB collected both biopsy pathology outcomes, in a breast pathology lexicon compiled by extending BI-RADS, and our institutional breast cancer registry. The audit results derived from the CMDB were in accordance with Mammography Quality Standards Act (MQSA) audits and national benchmarks. The CMDB has managed the challenges of multi-level organization demanded by the complexity of mammography practice and lexicon development in pathology. We foresee that the CMDB will be useful for efficient quality assurance audits and development of decision support tools to improve breast cancer diagnosis. Our procedure of developing the CMDB provides a framework to build a detailed data repository for breast imaging quality control and research, which has the potential to augment existing resources. PMID:25368510

  6. A Database for Propagation Models and Conversion to C++ Programming Language

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.; Angkasa, Krisjani; Rucker, James

    1996-01-01

    In the past few years, a computer program was produced to contain propagation models and the necessary prediction methods of most propagation phenomena. The propagation model database described here creates a user friendly environment that makes using the database easy for experienced users and novices alike. The database is designed to pass data through the desired models easily and generate relevant results quickly. The database already contains many of the propagation phenomena models accepted by the propagation community and every year new models are added. The major sources of models included are the NASA Propagation Effects Handbook or the International Radio Consultive Committee (CCIR) or publications such as the Institute for Electrical and Electronic Engineers (IEEE).

  7. A Database for Propagation Models and Conversion to C++ Programming Language

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.; Angkasa, Krisjani; Rucker, James

    1996-01-01

    In the past few years, a computer program was produced to contain propagation models and the necessary prediction methods of most propagation phenomena. The propagation model database described here creates a user friendly environment that makes using the database easy for experienced users and novices alike. The database is designed to pass data through the desired models easily and generate relevant results quickly. The database already contains many of the propagation phenomena models accepted by the propagation community and every year new models are added. The major sources of models included are the NASA Propagation Effects Handbook or the International Radio Consultive Committee (CCIR) or publications such as the Institute for Electrical and Electronic Engineers (IEEE).

  8. Database specification for the Worldwide Port System (WPS) Regional Integrated Cargo Database (ICDB)

    SciTech Connect

    Faby, E.Z.; Fluker, J.; Hancock, B.R.; Grubb, J.W.; Russell, D.L.; Loftis, J.P.; Shipe, P.C.; Truett, L.F.

    1994-03-01

    This Database Specification for the Worldwide Port System (WPS) Regional Integrated Cargo Database (ICDB) describes the database organization and storage allocation, provides the detailed data model of the logical and physical designs, and provides information for the construction of parts of the database such as tables, data elements, and associated dictionaries and diagrams.

  9. DSSTOX WEBSITE LAUNCH: IMPROVING PUBLIC ACCESS TO DATABASES FOR BUILDING STRUCTURE-TOXICITY PREDICTION MODELS

    EPA Science Inventory

    DSSTox Website Launch: Improving Public Access to Databases for Building Structure-Toxicity Prediction Models
    Ann M. Richard
    US Environmental Protection Agency, Research Triangle Park, NC, USA

    Distributed: Decentralized set of standardized, field-delimited databases,...

  10. Teaching Database Modeling and Design: Areas of Confusion and Helpful Hints

    ERIC Educational Resources Information Center

    Philip, George C.

    2007-01-01

    This paper identifies several areas of database modeling and design that have been problematic for students and even are likely to confuse faculty. Major contributing factors are the lack of clarity and inaccuracies that persist in the presentation of some basic database concepts in textbooks. The paper analyzes the problems and discusses ways to…

  11. Integrated Functional and Executional Modelling of Software Using Web-Based Databases

    NASA Technical Reports Server (NTRS)

    Kulkarni, Deepak; Marietta, Roberta

    1998-01-01

    NASA's software subsystems undergo extensive modification and updates over the operational lifetimes. It is imperative that modified software should satisfy safety goals. This report discusses the difficulties encountered in doing so and discusses a solution based on integrated modelling of software, use of automatic information extraction tools, web technology and databases. To appear in an article of Journal of Database Management.

  12. DSSTOX WEBSITE LAUNCH: IMPROVING PUBLIC ACCESS TO DATABASES FOR BUILDING STRUCTURE-TOXICITY PREDICTION MODELS

    EPA Science Inventory

    DSSTox Website Launch: Improving Public Access to Databases for Building Structure-Toxicity Prediction Models
    Ann M. Richard
    US Environmental Protection Agency, Research Triangle Park, NC, USA

    Distributed: Decentralized set of standardized, field-delimited databases,...

  13. Teaching Database Modeling and Design: Areas of Confusion and Helpful Hints

    ERIC Educational Resources Information Center

    Philip, George C.

    2007-01-01

    This paper identifies several areas of database modeling and design that have been problematic for students and even are likely to confuse faculty. Major contributing factors are the lack of clarity and inaccuracies that persist in the presentation of some basic database concepts in textbooks. The paper analyzes the problems and discusses ways to…

  14. Publication Trends in Model Organism Research

    PubMed Central

    Dietrich, Michael R.; Ankeny, Rachel A.; Chen, Patrick M.

    2014-01-01

    In 1990, the National Institutes of Health (NIH) gave some organisms special status as designated model organisms. This article documents publication trends for these NIH-designated model organisms over the past 40 years. We find that being designated a model organism by the NIH does not guarantee an increasing publication trend. An analysis of model and nonmodel organisms included in GENETICS since 1960 does reveal a sharp decline in the number of publications using nonmodel organisms yet no decline in the overall species diversity. We suggest that organisms with successful publication records tend to share critical characteristics, such as being well developed as standardized, experimental systems and being used by well-organized communities with good networks of exchange and methods of communication. PMID:25381363

  15. Knowledge discovery in clinical databases based on variable precision rough set model.

    PubMed Central

    Tsumoto, S.; Ziarko, W.; Shan, N.; Tanaka, H.

    1995-01-01

    Since a large amount of clinical data are being stored electronically, discovery of knowledge from such clinical databases is one of the important growing research area in medical informatics. For this purpose, we develop KDD-R (a system for Knowledge Discovery in Databases using Rough sets), an experimental system for knowledge discovery and machine learning research using variable precision rough sets (VPRS) model, which is an extension of original rough set model. This system works in the following steps. First, it preprocesses databases and translates continuous data into discretized ones. Second, KDD-R checks dependencies between attributes and reduces spurious data. Third, the system computes rules from reduced databases. Finally, fourth, it evaluates decision making. For evaluation, this system is applied to a clinical database of meningoencephalitis, whose computational results show that several new findings are obtained. PMID:8563283

  16. Modeling and Measuring Organization Capital

    ERIC Educational Resources Information Center

    Atkeson, Andrew; Kehoe, Patrick J.

    2005-01-01

    Manufacturing plants have a clear life cycle: they are born small, grow substantially with age, and eventually die. Economists have long thought that this life cycle is driven by organization capital, the accumulation of plant-specific knowledge. The location of plants in the life cycle determines the size of the payments, or organization rents,…

  17. Organization Development: Strategies and Models.

    ERIC Educational Resources Information Center

    Beckhard, Richard

    This book, written for managers, specialists, and students of management, is based largely on the author's experience in helping organization leaders with planned-change efforts, and on related experience of colleagues in the field. Chapter 1 presents the background and causes for the increased concern with organization development and planned…

  18. Modeling and Measuring Organization Capital

    ERIC Educational Resources Information Center

    Atkeson, Andrew; Kehoe, Patrick J.

    2005-01-01

    Manufacturing plants have a clear life cycle: they are born small, grow substantially with age, and eventually die. Economists have long thought that this life cycle is driven by organization capital, the accumulation of plant-specific knowledge. The location of plants in the life cycle determines the size of the payments, or organization rents,…

  19. Tree-Structured Digital Organisms Model

    NASA Astrophysics Data System (ADS)

    Suzuki, Teruhiko; Nobesawa, Shiho; Tahara, Ikuo

    Tierra and Avida are well-known models of digital organisms. They describe a life process as a sequence of computation codes. A linear sequence model may not be the only way to describe a digital organism, though it is very simple for a computer-based model. Thus we propose a new digital organism model based on a tree structure, which is rather similar to the generic programming. With our model, a life process is a combination of various functions, as if life in the real world is. This implies that our model can easily describe the hierarchical structure of life, and it can simulate evolutionary computation through mutual interaction of functions. We verified our model by simulations that our model can be regarded as a digital organism model according to its definitions. Our model even succeeded in creating species such as viruses and parasites.

  20. Organ Impairment—Drug–Drug Interaction Database: A Tool for Evaluating the Impact of Renal or Hepatic Impairment and Pharmacologic Inhibition on the Systemic Exposure of Drugs

    PubMed Central

    Yeung, CK; Yoshida, K; Kusama, M; Zhang, H; Ragueneau-Majlessi, I; Argon, S; Li, L; Chang, P; Le, CD; Zhao, P; Zhang, L; Sugiyama, Y; Huang, S-M

    2015-01-01

    The organ impairment and drug–drug interaction (OI-DDI) database is the first rigorously assembled database of pharmacokinetic drug exposure data from publicly available renal and hepatic impairment studies presented together with the maximum change in drug exposure from drug interaction inhibition studies. The database was used to conduct a systematic comparison of the effect of renal/hepatic impairment and pharmacologic inhibition on drug exposure. Additional applications are feasible with the public availability of this database. PMID:26380158

  1. Analysis of the Properties of Working Substances for the Organic Rankine Cycle based Database "REFPROP"

    NASA Astrophysics Data System (ADS)

    Galashov, Nikolay; Tsibulskiy, Svyatoslav; Serova, Tatiana

    2016-02-01

    The object of the study are substances that are used as a working fluid in systems operating on the basis of an organic Rankine cycle. The purpose of research is to find substances with the best thermodynamic, thermal and environmental properties. Research conducted on the basis of the analysis of thermodynamic and thermal properties of substances from the base "REFPROP" and with the help of numerical simulation of combined-cycle plant utilization triple cycle, where the lower cycle is an organic Rankine cycle. Base "REFPROP" describes and allows to calculate the thermodynamic and thermophysical parameters of most of the main substances used in production processes. On the basis of scientific publications on the use of working fluids in an organic Rankine cycle analysis were selected ozone-friendly low-boiling substances: ammonia, butane, pentane and Freon: R134a, R152a, R236fa and R245fa. For these substances have been identified and tabulated molecular weight, temperature of the triple point, boiling point, at atmospheric pressure, the parameters of the critical point, the value of the derivative of the temperature on the entropy of the saturated vapor line and the potential ozone depletion and global warming. It was also identified and tabulated thermodynamic and thermophysical parameters of the steam and liquid substances in a state of saturation at a temperature of 15 °C. This temperature is adopted as the minimum temperature of heat removal in the Rankine cycle when working on the water. Studies have shown that the best thermodynamic, thermal and environmental properties of the considered substances are pentane, butane and R245fa. For a more thorough analysis based on a gas turbine plant NK-36ST it has developed a mathematical model of combined cycle gas turbine (CCGT) triple cycle, where the lower cycle is an organic Rankine cycle, and is used as the air cooler condenser. Air condenser allows stating material at a temperature below 0 °C. Calculation of the

  2. Compartmental and Data-Based Modeling of Cerebral Hemodynamics: Linear Analysis

    PubMed Central

    Henley, B.C.; Shin, D.C.; Zhang, R.; Marmarelis, V.Z.

    2015-01-01

    Compartmental and data-based modeling of cerebral hemodynamics are alternative approaches that utilize distinct model forms and have been employed in the quantitative study of cerebral hemodynamics. This paper examines the relation between a compartmental equivalent-circuit and a data-based input-output model of dynamic cerebral autoregulation (DCA) and CO2-vasomotor reactivity (DVR). The compartmental model is constructed as an equivalent-circuit utilizing putative first principles and previously proposed hypothesis-based models. The linear input-output dynamics of this compartmental model are compared with data-based estimates of the DCA-DVR process. This comparative study indicates that there are some qualitative similarities between the two-input compartmental model and experimental results. PMID:26900535

  3. Interconnection of the Graphics Language for Database System to the Multi-Lingual, Multi-Model, Multi-Backend Database System Over an Ethernet Network

    DTIC Science & Technology

    1989-12-01

    the Attribute Based Data Language ( ABDL ) interface of MBDS which receives requests (generated by GLAD) for opening and querying databases. These...are the same for all interfaces. The attribute-based data model and language ( ABDL ) have been chosen as the KDM and 18 •~~ UDM. •ii I IM I KDL...Attribute-Based Data Model (ABDM) and Language ( ABDL ). a. Model Description Any database consists of a collection of files, each of which consists of a

  4. Relational-database model for improving quality assurance and process control in a composite manufacturing environment

    NASA Astrophysics Data System (ADS)

    Gentry, Jeffery D.

    2000-05-01

    A relational database is a powerful tool for collecting and analyzing the vast amounts of inner-related data associated with the manufacture of composite materials. A relational database contains many individual database tables that store data that are related in some fashion. Manufacturing process variables as well as quality assurance measurements can be collected and stored in database tables indexed according to lot numbers, part type or individual serial numbers. Relationships between manufacturing process and product quality can then be correlated over a wide range of product types and process variations. This paper presents details on how relational databases are used to collect, store, and analyze process variables and quality assurance data associated with the manufacture of advanced composite materials. Important considerations are covered including how the various types of data are organized and how relationships between the data are defined. Employing relational database techniques to establish correlative relationships between process variables and quality assurance measurements is then explored. Finally, the benefits of database techniques such as data warehousing, data mining and web based client/server architectures are discussed in the context of composite material manufacturing.

  5. Solid Waste Projection Model: Database User`s Guide. Version 1.4

    SciTech Connect

    Blackburn, C.L.

    1993-10-01

    The Solid Waste Projection Model (SWPM) system is an analytical tool developed by Pacific Northwest Laboratory (PNL) for Westinghouse Hanford Company (WHC) specifically to address Hanford solid waste management issues. This document is one of a set of documents supporting the SWPM system and providing instructions in the use and maintenance of SWPM components. This manual contains instructions for using Version 1.4 of the SWPM database: system requirements and preparation, entering and maintaining data, and performing routine database functions. This document supports only those operations which are specific to SWPM database menus and functions and does not Provide instruction in the use of Paradox, the database management system in which the SWPM database is established.

  6. TMPL: a database of experimental and theoretical transmembrane protein models positioned in the lipid bilayer

    PubMed Central

    Ghouzam, Yassine; Etchebest, Catherine

    2017-01-01

    Abstract Knowing the position of protein structures within the membrane is crucial for fundamental and applied research in the field of molecular biology. Only few web resources propose coordinate files of oriented transmembrane proteins, and these exclude predicted structures, although they represent the largest part of the available models. In this article, we present TMPL (http://www.dsimb.inserm.fr/TMPL/), a database of transmembrane protein structures (α-helical and β-sheet) positioned in the lipid bilayer. It is the first database to include theoretical models of transmembrane protein structures, making it a large repository with more than 11 000 entries. The TMPL database also contains experimentally solved protein structures, which are available as either atomistic or coarse-grained models. A unique feature of TMPL is the possibility for users to update the database by uploading, through an intuitive web interface, the membrane assignments they can obtain with our recent OREMPRO web server. PMID:28365741

  7. A comprehensive model for reproductive and developmental toxicity hazard identification: I. Development of a weight of evidence QSAR database.

    PubMed

    Matthews, Edwin J; Kruhlak, Naomi L; Daniel Benz, R; Contrera, Joseph F

    2007-03-01

    A weight of evidence (WOE) reproductive and developmental toxicology (reprotox) database was constructed that is suitable for quantitative structure-activity relationship (QSAR) modeling and human hazard identification of untested chemicals. The database was derived from multiple publicly available reprotox databases and consists of more than 10,000 individual rat, mouse, or rabbit reprotox tests linked to 2134 different organic chemical structures. The reprotox data were classified into seven general classes (male reproductive toxicity, female reproductive toxicity, fetal dysmorphogenesis, functional toxicity, mortality, growth, and newborn behavioral toxicity), and 90 specific categories as defined in the source reprotox databases. Each specific category contained over 500 chemicals, but the percentage of active chemicals was low, generally only 0.1-10%. The mathematical WOE model placed greater significance on confirmatory observations from repeat experiments, chemicals with multiple findings within a category, and the categorical relatedness of the findings. Using the weighted activity scores, statistical analyses were performed for specific data sets to identify clusters of categories that were correlated, containing similar profiles of active and inactive chemicals. The analysis revealed clusters of specific categories that contained chemicals that were active in two or more mammalian species (trans-species). Such chemicals are considered to have the highest potential risk to humans. In contrast, some specific categories exhibited only single species-specific activities. Results also showed that the rat and mouse were more susceptible to dysmorphogenesis than rabbits (6.1- and 3.6-fold, respectively).

  8. SynechoNET: integrated protein-protein interaction database of a model cyanobacterium Synechocystis sp. PCC 6803

    PubMed Central

    Kim, Woo-Yeon; Kang, Sungsoo; Kim, Byoung-Chul; Oh, Jeehyun; Cho, Seongwoong; Bhak, Jong; Choi, Jong-Soon

    2008-01-01

    Background Cyanobacteria are model organisms for studying photosynthesis, carbon and nitrogen assimilation, evolution of plant plastids, and adaptability to environmental stresses. Despite many studies on cyanobacteria, there is no web-based database of their regulatory and signaling protein-protein interaction networks to date. Description We report a database and website SynechoNET that provides predicted protein-protein interactions. SynechoNET shows cyanobacterial domain-domain interactions as well as their protein-level interactions using the model cyanobacterium, Synechocystis sp. PCC 6803. It predicts the protein-protein interactions using public interaction databases that contain mutually complementary and redundant data. Furthermore, SynechoNET provides information on transmembrane topology, signal peptide, and domain structure in order to support the analysis of regulatory membrane proteins. Such biological information can be queried and visualized in user-friendly web interfaces that include the interactive network viewer and search pages by keyword and functional category. Conclusion SynechoNET is an integrated protein-protein interaction database designed to analyze regulatory membrane proteins in cyanobacteria. It provides a platform for biologists to extend the genomic data of cyanobacteria by predicting interaction partners, membrane association, and membrane topology of Synechocystis proteins. SynechoNET is freely available at or directly at . PMID:18315852

  9. An Online Database for Informing Ecological Network Models: http://kelpforest.ucsc.edu

    PubMed Central

    Beas-Luna, Rodrigo; Novak, Mark; Carr, Mark H.; Tinker, Martin T.; Black, August; Caselle, Jennifer E.; Hoban, Michael; Malone, Dan; Iles, Alison

    2014-01-01

    Ecological network models and analyses are recognized as valuable tools for understanding the dynamics and resiliency of ecosystems, and for informing ecosystem-based approaches to management. However, few databases exist that can provide the life history, demographic and species interaction information necessary to parameterize ecological network models. Faced with the difficulty of synthesizing the information required to construct models for kelp forest ecosystems along the West Coast of North America, we developed an online database (http://kelpforest.ucsc.edu/) to facilitate the collation and dissemination of such information. Many of the database's attributes are novel yet the structure is applicable and adaptable to other ecosystem modeling efforts. Information for each taxonomic unit includes stage-specific life history, demography, and body-size allometries. Species interactions include trophic, competitive, facilitative, and parasitic forms. Each data entry is temporally and spatially explicit. The online data entry interface allows researchers anywhere to contribute and access information. Quality control is facilitated by attributing each entry to unique contributor identities and source citations. The database has proven useful as an archive of species and ecosystem-specific information in the development of several ecological network models, for informing management actions, and for education purposes (e.g., undergraduate and graduate training). To facilitate adaptation of the database by other researches for other ecosystems, the code and technical details on how to customize this database and apply it to other ecosystems are freely available and located at the following link (https://github.com/kelpforest-cameo/databaseui). PMID:25343723

  10. Approach for ontological modeling of database schema for the generation of semantic knowledge on the web

    NASA Astrophysics Data System (ADS)

    Rozeva, Anna

    2015-11-01

    Currently there is large quantity of content on web pages that is generated from relational databases. Conceptual domain models provide for the integration of heterogeneous content on semantic level. The use of ontology as conceptual model of a relational data sources makes them available to web agents and services and provides for the employment of ontological techniques for data access, navigation and reasoning. The achievement of interoperability between relational databases and ontologies enriches the web with semantic knowledge. The establishment of semantic database conceptual model based on ontology facilitates the development of data integration systems that use ontology as unified global view. Approach for generation of ontologically based conceptual model is presented. The ontology representing the database schema is obtained by matching schema elements to ontology concepts. Algorithm of the matching process is designed. Infrastructure for the inclusion of mediation between database and ontology for bridging legacy data with formal semantic meaning is presented. Implementation of the knowledge modeling approach on sample database is performed.

  11. An online database for informing ecological network models: http://kelpforest.ucsc.edu.

    PubMed

    Beas-Luna, Rodrigo; Novak, Mark; Carr, Mark H; Tinker, Martin T; Black, August; Caselle, Jennifer E; Hoban, Michael; Malone, Dan; Iles, Alison

    2014-01-01

    Ecological network models and analyses are recognized as valuable tools for understanding the dynamics and resiliency of ecosystems, and for informing ecosystem-based approaches to management. However, few databases exist that can provide the life history, demographic and species interaction information necessary to parameterize ecological network models. Faced with the difficulty of synthesizing the information required to construct models for kelp forest ecosystems along the West Coast of North America, we developed an online database (http://kelpforest.ucsc.edu/) to facilitate the collation and dissemination of such information. Many of the database's attributes are novel yet the structure is applicable and adaptable to other ecosystem modeling efforts. Information for each taxonomic unit includes stage-specific life history, demography, and body-size allometries. Species interactions include trophic, competitive, facilitative, and parasitic forms. Each data entry is temporally and spatially explicit. The online data entry interface allows researchers anywhere to contribute and access information. Quality control is facilitated by attributing each entry to unique contributor identities and source citations. The database has proven useful as an archive of species and ecosystem-specific information in the development of several ecological network models, for informing management actions, and for education purposes (e.g., undergraduate and graduate training). To facilitate adaptation of the database by other researches for other ecosystems, the code and technical details on how to customize this database and apply it to other ecosystems are freely available and located at the following link (https://github.com/kelpforest-cameo/databaseui).

  12. An online database for informing ecological network models: http://kelpforest.ucsc.edu

    USGS Publications Warehouse

    Beas-Luna, Rodrigo; Tinker, M. Tim; Novak, Mark; Carr, Mark H.; Black, August; Caselle, Jennifer E.; Hoban, Michael; Malone, Dan; Iles, Alison C.

    2014-01-01

    Ecological network models and analyses are recognized as valuable tools for understanding the dynamics and resiliency of ecosystems, and for informing ecosystem-based approaches to management. However, few databases exist that can provide the life history, demographic and species interaction information necessary to parameterize ecological network models. Faced with the difficulty of synthesizing the information required to construct models for kelp forest ecosystems along the West Coast of North America, we developed an online database (http://kelpforest.ucsc.edu/) to facilitate the collation and dissemination of such information. Many of the database's attributes are novel yet the structure is applicable and adaptable to other ecosystem modeling efforts. Information for each taxonomic unit includes stage-specific life history, demography, and body-size allometries. Species interactions include trophic, competitive, facilitative, and parasitic forms. Each data entry is temporally and spatially explicit. The online data entry interface allows researchers anywhere to contribute and access information. Quality control is facilitated by attributing each entry to unique contributor identities and source citations. The database has proven useful as an archive of species and ecosystem-specific information in the development of several ecological network models, for informing management actions, and for education purposes (e.g., undergraduate and graduate training). To facilitate adaptation of the database by other researches for other ecosystems, the code and technical details on how to customize this database and apply it to other ecosystems are freely available and located at the following link (https://github.com/kelpforest-cameo/data​baseui).

  13. CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database

    PubMed Central

    Jia, Baofeng; Raphenya, Amogelang R.; Alcock, Brian; Waglechner, Nicholas; Guo, Peiyao; Tsang, Kara K.; Lago, Briony A.; Dave, Biren M.; Pereira, Sheldon; Sharma, Arjun N.; Doshi, Sachin; Courtot, Mélanie; Lo, Raymond; Williams, Laura E.; Frye, Jonathan G.; Elsayegh, Tariq; Sardar, Daim; Westman, Erin L.; Pawlowski, Andrew C.; Johnson, Timothy A.; Brinkman, Fiona S.L.; Wright, Gerard D.; McArthur, Andrew G.

    2017-01-01

    The Comprehensive Antibiotic Resistance Database (CARD; http://arpcard.mcmaster.ca) is a manually curated resource containing high quality reference data on the molecular basis of antimicrobial resistance (AMR), with an emphasis on the genes, proteins and mutations involved in AMR. CARD is ontologically structured, model centric, and spans the breadth of AMR drug classes and resistance mechanisms, including intrinsic, mutation-driven and acquired resistance. It is built upon the Antibiotic Resistance Ontology (ARO), a custom built, interconnected and hierarchical controlled vocabulary allowing advanced data sharing and organization. Its design allows the development of novel genome analysis tools, such as the Resistance Gene Identifier (RGI) for resistome prediction from raw genome sequence. Recent improvements include extensive curation of additional reference sequences and mutations, development of a unique Model Ontology and accompanying AMR detection models to power sequence analysis, new visualization tools, and expansion of the RGI for detection of emergent AMR threats. CARD curation is updated monthly based on an interplay of manual literature curation, computational text mining, and genome analysis. PMID:27789705

  14. CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database.

    PubMed

    Jia, Baofeng; Raphenya, Amogelang R; Alcock, Brian; Waglechner, Nicholas; Guo, Peiyao; Tsang, Kara K; Lago, Briony A; Dave, Biren M; Pereira, Sheldon; Sharma, Arjun N; Doshi, Sachin; Courtot, Mélanie; Lo, Raymond; Williams, Laura E; Frye, Jonathan G; Elsayegh, Tariq; Sardar, Daim; Westman, Erin L; Pawlowski, Andrew C; Johnson, Timothy A; Brinkman, Fiona S L; Wright, Gerard D; McArthur, Andrew G

    2017-01-04

    The Comprehensive Antibiotic Resistance Database (CARD; http://arpcard.mcmaster.ca) is a manually curated resource containing high quality reference data on the molecular basis of antimicrobial resistance (AMR), with an emphasis on the genes, proteins and mutations involved in AMR. CARD is ontologically structured, model centric, and spans the breadth of AMR drug classes and resistance mechanisms, including intrinsic, mutation-driven and acquired resistance. It is built upon the Antibiotic Resistance Ontology (ARO), a custom built, interconnected and hierarchical controlled vocabulary allowing advanced data sharing and organization. Its design allows the development of novel genome analysis tools, such as the Resistance Gene Identifier (RGI) for resistome prediction from raw genome sequence. Recent improvements include extensive curation of additional reference sequences and mutations, development of a unique Model Ontology and accompanying AMR detection models to power sequence analysis, new visualization tools, and expansion of the RGI for detection of emergent AMR threats. CARD curation is updated monthly based on an interplay of manual literature curation, computational text mining, and genome analysis.

  15. Combining computational models, semantic annotations and simulation experiments in a graph database.

    PubMed

    Henkel, Ron; Wolkenhauer, Olaf; Waltemath, Dagmar

    2015-01-01

    Model repositories such as the BioModels Database, the CellML Model Repository or JWS Online are frequently accessed to retrieve computational models of biological systems. However, their storage concepts support only restricted types of queries and not all data inside the repositories can be retrieved. In this article we present a storage concept that meets this challenge. It grounds on a graph database, reflects the models' structure, incorporates semantic annotations and simulation descriptions and ultimately connects different types of model-related data. The connections between heterogeneous model-related data and bio-ontologies enable efficient search via biological facts and grant access to new model features. The introduced concept notably improves the access of computational models and associated simulations in a model repository. This has positive effects on tasks such as model search, retrieval, ranking, matching and filtering. Furthermore, our work for the first time enables CellML- and Systems Biology Markup Language-encoded models to be effectively maintained in one database. We show how these models can be linked via annotations and queried. Database URL: https://sems.uni-rostock.de/projects/masymos/

  16. Database Manager

    ERIC Educational Resources Information Center

    Martin, Andrew

    2010-01-01

    It is normal practice today for organizations to store large quantities of records of related information as computer-based files or databases. Purposeful information is retrieved by performing queries on the data sets. The purpose of DATABASE MANAGER is to communicate to students the method by which the computer performs these queries. This…

  17. Database Manager

    ERIC Educational Resources Information Center

    Martin, Andrew

    2010-01-01

    It is normal practice today for organizations to store large quantities of records of related information as computer-based files or databases. Purposeful information is retrieved by performing queries on the data sets. The purpose of DATABASE MANAGER is to communicate to students the method by which the computer performs these queries. This…

  18. QSAR Modeling Using Large-Scale Databases: Case Study for HIV-1 Reverse Transcriptase Inhibitors.

    PubMed

    Tarasova, Olga A; Urusova, Aleksandra F; Filimonov, Dmitry A; Nicklaus, Marc C; Zakharov, Alexey V; Poroikov, Vladimir V

    2015-07-27

    Large-scale databases are important sources of training sets for various QSAR modeling approaches. Generally, these databases contain information extracted from different sources. This variety of sources can produce inconsistency in the data, defined as sometimes widely diverging activity results for the same compound against the same target. Because such inconsistency can reduce the accuracy of predictive models built from these data, we are addressing the question of how best to use data from publicly and commercially accessible databases to create accurate and predictive QSAR models. We investigate the suitability of commercially and publicly available databases to QSAR modeling of antiviral activity (HIV-1 reverse transcriptase (RT) inhibition). We present several methods for the creation of modeling (i.e., training and test) sets from two, either commercially or freely available, databases: Thomson Reuters Integrity and ChEMBL. We found that the typical predictivities of QSAR models obtained using these different modeling set compilation methods differ significantly from each other. The best results were obtained using training sets compiled for compounds tested using only one method and material (i.e., a specific type of biological assay). Compound sets aggregated by target only typically yielded poorly predictive models. We discuss the possibility of "mix-and-matching" assay data across aggregating databases such as ChEMBL and Integrity and their current severe limitations for this purpose. One of them is the general lack of complete and semantic/computer-parsable descriptions of assay methodology carried by these databases that would allow one to determine mix-and-matchability of result sets at the assay level.

  19. Automatic generation of conceptual database design tools from data model specifications

    SciTech Connect

    Hong, Shuguang.

    1989-01-01

    The problems faced in the design and implementation of database software systems based on object-oriented data models are similar to that of other software design, i.e., difficult, complex, yet redundant effort. Automatic generation of database software system has been proposed as a solution to the problems. In order to generate database software system for a variety of object-oriented data models, two critical issues: data model specification and software generation, must be addressed. SeaWeed is a software system that automatically generates conceptual database design tools from data model specifications. A meta model has been defined for the specification of a class of object-oriented data models. This meta model provides a set of primitive modeling constructs that can be used to express the semantics, or unique characteristics, of specific data models. Software reusability has been adopted for the software generation. The technique of design reuse is utilized to derive the requirement specification of the software to be generated from data model specifications. The mechanism of code reuse is used to produce the necessary reusable software components. This dissertation presents the research results of SeaWeed including the meta model, data model specification, a formal representation of design reuse and code reuse, and the software generation paradigm.

  20. Combining computational models, semantic annotations and simulation experiments in a graph database

    PubMed Central

    Henkel, Ron; Wolkenhauer, Olaf; Waltemath, Dagmar

    2015-01-01

    Model repositories such as the BioModels Database, the CellML Model Repository or JWS Online are frequently accessed to retrieve computational models of biological systems. However, their storage concepts support only restricted types of queries and not all data inside the repositories can be retrieved. In this article we present a storage concept that meets this challenge. It grounds on a graph database, reflects the models’ structure, incorporates semantic annotations and simulation descriptions and ultimately connects different types of model-related data. The connections between heterogeneous model-related data and bio-ontologies enable efficient search via biological facts and grant access to new model features. The introduced concept notably improves the access of computational models and associated simulations in a model repository. This has positive effects on tasks such as model search, retrieval, ranking, matching and filtering. Furthermore, our work for the first time enables CellML- and Systems Biology Markup Language-encoded models to be effectively maintained in one database. We show how these models can be linked via annotations and queried. Database URL: https://sems.uni-rostock.de/projects/masymos/ PMID:25754863

  1. Guide on Data Models in the Selection and Use of Database Management Systems. Final Report.

    ERIC Educational Resources Information Center

    Gallagher, Leonard J.; Draper, Jesse M.

    A tutorial introduction to data models in general is provided, with particular emphasis on the relational and network models defined by the two proposed ANSI (American National Standards Institute) database language standards. Examples based on the network and relational models include specific syntax and semantics, while examples from the other…

  2. Database/Template Protocol to Automate Development of Complex Environmental Input Models

    SciTech Connect

    COLLARD, LEONARD

    2004-11-10

    At the U.S. Department of Energy Savannah River Site, complex environmental models were required to analyze the performance of a suite of radionuclides, including decay chains consisting of multiple radionuclides. To facilitate preparation of the model for each radionuclide a sophisticated protocol was established to link a database containing material information with a template. The protocol consists of data and special commands in the template, control information in the database and key selection information in the database. A preprocessor program reads a template, incorporates the appropriate information from the database and generates the final model. In effect, the database/template protocol forms a command language. That command language typically allows the user to perform multiple independent analyses merely by setting environmental variables to identify the nuclides to be analyzed and having the template reference those environmental variables. The environmental variables ca n be set by a batch or script that serves as a shell to analyze each radionuclide in a separate subdirectory (if desired) and to conduct any preprocessing and postprocessing functions. The user has complete control to generate the database and how it interacts with the template. This protocol was valuable for analyzing multiple radionuclides for a single disposal unit. It can easily be applied for other disposal units, to uncertainty studies, and to sensitivity studies. The protocol can be applied to any type of model input for any computer program. A primary advantage of this protocol is that it does not require any programming or compiling while providing robust applicability.

  3. ExtraTrain: a database of Extragenic regions and Transcriptional information in prokaryotic organisms

    PubMed Central

    Pareja, Eduardo; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Bonal, Javier; Tobes, Raquel

    2006-01-01

    Background Transcriptional regulation processes are the principal mechanisms of adaptation in prokaryotes. In these processes, the regulatory proteins and the regulatory DNA signals located in extragenic regions are the key elements involved. As all extragenic spaces are putative regulatory regions, ExtraTrain covers all extragenic regions of available genomes and regulatory proteins from bacteria and archaea included in the UniProt database. Description ExtraTrain provides integrated and easily manageable information for 679816 extragenic regions and for the genes delimiting each of them. In addition ExtraTrain supplies a tool to explore extragenic regions, named Palinsight, oriented to detect and search palindromic patterns. This interactive visual tool is totally integrated in the database, allowing the search for regulatory signals in user defined sets of extragenic regions. The 26046 regulatory proteins included in ExtraTrain belong to the families AraC/XylS, ArsR, AsnC, Cold shock domain, CRP-FNR, DeoR, GntR, IclR, LacI, LuxR, LysR, MarR, MerR, NtrC/Fis, OmpR and TetR. The database follows the InterPro criteria to define these families. The information about regulators includes manually curated sets of references specifically associated to regulator entries. In order to achieve a sustainable and maintainable knowledge database ExtraTrain is a platform open to the contribution of knowledge by the scientific community providing a system for the incorporation of textual knowledge. Conclusion ExtraTrain is a new database for exploring Extragenic regions and Transcriptional information in bacteria and archaea. ExtraTrain database is available at . PMID:16539733

  4. Modeling and database for melt-water interfacial heat transfer

    SciTech Connect

    Farmer, M.T.; Spencer, B.W.; Schneider, J.P.; Bonomo, B.; Theofanous, G.

    1992-04-01

    A mechanistic model is developed to predict the transition superficial gas velocity between bulk cooldown and crust-limited heat transfer regimes in a sparged molten pool with a coolant overlayer. The model has direct applications in the analysis of ex-vessel severe accidents, where molten corium interacts with concrete, thereby producing sparging concrete decomposition gases. The analysis approach embodies thermal, mechanical, and hydrodynamic aspects associated with incipient crust formation at the melt/coolant interface. The model is validated against experiment data obtained with water (melt) and liquid nitrogen (coolant) simulants. Predictions are then made for the critical gas velocity at which crust formation will occur for core material interacting with concrete in the presence of water.

  5. Modeling and database for melt-water interfacial heat transfer

    SciTech Connect

    Farmer, M.T.; Spencer, B.W. ); Schneider, J.P. ); Bonomo, B. ); Theofanous, G. )

    1992-01-01

    A mechanistic model is developed to predict the transition superficial gas velocity between bulk cooldown and crust-limited heat transfer regimes in a sparged molten pool with a coolant overlayer. The model has direct applications in the analysis of ex-vessel severe accidents, where molten corium interacts with concrete, thereby producing sparging concrete decomposition gases. The analysis approach embodies thermal, mechanical, and hydrodynamic aspects associated with incipient crust formation at the melt/coolant interface. The model is validated against experiment data obtained with water (melt) and liquid nitrogen (coolant) simulants. Predictions are then made for the critical gas velocity at which crust formation will occur for core material interacting with concrete in the presence of water.

  6. Integration of solid modeling and database management for CAD/CAM

    SciTech Connect

    Lee, Y.C.

    1984-01-01

    Focusing on geometric completeness and data independence respectively, solid modeling and database management could be bridged together so as to provide a CAD/CAM environment with a unified geometric database for multiple application uses. The proposed approach is based on the CSG (Constructive Solid Geometry) scheme for solid modeling and the generic relational model for database management, primarily for the purpose of data conciseness while obeying a rigorous database design discipline. To facilitate data scheme definition, a systematic procedure was devised to convert a set of grammar rules into a generic scheme. A relational query language, SEQUEL, was modified to define, control, and manipulate the flat relations that represent the highly structural generic relational model. To further justify the usefulness of the proposed geometric database management system as well as its supporting query language, two application issues, namely, a solid modeler for CAD and a preprocessor for CAM, were also investigated. The solid modeler implemented not only acts as an application to be interfaced by the query language but also serves as a tool to visually verify the geometric data. Research efforts on CAM issues have also resulted in an algorithm capable of extracting and unifying certain manufacturing features that are embedded in CSG trees.

  7. FOAM (Functional Ontology Assignments for Metagenomes): A Hidden Markov Model (HMM) database with environmental focus

    DOE PAGES

    Prestat, Emmanuel; David, Maude M.; Hultman, Jenni; ...

    2014-09-26

    A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associatedmore » functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.« less

  8. Primate Models in Organ Transplantation

    PubMed Central

    Anderson, Douglas J.; Kirk, Allan D.

    2013-01-01

    Large animal models have long served as the proving grounds for advances in transplantation, bridging the gap between inbred mouse experimentation and human clinical trials. Although a variety of species have been and continue to be used, the emergence of highly targeted biologic- and antibody-based therapies has required models to have a high degree of homology with humans. Thus, the nonhuman primate has become the model of choice in many settings. This article will provide an overview of nonhuman primate models of transplantation. Issues of primate genetics and care will be introduced, and a brief overview of technical aspects for various transplant models will be discussed. Finally, several prominent immunosuppressive and tolerance strategies used in primates will be reviewed. PMID:24003248

  9. Microporoelastic Modeling of Organic-Rich Shales

    NASA Astrophysics Data System (ADS)

    Khosh Sokhan Monfared, S.; Abedi, S.; Ulm, F. J.

    2014-12-01

    Organic-rich shale is an extremely complex, naturally occurring geo-composite. The heterogeneous nature of organic-rich shale and its anisotropic behavior pose grand challenges for characterization, modeling and engineering design The intricacy of organic-rich shale, in the context of its mechanical and poromechanical properties, originates in the presence of organic/inorganic constituents and their interfaces as well as the occurrence of porosity and elastic anisotropy, at multiple length scales. To capture the contributing mechanisms, of 1st order, responsible for organic-rich shale complex behavior, we introduce an original approach for micromechanical modeling of organic-rich shales which accounts for the effect of maturity of organics on the overall elasticity through morphology considerations. This morphology contribution is captured by means of an effective media theory that bridges the gap between immature and mature systems through the choice of system's microtexture; namely a matrix-inclusion morphology (Mori-Tanaka) for immature systems and a polycrystal/granular morphology for mature systems. Also, we show that interfaces play a role on the effective elasticity of mature, organic-rich shales. The models are calibrated by means of ultrasonic pulse velocity measurements of elastic properties and validated by means of nanoindentation results. Sensitivity analyses using Spearman's Partial Rank Correlation Coefficient shows the importance of porosity and Total Organic Carbon (TOC) as key input parameters for accurate model predictions. These modeling developments pave the way to reach a "unique" set of clay properties and highlight the importance of depositional environment, burial and diagenetic processes on overall mechanical and poromechanical behavior of organic-rich shale. These developments also emphasize the importance of understanding and modeling clay elasticity and organic maturity on the overall rock behavior which is of critical importance for a

  10. Database Design Learning: A Project-Based Approach Organized through a Course Management System

    ERIC Educational Resources Information Center

    Dominguez, Cesar; Jaime, Arturo

    2010-01-01

    This paper describes an active method for database design learning through practical tasks development by student teams in a face-to-face course. This method integrates project-based learning, and project management techniques and tools. Some scaffolding is provided at the beginning that forms a skeleton that adapts to a great variety of…

  11. Database Design Learning: A Project-Based Approach Organized through a Course Management System

    ERIC Educational Resources Information Center

    Dominguez, Cesar; Jaime, Arturo

    2010-01-01

    This paper describes an active method for database design learning through practical tasks development by student teams in a face-to-face course. This method integrates project-based learning, and project management techniques and tools. Some scaffolding is provided at the beginning that forms a skeleton that adapts to a great variety of…

  12. Functional Decomposition of Modeling and Simulation Terrain Database Generation Process

    DTIC Science & Technology

    2008-09-19

    Department of the Army position unless so designated by other authorized documents. DESTROY THIS REPORT WHEN NO LONGER NEEDED. DO NOT RETURN IT TO THE...with ArcGIS by Environmental Systems Research Institute ( ESRI ) and TerraTools by TerraSim, respec- tively. ER D C /TEC SR -08-1 5...Common Data Model Framework (CDMF) contains a set of tools for creating and analyzing EDMs. CDMF is a government-off-the-shelf technology designed and

  13. Database Needs for Modeling and Simulation of Plasma Processing.

    DTIC Science & Technology

    1996-01-01

    structure codes as well as semiempirical methods, should be encouraged. 2. A spectrum of plasma models should be developed, aimed at a variety of uses...One set of codes should be developed to provide a compact, relatively fast simulation that addresses plasma and surface kinetics and is useful for...process engineers. Convenient user interfaces would be important for this set of codes . A second set of codes would include more sophisticated algorithms

  14. Challenges of Country Modeling with Databases, Newsfeeds, and Expert Surveys

    DTIC Science & Technology

    2008-01-01

    route to obtaining our information of interest, namely the information we need to determine our model parameters. Instead, it seems wisest to fully...9.5, we will focus on surveying the kinds of automated data extraction technologies that are available today to obtain empirical materials from the...preferences, and obtaining this informa- tion can be a daunting task. To our relief, we have access to an extensive collection of survey results

  15. Circulation Control Model Experimental Database for CFD Validation

    NASA Technical Reports Server (NTRS)

    Paschal, Keith B.; Neuhart, Danny H.; Beeler, George B.; Allan, Brian G.

    2012-01-01

    A 2D circulation control wing was tested in the Basic Aerodynamic Research Tunnel at the NASA Langley Research Center. A traditional circulation control wing employs tangential blowing along the span over a trailing-edge Coanda surface for the purpose of lift augmentation. This model has been tested extensively at the Georgia Tech Research Institute for the purpose of performance documentation at various blowing rates. The current study seeks to expand on the previous work by documenting additional flow-field data needed for validation of computational fluid dynamics. Two jet momentum coefficients were tested during this entry: 0.047 and 0.114. Boundary-layer transition was investigated and turbulent boundary layers were established on both the upper and lower surfaces of the model. Chordwise and spanwise pressure measurements were made, and tunnel sidewall pressure footprints were documented. Laser Doppler Velocimetry measurements were made on both the upper and lower surface of the model at two chordwise locations (x/c = 0.8 and 0.9) to document the state of the boundary layers near the spanwise blowing slot.

  16. Extracting protein alignment models from the sequence database.

    PubMed Central

    Neuwald, A F; Liu, J S; Lipman, D J; Lawrence, C E

    1997-01-01

    Biologists often gain structural and functional insights into a protein sequence by constructing a multiple alignment model of the family. Here a program called Probe fully automates this process of model construction starting from a single sequence. Central to this program is a powerful new method to locate and align only those, often subtly, conserved patterns essential to the family as a whole. When applied to randomly chosen proteins, Probe found on average about four times as many relationships as a pairwise search and yielded many new discoveries. These include: an obscure subfamily of globins in the roundworm Caenorhabditis elegans ; two new superfamilies of metallohydrolases; a lipoyl/biotin swinging arm domain in bacterial membrane fusion proteins; and a DH domain in the yeast Bud3 and Fus2 proteins. By identifying distant relationships and merging families into superfamilies in this way, this analysis further confirms the notion that proteins evolved from relatively few ancient sequences. Moreover, this method automatically generates models of these ancient conserved regions for rapid and sensitive screening of sequences. PMID:9108146

  17. Guidelines for the Effective Use of Entity-Attribute-Value Modeling for Biomedical Databases

    PubMed Central

    Dinu, Valentin; Nadkarni, Prakash

    2007-01-01

    Purpose To introduce the goals of EAV database modeling, to describe the situations where Entity-Attribute-Value (EAV) modeling is a useful alternative to conventional relational methods of database modeling, and to describe the fine points of implementation in production systems. Methods We analyze the following circumstances: 1) data are sparse and have a large number of applicable attributes, but only a small fraction will apply to a given entity; 2) numerous classes of data need to be represented, each class has a limited number of attributes, but the number of instances of each class is very small. We also consider situations calling for a mixed approach where both conventional and EAV design are used for appropriate data classes. Results and Conclusions In robust production systems, EAV-modeled databases trade a modest data sub-schema for a complex metadata sub-schema. The need to design the metadata effectively makes EAV design potentially more challenging than conventional design. PMID:17098467

  18. [Estimation of China soil organic carbon storage and density based on 1:1,000,000 soil database].

    PubMed

    Yu, Dongsheng; Shi, Xuezheng; Sun, Weixia; Wang, Hongjie; Liu, Qinghua; Zhao, Yongcun

    2005-12-01

    Based on 1:1,000,000 soil database, and employing the methods of spatial expression, this paper estimated the soil organic carbon storage (SOCS) and density (SOCD) of China. The database consists of 1:1,000,000 digital soil map, soil profile attribution database, and soil reference system. The digital soil map contained 926 soil mapping units, 690 soil families, and 94 000 or more polygons, while the soil profile attribution database collected 7292 soil profiles, including 81 attribution fields. The SOCDs of soil profiles were calculated and linked to the soil polygons in the digital soil map by the method of "GIS linkage based on soil type", resulting in a vector map of 1:1,000,000 China SOCD. The SOCS of the country or of a soil could be estimated by summing up the SOCS of all polygons or the polygons of a soil, and their SOCD were the SOCS of them derived by their areas. The estimated SOCS and SOCD of the country was 89. 14 Pg (1 Pg = 10(15) g) and 9.60 kg m(-2), respectively, covered all the soils with a total area of 928.10 x 10(4) km2, which might be considered closest to the real value.

  19. The Cambridge MRI database for animal models of Huntington disease.

    PubMed

    Sawiak, Stephen J; Morton, A Jennifer

    2016-01-01

    We describe the Cambridge animal brain magnetic resonance imaging repository comprising 400 datasets to date from mouse models of Huntington disease. The data include raw images as well as segmented grey and white matter images with maps of cortical thickness. All images and phenotypic data for each subject are freely-available without restriction from (http://www.dspace.cam.ac.uk/handle/1810/243361/). Software and anatomical population templates optimised for animal brain analysis with MRI are also available from this site. Copyright © 2015. Published by Elsevier Inc.

  20. Use of an accurate-mass database for the systematic identification of transformation products of organic contaminants in wastewater effluents.

    PubMed

    Gómez-Ramos, María del Mar; Pérez-Parada, Andrés; García-Reyes, Juan F; Fernández-Alba, Amadeo R; Agüera, Ana

    2011-11-04

    In this article, a systematic approach is proposed to assist and simplify the identification of transformation products (TPs) of organic contaminants. This approach is based on the use of characteristic fragmentation undergone by organic contaminants during MS/MS fragmentation events, and the relationship and consistency with the transformations experimented by these chemicals in the environment or during water treatment processes. With this in mind, a database containing accurate-mass information of 147 compounds and their main fragments generated by CID MS/MS fragmentation experiments was created using an LC-QTOF-MS/MS system. The developed database was applied to the identification of tentative TPs and related unexpected compounds in eight wastewater effluent samples. The approach comprises basically three stages: (a) automatic screening, (b) identification of possible TPs and (c) confirmation by MS/MS analysis. Parameters related to the search of compounds in the database have been optimized and their dependence with the exhaustiveness of the study evaluated. Eight degradation products, from the pharmaceuticals acetaminophen, amoxicillin, carbamazepine, erythromycin and azithromycin and from the pesticide diazinon, were identified with a high grade of accuracy. Three of them were confirmed by analysis of the corresponding analytical standards. Copyright © 2011 Elsevier B.V. All rights reserved.

  1. Hydraulic fracture propagation modeling and data-based fracture identification

    NASA Astrophysics Data System (ADS)

    Zhou, Jing

    Successful shale gas and tight oil production is enabled by the engineering innovation of horizontal drilling and hydraulic fracturing. Hydraulically induced fractures will most likely deviate from the bi-wing planar pattern and generate complex fracture networks due to mechanical interactions and reservoir heterogeneity, both of which render the conventional fracture simulators insufficient to characterize the fractured reservoir. Moreover, in reservoirs with ultra-low permeability, the natural fractures are widely distributed, which will result in hydraulic fractures branching and merging at the interface and consequently lead to the creation of more complex fracture networks. Thus, developing a reliable hydraulic fracturing simulator, including both mechanical interaction and fluid flow, is critical in maximizing hydrocarbon recovery and optimizing fracture/well design and completion strategy in multistage horizontal wells. A novel fully coupled reservoir flow and geomechanics model based on the dual-lattice system is developed to simulate multiple nonplanar fractures' propagation in both homogeneous and heterogeneous reservoirs with or without pre-existing natural fractures. Initiation, growth, and coalescence of the microcracks will lead to the generation of macroscopic fractures, which is explicitly mimicked by failure and removal of bonds between particles from the discrete element network. This physics-based modeling approach leads to realistic fracture patterns without using the empirical rock failure and fracture propagation criteria required in conventional continuum methods. Based on this model, a sensitivity study is performed to investigate the effects of perforation spacing, in-situ stress anisotropy, rock properties (Young's modulus, Poisson's ratio, and compressive strength), fluid properties, and natural fracture properties on hydraulic fracture propagation. In addition, since reservoirs are buried thousands of feet below the surface, the

  2. Modeling Powered Aerodynamics for the Orion Launch Abort Vehicle Aerodynamic Database

    NASA Technical Reports Server (NTRS)

    Chan, David T.; Walker, Eric L.; Robinson, Philip E.; Wilson, Thomas M.

    2011-01-01

    Modeling the aerodynamics of the Orion Launch Abort Vehicle (LAV) has presented many technical challenges to the developers of the Orion aerodynamic database. During a launch abort event, the aerodynamic environment around the LAV is very complex as multiple solid rocket plumes interact with each other and the vehicle. It is further complicated by vehicle separation events such as between the LAV and the launch vehicle stack or between the launch abort tower and the crew module. The aerodynamic database for the LAV was developed mainly from wind tunnel tests involving powered jet simulations of the rocket exhaust plumes, supported by computational fluid dynamic simulations. However, limitations in both methods have made it difficult to properly capture the aerodynamics of the LAV in experimental and numerical simulations. These limitations have also influenced decisions regarding the modeling and structure of the aerodynamic database for the LAV and led to compromises and creative solutions. Two database modeling approaches are presented in this paper (incremental aerodynamics and total aerodynamics), with examples showing strengths and weaknesses of each approach. In addition, the unique problems presented to the database developers by the large data space required for modeling a launch abort event illustrate the complexities of working with multi-dimensional data.

  3. An Object-Relational Ifc Storage Model Based on Oracle Database

    NASA Astrophysics Data System (ADS)

    Li, Hang; Liu, Hua; Liu, Yong; Wang, Yuan

    2016-06-01

    With the building models are getting increasingly complicated, the levels of collaboration across professionals attract more attention in the architecture, engineering and construction (AEC) industry. In order to adapt the change, buildingSMART developed Industry Foundation Classes (IFC) to facilitate the interoperability between software platforms. However, IFC data are currently shared in the form of text file, which is defective. In this paper, considering the object-based inheritance hierarchy of IFC and the storage features of different database management systems (DBMS), we propose a novel object-relational storage model that uses Oracle database to store IFC data. Firstly, establish the mapping rules between data types in IFC specification and Oracle database. Secondly, design the IFC database according to the relationships among IFC entities. Thirdly, parse the IFC file and extract IFC data. And lastly, store IFC data into corresponding tables in IFC database. In experiment, three different building models are selected to demonstrate the effectiveness of our storage model. The comparison of experimental statistics proves that IFC data are lossless during data exchange.

  4. A data model and database for high-resolution pathology analytical image informatics

    PubMed Central

    Wang, Fusheng; Kong, Jun; Cooper, Lee; Pan, Tony; Kurc, Tahsin; Chen, Wenjin; Sharma, Ashish; Niedermayr, Cristobal; Oh, Tae W; Brat, Daniel; Farris, Alton B; Foran, David J; Saltz, Joel

    2011-01-01

    Background: The systematic analysis of imaged pathology specimens often results in a vast amount of morphological information at both the cellular and sub-cellular scales. While microscopy scanners and computerized analysis are capable of capturing and analyzing data rapidly, microscopy image data remain underutilized in research and clinical settings. One major obstacle which tends to reduce wider adoption of these new technologies throughout the clinical and scientific communities is the challenge of managing, querying, and integrating the vast amounts of data resulting from the analysis of large digital pathology datasets. This paper presents a data model, which addresses these challenges, and demonstrates its implementation in a relational database system. Context: This paper describes a data model, referred to as Pathology Analytic Imaging Standards (PAIS), and a database implementation, which are designed to support the data management and query requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines on whole-slide images and tissue microarrays (TMAs). Aims: (1) Development of a data model capable of efficiently representing and storing virtual slide related image, annotation, markup, and feature information. (2) Development of a database, based on the data model, capable of supporting queries for data retrieval based on analysis and image metadata, queries for comparison of results from different analyses, and spatial queries on segmented regions, features, and classified objects. Settings and Design: The work described in this paper is motivated by the challenges associated with characterization of micro-scale features for comparative and correlative analyses involving whole-slides tissue images and TMAs. Technologies for digitizing tissues have advanced significantly in the past decade. Slide scanners are capable of producing high-magnification, high-resolution images from whole slides and TMAs

  5. A data model and database for high-resolution pathology analytical image informatics.

    PubMed

    Wang, Fusheng; Kong, Jun; Cooper, Lee; Pan, Tony; Kurc, Tahsin; Chen, Wenjin; Sharma, Ashish; Niedermayr, Cristobal; Oh, Tae W; Brat, Daniel; Farris, Alton B; Foran, David J; Saltz, Joel

    2011-01-01

    The systematic analysis of imaged pathology specimens often results in a vast amount of morphological information at both the cellular and sub-cellular scales. While microscopy scanners and computerized analysis are capable of capturing and analyzing data rapidly, microscopy image data remain underutilized in research and clinical settings. One major obstacle which tends to reduce wider adoption of these new technologies throughout the clinical and scientific communities is the challenge of managing, querying, and integrating the vast amounts of data resulting from the analysis of large digital pathology datasets. This paper presents a data model, which addresses these challenges, and demonstrates its implementation in a relational database system. This paper describes a data model, referred to as Pathology Analytic Imaging Standards (PAIS), and a database implementation, which are designed to support the data management and query requirements of detailed characterization of micro-anatomic morphology through many interrelated analysis pipelines on whole-slide images and tissue microarrays (TMAs). (1) Development of a data model capable of efficiently representing and storing virtual slide related image, annotation, markup, and feature information. (2) Development of a database, based on the data model, capable of supporting queries for data retrieval based on analysis and image metadata, queries for comparison of results from different analyses, and spatial queries on segmented regions, features, and classified objects. The work described in this paper is motivated by the challenges associated with characterization of micro-scale features for comparative and correlative analyses involving whole-slides tissue images and TMAs. Technologies for digitizing tissues have advanced significantly in the past decade. Slide scanners are capable of producing high-magnification, high-resolution images from whole slides and TMAs within several minutes. Hence, it is becoming

  6. Data-mining analysis of the global distribution of soil carbon in observational databases and Earth system models

    NASA Astrophysics Data System (ADS)

    Hashimoto, Shoji; Nanko, Kazuki; Ťupek, Boris; Lehtonen, Aleksi

    2017-03-01

    Future climate change will dramatically change the carbon balance in the soil, and this change will affect the terrestrial carbon stock and the climate itself. Earth system models (ESMs) are used to understand the current climate and to project future climate conditions, but the soil organic carbon (SOC) stock simulated by ESMs and those of observational databases are not well correlated when the two are compared at fine grid scales. However, the specific key processes and factors, as well as the relationships among these factors that govern the SOC stock, remain unclear; the inclusion of such missing information would improve the agreement between modeled and observational data. In this study, we sought to identify the influential factors that govern global SOC distribution in observational databases, as well as those simulated by ESMs. We used a data-mining (machine-learning) (boosted regression trees - BRT) scheme to identify the factors affecting the SOC stock. We applied BRT scheme to three observational databases and 15 ESM outputs from the fifth phase of the Coupled Model Intercomparison Project (CMIP5) and examined the effects of 13 variables/factors categorized into five groups (climate, soil property, topography, vegetation, and land-use history). Globally, the contributions of mean annual temperature, clay content, carbon-to-nitrogen (CN) ratio, wetland ratio, and land cover were high in observational databases, whereas the contributions of the mean annual temperature, land cover, and net primary productivity (NPP) were predominant in the SOC distribution in ESMs. A comparison of the influential factors at a global scale revealed that the most distinct differences between the SOCs from the observational databases and ESMs were the low clay content and CN ratio contributions, and the high NPP contribution in the ESMs. The results of this study will aid in identifying the causes of the current mismatches between observational SOC databases and ESM outputs

  7. Modeling Virtual Organization Architecture with the Virtual Organization Breeding Methodology

    NASA Astrophysics Data System (ADS)

    Paszkiewicz, Zbigniew; Picard, Willy

    While Enterprise Architecture Modeling (EAM) methodologies become more and more popular, an EAM methodology tailored to the needs of virtual organizations (VO) is still to be developed. Among the most popular EAM methodologies, TOGAF has been chosen as the basis for a new EAM methodology taking into account characteristics of VOs presented in this paper. In this new methodology, referred as Virtual Organization Breeding Methodology (VOBM), concepts developed within the ECOLEAD project, e.g. the concept of Virtual Breeding Environment (VBE) or the VO creation schema, serve as fundamental elements for development of VOBM. VOBM is a generic methodology that should be adapted to a given VBE. VOBM defines the structure of VBE and VO architectures in a service-oriented environment, as well as an architecture development method for virtual organizations (ADM4VO). Finally, a preliminary set of tools and methods for VOBM is given in this paper.

  8. Modeling personnel turnover in the parametric organization

    NASA Technical Reports Server (NTRS)

    Dean, Edwin B.

    1991-01-01

    A model is developed for simulating the dynamics of a newly formed organization, credible during all phases of organizational development. The model development process is broken down into the activities of determining the tasks required for parametric cost analysis (PCA), determining the skills required for each PCA task, determining the skills available in the applicant marketplace, determining the structure of the model, implementing the model, and testing it. The model, parameterized by the likelihood of job function transition, has demonstrated by the capability to represent the transition of personnel across functional boundaries within a parametric organization using a linear dynamical system, and the ability to predict required staffing profiles to meet functional needs at the desired time. The model can be extended by revisions of the state and transition structure to provide refinements in functional definition for the parametric and extended organization.

  9. Modeling personnel turnover in the parametric organization

    NASA Technical Reports Server (NTRS)

    Dean, Edwin B.

    1991-01-01

    A model is developed for simulating the dynamics of a newly formed organization, credible during all phases of organizational development. The model development process is broken down into the activities of determining the tasks required for parametric cost analysis (PCA), determining the skills required for each PCA task, determining the skills available in the applicant marketplace, determining the structure of the model, implementing the model, and testing it. The model, parameterized by the likelihood of job function transition, has demonstrated by the capability to represent the transition of personnel across functional boundaries within a parametric organization using a linear dynamical system, and the ability to predict required staffing profiles to meet functional needs at the desired time. The model can be extended by revisions of the state and transition structure to provide refinements in functional definition for the parametric and extended organization.

  10. A New Global River Network Database for Macroscale Hydrologic modeling

    SciTech Connect

    Wu, Huan; Kimball, John S.; Li, Hongyi; Huang, Maoyi; Leung, Lai-Yung R.; Adler, Robert F.

    2012-09-28

    Coarse resolution (upscaled) river networks are critical inputs for runoff routing in macroscale hydrologic models. Recently, Wu et al. (2011) developed a hierarchical Dominant River Tracing (DRT) algorithm for automated extraction and spatial upscaling of basin flow directions and river networks using fine-scale hydrography inputs (e.g., flow direction, river networks, and flow accumulation). The DRT was initially applied using HYDRO1K baseline fine-scale hydrography inputs and the resulting upscaled global hydrography maps were produced at several spatial scales, and verified against other available regional and global datasets. New baseline fine-scale hydrography data from HydroSHEDS are now available for many regions and provide superior scale and quality relative to HYDRO1K. However, HydroSHEDS does not cover regions above 60°N. In this study, we applied the DRT algorithms using combined HydroSHEDS and HYDRO1K global fine-scale hydrography inputs, and produced a new series of upscaled global river network data at multiple (1/16° to 2°) spatial resolutions in a consistent (WGS84) projection. The new upscaled river networks are internally consistent and congruent with the baseline fine-scale inputs. The DRT results preserve baseline fine-scale river networks independent of spatial scales, with consistency in river network, basin shape, basin area, river length, and basin internal drainage structure between upscaled and baseline fine-scale hydrography. These digital data are available online for public access (ftp://ftp.ntsg.umt.edu/pub/data/DRT/) and should facilitate improved regional to global scale hydrological simulations, including runoff routing and river discharge calculations.

  11. Cardiac Electromechanical Models: From Cell to Organ

    PubMed Central

    Trayanova, Natalia A.; Rice, John Jeremy

    2011-01-01

    The heart is a multiphysics and multiscale system that has driven the development of the most sophisticated mathematical models at the frontiers of computational physiology and medicine. This review focuses on electromechanical (EM) models of the heart from the molecular level of myofilaments to anatomical models of the organ. Because of the coupling in terms of function and emergent behaviors at each level of biological hierarchy, separation of behaviors at a given scale is difficult. Here, a separation is drawn at the cell level so that the first half addresses subcellular/single-cell models and the second half addresses organ models. At the subcellular level, myofilament models represent actin–myosin interaction and Ca-based activation. The discussion of specific models emphasizes the roles of cooperative mechanisms and sarcomere length dependence of contraction force, considered to be the cellular basis of the Frank–Starling law. A model of electrophysiology and Ca handling can be coupled to a myofilament model to produce an EM cell model, and representative examples are summarized to provide an overview of the progression of the field. The second half of the review covers organ-level models that require solution of the electrical component as a reaction–diffusion system and the mechanical component, in which active tension generated by the myocytes produces deformation of the organ as described by the equations of continuum mechanics. As outlined in the review, different organ-level models have chosen to use different ionic and myofilament models depending on the specific application; this choice has been largely dictated by compromises between model complexity and computational tractability. The review also addresses application areas of EM models such as cardiac resynchronization therapy and the role of mechano-electric coupling in arrhythmias and defibrillation. PMID:21886622

  12. Cardiac electromechanical models: from cell to organ.

    PubMed

    Trayanova, Natalia A; Rice, John Jeremy

    2011-01-01

    The heart is a multiphysics and multiscale system that has driven the development of the most sophisticated mathematical models at the frontiers of computational physiology and medicine. This review focuses on electromechanical (EM) models of the heart from the molecular level of myofilaments to anatomical models of the organ. Because of the coupling in terms of function and emergent behaviors at each level of biological hierarchy, separation of behaviors at a given scale is difficult. Here, a separation is drawn at the cell level so that the first half addresses subcellular/single-cell models and the second half addresses organ models. At the subcellular level, myofilament models represent actin-myosin interaction and Ca-based activation. The discussion of specific models emphasizes the roles of cooperative mechanisms and sarcomere length dependence of contraction force, considered to be the cellular basis of the Frank-Starling law. A model of electrophysiology and Ca handling can be coupled to a myofilament model to produce an EM cell model, and representative examples are summarized to provide an overview of the progression of the field. The second half of the review covers organ-level models that require solution of the electrical component as a reaction-diffusion system and the mechanical component, in which active tension generated by the myocytes produces deformation of the organ as described by the equations of continuum mechanics. As outlined in the review, different organ-level models have chosen to use different ionic and myofilament models depending on the specific application; this choice has been largely dictated by compromises between model complexity and computational tractability. The review also addresses application areas of EM models such as cardiac resynchronization therapy and the role of mechano-electric coupling in arrhythmias and defibrillation.

  13. Query Monitoring and Analysis for Database Privacy - A Security Automata Model Approach.

    PubMed

    Kumar, Anand; Ligatti, Jay; Tu, Yi-Cheng

    2015-11-01

    Privacy and usage restriction issues are important when valuable data are exchanged or acquired by different organizations. Standard access control mechanisms either restrict or completely grant access to valuable data. On the other hand, data obfuscation limits the overall usability and may result in loss of total value. There are no standard policy enforcement mechanisms for data acquired through mutual and copyright agreements. In practice, many different types of policies can be enforced in protecting data privacy. Hence there is the need for an unified framework that encapsulates multiple suites of policies to protect the data. We present our vision of an architecture named security automata model (SAM) to enforce privacy-preserving policies and usage restrictions. SAM analyzes the input queries and their outputs to enforce various policies, liberating data owners from the burden of monitoring data access. SAM allows administrators to specify various policies and enforces them to monitor queries and control the data access. Our goal is to address the problems of data usage control and protection through privacy policies that can be defined, enforced, and integrated with the existing access control mechanisms using SAM. In this paper, we lay out the theoretical foundation of SAM, which is based on an automata named Mandatory Result Automata. We also discuss the major challenges of implementing SAM in a real-world database environment as well as ideas to meet such challenges.

  14. Can simple population genetic models reconcile partial match frequencies observed in large forensic databases?

    PubMed

    Mueller, Laurence D

    2008-08-01

    A recent study of partial matches in the Arizona offender database of DNA profiles has revealed a large number of nine and ten locus matches. I use simple models that incorporate the product rule, population substructure, and relatedness to predict the expected number of matches in large databases. I find that there is a relatively narrow window of parameter values that can plausibly describe the Arizona results. Further research could help determine if the Arizona samples are congruent with some of the models presented here or whether fundamental assumptions for predicting these match frequencies requires adjustments.

  15. Model Organisms for Studying the Cell Cycle.

    PubMed

    Tang, Zhaohua

    2016-01-01

    Regulation of the cell-division cycle is fundamental for the growth, development, and reproduction of all species of life. In the past several decades, a conserved theme of cell cycle regulation has emerged from research in diverse model organisms. A comparison of distinct features of several diverse model organisms commonly used in cell cycle studies highlights their suitability for various experimental approaches, and recaptures their contributions to our current understanding of the eukaryotic cell cycle. A historic perspective presents a recollection of the breakthrough upon unfolding the universal principles of cell cycle control by scientists working with diverse model organisms, thereby appreciating the discovery pathways in this field. A comprehensive understanding is necessary to address current challenging questions about cell cycle control. Advances in genomics, proteomics, quantitative methodologies, and approaches of systems biology are redefining the traditional concept of what constitutes a model organism and have established a new era for development of novel, and refinement of the established model organisms. Researchers working in the field are no longer separated by their favorite model organisms; they have become more integrated into a larger community for gaining greater insights into how a cell divides and cycles. The new technologies provide a broad evolutionary spectrum of the cell-division cycle and allow informative comparisons among different species at a level that has never been possible, exerting unimaginable impact on our comprehensive understanding of cell cycle regulation.

  16. A Relational Database Model and Data Migration Plan for the Student Services Department at the Marine Corps Institute

    DTIC Science & Technology

    1997-09-01

    response to MCI’s request. It investigates data modeling and database design using the Integration Definition for Information Modeling ( IDEFiX ) methodology...and the relational model. It also addresses the migration of data and databases from legacy to open systems. The application of the IDEFiX model

  17. Empirical evaluation of analytical models for parallel relational data-base queries. Master's thesis

    SciTech Connect

    Denham, M.C.

    1990-12-01

    This thesis documents the design and implementation of three parallel join algorithms to be used in the verification of analytical models developed by Kearns. Kearns developed a set of analytical models for a variety of relational database queries. These models serve as tools for the design of parallel relational database system. Each of Kearns' models is classified as either single step or multiple step. The single step models reflect queries that require only one operation while the multiple step models reflect queries that require multiple operations. Three parallel join algorithms were implemented based upon Kearns' models. Two are based upon single step join models and one is based upon a multiple step join model. They are implemented on an Intel iPSC/1 parallel computer. The single step join algorithms include the parallel nested-loop join and the bucket (or hash) join. The multiple step algorithm that was implemented is a pipelined version of the bucket join. The results show that within the constraints of the test cases run, the three models are all at least accurate to within about 8.5% and they should prove useful in the design of parallel relational database systems.

  18. High rate of unemployment after liver transplantation: analysis of the United Network for Organ Sharing database.

    PubMed

    Huda, Amina; Newcomer, Robert; Harrington, Charlene; Blegen, Mary G; Keeffe, Emmet B

    2012-01-01

    The goal of liver transplantation (LT) is to maximize the length and quality of a patient's life and facilitate his or her return to full productivity. The aims of this study were (1) to use the United Network for Organ Sharing (UNOS) data set to determine the proportions of recipients who were employed and unemployed within 24 months after LT between 2002 and 2008 and (2) to examine the factors associated with a return to employment. UNOS data that were collected since the adoption of the Model for End-Stage Liver Disease scoring system on February 27, 2002 were analyzed. There were 21,942 transplant recipients who met the inclusion criteria. The employment status of the recipients was analyzed within a 60-day window at the following times after transplantation: 6, 12, and 24 months. Approximately one-quarter of the LT recipients (5360 or 24.4%) were employed within 24 months after transplantation, and the remaining recipients had not returned to work. The demographic variables that were independently associated with posttransplant employment included an age of 18 to 40 years, male sex, a college degree, Caucasian race, and pretransplant employment. Patients with alcoholic liver disease had a significantly lower rate of employment than patients with other etiologies of liver disease. The recipients who were employed after transplantation had significantly better functional status than those who were not employed. In conclusion, the employment rate after LT is low, with only one-quarter of LT recipients employed. New national and individual transplant program policies are needed to assess the root causes of unemployment in recipients who wish to work after LT. Copyright © 2011 American Association for the Study of Liver Diseases.

  19. Data model and relational database design for the New Jersey Water-Transfer Data System (NJWaTr)

    USGS Publications Warehouse

    Tessler, Steven

    2003-01-01

    The New Jersey Water-Transfer Data System (NJWaTr) is a database design for the storage and retrieval of water-use data. NJWaTr can manage data encompassing many facets of water use, including (1) the tracking of various types of water-use activities (withdrawals, returns, transfers, distributions, consumptive-use, wastewater collection, and treatment); (2) the storage of descriptions, classifications and locations of places and organizations involved in water-use activities; (3) the storage of details about measured or estimated volumes of water associated with water-use activities; and (4) the storage of information about data sources and water resources associated with water use. In NJWaTr, each water transfer occurs unidirectionally between two site objects, and the sites and conveyances form a water network. The core entities in the NJWaTr model are site, conveyance, transfer/volume, location, and owner. Other important entities include water resource (used for withdrawals and returns), data source, permit, and alias. Multiple water-exchange estimates based on different methods or data sources can be stored for individual transfers. Storage of user-defined details is accommodated for several of the main entities. Many tables contain classification terms to facilitate the detailed description of data items and can be used for routine or custom data summarization. NJWaTr accommodates single-user and aggregate-user water-use data, can be used for large or small water-network projects, and is available as a stand-alone Microsoft? Access database. Data stored in the NJWaTr structure can be retrieved in user-defined combinations to serve visualization and analytical applications. Users can customize and extend the database, link it to other databases, or implement the design in other relational database applications.

  20. Modeling and implementing a database on drugs into a hospital intranet.

    PubMed

    François, M; Joubert, M; Fieschi, D; Fieschi, M

    1998-09-01

    Our objective was to develop a drug information service, implementing a database on drugs in our university hospitals information system. Thériaque is a database, maintained by a group of pharmacists and physicians, on all the drugs available in France. Before its implementation we modeled its content (chemical classes, active components, excipients, indications, contra-indications, side effects, and so on) according to an object-oriented method. Then we designed HTML pages whose appearance translates the structure of classes of objects of the model. Fields in pages are dynamically fulfilled by the results of queries to a relational database in which information on drugs is stored. This allowed a fast implementation and did not imply to port a client application on the thousands of workstations over the network. The interface provides end-users with an easy-to-use and natural way to access information related to drugs in an internet environment.

  1. Crystal Plasticity Modeling of Microstructure Evolution and Mechanical Fields During Processing of Metals Using Spectral Databases

    NASA Astrophysics Data System (ADS)

    Knezevic, Marko; Kalidindi, Surya R.

    2017-02-01

    This article reviews the advances made in the development and implementation of a novel approach to speeding up crystal plasticity simulations of metal processing by one to three orders of magnitude when compared with the conventional approaches, depending on the specific details of implementation. This is mainly accomplished through the use of spectral crystal plasticity (SCP) databases grounded in the compact representation of the functions central to crystal plasticity computations. A key benefit of the databases is that they allow for a noniterative retrieval of constitutive solutions for any arbitrary plastic stretching tensor (i.e., deformation mode) imposed on a crystal of arbitrary orientation. The article emphasizes the latest developments in terms of embedding SCP databases within implicit finite elements. To illustrate the potential of these novel implementations, the results from several process modeling applications including equichannel angular extrusion and rolling are presented and compared with experimental measurements and predictions from other models.

  2. Crystal Plasticity Modeling of Microstructure Evolution and Mechanical Fields During Processing of Metals Using Spectral Databases

    NASA Astrophysics Data System (ADS)

    Knezevic, Marko; Kalidindi, Surya R.

    2017-05-01

    This article reviews the advances made in the development and implementation of a novel approach to speeding up crystal plasticity simulations of metal processing by one to three orders of magnitude when compared with the conventional approaches, depending on the specific details of implementation. This is mainly accomplished through the use of spectral crystal plasticity (SCP) databases grounded in the compact representation of the functions central to crystal plasticity computations. A key benefit of the databases is that they allow for a noniterative retrieval of constitutive solutions for any arbitrary plastic stretching tensor (i.e., deformation mode) imposed on a crystal of arbitrary orientation. The article emphasizes the latest developments in terms of embedding SCP databases within implicit finite elements. To illustrate the potential of these novel implementations, the results from several process modeling applications including equichannel angular extrusion and rolling are presented and compared with experimental measurements and predictions from other models.

  3. Hydrologic Derivatives for Modeling and Analysis—A new global high-resolution database

    USGS Publications Warehouse

    Verdin, Kristine L.

    2017-07-17

    The U.S. Geological Survey has developed a new global high-resolution hydrologic derivative database. Loosely modeled on the HYDRO1k database, this new database, entitled Hydrologic Derivatives for Modeling and Analysis, provides comprehensive and consistent global coverage of topographically derived raster layers (digital elevation model data, flow direction, flow accumulation, slope, and compound topographic index) and vector layers (streams and catchment boundaries). The coverage of the data is global, and the underlying digital elevation model is a hybrid of three datasets: HydroSHEDS (Hydrological data and maps based on SHuttle Elevation Derivatives at multiple Scales), GMTED2010 (Global Multi-resolution Terrain Elevation Data 2010), and the SRTM (Shuttle Radar Topography Mission). For most of the globe south of 60°N., the raster resolution of the data is 3 arc-seconds, corresponding to the resolution of the SRTM. For the areas north of 60°N., the resolution is 7.5 arc-seconds (the highest resolution of the GMTED2010 dataset) except for Greenland, where the resolution is 30 arc-seconds. The streams and catchments are attributed with Pfafstetter codes, based on a hierarchical numbering system, that carry important topological information. This database is appropriate for use in continental-scale modeling efforts. The work described in this report was conducted by the U.S. Geological Survey in cooperation with the National Aeronautics and Space Administration Goddard Space Flight Center.

  4. Modeling Personnel Turnover in the Parametric Organization

    NASA Technical Reports Server (NTRS)

    Dean, Edwin B.

    1991-01-01

    A primary issue in organizing a new parametric cost analysis function is to determine the skill mix and number of personnel required. The skill mix can be obtained by a functional decomposition of the tasks required within the organization and a matrixed correlation with educational or experience backgrounds. The number of personnel is a function of the skills required to cover all tasks, personnel skill background and cross training, the intensity of the workload for each task, migration through various tasks by personnel along a career path, personnel hiring limitations imposed by management and the applicant marketplace, personnel training limitations imposed by management and personnel capability, and the rate at which personnel leave the organization for whatever reason. Faced with the task of relating all of these organizational facets in order to grow a parametric cost analysis (PCA) organization from scratch, it was decided that a dynamic model was required in order to account for the obvious dynamics of the forming organization. The challenge was to create such a simple model which would be credible during all phases of organizational development. The model development process was broken down into the activities of determining the tasks required for PCA, determining the skills required for each PCA task, determining the skills available in the applicant marketplace, determining the structure of the dynamic model, implementing the dynamic model, and testing the dynamic model.

  5. Condensing Organic Aerosols in a Microphysical Model

    NASA Astrophysics Data System (ADS)

    Gao, Y.; Tsigaridis, K.; Bauer, S.

    2015-12-01

    The condensation of organic aerosols is represented in a newly developed box-model scheme, where its effect on the growth and composition of particles are examined. We implemented the volatility-basis set (VBS) framework into the aerosol mixing state resolving microphysical scheme Multiconfiguration Aerosol TRacker of mIXing state (MATRIX). This new scheme is unique and advances the representation of organic aerosols in models in that, contrary to the traditional treatment of organic aerosols as non-volatile in most climate models and in the original version of MATRIX, this new scheme treats them as semi-volatile. Such treatment is important because low-volatility organics contribute significantly to the growth of particles. The new scheme includes several classes of semi-volatile organic compounds from the VBS framework that can partition among aerosol populations in MATRIX, thus representing the growth of particles via condensation of low volatility organic vapors. Results from test cases representing Mexico City and a Finish forrest condistions show good representation of the time evolutions of concentration for VBS species in the gas phase and in the condensed particulate phase. Emitted semi-volatile primary organic aerosols evaporate almost completely in the high volatile range, and they condense more efficiently in the low volatility range.

  6. A genome-scale metabolic flux model of Escherichia coli K–12 derived from the EcoCyc database

    PubMed Central

    2014-01-01

    advantages can be derived from the combination of model organism databases and flux balance modeling represented by MetaFlux. Interpretation of the EcoCyc database as a flux balance model results in a highly accurate metabolic model and provides a rigorous consistency check for information stored in the database. PMID:24974895

  7. A virtual observatory for photoionized nebulae: the Mexican Million Models database (3MdB).

    NASA Astrophysics Data System (ADS)

    Morisset, C.; Delgado-Inglada, G.; Flores-Fajardo, N.

    2015-04-01

    Photoionization models obtained with numerical codes are widely used to study the physics of the interstellar medium (planetary nebulae, HII regions, etc). Grids of models are performed to understand the effects of the different parameters used to describe the regions on the observables (mainly emission line intensities). Most of the time, only a small part of the computed results of such grids are published, and they are sometimes hard to obtain in a user-friendly format. We present here the Mexican Million Models dataBase (3MdB), an effort to resolve both of these issues in the form of a database of photoionization models, easily accessible through the MySQL protocol, and containing a lot of useful outputs from the models, such as the intensities of 178 emission lines, the ionic fractions of all the ions, etc. Some examples of the use of the 3MdB are also presented.

  8. Modeling the High Speed Research Cycle 2B Longitudinal Aerodynamic Database Using Multivariate Orthogonal Functions

    NASA Technical Reports Server (NTRS)

    Morelli, E. A.; Proffitt, M. S.

    1999-01-01

    The data for longitudinal non-dimensional, aerodynamic coefficients in the High Speed Research Cycle 2B aerodynamic database were modeled using polynomial expressions identified with an orthogonal function modeling technique. The discrepancy between the tabular aerodynamic data and the polynomial models was tested and shown to be less than 15 percent for drag, lift, and pitching moment coefficients over the entire flight envelope. Most of this discrepancy was traced to smoothing local measurement noise and to the omission of mass case 5 data in the modeling process. A simulation check case showed that the polynomial models provided a compact and accurate representation of the nonlinear aerodynamic dependencies contained in the HSR Cycle 2B tabular aerodynamic database.

  9. Integrated Functional and Executional Modelling of Software Using Web-Based Databases

    NASA Technical Reports Server (NTRS)

    Kulkarni, Deepak; Marietta, Roberta

    1998-01-01

    NASA's software subsystems undergo extensive modification and updates over the operational lifetimes. It is imperative that modified software should satisfy safety goals. This report discusses the difficulties encountered in doing so and discusses a solution based on integrated modelling of software, use of automatic information extraction tools, web technology and databases.

  10. A Transaction Workload Model and its Application to a Tactical C3 Distributed Database System

    DTIC Science & Technology

    1980-07-01

    34IANI’.LP MISSiON PLANNING FRAG ENIR’Y REVIEW FLIGHT SCHEDULING GERATION MONITORING IMMEDIATC MISSION PLANNING OPERATION ADOJS1MENT DAI A MAINTSNANCE 0 2...the performance and design of a distributed database system. The effect certaiftly warrants future investigation to which our model is we]i suited

  11. The Subject-Object Relationship Interface Model in Database Management Systems.

    ERIC Educational Resources Information Center

    Yannakoudakis, Emmanuel J.; Attar-Bashi, Hussain A.

    1989-01-01

    Describes a model that displays structures necessary to map between the conceptual and external levels in database management systems, using an algorithm that maps the syntactic representations of tuples onto semantic representations. A technique for translating tuples into natural language sentences is introduced, and a system implemented in…

  12. A Multiscale Database of Soil Properties for Regional Environmental Quality Modeling in the Western United States

    USDA-ARS?s Scientific Manuscript database

    The USDA-NRCS STATSGO regional soil database can provide generalized soil information for regional-scale modeling, planning and management of soil and water conservation, and assessment of environmental quality. However, the data available in STATSGO can not be readily extracted nor parameterized to...

  13. Selecting representative model micro-organisms

    PubMed Central

    Holland, BR; Schmid, J

    2005-01-01

    Background Micro-biological research relies on the use of model organisms that act as representatives of their species or subspecies, these are frequently well-characterized laboratory strains. However, it has often become apparent that the model strain initially chosen does not represent important features of the species. For micro-organisms, the diversity of their genomes is such that even the best possible choice of initial strain for sequencing may not assure that the genome obtained adequately represents the species. To acquire information about a species' genome as efficiently as possible, we require a method to choose strains for analysis on the basis of how well they represent the species. Results We develop the Best Total Coverage (BTC) method for selecting one or more representative model organisms from a group of interest, given that rough genetic distances between the members of the group are known. Software implementing a "greedy" version of the method can be used with large data sets, its effectiveness is tested using both constructed and biological data sets. Conclusion In both the simulated and biological examples the greedy-BTC method outperformed random selection of model organisms, and for two biological examples it outperformed selection of model strains based on phylogenetic structure. Although the method was designed with microbial species in mind, and is tested here on three microbial data sets, it will also be applicable to other types of organism. PMID:15904495

  14. modbase, a database of annotated comparative protein structure models and associated resources

    PubMed Central

    Pieper, Ursula; Eswar, Narayanan; Webb, Ben M.; Eramian, David; Kelly, Libusha; Barkan, David T.; Carter, Hannah; Mankoo, Parminder; Karchin, Rachel; Marti-Renom, Marc A.; Davis, Fred P.; Sali, Andrej

    2009-01-01

    MODBASE (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by MODPIPE, an automated modeling pipeline that relies primarily on MODELLER for fold assignment, sequence–structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE currently contains 5 152 695 reliable models for domains in 1 593 209 unique protein sequences; only models based on statistically significant alignments and/or models assessed to have the correct fold are included. MODBASE also allows users to calculate comparative models on demand, through an interface to the MODWEB modeling server (http://salilab.org/modweb). Other resources integrated with MODBASE include databases of multiple protein structure alignments (DBAli), structurally defined ligand binding sites (LIGBASE), predicted ligand binding sites (AnnoLyze), structurally defined binary domain interfaces (PIBASE) and annotated single nucleotide polymorphisms and somatic mutations found in human proteins (LS-SNP, LS-Mut). MODBASE models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/). PMID:18948282

  15. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17.

    PubMed

    Ruddigkeit, Lars; van Deursen, Ruud; Blum, Lorenz C; Reymond, Jean-Louis

    2012-11-26

    Drug molecules consist of a few tens of atoms connected by covalent bonds. How many such molecules are possible in total and what is their structure? This question is of pressing interest in medicinal chemistry to help solve the problems of drug potency, selectivity, and toxicity and reduce attrition rates by pointing to new molecular series. To better define the unknown chemical space, we have enumerated 166.4 billion molecules of up to 17 atoms of C, N, O, S, and halogens forming the chemical universe database GDB-17, covering a size range containing many drugs and typical for lead compounds. GDB-17 contains millions of isomers of known drugs, including analogs with high shape similarity to the parent drug. Compared to known molecules in PubChem, GDB-17 molecules are much richer in nonaromatic heterocycles, quaternary centers, and stereoisomers, densely populate the third dimension in shape space, and represent many more scaffold types.

  16. PK/DB: database for pharmacokinetic properties and predictive in silico ADME models.

    PubMed

    Moda, Tiago L; Torres, Leonardo G; Carrara, Alexandre E; Andricopulo, Adriano D

    2008-10-01

    The study of pharmacokinetic properties (PK) is of great importance in drug discovery and development. In the present work, PK/DB (a new freely available database for PK) was designed with the aim of creating robust databases for pharmacokinetic studies and in silico absorption, distribution, metabolism and excretion (ADME) prediction. Comprehensive, web-based and easy to access, PK/DB manages 1203 compounds which represent 2973 pharmacokinetic measurements, including five models for in silico ADME prediction (human intestinal absorption, human oral bioavailability, plasma protein binding, blood-brain barrier and water solubility). http://www.pkdb.ifsc.usp.br

  17. Genome databases

    SciTech Connect

    Courteau, J.

    1991-10-11

    Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts in the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.

  18. The LAILAPS search engine: a feature model for relevance ranking in life science databases.

    PubMed

    Lange, Matthias; Spies, Karl; Colmsee, Christian; Flemming, Steffen; Klapperstück, Matthias; Scholz, Uwe

    2010-03-25

    Efficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking. In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable. The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries. Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de.

  19. The origin and evolution of model organisms

    NASA Technical Reports Server (NTRS)

    Hedges, S. Blair

    2002-01-01

    The phylogeny and timescale of life are becoming better understood as the analysis of genomic data from model organisms continues to grow. As a result, discoveries are being made about the early history of life and the origin and development of complex multicellular life. This emerging comparative framework and the emphasis on historical patterns is helping to bridge barriers among organism-based research communities.

  20. The origin and evolution of model organisms

    NASA Technical Reports Server (NTRS)

    Hedges, S. Blair

    2002-01-01

    The phylogeny and timescale of life are becoming better understood as the analysis of genomic data from model organisms continues to grow. As a result, discoveries are being made about the early history of life and the origin and development of complex multicellular life. This emerging comparative framework and the emphasis on historical patterns is helping to bridge barriers among organism-based research communities.

  1. Putting "Organizations" into an Organization Theory Course: A Hybrid CAO Model for Teaching Organization Theory

    ERIC Educational Resources Information Center

    Hannah, David R.; Venkatachary, Ranga

    2010-01-01

    In this article, the authors present a retrospective analysis of an instructor's multiyear redesign of a course on organization theory into what is called a hybrid Classroom-as-Organization model. It is suggested that this new course design served to apprentice students to function in quasi-real organizational structures. The authors further argue…

  2. Putting "Organizations" into an Organization Theory Course: A Hybrid CAO Model for Teaching Organization Theory

    ERIC Educational Resources Information Center

    Hannah, David R.; Venkatachary, Ranga

    2010-01-01

    In this article, the authors present a retrospective analysis of an instructor's multiyear redesign of a course on organization theory into what is called a hybrid Classroom-as-Organization model. It is suggested that this new course design served to apprentice students to function in quasi-real organizational structures. The authors further argue…

  3. Database assessment of CMIP5 and hydrological models to determine flood risk areas

    NASA Astrophysics Data System (ADS)

    Limlahapun, Ponthip; Fukui, Hiromichi

    2016-11-01

    Solutions for water-related disasters may not be solved with a single scientific method. Based on this premise, we involved logic conceptions, associate sequential result amongst models, and database applications attempting to analyse historical and future scenarios in the context of flooding. The three main models used in this study are (1) the fifth phase of the Coupled Model Intercomparison Project (CMIP5) to derive precipitation; (2) the Integrated Flood Analysis System (IFAS) to extract amount of discharge; and (3) the Hydrologic Engineering Center (HEC) model to generate inundated areas. This research notably focused on integrating data regardless of system-design complexity, and database approaches are significantly flexible, manageable, and well-supported for system data transfer, which makes them suitable for monitoring a flood. The outcome of flood map together with real-time stream data can help local communities identify areas at-risk of flooding in advance.

  4. A C library for retrieving specific reactions from the BioModels database

    PubMed Central

    Neal, M. L.; Galdzicki, M.; Gallimore, J. T.; Sauro, H. M.

    2014-01-01

    Summary: We describe libSBMLReactionFinder, a C library for retrieving specific biochemical reactions from the curated systems biology markup language models contained in the BioModels database. The library leverages semantic annotations in the database to associate reactions with human-readable descriptions, making the reactions retrievable through simple string searches. Our goal is to provide a useful tool for quantitative modelers who seek to accelerate modeling efforts through the reuse of previously published representations of specific chemical reactions. Availability and implementation: The library is open-source and dual licensed under the Mozilla Public License Version 2.0 and GNU General Public License Version 2.0. Project source code, downloads and documentation are available at http://code.google.com/p/lib-sbml-reaction-finder. Contact: mneal@uw.edu PMID:24078714

  5. Examination of the U.S. EPA’s vapor intrusion database based on models

    PubMed Central

    Yao, Yijun; Shen, Rui; Pennell, Kelly G.; Suuberg, Eric M.

    2013-01-01

    In the United States Environmental Protection Agency (U.S. EPA)’s vapor intrusion (VI) database, there appears to be a trend showing an inverse relationship between the indoor air concentration attenuation factor and the subsurface source vapor concentration. This is inconsistent with the physical understanding in current vapor intrusion models. This paper explores possible reasons for this apparent discrepancy. Soil vapor transport processes occur independently of the actual building entry process, and are consistent with the trends in the database results. A recent EPA technical report provided a list of factors affecting vapor intrusion, and the influence of some of these are explored in the context of the database results. PMID:23293835

  6. ITVO and BaSTI: databases and services for cosmological and stellar models.

    NASA Astrophysics Data System (ADS)

    Manzato, P.; Molinaro, M.; Gasparo, F.; Pasian, F.; Pietrinferni, A.; Cassisi, S.; Gheller, C.; Ameglio, S.; Murante, G.; Borgani, S.

    We have created a database structure to store the metadata of different types of cosmological simulations (Gadget, Enzo, FLY) and the first relational database for stellar evolution models BaSTI, it includes tracks and isochrones computed with the FRANEC code. We are also studying the feasibility of including different sets of theory data and services in the Virtual Observatory (VObs). Some examples of services are: the calculation on-the-fly of the profiles of some quantities for the simulated galaxy clusters, the preview of the object image opened with a VObs tool and retrieve a VOTable standard format. Furthermore, the BaSTI database development is the use case for studying the feasibility of storing in it the output of new simulations performed using the Grid infrastructure as demonstrating in the VO-DCA WP5, EU funded project. All that could be matter of discussion between the tool developers and the users, the scientists.

  7. Modelling nitrous oxide emissions from organic soils in Europe

    NASA Astrophysics Data System (ADS)

    Leppelt, Thomas; Dechow, Rene; Gebbert, Sören; Freibauer, Annette

    2013-04-01

    The greenhouse gas emission potential of peatland ecosystems are mandatory for a complete annual emission budget in Europe. The GHG-Europe project aims to improve the modelling capabilities for greenhouse gases, e.g., nitrous oxide. The heterogeneous and event driven fluxes of nitrous oxide are challenging to model on European scale, especially regarding the upscaling purpose and certain parameter estimations. Due to these challenges adequate techniques are needed to create a robust empirical model. Therefore a literature study of nitrous oxide fluxes from organic soils has been carried out. This database contains flux data from boreal and temperate climate zones and covers the different land use categories: cropland, grassland, forest, natural and peat extraction sites. Especially managed crop- and grassland sites feature high emission potential. Generally nitrous oxide emissions increases significantly with deep drainage and intensive application of nitrogen fertilisation. Whereas natural peatland sites with a near surface groundwater table can act as nitrous oxide sink. An empirical fuzzy logic model has been applied to predict annual nitrous oxide emissions from organic soils. The calibration results in two separate models with best model performances for bogs and fens, respectively. The derived parameter combinations of these models contain mean groundwater table, nitrogen fertilisation, annual precipitation, air temperature, carbon content and pH value. Influences of the calibrated parameters on nitrous oxide fluxes are verified by several studies in literature. The extrapolation potential has been tested by an implemented cross validation. Furthermore the parameter ranges of the calibrated models are compared to occurring values on European scale. This avoid unknown systematic errors for the regionalisation purpose. Additionally a sensitivity analysis specify the model behaviour for each alternating parameter. The upscaling process for European peatland

  8. BriX: a database of protein building blocks for structural analysis, modeling and design

    PubMed Central

    Vanhee, Peter; Verschueren, Erik; Baeten, Lies; Stricher, Francois; Serrano, Luis

    2011-01-01

    High-resolution structures of proteins remain the most valuable source for understanding their function in the cell and provide leads for drug design. Since the availability of sufficient protein structures to tackle complex problems such as modeling backbone moves or docking remains a problem, alternative approaches using small, recurrent protein fragments have been employed. Here we present two databases that provide a vast resource for implementing such fragment-based strategies. The BriX database contains fragments from over 7000 non-homologous proteins from the Astral collection, segmented in lengths from 4 to 14 residues and clustered according to structural similarity, summing up to a content of 2 million fragments per length. To overcome the lack of loops classified in BriX, we constructed the Loop BriX database of non-regular structure elements, clustered according to end-to-end distance between the regular residues flanking the loop. Both databases are available online (http://brix.crg.es) and can be accessed through a user-friendly web-interface. For high-throughput queries a web-based API is provided, as well as full database downloads. In addition, two exciting applications are provided as online services: (i) user-submitted structures can be covered on the fly with BriX classes, representing putative structural variation throughout the protein and (ii) gaps or low-confidence regions in these structures can be bridged with matching fragments. PMID:20972210

  9. BriX: a database of protein building blocks for structural analysis, modeling and design.

    PubMed

    Vanhee, Peter; Verschueren, Erik; Baeten, Lies; Stricher, Francois; Serrano, Luis; Rousseau, Frederic; Schymkowitz, Joost

    2011-01-01

    High-resolution structures of proteins remain the most valuable source for understanding their function in the cell and provide leads for drug design. Since the availability of sufficient protein structures to tackle complex problems such as modeling backbone moves or docking remains a problem, alternative approaches using small, recurrent protein fragments have been employed. Here we present two databases that provide a vast resource for implementing such fragment-based strategies. The BriX database contains fragments from over 7000 non-homologous proteins from the Astral collection, segmented in lengths from 4 to 14 residues and clustered according to structural similarity, summing up to a content of 2 million fragments per length. To overcome the lack of loops classified in BriX, we constructed the Loop BriX database of non-regular structure elements, clustered according to end-to-end distance between the regular residues flanking the loop. Both databases are available online (http://brix.crg.es) and can be accessed through a user-friendly web-interface. For high-throughput queries a web-based API is provided, as well as full database downloads. In addition, two exciting applications are provided as online services: (i) user-submitted structures can be covered on the fly with BriX classes, representing putative structural variation throughout the protein and (ii) gaps or low-confidence regions in these structures can be bridged with matching fragments.

  10. Prediction of biological targets for compounds using multiple-category Bayesian models trained on chemogenomics databases.

    PubMed

    Nidhi; Glick, Meir; Davies, John W; Jenkins, Jeremy L

    2006-01-01

    Target identification is a critical step following the discovery of small molecules that elicit a biological phenotype. The present work seeks to provide an in silico correlate of experimental target fishing technologies in order to rapidly fish out potential targets for compounds on the basis of chemical structure alone. A multiple-category Laplacian-modified naïve Bayesian model was trained on extended-connectivity fingerprints of compounds from 964 target classes in the WOMBAT (World Of Molecular BioAcTivity) chemogenomics database. The model was employed to predict the top three most likely protein targets for all MDDR (MDL Drug Database Report) database compounds. On average, the correct target was found 77% of the time for compounds from 10 MDDR activity classes with known targets. For MDDR compounds annotated with only therapeutic or generic activities such as "antineoplastic", "kinase inhibitor", or "anti-inflammatory", the model was able to systematically deconvolute the generic activities to specific targets associated with the therapeutic effect. Examples of successful deconvolution are given, demonstrating the usefulness of the tool for improving knowledge in chemogenomics databases and for predicting new targets for orphan compounds.

  11. LegumeIP: an integrative database for comparative genomics and transcriptomics of model legumes.

    PubMed

    Li, Jun; Dai, Xinbin; Liu, Tingsong; Zhao, Patrick Xuechun

    2012-01-01

    Legumes play a vital role in maintaining the nitrogen cycle of the biosphere. They conduct symbiotic nitrogen fixation through endosymbiotic relationships with bacteria in root nodules. However, this and other characteristics of legumes, including mycorrhization, compound leaf development and profuse secondary metabolism, are absent in the typical model plant Arabidopsis thaliana. We present LegumeIP (http://plantgrn.noble.org/LegumeIP/), an integrative database for comparative genomics and transcriptomics of model legumes, for studying gene function and genome evolution in legumes. LegumeIP compiles gene and gene family information, syntenic and phylogenetic context and tissue-specific transcriptomic profiles. The database holds the genomic sequences of three model legumes, Medicago truncatula, Glycine max and Lotus japonicus plus two reference plant species, A. thaliana and Populus trichocarpa, with annotations based on UniProt, InterProScan, Gene Ontology and the Kyoto Encyclopedia of Genes and Genomes databases. LegumeIP also contains large-scale microarray and RNA-Seq-based gene expression data. Our new database is capable of systematic synteny analysis across M. truncatula, G. max, L. japonicas and A. thaliana, as well as construction and phylogenetic analysis of gene families across the five hosted species. Finally, LegumeIP provides comprehensive search and visualization tools that enable flexible queries based on gene annotation, gene family, synteny and relative gene expression.

  12. A database and tool for boundary conditions for regional air quality modeling: description and evaluation

    NASA Astrophysics Data System (ADS)

    Henderson, B. H.; Akhtar, F.; Pye, H. O. T.; Napelenok, S. L.; Hutzell, W. T.

    2013-09-01

    Transported air pollutants receive increasing attention as regulations tighten and global concentrations increase. The need to represent international transport in regional air quality assessments requires improved representation of boundary concentrations. Currently available observations are too sparse vertically to provide boundary information, particularly for ozone precursors, but global simulations can be used to generate spatially and temporally varying Lateral Boundary Conditions (LBC). This study presents a public database of global simulations designed and evaluated for use as LBC for air quality models (AQMs). The database covers the contiguous United States (CONUS) for the years 2000-2010 and contains hourly varying concentrations of ozone, aerosols, and their precursors. The database is complimented by a tool for configuring the global results as inputs to regional scale models (e.g., Community Multiscale Air Quality or Comprehensive Air quality Model with extensions). This study also presents an example application based on the CONUS domain, which is evaluated against satellite retrieved ozone vertical profiles. The results show performance is largely within uncertainty estimates for the Tropospheric Emission Spectrometer (TES) with some exceptions. The major difference shows a high bias in the upper troposphere along the southern boundary in January. This publication documents the global simulation database, the tool for conversion to LBC, and the fidelity of concentrations on the boundaries. This documentation is intended to support applications that require representation of long-range transport of air pollutants.

  13. Understanding Terrorist Organizations with a Dynamic Model

    NASA Astrophysics Data System (ADS)

    Gutfraind, Alexander

    Terrorist organizations change over time because of processes such as recruitment and training as well as counter-terrorism (CT) measures, but the effects of these processes are typically studied qualitatively and in separation from each other. Seeking a more quantitative and integrated understanding, we constructed a simple dynamic model where equations describe how these processes change an organization’s membership. Analysis of the model yields a number of intuitive as well as novel findings. Most importantly it becomes possible to predict whether counter-terrorism measures would be sufficient to defeat the organization. Furthermore, we can prove in general that an organization would collapse if its strength and its pool of foot soldiers decline simultaneously. In contrast, a simultaneous decline in its strength and its pool of leaders is often insufficient and short-termed. These results and other like them demonstrate the great potential of dynamic models for informing terrorism scholarship and counter-terrorism policy making.

  14. Research proceedings on amphibian model organisms

    PubMed Central

    LIU, Lu-Sha; ZHAO, Lan-Ying; WANG, Shou-Hong; JIANG, Jian-Ping

    2016-01-01

    Model organisms have long been important in biology and medicine due to their specific characteristics. Amphibians, especially Xenopus, play key roles in answering fundamental questions on developmental biology, regeneration, genetics, and toxicology due to their large and abundant eggs, as well as their versatile embryos, which can be readily manipulated and developed in vivo. Furthermore, amphibians have also proven to be of considerable benefit in human disease research due to their conserved cellular developmental and genomic organization. This review gives a brief introduction on the progress and limitations of these animal models in biology and human disease research, and discusses the potential and challenge of Microhyla fissipes as a new model organism. PMID:27469255

  15. Geometric modeling of pelvic organs with thickness

    NASA Astrophysics Data System (ADS)

    Bay, T.; Chen, Z.-W.; Raffin, R.; Daniel, M.; Joli, P.; Feng, Z.-Q.; Bellemare, M.-E.

    2012-03-01

    Physiological changes in the spatial configuration of the internal organs in the abdomen can induce different disorders that need surgery. Following the complexity of the surgical procedure, mechanical simulations are necessary but the in vivo factor makes complicate the study of pelvic organs. In order to determine a realistic behavior of these organs, an accurate geometric model associated with a physical modeling is therefore required. Our approach is integrated in the partnership between a geometric and physical module. The Geometric Modeling seeks to build a continuous geometric model: from a dataset of 3D points provided by a Segmentation step, surfaces are created through a B-spline fitting process. An energy function is built to measure the bidirectional distance between surface and data. This energy is minimized with an alternate iterative Hoschek-like method. A thickness is added with an offset formulation, and the geometric model is finally exported in a hexahedral mesh. Afterward, the Physical Modeling tries to calculate the properties of the soft tissues to simulate the organs displacements. The physical parameters attached to the data are determined with a feedback loop between finite-elements deformations and ground-truth acquisition (dynamic MRI).

  16. Improving the quality of protein identification in non-model species. Characterization of Quercus ilex seed and Pinus radiata needle proteomes by using SEQUEST and custom databases.

    PubMed

    Romero-Rodríguez, M Cristina; Pascual, Jesús; Valledor, Luis; Jorrín-Novo, Jesús

    2014-06-13

    ), as we demonstrated analyzing Quercus seeds and Pine needles. The proposed approach based on the building of a custom database is not difficult or time consuming, so we recommend its routine use when working with non-model species. This article is part of a Special Issue entitled: Proteomics of non-model organisms. Copyright © 2014 Elsevier B.V. All rights reserved.

  17. Creating a model to detect dairy cattle farms with poor welfare using a national database.

    PubMed

    Krug, C; Haskell, M J; Nunes, T; Stilwell, G

    2015-12-01

    The objective of this study was to determine whether dairy farms with poor cow welfare could be identified using a national database for bovine identification and registration that monitors cattle deaths and movements. The welfare of dairy cattle was assessed using the Welfare Quality(®) protocol (WQ) on 24 Portuguese dairy farms and on 1930 animals. Five farms were classified as having poor welfare and the other 19 were classified as having good welfare. Fourteen million records from the national cattle database were analysed to identify potential welfare indicators for dairy farms. Fifteen potential national welfare indicators were calculated based on that database, and the link between the results on the WQ evaluation and the national cattle database was made using the identification code of each farm. Within the potential national welfare indicators, only two were significantly different between farms with good welfare and poor welfare, 'proportion of on-farm deaths' (p<0.01) and 'female/male birth ratio' (p<0.05). To determine whether the database welfare indicators could be used to distinguish farms with good welfare from farms with poor welfare, we created a model using the classifier J48 of Waikato Environment for Knowledge Analysis. The model was a decision tree based on two variables, 'proportion of on-farm deaths' and 'calving-to-calving interval', and it was able to correctly identify 70% and 79% of the farms classified as having poor and good welfare, respectively. The national cattle database analysis could be useful in helping official veterinary services in detecting farms that have poor welfare and also in determining which welfare indicators are poor on each particular farm.

  18. Toxicity of halogenated organic compounds. (Latest citations from the NTIS bibliographic database). Published Search

    SciTech Connect

    Not Available

    1993-09-01

    The bibliography contains citations concerning health and environmental effects of halogenated organic compounds. Topics include laboratory and field investigations regarding bioaccumulation and concentration, metabolic aspects, and specific site studies in industrial and commercial operations. Pesticides, solvents, and a variety of industrial compounds are discussed. (Contains 250 citations and includes a subject term index and title list.)

  19. D Digital Model Database Applied to Conservation and Research of Wooden Construction in China

    NASA Astrophysics Data System (ADS)

    Zheng, Y.

    2013-07-01

    Protected by the Tai-Hang Mountains, Shanxi Province, located in north central China, is a highly prosperous, densely populated valley and considered to be one of the cradles of Chinese civilization. Its continuous habitation and rich culture have given rise to a large number of temple complexes and pavilions. Among these structures, 153 can be dated as early as from the Tang dynasty (618- 907C.E.) to the end of the Yuan dynasty (1279-1368C.E.) in Southern Shanxi area. The buildings are the best-preserved examples of wooden Chinese architecture in existence, exemplifying historic building technology and displaying highly intricate architectural decoration and detailing. They have survived war, earthquakes, and, in the last hundred years, neglect. In 2005, a decade-long conservation project was initiated by the State Administration of Cultural Heritage of China (SACH) to conserve and document these important buildings. The conservation process requires stabilization, conservation of important features, and, where necessary, partial dismantlement in order to replace unsound structural elements. Project team of CHCC have developed a practical recording system that created a record of all building components prior to and during the conservation process. After that we are trying to establish a comprehensive database which include all of the 153 earlier buildings, through which we can easily entering, browse, indexing information of the wooden construction, even deep into component details. The Database can help us to carry out comparative studies of these wooden structures, and, provide important support for the continued conservation of these heritage buildings. For some of the most important wooden structure, we have established three-dimensional models. Connected the Database with 3D Digital Model based on ArcGIS, we have developed 3D Digital Model Database for these cherish buildings. The 3D Digital Model Database helps us set up an integrate information inventory

  20. The Transporter Classification Database

    PubMed Central

    Saier, Milton H.; Reddy, Vamsee S.; Tamang, Dorjee G.; Västermark, Åke

    2014-01-01

    The Transporter Classification Database (TCDB; http://www.tcdb.org) serves as a common reference point for transport protein research. The database contains more than 10 000 non-redundant proteins that represent all currently recognized families of transmembrane molecular transport systems. Proteins in TCDB are organized in a five level hierarchical system, where the first two levels are the class and subclass, the second two are the family and subfamily, and the last one is the transport system. Superfamilies that contain multiple families are included as hyperlinks to the five tier TC hierarchy. TCDB includes proteins from all types of living organisms and is the only transporter classification system that is both universal and recognized by the International Union of Biochemistry and Molecular Biology. It has been expanded by manual curation, contains extensive text descriptions providing structural, functional, mechanistic and evolutionary information, is supported by unique software and is interconnected to many other relevant databases. TCDB is of increasing usefulness to the international scientific community and can serve as a model for the expansion of database technologies. This manuscript describes an update of the database descriptions previously featured in NAR database issues. PMID:24225317

  1. Polymer models of chromosome (re)organization

    NASA Astrophysics Data System (ADS)

    Mirny, Leonid

    Chromosome Conformation Capture technique (Hi-C) provides comprehensive information about frequencies of spatial interactions between genomic loci. Inferring 3D organization of chromosomes from these data is a challenging biophysical problem. We develop a top-down approach to biophysical modeling of chromosomes. Starting with a minimal set of biologically motivated interactions we build ensembles of polymer conformations that can reproduce major features observed in Hi-C experiments. I will present our work on modeling organization of human metaphase and interphase chromosomes. Our works suggests that active processes of loop extrusion can be a universal mechanism responsible for formation of domains in interphase and chromosome compaction in metaphase.

  2. Mouse genome database 2016

    PubMed Central

    Bult, Carol J.; Eppig, Janan T.; Blake, Judith A.; Kadin, James A.; Richardson, Joel E.

    2016-01-01

    The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data. PMID:26578600

  3. Database of the United States Coal Pellet Collection of the U.S. Geological Survey Organic Petrology Laboratory

    USGS Publications Warehouse

    Deems, Nikolaus J.; Hackley, Paul C.

    2012-01-01

    The Organic Petrology Laboratory (OPL) of the U.S. Geological Survey (USGS) Eastern Energy Resources Science Center in Reston, Virginia, contains several thousand processed coal sample materials that were loosely organized in laboratory drawers for the past several decades. The majority of these were prepared as 1-inch-diameter particulate coal pellets (more than 6,000 pellets; one sample usually was prepared as two pellets, although some samples were prepared in as many as four pellets), which were polished and used in reflected light petrographic studies. These samples represent the work of many scientists from the 1970s to the present, most notably Ron Stanton, who managed the OPL until 2001 (see Warwick and Ruppert, 2005, for a comprehensive bibliography of Ron Stanton's work). The purpose of the project described herein was to organize and catalog the U.S. part of the petrographic sample collection into a comprehensive database (available with this report as a Microsoft Excel file) and to compile and list published studies associated with the various sample sets. Through this work, the extent of the collection is publicly documented as a resource and sample library available to other scientists and researchers working in U.S. coal basins previously studied by organic petrologists affiliated with the USGS. Other researchers may obtain samples in the OPL collection on loan at the discretion of the USGS authors listed in this report and its associated Web page.

  4. Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.

    PubMed

    Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P

    2015-01-01

    This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of

  5. Microtechnology-Based Multi-Organ Models

    PubMed Central

    Lee, Seung Hwan; Sung, Jong Hwan

    2017-01-01

    Drugs affect the human body through absorption, distribution, metabolism, and elimination (ADME) processes. Due to their importance, the ADME processes need to be studied to determine the efficacy and side effects of drugs. Various in vitro model systems have been developed and used to realize the ADME processes. However, conventional model systems have failed to simulate the ADME processes because they are different from in vivo, which has resulted in a high attrition rate of drugs and a decrease in the productivity of new drug development. Recently, a microtechnology-based in vitro system called “organ-on-a-chip” has been gaining attention, with more realistic cell behavior and physiological reactions, capable of better simulating the in vivo environment. Furthermore, multi-organ-on-a-chip models that can provide information on the interaction between the organs have been developed. The ultimate goal is the development of a “body-on-a-chip”, which can act as a whole body model. In this review, we introduce and summarize the current progress in the development of multi-organ models as a foundation for the development of body-on-a-chip. PMID:28952525

  6. Microtechnology-Based Multi-Organ Models.

    PubMed

    Lee, Seung Hwan; Sung, Jong Hwan

    2017-05-21

    Drugs affect the human body through absorption, distribution, metabolism, and elimination (ADME) processes. Due to their importance, the ADME processes need to be studied to determine the efficacy and side effects of drugs. Various in vitro model systems have been developed and used to realize the ADME processes. However, conventional model systems have failed to simulate the ADME processes because they are different from in vivo, which has resulted in a high attrition rate of drugs and a decrease in the productivity of new drug development. Recently, a microtechnology-based in vitro system called "organ-on-a-chip" has been gaining attention, with more realistic cell behavior and physiological reactions, capable of better simulating the in vivo environment. Furthermore, multi-organ-on-a-chip models that can provide information on the interaction between the organs have been developed. The ultimate goal is the development of a "body-on-a-chip", which can act as a whole body model. In this review, we introduce and summarize the current progress in the development of multi-organ models as a foundation for the development of body-on-a-chip.

  7. Object-oriented urban 3D spatial data model organization method

    NASA Astrophysics Data System (ADS)

    Li, Jing-wen; Li, Wen-qing; Lv, Nan; Su, Tao

    2015-12-01

    This paper combined the 3d data model with object-oriented organization method, put forward the model of 3d data based on object-oriented method, implemented the city 3d model to quickly build logical semantic expression and model, solved the city 3d spatial information representation problem of the same location with multiple property and the same property with multiple locations, designed the space object structure of point, line, polygon, body for city of 3d spatial database, and provided a new thought and method for the city 3d GIS model and organization management.

  8. Application of statistical models for secondary data usage of the US Navy's Occupational Exposure Database (NOED).

    PubMed

    Formisano, J A; Still, K; Alexander, W; Lippmann, M

    2001-02-01

    Many organizations around the world have collected data related to individual worker exposures that are used to determine compliance with workplace standards. These data are often warehoused and thereafter rarely used as an information resource. Using appropriate groupings and analysis of OSHA data, Gómez showed that such stored data can provide additional insight on factors affecting occupational exposures. Using data from the Occupational Exposure Database of the United States Navy, the usefulness of statistical models for defining probabilities of exposure above permissible limits for observed work conditions is examined. Analyses have highlighted worker Similar Exposure Groups (SEGs) with potential for overexposure to asbestos and lead. In terms of grouping data, Rappaport et al. defined the Within-Between Lognormal Model, a scale-independent measure for quantifying between-worker variability within a selected worker group: (B)R.95 = exp[3.92s(sB)], representing the ratio of arithmetic mean exposures received by workers in the 97.5th and 2.5th percentiles. To help search for groups, the Proportional Odds Model, a generalization of the logistic model to ordinal data, can predict probabilities for group exposure above the Occupational Exposure Limit (OEL), or the Action Level (AL), which is one-half of the OEL. Worker SEGs have been identified for asbestos workers removing friable asbestos ((B)R.95 = 11.0) and nonfriable asbestos ((B)R.95 = 6.5); metal cleaning workers sandingspecialized equipment ((B)R.95 = 11.3), and workers at target shooting ranges cleaning up lead debris ((B)R.95 = 10). Estimated probabilities for the categories OEL support current understanding of work processes examined. Differences in probability noted between tasks and levels of ventilation validate this method for evaluating other available workplace exposure determinants, and for predicting probability of membership in categories that may help further define worker

  9. Evaluation of low wind modeling approaches for two tall-stack databases.

    PubMed

    Paine, Robert; Samani, Olga; Kaplan, Mary; Knipping, Eladio; Kumar, Naresh

    2015-11-01

    The performance of the AERMOD air dispersion model under low wind speed conditions, especially for applications with only one level of meteorological data and no direct turbulence measurements or vertical temperature gradient observations, is the focus of this study. The analysis documented in this paper addresses evaluations for low wind conditions involving tall stack releases for which multiple years of concurrent emissions, meteorological data, and monitoring data are available. AERMOD was tested on two field-study databases involving several SO2 monitors and hourly emissions data that had sub-hourly meteorological data (e.g., 10-min averages) available using several technical options: default mode, with various low wind speed beta options, and using the available sub-hourly meteorological data. These field study databases included (1) Mercer County, a North Dakota database featuring five SO2 monitors within 10 km of the Dakota Gasification Company's plant and the Antelope Valley Station power plant in an area of both flat and elevated terrain, and (2) a flat-terrain setting database with four SO2 monitors within 6 km of the Gibson Generating Station in southwest Indiana. Both sites featured regionally representative 10-m meteorological databases, with no significant terrain obstacles between the meteorological site and the emission sources. The low wind beta options show improvement in model performance helping to reduce some of the over-prediction biases currently present in AERMOD when run with regulatory default options. The overall findings with the low wind speed testing on these tall stack field-study databases indicate that AERMOD low wind speed options have a minor effect for flat terrain locations, but can have a significant effect for elevated terrain locations. The performance of AERMOD using low wind speed options leads to improved consistency of meteorological conditions associated with the highest observed and predicted concentration events. The

  10. Thermodynamic modeling for organic solid precipitation

    SciTech Connect

    Chung, T.H.

    1992-12-01

    A generalized predictive model which is based on thermodynamic principle for solid-liquid phase equilibrium has been developed for organic solid precipitation. The model takes into account the effects of temperature, composition, and activity coefficient on the solubility of wax and asphaltenes in organic solutions. The solid-liquid equilibrium K-value is expressed as a function of the heat of melting, melting point temperature, solubility parameter, and the molar volume of each component in the solution. All these parameters have been correlated with molecular weight. Thus, the model can be applied to crude oil systems. The model has been tested with experimental data for wax formation and asphaltene precipitation. The predicted wax appearance temperature is very close to the measured temperature. The model not only can match the measured asphaltene solubility data but also can be used to predict the solubility of asphaltene in organic solvents or crude oils. The model assumes that asphaltenes are dissolved in oil in a true liquid state, not in colloidal suspension, and the precipitation-dissolution process is reversible by changing thermodynamic conditions. The model is thermodynamically consistent and has no ambiguous assumptions.

  11. Thermodynamic modeling for organic solid precipitation

    SciTech Connect

    Chung, T.H.

    1992-12-01

    A generalized predictive model which is based on thermodynamic principle for solid-liquid phase equilibrium has been developed for organic solid precipitation. The model takes into account the effects of temperature, composition, and activity coefficient on the solubility of wax and asphaltenes in organic solutions. The solid-liquid equilibrium K-value is expressed as a function of the heat of melting, melting point temperature, solubility parameter, and the molar volume of each component in the solution. All these parameters have been correlated with molecular weight. Thus, the model can be applied to crude oil systems. The model has been tested with experimental data for wax formation and asphaltene precipitation. The predicted wax appearance temperature is very close to the measured temperature. The model not only can match the measured asphaltene solubility data but also can be used to predict the solubility of asphaltene in organic solvents or crude oils. The model assumes that asphaltenes are dissolved in oil in a true liquid state, not in colloidal suspension, and the precipitation-dissolution process is reversible by changing thermodynamic conditions. The model is thermodynamically consistent and has no ambiguous assumptions.

  12. Measuring the effects of distributed database models on transaction availability measures

    NASA Technical Reports Server (NTRS)

    Mukkamala, Ravi

    1991-01-01

    Data distribution, data replication, and system reliability are key factors in determining the availability measures for transactions in distributed database systems. In order to simplify the evaluation of these measures, database designers and researchers tend to make unrealistic assumptions about these factors. Here, the effect of such assumptions on the computational complexity and accuracy of such evaluations is investigated. A database system is represented with five parameters related to the above factors. Probabilistic analysis is employed to evaluate the availability of read-one and read-write transactions. Both the read-one/write-all and the majority-read/majority-write replication control policies are considered. It is concluded that transaction availability is more sensitive to variations in degrees of replication, less sensitive to data distribution, and insensitive to reliability variations in a heterogeneous system. The computational complexity of the evaluations is found to be mainly determined by the chosen distributed database model, while the accuracy of the results are not so much dependent on the models.

  13. Acquisition of Seco Creek GIS database and it's use in water quality models

    SciTech Connect

    Steers, C.A.; Steiner, M.; Taylor, B. )

    1993-02-01

    The Seco Creek Water Quality Demonstration Project covers 1,700,670 acres in parts of Bandera, Frio, Medina and Uvalde Counties in south central Texas. The Seco Creek Database was constructed as part of the Soil Conservation Service's National Water Quality Program to develop hydrologic tools that measure the effects of agricultural nonpoint source pollution and to demonstrate the usefulness of GIS in natural resources management. This project will be part of a GRASS-Water Quality Model Interface which will incorporate watershed models with water quality planning and implementation by January of 1994. The Seco Creek Demonstration Area is the sole water supply for 1.3 million in the San Antonio Area. The database constructed for the project will help maintain the excellent water quality that flows directly into the Edwards Aquifer. The database consists of several vector and raster layers including: SSURGO quality soils, elevation, roads, streams and detailed data on field ownership, cropping and grazing practices and other landuses. This paper will consist of the development and planned uses of the Seco Creek Database.

  14. A scalable database model for multiparametric time series: a volcano observatory case study

    NASA Astrophysics Data System (ADS)

    Montalto, Placido; Aliotta, Marco; Cassisi, Carmelo; Prestifilippo, Michele; Cannata, Andrea

    2014-05-01

    The variables collected by a sensor network constitute a heterogeneous data source that needs to be properly organized in order to be used in research and geophysical monitoring. With the time series term we refer to a set of observations of a given phenomenon acquired sequentially in time. When the time intervals are equally spaced one speaks of period or sampling frequency. Our work describes in detail a possible methodology for storage and management of time series using a specific data structure. We designed a framework, hereinafter called TSDSystem (Time Series Database System), in order to acquire time series from different data sources and standardize them within a relational database. The operation of standardization provides the ability to perform operations, such as query and visualization, of many measures synchronizing them using a common time scale. The proposed architecture follows a multiple layer paradigm (Loaders layer, Database layer and Business Logic layer). Each layer is specialized in performing particular operations for the reorganization and archiving of data from different sources such as ASCII, Excel, ODBC (Open DataBase Connectivity), file accessible from the Internet (web pages, XML). In particular, the loader layer performs a security check of the working status of each running software through an heartbeat system, in order to automate the discovery of acquisition issues and other warning conditions. Although our system has to manage huge amounts of data, performance is guaranteed by using a smart partitioning table strategy, that keeps balanced the percentage of data stored in each database table. TSDSystem also contains modules for the visualization of acquired data, that provide the possibility to query different time series on a specified time range, or follow the realtime signal acquisition, according to a data access policy from the users.

  15. The Chinchilla Research Resource Database: resource for an otolaryngology disease model

    PubMed Central

    Shimoyama, Mary; Smith, Jennifer R.; De Pons, Jeff; Tutaj, Marek; Khampang, Pawjai; Hong, Wenzhou; Erbe, Christy B.; Ehrlich, Garth D.; Bakaletz, Lauren O.; Kerschner, Joseph E.

    2016-01-01

    The long-tailed chinchilla (Chinchilla lanigera) is an established animal model for diseases of the inner and middle ear, among others. In particular, chinchilla is commonly used to study diseases involving viral and bacterial pathogens and polymicrobial infections of the upper respiratory tract and the ear, such as otitis media. The value of the chinchilla as a model for human diseases prompted the sequencing of its genome in 2012 and the more recent development of the Chinchilla Research Resource Database (http://crrd.mcw.edu) to provide investigators with easy access to relevant datasets and software tools to enhance their research. The Chinchilla Research Resource Database contains a complete catalog of genes for chinchilla and, for comparative purposes, human. Chinchilla genes can be viewed in the context of their genomic scaffold positions using the JBrowse genome browser. In contrast to the corresponding records at NCBI, individual gene reports at CRRD include functional annotations for Disease, Gene Ontology (GO) Biological Process, GO Molecular Function, GO Cellular Component and Pathway assigned to chinchilla genes based on annotations from the corresponding human orthologs. Data can be retrieved via keyword and gene-specific searches. Lists of genes with similar functional attributes can be assembled by leveraging the hierarchical structure of the Disease, GO and Pathway vocabularies through the Ontology Search and Browser tool. Such lists can then be further analyzed for commonalities using the Gene Annotator (GA) Tool. All data in the Chinchilla Research Resource Database is freely accessible and downloadable via the CRRD FTP site or using the download functions available in the search and analysis tools. The Chinchilla Research Resource Database is a rich resource for researchers using, or considering the use of, chinchilla as a model for human disease. Database URL: http://crrd.mcw.edu PMID:27173523

  16. The Chinchilla Research Resource Database: resource for an otolaryngology disease model.

    PubMed

    Shimoyama, Mary; Smith, Jennifer R; De Pons, Jeff; Tutaj, Marek; Khampang, Pawjai; Hong, Wenzhou; Erbe, Christy B; Ehrlich, Garth D; Bakaletz, Lauren O; Kerschner, Joseph E

    2016-01-01

    The long-tailed chinchilla (Chinchilla lanigera) is an established animal model for diseases of the inner and middle ear, among others. In particular, chinchilla is commonly used to study diseases involving viral and bacterial pathogens and polymicrobial infections of the upper respiratory tract and the ear, such as otitis media. The value of the chinchilla as a model for human diseases prompted the sequencing of its genome in 2012 and the more recent development of the Chinchilla Research Resource Database (http://crrd.mcw.edu) to provide investigators with easy access to relevant datasets and software tools to enhance their research. The Chinchilla Research Resource Database contains a complete catalog of genes for chinchilla and, for comparative purposes, human. Chinchilla genes can be viewed in the context of their genomic scaffold positions using the JBrowse genome browser. In contrast to the corresponding records at NCBI, individual gene reports at CRRD include functional annotations for Disease, Gene Ontology (GO) Biological Process, GO Molecular Function, GO Cellular Component and Pathway assigned to chinchilla genes based on annotations from the corresponding human orthologs. Data can be retrieved via keyword and gene-specific searches. Lists of genes with similar functional attributes can be assembled by leveraging the hierarchical structure of the Disease, GO and Pathway vocabularies through the Ontology Search and Browser tool. Such lists can then be further analyzed for commonalities using the Gene Annotator (GA) Tool. All data in the Chinchilla Research Resource Database is freely accessible and downloadable via the CRRD FTP site or using the download functions available in the search and analysis tools. The Chinchilla Research Resource Database is a rich resource for researchers using, or considering the use of, chinchilla as a model for human disease.Database URL: http://crrd.mcw.edu.

  17. WholeCellSimDB: a hybrid relational/HDF database for whole-cell model predictions.

    PubMed

    Karr, Jonathan R; Phillips, Nolan C; Covert, Markus W

    2014-01-01

    Mechanistic 'whole-cell' models are needed to develop a complete understanding of cell physiology. However, extracting biological insights from whole-cell models requires running and analyzing large numbers of simulations. We developed WholeCellSimDB, a database for organizing whole-cell simulations. WholeCellSimDB was designed to enable researchers to search simulation metadata to identify simulations for further analysis, and quickly slice and aggregate simulation results data. In addition, WholeCellSimDB enables users to share simulations with the broader research community. The database uses a hybrid relational/hierarchical data format architecture to efficiently store and retrieve both simulation setup metadata and results data. WholeCellSimDB provides a graphical Web-based interface to search, browse, plot and export simulations; a JavaScript Object Notation (JSON) Web service to retrieve data for Web-based visualizations; a command-line interface to deposit simulations; and a Python API to retrieve data for advanced analysis. Overall, we believe WholeCellSimDB will help researchers use whole-cell models to advance basic biological science and bioengineering. http://www.wholecellsimdb.org SOURCE CODE REPOSITORY: URL: http://github.com/CovertLab/WholeCellSimDB. © The Author(s) 2014. Published by Oxford University Press.

  18. Chromatin fiber functional organization: Some plausible models

    NASA Astrophysics Data System (ADS)

    Lesne, A.; Victor, J.-M.

    2006-03-01

    We here present a modeling study of the chromatin fiber functional organization. Multi-scale modeling is required to unravel the complex interplay between the fiber and the DNA levels. It suggests plausible scenarios, including both physical and biological aspects, for fiber condensation, its targeted decompaction, and transcription regulation. We conclude that a major role of the chromatin fiber structure might be to endow DNA with allosteric potentialities and to control DNA transactions by an epigenetic tuning of its mechanical and topological constraints.

  19. Resveratrol and Lifespan in Model Organisms.

    PubMed

    Pallauf, Kathrin; Rimbach, Gerald; Rupp, Petra Maria; Chin, Dawn; Wolf, Insa M A

    2016-01-01

    Resveratrol may possess life-prolonging and health-benefitting properties, some of which may resemble the effect of caloric restriction (CR). CR appears to prolong the lifespan of model organisms in some studies and may benefit human health. However, for humans, restricting food intake for an extended period of time seems impracticable and substances imitating the beneficial effects of CR without having to reduce food intake could improve health in an aging and overweight population. We have reviewed the literature studying the influence of resveratrol on the lifespan of model organisms including yeast, flies, worms, and rodents. We summarize the in vivo findings, describe modulations of molecular targets and gene expression observed in vivo and in vitro, and discuss how these changes may contribute to lifespan extension. Data from clinical studies are summarized to provide an insight about the potential of resveratrol supplementation in humans. Resveratrol supplementation has been shown to prolong lifespan in approximately 60% of the studies conducted in model organisms. However, current literature is contradictory, indicating that the lifespan effects of resveratrol vary strongly depending on the model organism. While worms and killifish seemed very responsive to resveratrol, resveratrol failed to affect lifespan in the majority of the studies conducted in flies and mice. Furthermore, factors such as dose, gender, genetic background and diet composition may contribute to the high variance in the observed effects. It remains inconclusive whether resveratrol is indeed a CR mimetic and possesses life-prolonging properties. The limited bioavailability of resveratrol may further impede its potential effects.

  20. Data-based stochastic subgrid-scale parametrization: an approach using cluster-weighted modelling.

    PubMed

    Kwasniok, Frank

    2012-03-13

    A new approach for data-based stochastic parametrization of unresolved scales and processes in numerical weather and climate prediction models is introduced. The subgrid-scale model is conditional on the state of the resolved scales, consisting of a collection of local models. A clustering algorithm in the space of the resolved variables is combined with statistical modelling of the impact of the unresolved variables. The clusters and the parameters of the associated subgrid models are estimated simultaneously from data. The method is implemented and explored in the framework of the Lorenz '96 model using discrete Markov processes as local statistical models. Performance of the cluster-weighted Markov chain scheme is investigated for long-term simulations as well as ensemble prediction. It clearly outperforms simple parametrization schemes and compares favourably with another recently proposed subgrid modelling scheme also based on conditional Markov chains.

  1. Leaf respiration (GlobResp) - global trait database supports Earth System Models

    SciTech Connect

    Wullschleger, Stan D.; Warren, Jeffrey; Thornton, Peter E.

    2015-03-20

    Here we detail how Atkin and his colleagues compiled a global database (GlobResp) that details rates of leaf dark respiration and associated traits from sites that span Arctic tundra to tropical forests. This compilation builds upon earlier research (Reich et al., 1998; Wright et al., 2006) and was supplemented by recent field campaigns and unpublished data.In keeping with other trait databases, GlobResp provides insights on how physiological traits, especially rates of dark respiration, vary as a function of environment and how that variation can be used to inform terrestrial biosphere models and land surface components of Earth System Models. Although an important component of plant and ecosystem carbon (C) budgets (Wythers et al., 2013), respiration has only limited representation in models. Seen through the eyes of a plant scientist, Atkin et al. (2015) give readers a unique perspective on the climatic controls on respiration, thermal acclimation and evolutionary adaptation of dark respiration, and insights into the covariation of respiration with other leaf traits. We find there is ample evidence that once large databases are compiled, like GlobResp, they can reveal new knowledge of plant function and provide a valuable resource for hypothesis testing and model development.

  2. Leaf respiration (GlobResp) - global trait database supports Earth System Models

    DOE PAGES

    Wullschleger, Stan D.; Warren, Jeffrey; Thornton, Peter E.

    2015-03-20

    Here we detail how Atkin and his colleagues compiled a global database (GlobResp) that details rates of leaf dark respiration and associated traits from sites that span Arctic tundra to tropical forests. This compilation builds upon earlier research (Reich et al., 1998; Wright et al., 2006) and was supplemented by recent field campaigns and unpublished data.In keeping with other trait databases, GlobResp provides insights on how physiological traits, especially rates of dark respiration, vary as a function of environment and how that variation can be used to inform terrestrial biosphere models and land surface components of Earth System Models. Althoughmore » an important component of plant and ecosystem carbon (C) budgets (Wythers et al., 2013), respiration has only limited representation in models. Seen through the eyes of a plant scientist, Atkin et al. (2015) give readers a unique perspective on the climatic controls on respiration, thermal acclimation and evolutionary adaptation of dark respiration, and insights into the covariation of respiration with other leaf traits. We find there is ample evidence that once large databases are compiled, like GlobResp, they can reveal new knowledge of plant function and provide a valuable resource for hypothesis testing and model development.« less

  3. Data extraction tool and colocation database for satellite and model product evaluation (Invited)

    NASA Astrophysics Data System (ADS)

    Ansari, S.; Zhang, H.; Privette, J. L.; Del Greco, S.; Urzen, M.; Pan, Y.; Cook, R. B.; Wilson, B. E.; Wei, Y.

    2009-12-01

    The Satellite Product Evaluation Center (SPEC) is an ongoing project to integrate operational monitoring of data products from satellite and model analysis, with support for quantitative calibration, validation and algorithm improvement. The system uniquely allows scientists and others to rapidly access, subset, visualize, statistically compare and download multi-temporal data from multiple in situ, satellite, weather radar and model sources without reference to native data and metadata formats, packaging or physical location. Although still in initial development, the SPEC database and services will contain a wealth of integrated data for evaluation, validation, and discovery science activities across many different disciplines. The SPEC data extraction architecture departs from traditional dataset and research driven approaches through the use of standards and relational database technology. The NetCDF for Java API is used as a framework for data decoding and abstraction. The data are treated as generic feature types (such as Grid or Swath) as defined by the NetCDF Climate and Forecast (CF) metadata conventions. Colocation data for various field measurement networks, such as the Climate Reference Network (CRN) and Ameriflux network, are extracted offline, from local disk or distributed sources. The resulting data subsets are loaded into a relational database for fast access. URL-based (Representational State Transfer (REST)) web services are provided for simple database access to application programmers and scientists. SPEC supports broad NOAA, U.S. Global Change Research Program (USGCRP) and World Climate Research Programme (WCRP) initiatives including the National Polar-orbiting Operational Environmental Satellite System (NPOESS) and NOAA’s Climate Data Record (CDR) programs. SPEC is a collaboration between NOAA’s National Climatic Data Center (NCDC) and DOE’s Oak Ridge National Laboratory (ORNL). In this presentation we will describe the data extraction

  4. S-World: A high resolution global soil database for simulation modelling (Invited)

    NASA Astrophysics Data System (ADS)

    Stoorvogel, J. J.

    2013-12-01

    There is an increasing call for high resolution soil information at the global level. A good example for such a call is the Global Gridded Crop Model Intercomparison carried out within AgMIP. While local studies can make use of surveying techniques to collect additional techniques this is practically impossible at the global level. It is therefore important to rely on legacy data like the Harmonized World Soil Database. Several efforts do exist that aim at the development of global gridded soil property databases. These estimates of the variation of soil properties can be used to assess e.g., global soil carbon stocks. However, they do not allow for simulation runs with e.g., crop growth simulation models as these models require a description of the entire pedon rather than a few soil properties. This study provides the required quantitative description of pedons at a 1 km resolution for simulation modelling. It uses the Harmonized World Soil Database (HWSD) for the spatial distribution of soil types, the ISRIC-WISE soil profile database to derive information on soil properties per soil type, and a range of co-variables on topography, climate, and land cover to further disaggregate the available data. The methodology aims to take stock of these available data. The soil database is developed in five main steps. Step 1: All 148 soil types are ordered on the basis of their expected topographic position using e.g., drainage, salinization, and pedogenesis. Using the topographic ordering and combining the HWSD with a digital elevation model allows for the spatial disaggregation of the composite soil units. This results in a new soil map with homogeneous soil units. Step 2: The ranges of major soil properties for the topsoil and subsoil of each of the 148 soil types are derived from the ISRIC-WISE soil profile database. Step 3: A model of soil formation is developed that focuses on the basic conceptual question where we are within the range of a particular soil property

  5. Emergent organization in a model market

    NASA Astrophysics Data System (ADS)

    Yadav, Avinash Chand; Manchanda, Kaustubh; Ramaswamy, Ramakrishna

    2017-09-01

    We study the collective behaviour of interacting agents in a simple model of market economics that was originally introduced by Nørrelykke and Bak. A general theoretical framework for interacting traders on an arbitrary network is presented, with the interaction consisting of buying (namely consumption) and selling (namely production) of commodities. Extremal dynamics is introduced by having the agent with least profit in the market readjust prices, causing the market to self-organize. In addition to examining this model market on regular lattices in two-dimensions, we also study the cases of random complex networks both with and without community structures. Fluctuations in an activity signal exhibit properties that are characteristic of avalanches observed in models of self-organized criticality, and these can be described by power-law distributions when the system is in the critical state.

  6. Bridging the gap between climate models and impact studies: the FORESEE Database.

    PubMed

    Dobor, L; Barcza, Z; Hlásny, T; Havasi, Á; Horváth, F; Ittzés, P; Bartholy, J

    2015-07-01

    Studies on climate change impacts are essential for identifying vulnerabilities and developing adaptation options. However, such studies depend crucially on the availability of reliable climate data. In this study, we introduce the climatological database called FORESEE (Open Database for Climate Change Related Impact Studies in Central Europe), which was developed to support the research of and adaptation to climate change in Central and Eastern Europe: the region where knowledge of possible climate change effects is inadequate. A questionnaire-based survey was used to specify database structure and content. FORESEE contains the seamless combination of gridded daily observation-based data (1951-2013) built on the E-OBS and CRU TS datasets, and a collection of climate projections (2014-2100). The future climate is represented by bias-corrected meteorological data from 10 regional climate models (RCMs), driven by the A1B emission scenario. These latter data were developed within the frame of the ENSEMBLES FP6 project. Although FORESEE only covers a limited area of Central and Eastern Europe, the methodology of database development, the applied bias correction techniques, and the data dissemination method, can serve as a blueprint for similar initiatives.

  7. Modeling global organic aerosol formation and growth

    NASA Astrophysics Data System (ADS)

    Tsimpidi, Alexandra; Karydis, Vlasios; Pandis, Spyros; Lelieveld, Jos

    2014-05-01

    A computationally efficient framework for the description of organic aerosol (OA)-gas partitioning and chemical aging has been developed and implemented into the EMAC atmospheric chemistry-climate model. This model simulates the formation of primary (POA) and secondary organic aerosols (SOA) from semi-volatile (SVOC), intermediate-volatile (IVOC) and volatile organic compounds (VOC). POA are divided in two groups with saturation concentrations at 298 K 0.1, 10, 1000, 100000 µg m-3: OA from fossil fuel combustion and biomass burning. The first 2 surrogate species from each group represent the SVOC while the other surrogate species represent the IVOC. Photochemical reactions that change the volatility of the organics in the gas phase are taken into account. The oxidation products from each group of precursors (SVOC, IVOC, and VOC) are lumped into an additional set of oxidized surrogate species (S-SOA, I-SOA, and V-SOA, respectively) in order to track their source of origin. This model is used to i) estimate the relative contributions of SOA and POA to total OA, ii) determine how SOA concentrations are affected by biogenic and anthropogenic emissions, and iii) evaluate the effect of photochemical aging and long-range transport on OA budget over specific regions.

  8. Self-organized model of cascade spreading

    NASA Astrophysics Data System (ADS)

    Gualdi, S.; Medo, M.; Zhang, Y.-C.

    2011-01-01

    We study simultaneous price drops of real stocks and show that for high drop thresholds they follow a power-law distribution. To reproduce these collective downturns, we propose a minimal self-organized model of cascade spreading based on a probabilistic response of the system elements to stress conditions. This model is solvable using the theory of branching processes and the mean-field approximation. For a wide range of parameters, the system is in a critical state and displays a power-law cascade-size distribution similar to the empirically observed one. We further generalize the model to reproduce volatility clustering and other observed properties of real stocks.

  9. Animal models of organic heart valve disease.

    PubMed

    Roosens, Bram; Bala, Gezim; Droogmans, Steven; Van Camp, Guy; Breyne, Joke; Cosyns, Bernard

    2013-05-25

    Heart valve disease is a frequently encountered pathology, related to high morbidity and mortality rates in industrialized and developing countries. Animal models are interesting to investigate the causality, but also underlying mechanisms and potential treatments of human valvular diseases. Recently, animal models of heart valve disease have been developed, which allow to investigate the pathophysiology, and to follow the progression and the potential regression of disease with therapeutics over time. The present review provides an overview of animal models of primary, organic heart valve disease: myxoid age-related, infectious, drug-induced, degenerative calcified, and mechanically induced valvular heart disease. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  10. Theory and modeling of stereoselective organic reactions.

    PubMed

    Houk, K N; Paddon-Row, M N; Rondan, N G; Wu, Y D; Brown, F K; Spellmeyer, D C; Metz, J T; Li, Y; Loncharich, R J

    1986-03-07

    Theoretical investigations of the transition structures of additions and cycloadditions reveal details about the geometries of bond-forming processes that are not directly accessible by experiment. The conformational analysis of transition states has been developed from theoretical generalizations about the preferred angle of attack by reagents on multiple bonds and predictions of conformations with respect to partially formed bonds. Qualitative rules for the prediction of the stereochemistries of organic reactions have been devised, and semi-empirical computational models have also been developed to predict the stereoselectivities of reactions of large organic molecules, such as nucleophilic additions to carbonyls, electrophilic hydroborations and cycloadditions, and intramolecular radical additions and cycloadditions.

  11. Sediment-Hosted Copper Deposits of the World: Deposit Models and Database

    USGS Publications Warehouse

    Cox, Dennis P.; Lindsey, David A.; Singer, Donald A.; Diggles, Michael F.

    2003-01-01

    Introduction This publication contains four descriptive models and four grade-tonnage models for sediment hosted copper deposits. Descriptive models are useful in exploration planning and resource assessment because they enable the user to identify deposits in the field and to identify areas on geologic and geophysical maps where deposits could occur. Grade and tonnage models are used in resource assessment to predict the likelihood of different combinations of grades and tonnages that could occur in undiscovered deposits in a specific area. They are also useful in exploration in deciding what deposit types meet the economic objectives of the exploration company. The models in this report supersede the sediment-hosted copper models in USGS Bulletin 1693 (Cox, 1986, and Mosier and others, 1986) and are subdivided into a general type and three subtypes. The general model is useful in classifying deposits whose features are obscured by metamorphism or are otherwise poorly described, and for assessing regions in which the geologic environments are poorly understood. The three subtypes are based on differences in deposit form and environments of deposition. These differences are described under subtypes in the general model. Deposit models are based on the descriptions of geologic environments and physical characteristics, and on metal grades and tonnages of many individual deposits. Data used in this study are presented in a database representing 785 deposits in nine continents. This database was derived partly from data published by Kirkham and others (1994) and from new information in recent publications. To facilitate the construction of grade and tonnage models, the information, presented by Kirkham in disaggregated form, was brought together to provide a single grade and a single tonnage for each deposit. Throughout the report individual deposits are defined as being more than 2,000 meters from the nearest adjacent deposit. The deposit models are presented here as

  12. Use of models of biomacromolecule separation in AMT database generation for shotgun proteomics.

    PubMed

    Pridatchenko, M L; Tarasova, I A; Guryca, V; Kononikhin, A S; Adams, C; Tolmachev, D A; Agapov, A Yu; Evreinov, V V; Popov, I A; Nikolaev, E N; Zubarev, R A; Gorshkov, A V; Masselon, C D; Gorshkov, M V

    2009-11-01

    Generation of a complex proteome database requires use of powerful analytical methods capable of following rapid changes in the proteome due to changing physiological and pathological states of the organism under study. One of the promising technologies with this regard is the use of so-called Accurate Mass and Time (AMT) tag peptide databases. Generation of an AMT database for a complex proteome requires combined efforts by many research groups and laboratories, but the chromatography data resulting from these efforts are tied to the particular experimental conditions and, in general, are not transferable from one platform to another. In this work, we consider an approach to solve this problem that is based on the generation of a universal scale for the chromatography data using a multiple-point normalization method. The method follows from the concept of linear correlation between chromatography data obtained over a wide range of separation parameters. The method is further tested for tryptic peptide mixtures with experimental data collected from mutual studies by different independent research groups using different separation protocols and mass spectrometry data processing tools.

  13. An evaluation of the THIN database in the OMOP Common Data Model for active drug safety surveillance.

    PubMed

    Zhou, Xiaofeng; Murugesan, Sundaresan; Bhullar, Harshvinder; Liu, Qing; Cai, Bing; Wentworth, Chuck; Bate, Andrew

    2013-02-01

    There has been increased interest in using multiple observational databases to understand the safety profile of medical products during the postmarketing period. However, it is challenging to perform analyses across these heterogeneous data sources. The Observational Medical Outcome Partnership (OMOP) provides a Common Data Model (CDM) for organizing and standardizing databases. OMOP's work with the CDM has primarily focused on US databases. As a participant in the OMOP Extended Consortium, we implemented the OMOP CDM on the UK Electronic Healthcare Record database-The Health Improvement Network (THIN). The aim of the study was to evaluate the implementation of the THIN database in the OMOP CDM and explore its use for active drug safety surveillance. Following the OMOP CDM specification, the raw THIN database was mapped into a CDM THIN database. Ten Drugs of Interest (DOI) and nine Health Outcomes of Interest (HOI), defined and focused by the OMOP, were created using the CDM THIN database. Quantitative comparison of raw THIN to CDM THIN was performed by execution and analysis of OMOP standardized reports and additional analyses. The practical value of CDM THIN for drug safety and pharmacoepidemiological research was assessed by implementing three analysis methods: Proportional Reporting Ratio (PRR), Univariate Self-Case Control Series (USCCS) and High-Dimensional Propensity Score (HDPS). A published study using raw THIN data was selected to examine the external validity of CDM THIN. Overall demographic characteristics were the same in both databases. Mapping medical and drug codes into the OMOP terminology dictionary was incomplete: 25 % medical codes and 55 % drug codes in raw THIN were not listed in the OMOP terminology dictionary, representing 6 % condition occurrence counts, 4 % procedure occurrence counts and 7 % drug exposure counts in raw THIN. Seven DOIs had <0.3 % and three DOIs had 1 % of unmapped drug exposure counts; each HOI had at least one definition

  14. A Global Database of Land Surface Parameters at 1-km Resolution in Meteorological and Climate Models.

    NASA Astrophysics Data System (ADS)

    Masson, Valéry; Champeaux, Jean-Louis; Chauvin, Fabrice; Meriguet, Christelle; Lacaze, Roselyne

    2003-05-01

    Ecoclimap, a new complete surface parameter global dataset at a 1-km resolution, is presented. It is intended to be used to initialize the soil-vegetation-atmosphere transfer schemes (SVATs) in meteorological and climate models (at all horizontal scales). The database supports the `tile' approach, which is utilized by an increasing number of SVATs. Two hundred and fifteen ecosystems representing areas of homogeneous vegetation are derived by combining existing land cover maps and climate maps, in addition to using Advanced Very High Resolution Radiometer (AVHRR) satellite data. Then, all surface parameters are derived for each of these ecosystems using lookup tables with the annual cycle of the leaf area index (LAI) being constrained by the AVHRR information. The resulting LAI is validated against a large amount of in situ ground observations, and it is also compared to LAI derived from the International Satellite Land Surface Climatology Project (ISLSCP-2) database and the Polarization and Directionality of the Earth's Reflectance (POLDER) satellite. The comparison shows that this new LAI both reproduces values coherent at large scales with other datasets, and includes the high spatial variations owing to the input land cover data at a 1-km resolution. In terms of climate modeling studies, the use of this new database is shown to improve the surface climatology of the ARPEGE climate model.

  15. Donor Age Still Matters in Liver Transplant: Results From the United Network for Organ Sharing-Scientific Registry of Transplant Recipients Database.

    PubMed

    Montenovo, Martin I; Hansen, Ryan N; Dick, André A S; Reyes, Jorge

    2017-10-01

    Individuals older than 60 years represent a large proportion of the organs available for orthotopic liver transplant. However, the use of organs from older donors remains controversial. We hypothesized that the use of older donors would not affect patient and graft survival due to significant improvements in donor-recipient management. We conducted a retrospective cohort analysis using the United Network for Organ Sharing database from February 2002 through December 2012, including non-HCV-infected adults (18 and older) who underwent primary orthotopic liver transplant. We compared patient and graft survival between 4 cohorts based on donor's age (< 60, 60-69, 70-79, and 80+ years) using the Kaplan-Meier estimator. Cox proportional hazards models were constructed to adjust for recipient and donor characteristics to estimate the risk associated with organs from older donors. We identified 35 788 liver transplant recipients. Unadjusted analyses indicated that both patient and graft survival were similar among recipients of donors older than 60 years but significantly inferior compared with those recipients who received a liver from a donor younger than 60 years. Multivariate regression revealed that all 3 categories of donor age > 60 years old were significantly associated with worse patient and graft survival. Model for End-Stage Liver Disease score was not an effective modifier of the association between donor age and survival. The use of liver grafts from elderly donors has a negative impact on both patient and graft survival. Recipient's Model for End-Stage Liver Disease score did not change survival based on donor age.

  16. Modeling organic nitrogen conversions in activated sludge bioreactors.

    PubMed

    Makinia, Jacek; Pagilla, Krishna; Czerwionka, Krzysztof; Stensel, H David

    2011-01-01

    For biological nutrient removal (BNR) systems designed to maximize nitrogen removal, the effluent total nitrogen (TN) concentration may range from 2.0 to 4.0 g N/m(3) with about 25-50% in the form of organic nitrogen (ON). In this study, current approaches to modeling organic N conversions (separate processes vs. constant contents of organic fractions) were compared. A new conceptual model of ON conversions was developed and combined with Activated Sludge Model No. 2d (ASM2d). The model addresses a new insight into the processes of ammonification, biomass decay and hydrolysis of particulate and colloidal ON (PON and CON, respectively). Three major ON fractions incorporated are defined as dissolved (DON) (<0.1 µm), CON (0.1-1.2 µm) and PON (41.2 µm). Each major fraction was further divided into two sub-fractions - biodegradable and non-biodegradable. Experimental data were collected during field measurements and lab experiments conducted at the ''Wschod'' WWTP (570,000 PE) in Gdansk (Poland). The accurate steady-state predictions of DON and CON profiles were possible by varying ammonification and hydrolysis rates under different electron acceptor conditions. With the same model parameter set, the behaviors of both inorganic N forms (NH4-N, NOX-N) and ON forms (DON, CON) in the batch experiments were predicted. The challenges to accurately simulate and predict effluent ON levels from BNR systems are due to analytical methods of direct ON measurement (replacing TKN) and lack of large enough database (in-process measurements, dynamic variations of the ON concentrations) which can be used to determine parameter value ranges.

  17. A prediction model to estimate completeness of electronic physician claims databases.

    PubMed

    Lix, Lisa M; Yao, Xue; Kephart, George; Quan, Hude; Smith, Mark; Kuwornu, John Paul; Manoharan, Nitharsana; Kouokam, Wilfrid; Sikdar, Khokan

    2015-08-26

    Electronic physician claims databases are widely used for chronic disease research and surveillance, but quality of the data may vary with a number of physician characteristics, including payment method. The objectives were to develop a prediction model for the number of prevalent diabetes cases in fee-for-service (FFS) electronic physician claims databases and apply it to estimate cases among non-FFS (NFFS) physicians, for whom claims data are often incomplete. A retrospective observational cohort design was adopted. Data from the Canadian province of Newfoundland and Labrador were used to construct the prediction model and data from the province of Manitoba were used to externally validate the model. A cohort of diagnosed diabetes cases was ascertained from physician claims, insured resident registry and hospitalisation records. A cohort of FFS physicians who were responsible for the diagnosis was ascertained from physician claims and registry data. A generalised linear model with a γ distribution was used to model the number of diabetes cases per FFS physician as a function of physician characteristics. The expected number of diabetes cases per NFFS physician was estimated. The diabetes case cohort consisted of 31,714 individuals; the mean cases per FFS physician was 75.5 (median = 49.0). Sex and years since specialty licensure were significantly associated (p < 0.05) with the number of cases per physician. Applying the prediction model to NFFS physician registry data resulted in an estimate of 18,546 cases; only 411 were observed in claims data. The model demonstrated face validity in an independent data set. Comparing observed and predicted disease cases is a useful and generalisable approach to assess the quality of electronic databases for population-based research and surveillance. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  18. A prediction model to estimate completeness of electronic physician claims databases

    PubMed Central

    Lix, Lisa M; Yao, Xue; Kephart, George; Quan, Hude; Smith, Mark; Kuwornu, John Paul; Manoharan, Nitharsana; Kouokam, Wilfrid; Sikdar, Khokan

    2015-01-01

    Objectives Electronic physician claims databases are widely used for chronic disease research and surveillance, but quality of the data may vary with a number of physician characteristics, including payment method. The objectives were to develop a prediction model for the number of prevalent diabetes cases in fee-for-service (FFS) electronic physician claims databases and apply it to estimate cases among non-FFS (NFFS) physicians, for whom claims data are often incomplete. Design A retrospective observational cohort design was adopted. Setting Data from the Canadian province of Newfoundland and Labrador were used to construct the prediction model and data from the province of Manitoba were used to externally validate the model. Participants A cohort of diagnosed diabetes cases was ascertained from physician claims, insured resident registry and hospitalisation records. A cohort of FFS physicians who were responsible for the diagnosis was ascertained from physician claims and registry data. Primary and secondary outcome measures A generalised linear model with a γ distribution was used to model the number of diabetes cases per FFS physician as a function of physician characteristics. The expected number of diabetes cases per NFFS physician was estimated. Results The diabetes case cohort consisted of 31 714 individuals; the mean cases per FFS physician was 75.5 (median=49.0). Sex and years since specialty licensure were significantly associated (p<0.05) with the number of cases per physician. Applying the prediction model to NFFS physician registry data resulted in an estimate of 18 546 cases; only 411 were observed in claims data. The model demonstrated face validity in an independent data set. Conclusions Comparing observed and predicted disease cases is a useful and generalisable approach to assess the quality of electronic databases for population-based research and surveillance. PMID:26310395

  19. Epidemiology of Occupational Accidents in Iran Based on Social Security Organization Database

    PubMed Central

    Mehrdad, Ramin; Seifmanesh, Shahdokht; Chavoshi, Farzaneh; Aminian, Omid; Izadi, Nazanin

    2014-01-01

    Background: Background: Today, occupational accidents are one of the most important problems in industrial world. Due to lack of appropriate system for registration and reporting, there is no accurate statistics of occupational accidents all over the world especially in developing countries. Objectives: The aim of this study is epidemiological assessment of occupational accidents in Iran. Materials and Methods: Information of available occupational accidents in Social Security Organization was extracted from accident reporting and registration forms. In this cross-sectional study, gender, age, economic activity, type of accident and injured body part in 22158 registered accidents during 2008 were described. Results: The occupational accidents rate was 253 in 100,000 workers in 2008. 98.2% of injured workers were men. The mean age of injured workers was 32.07 ± 9.12 years. The highest percentage belonged to age group of 25-34 years old. In our study, most of the accidents occurred in basic metals industry, electrical and non-electrical machines and construction industry. Falling down from height and crush injury were the most prevalent accidents. Upper and lower extremities were the most common injured body parts. Conclusion: Due to the high rate of accidents in metal and construction industries, engineering controls, the use of appropriate protective equipment and safety worker training seems necessary. PMID:24719699

  20. Epidemiology of occupational accidents in iran based on social security organization database.

    PubMed

    Mehrdad, Ramin; Seifmanesh, Shahdokht; Chavoshi, Farzaneh; Aminian, Omid; Izadi, Nazanin

    2014-01-01

    Today, occupational accidents are one of the most important problems in industrial world. Due to lack of appropriate system for registration and reporting, there is no accurate statistics of occupational accidents all over the world especially in developing countries. The aim of this study is epidemiological assessment of occupational accidents in Iran. Information of available occupational accidents in Social Security Organization was extracted from accident reporting and registration forms. In this cross-sectional study, gender, age, economic activity, type of accident and injured body part in 22158 registered accidents during 2008 were described. The occupational accidents rate was 253 in 100,000 workers in 2008. 98.2% of injured workers were men. The mean age of injured workers was 32.07 ± 9.12 years. The highest percentage belonged to age group of 25-34 years old. In our study, most of the accidents occurred in basic metals industry, electrical and non-electrical machines and construction industry. Falling down from height and crush injury were the most prevalent accidents. Upper and lower extremities were the most common injured body parts. Due to the high rate of accidents in metal and construction industries, engineering controls, the use of appropriate protective equipment and safety worker training seems necessary.

  1. SMIfp (SMILES fingerprint) chemical space for virtual screening and visualization of large databases of organic molecules.

    PubMed

    Schwartz, Julian; Awale, Mahendra; Reymond, Jean-Louis

    2013-08-26

    SMIfp (SMILES fingerprint) is defined here as a scalar fingerprint describing organic molecules by counting the occurrences of 34 different symbols in their SMILES strings, which creates a 34-dimensional chemical space. Ligand-based virtual screening using the city-block distance CBD(SMIfp) as similarity measure provides good AUC values and enrichment factors for recovering series of actives from the directory of useful decoys (DUD-E) and from ZINC. DrugBank, ChEMBL, ZINC, PubChem, GDB-11, GDB-13, and GDB-17 can be searched by CBD(SMIfp) using an online SMIfp-browser at www.gdb.unibe.ch. Visualization of the SMIfp chemical space was performed by principal component analysis and color-coded maps of the (PC1, PC2)-planes, with interactive access to the molecules enabled by the Java application SMIfp-MAPPLET available from www.gdb.unibe.ch. These maps spread molecules according to their fraction of aromatic atoms, size and polarity. SMIfp provides a new and relevant entry to explore the small molecule chemical space.

  2. Chess databases as a research vehicle in psychology: Modeling large data.

    PubMed

    Vaci, Nemanja; Bilalić, Merim

    2016-09-01

    The game of chess has often been used for psychological investigations, particularly in cognitive science. The clear-cut rules and well-defined environment of chess provide a model for investigations of basic cognitive processes, such as perception, memory, and problem solving, while the precise rating system for the measurement of skill has enabled investigations of individual differences and expertise-related effects. In the present study, we focus on another appealing feature of chess-namely, the large archive databases associated with the game. The German national chess database presented in this study represents a fruitful ground for the investigation of multiple longitudinal research questions, since it collects the data of over 130,000 players and spans over 25 years. The German chess database collects the data of all players, including hobby players, and all tournaments played. This results in a rich and complete collection of the skill, age, and activity of the whole population of chess players in Germany. The database therefore complements the commonly used expertise approach in cognitive science by opening up new possibilities for the investigation of multiple factors that underlie expertise and skill acquisition. Since large datasets are not common in psychology, their introduction also raises the question of optimal and efficient statistical analysis. We offer the database for download and illustrate how it can be used by providing concrete examples and a step-by-step tutorial using different statistical analyses on a range of topics, including skill development over the lifetime, birth cohort effects, effects of activity and inactivity on skill, and gender differences.

  3. A database and tool for boundary conditions for regional air quality modeling: description and evaluation

    NASA Astrophysics Data System (ADS)

    Henderson, B. H.; Akhtar, F.; Pye, H. O. T.; Napelenok, S. L.; Hutzell, W. T.

    2014-02-01

    Transported air pollutants receive increasing attention as regulations tighten and global concentrations increase. The need to represent international transport in regional air quality assessments requires improved representation of boundary concentrations. Currently available observations are too sparse vertically to provide boundary information, particularly for ozone precursors, but global simulations can be used to generate spatially and temporally varying lateral boundary conditions (LBC). This study presents a public database of global simulations designed and evaluated for use as LBC for air quality models (AQMs). The database covers the contiguous United States (CONUS) for the years 2001-2010 and contains hourly varying concentrations of ozone, aerosols, and their precursors. The database is complemented by a tool for configuring the global results as inputs to regional scale models (e.g., Community Multiscale Air Quality or Comprehensive Air quality Model with extensions). This study also presents an example application based on the CONUS domain, which is evaluated against satellite retrieved ozone and carbon monoxide vertical profiles. The results show performance is largely within uncertainty estimates for ozone from the Ozone Monitoring Instrument and carbon monoxide from the Measurements Of Pollution In The Troposphere (MOPITT), but there were some notable biases compared with Tropospheric Emission Spectrometer (TES) ozone. Compared with TES, our ozone predictions are high-biased in the upper troposphere, particularly in the south during January. This publication documents the global simulation database, the tool for conversion to LBC, and the evaluation of concentrations on the boundaries. This documentation is intended to support applications that require representation of long-range transport of air pollutants.

  4. Model Organisms and Traditional Chinese Medicine Syndrome Models

    PubMed Central

    Xu, Jin-Wen

    2013-01-01

    Traditional Chinese medicine (TCM) is an ancient medical system with a unique cultural background. Nowadays, more and more Western countries due to its therapeutic efficacy are accepting it. However, safety and clear pharmacological action mechanisms of TCM are still uncertain. Due to the potential application of TCM in healthcare, it is necessary to construct a scientific evaluation system with TCM characteristics and benchmark the difference from the standard of Western medicine. Model organisms have played an important role in the understanding of basic biological processes. It is easier to be studied in certain research aspects and to obtain the information of other species. Despite the controversy over suitable syndrome animal model under TCM theoretical guide, it is unquestionable that many model organisms should be used in the studies of TCM modernization, which will bring modern scientific standards into mysterious ancient Chinese medicine. In this review, we aim to summarize the utilization of model organisms in the construction of TCM syndrome model and highlight the relevance of modern medicine with TCM syndrome animal model. It will serve as the foundation for further research of model organisms and for its application in TCM syndrome model. PMID:24381636

  5. Data-Based Consultation in Student Affairs.

    ERIC Educational Resources Information Center

    Newman, Jody L.; Fuqua, Dale R.

    1984-01-01

    Provides an introduction to data-based interventions in the organizational context. Compares four models of data use and discusses how they pertain to student affairs organizations and to staff training and development. (JAC)

  6. Very fast road database verification using textured 3D city models obtained from airborne imagery

    NASA Astrophysics Data System (ADS)

    Bulatov, Dimitri; Ziems, Marcel; Rottensteiner, Franz; Pohl, Melanie

    2014-10-01

    Road databases are known to be an important part of any geodata infrastructure, e.g. as the basis for urban planning or emergency services. Updating road databases for crisis events must be performed quickly and with the highest possible degree of automation. We present a semi-automatic algorithm for road verification using textured 3D city models, starting from aerial or even UAV-images. This algorithm contains two processes, which exchange input and output, but basically run independently from each other. These processes are textured urban terrain reconstruction and road verification. The first process contains a dense photogrammetric reconstruction of 3D geometry of the scene using depth maps. The second process is our core procedure, since it contains various methods for road verification. Each method represents a unique road model and a specific strategy, and thus is able to deal with a specific type of roads. Each method is designed to provide two probability distributions, where the first describes the state of a road object (correct, incorrect), and the second describes the state of its underlying road model (applicable, not applicable). Based on the Dempster-Shafer Theory, both distributions are mapped to a single distribution that refers to three states: correct, incorrect, and unknown. With respect to the interaction of both processes, the normalized elevation map and the digital orthophoto generated during 3D reconstruction are the necessary input - together with initial road database entries - for the road verification process. If the entries of the database are too obsolete or not available at all, sensor data evaluation enables classification of the road pixels of the elevation map followed by road map extraction by means of vectorization and filtering of the geometrically and topologically inconsistent objects. Depending on the time issue and availability of a geo-database for buildings, the urban terrain reconstruction procedure has semantic models

  7. Charophytes: Evolutionary Giants and Emerging Model Organisms

    PubMed Central

    Domozych, David S.; Popper, Zoë A.; Sørensen, Iben

    2016-01-01

    Charophytes are the group of green algae whose ancestral lineage gave rise to land plants in what resulted in a profoundly transformative event in the natural history of the planet. Extant charophytes exhibit many features that are similar to those found in land plants and their relatively simple phenotypes make them efficacious organisms for the study of many fundamental biological phenomena. Several taxa including Micrasterias, Penium, Chara, and Coleochaete are valuable model organisms for the study of cell biology, development, physiology and ecology of plants. New and rapidly expanding molecular studies are increasing the use of charophytes that in turn, will dramatically enhance our understanding of the evolution of plants and the adaptations that allowed for survival on land. The Frontiers in Plant Science series on “Charophytes” provides an assortment of new research reports and reviews on charophytes and their emerging significance as model plants. PMID:27777578

  8. Experiment Databases

    NASA Astrophysics Data System (ADS)

    Vanschoren, Joaquin; Blockeel, Hendrik

    Next to running machine learning algorithms based on inductive queries, much can be learned by immediately querying the combined results of many prior studies. Indeed, all around the globe, thousands of machine learning experiments are being executed on a daily basis, generating a constant stream of empirical information on machine learning techniques. While the information contained in these experiments might have many uses beyond their original intent, results are typically described very concisely in papers and discarded afterwards. If we properly store and organize these results in central databases, they can be immediately reused for further analysis, thus boosting future research. In this chapter, we propose the use of experiment databases: databases designed to collect all the necessary details of these experiments, and to intelligently organize them in online repositories to enable fast and thorough analysis of a myriad of collected results. They constitute an additional, queriable source of empirical meta-data based on principled descriptions of algorithm executions, without reimplementing the algorithms in an inductive database. As such, they engender a very dynamic, collaborative approach to experimentation, in which experiments can be freely shared, linked together, and immediately reused by researchers all over the world. They can be set up for personal use, to share results within a lab or to create open, community-wide repositories. Here, we provide a high-level overview of their design, and use an existing experiment database to answer various interesting research questions about machine learning algorithms and to verify a number of recent studies.

  9. Modeling Attrition in Organizations from Email Communication

    DTIC Science & Technology

    2013-09-08

    online multiplayer games , user involvement and movement in online social networking plat- forms, and employee turnover within an organization. At... online games . The same problem of reducing churn rates appears in the scenario of social networking platforms. In recent years there has been a...ABSTRACT 16. SECURITY CLASSIFICATION OF: Modeling people’s online behavior in relation to their real-world social context is an interesting and important

  10. Modeling plasmonic efficiency enhancement in organic photovoltaics.

    PubMed

    Taff, Y; Apter, B; Katz, E A; Efron, U

    2015-09-10

    Efficiency enhancement of bulk heterojunction (BHJ) organic solar cells by means of the plasmonic effect is investigated by using finite-difference time-domain (FDTD) optical simulations combined with analytical modeling of exciton dissociation and charge transport efficiencies. The proposed method provides an improved analysis of the cell performance compared to previous FDTD studies. The results of the simulations predict an 11.8% increase in the cell's short circuit current with the use of Ag nano-hexagons.

  11. Empirical cost models for estimating power and energy consumption in database servers

    NASA Astrophysics Data System (ADS)

    Valdivia Garcia, Harold Dwight

    The explosive growth in the size of data centers, coupled with the widespread use of virtualization technology has brought power and energy consumption as major concerns for data center administrators. Provisioning decisions must take into consideration not only target application performance but also the power demands and total energy consumption incurred by the hardware and software to be deployed at the data center. Failure to do so will result in damaged equipment, power outages, and inefficient operation. Since database servers comprise one of the most popular and important server applications deployed in such facilities, it becomes necessary to have accurate cost models that can predict the power and energy demands that each database workloads will impose in the system. In this work we present an empirical methodology to estimate the power and energy cost of database operations. Our methodology uses multiple-linear regression to derive accurate cost models that depend only on readily available statistics such as selectivity factors, tuple size, numbers columns and relational cardinality. Moreover, our method does not need measurement of individual hardware components, but rather total power and energy consumption measured at a server. We have implemented our methodology, and ran experiments with several server configurations. Our experiments indicate that we can predict power and energy more accurately than alternative methods found in the literature.

  12. Hydrodynamic interaction of two swimming model micro-organisms

    NASA Astrophysics Data System (ADS)

    Ishikawa, Takuji; Simmonds, M. P.; Pedley, T. J.

    2006-12-01

    In order to understand the rheological and transport properties of a suspension of swimming micro-organisms, it is necessary to analyse the fluid-dynamical interaction of pairs of such swimming cells. In this paper, a swimming micro-organism is modelled as a squirming sphere with prescribed tangential surface velocity, referred to as a squirmer. The centre of mass of the sphere may be displaced from the geometric centre (bottom-heaviness). The effects of inertia and Brownian motion are neglected, because real micro-organisms swim at very low Reynolds numbers but are too large for Brownian effects to be important. The interaction of two squirmers is calculated analytically for the limits of small and large separations and is also calculated numerically using a boundary-element method. The analytical and the numerical results for the translational rotational velocities and for the stresslet of two squirmers correspond very well. We sought to generate a database for an interacting pair of squirmers from which one can easily predict the motion of a collection of squirmers. The behaviour of two interacting squirmers is discussed phenomenologically, too. The results for the trajectories of two squirmers show that first the squirmers attract each other, then they change their orientation dramatically when they are in near contact and finally they separate from each other. The effect of bottom-heaviness is considerable. Restricting the trajectories to two dimensions is shown to give misleading results. Some movies of interacting squirmers are available with the online version of the paper.

  13. Modeling and Design of Capacitive Micromachined Ultrasonic Transducers Based-on Database Optimization

    NASA Astrophysics Data System (ADS)

    Chang, M. W.; Gwo, T. J.; Deng, T. M.; Chang, H. C.

    2006-04-01

    A Capacitive Micromachined Ultrasonic Transducers simulation database, based on electromechanical coupling theory, has been fully developed for versatile capacitive microtransducer design and analysis. Both arithmetic and graphic configurations are used to find optimal parameters based on serial coupling simulations. The key modeling parameters identified can improve microtransducer's character and reliability effectively. This method could be used to reduce design time and fabrication cost, eliminating trial-and-error procedures. Various microtransducers, with optimized characteristics, can be developed economically using the developed database. A simulation to design an ultrasonic microtransducer is completed as an executed example. The dependent relationship between membrane geometry, vibration displacement and output response is demonstrated. The electromechanical coupling effects, mechanical impedance and frequency response are also taken into consideration for optimal microstructures. The microdevice parameters with the best output signal response are predicted, and microfabrication processing constraints and realities are also taken into consideration.

  14. Discovery of approximate concepts in clinical databases based on a rough set model

    NASA Astrophysics Data System (ADS)

    Tsumoto, Shusaku

    2000-04-01

    Rule discovery methods have been introduced to find useful and unexpected patterns from databases. However, one of the most important problems on these methods is that extracted rules have only positive knowledge, which do not include negative information that medical experts need to confirm whether a patient will suffer from symptoms caused by drug side-effect. This paper first discusses the characteristics of medical reasoning and defines positive and negative rules based on rough set model. Then, algorithms for induction of positive and negative rules are introduced. Then, the proposed method was evaluated on clinical databases, the experimental results of which shows several interesting patterns were discovered, such as a rule describing a relation between urticaria caused by antibiotics and food.

  15. Associative memory model for searching an image database by image snippet

    NASA Astrophysics Data System (ADS)

    Khan, Javed I.; Yun, David Y.

    1994-09-01

    This paper presents an associative memory called an multidimensional holographic associative computing (MHAC), which can be potentially used to perform feature based image database query using image snippet. MHAC has the unique capability to selectively focus on specific segments of a query frame during associative retrieval. As a result, this model can perform search on the basis of featural significance described by a subset of the snippet pixels. This capability is critical for visual query in image database because quite often the cognitive index features in the snippet are statistically weak. Unlike, the conventional artificial associative memories, MHAC uses a two level representation and incorporates additional meta-knowledge about the reliability status of segments of information it receives and forwards. In this paper we present the analysis of focus characteristics of MHAC.

  16. Seismic hazard assessment for Myanmar: Earthquake model database, ground-motion scenarios, and probabilistic assessments

    NASA Astrophysics Data System (ADS)

    Chan, C. H.; Wang, Y.; Thant, M.; Maung Maung, P.; Sieh, K.

    2015-12-01

    We have constructed an earthquake and fault database, conducted a series of ground-shaking scenarios, and proposed seismic hazard maps for all of Myanmar and hazard curves for selected cities. Our earthquake database integrates the ISC, ISC-GEM and global ANSS Comprehensive Catalogues, and includes harmonized magnitude scales without duplicate events. Our active fault database includes active fault data from previous studies. Using the parameters from these updated databases (i.e., the Gutenberg-Richter relationship, slip rate, maximum magnitude and the elapse time of last events), we have determined the earthquake recurrence models of seismogenic sources. To evaluate the ground shaking behaviours in different tectonic regimes, we conducted a series of tests by matching the modelled ground motions to the felt intensities of earthquakes. Through the case of the 1975 Bagan earthquake, we determined that Atkinson and Moore's (2003) scenario using the ground motion prediction equations (GMPEs) fits the behaviours of the subduction events best. Also, the 2011 Tarlay and 2012 Thabeikkyin events suggested the GMPEs of Akkar and Cagnan (2010) fit crustal earthquakes best. We thus incorporated the best-fitting GMPEs and site conditions based on Vs30 (the average shear-velocity down to 30 m depth) from analysis of topographic slope and microtremor array measurements to assess seismic hazard. The hazard is highest in regions close to the Sagaing Fault and along the Western Coast of Myanmar as seismic sources there have earthquakes occur at short intervals and/or last events occurred a long time ago. The hazard curves for the cities of Bago, Mandalay, Sagaing, Taungoo and Yangon show higher hazards for sites close to an active fault or with a low Vs30, e.g., the downtown of Sagaing and Shwemawdaw Pagoda in Bago.

  17. Influence of high-resolution surface databases on the modeling of local atmospheric circulation systems

    NASA Astrophysics Data System (ADS)

    Paiva, L. M. S.; Bodstein, G. C. R.; Pimentel, L. C. G.

    2014-08-01

    Large-eddy simulations are performed using the Advanced Regional Prediction System (ARPS) code at horizontal grid resolutions as fine as 300 m to assess the influence of detailed and updated surface databases on the modeling of local atmospheric circulation systems of urban areas with complex terrain. Applications to air pollution and wind energy are sought. These databases are comprised of 3 arc-sec topographic data from the Shuttle Radar Topography Mission, 10 arc-sec vegetation-type data from the European Space Agency (ESA) GlobCover project, and 30 arc-sec leaf area index and fraction of absorbed photosynthetically active radiation data from the ESA GlobCarbon project. Simulations are carried out for the metropolitan area of Rio de Janeiro using six one-way nested-grid domains that allow the choice of distinct parametric models and vertical resolutions associated to each grid. ARPS is initialized using the Global Forecasting System with 0.5°-resolution data from the National Center of Environmental Prediction, which is also used every 3 h as lateral boundary condition. Topographic shading is turned on and two soil layers are used to compute the soil temperature and moisture budgets in all runs. Results for two simulated runs covering three periods of time are compared to surface and upper-air observational data to explore the dependence of the simulations on initial and boundary conditions, grid resolution, topographic and land-use databases. Our comparisons show overall good agreement between simulated and observational data, mainly for the potential temperature and the wind speed fields, and clearly indicate that the use of high-resolution databases improves significantly our ability to predict the local atmospheric circulation.

  18. Influence of high-resolution surface databases on the modeling of local atmospheric circulation systems

    NASA Astrophysics Data System (ADS)

    Paiva, L. M. S.; Bodstein, G. C. R.; Pimentel, L. C. G.

    2013-12-01

    Large-eddy simulations are performed using the Advanced Regional Prediction System (ARPS) code at horizontal grid resolutions as fine as 300 m to assess the influence of detailed and updated surface databases on the modeling of local atmospheric circulation systems of urban areas with complex terrain. Applications to air pollution and wind energy are sought. These databases are comprised of 3 arc-sec topographic data from the Shuttle Radar Topography Mission, 10 arc-sec vegetation type data from the European Space Agency (ESA) GlobCover Project, and 30 arc-sec Leaf Area Index and Fraction of Absorbed Photosynthetically Active Radiation data from the ESA GlobCarbon Project. Simulations are carried out for the Metropolitan Area of Rio de Janeiro using six one-way nested-grid domains that allow the choice of distinct parametric models and vertical resolutions associated to each grid. ARPS is initialized using the Global Forecasting System with 0.5°-resolution data from the National Center of Environmental Prediction, which is also used every 3 h as lateral boundary condition. Topographic shading is turned on and two soil layers with depths of 0.01 and 1.0 m are used to compute the soil temperature and moisture budgets in all runs. Results for two simulated runs covering the period from 6 to 7 September 2007 are compared to surface and upper-air observational data to explore the dependence of the simulations on initial and boundary conditions, topographic and land-use databases and grid resolution. Our comparisons show overall good agreement between simulated and observed data and also indicate that the low resolution of the 30 arc-sec soil database from United States Geological Survey, the soil moisture and skin temperature initial conditions assimilated from the GFS analyses and the synoptic forcing on the lateral boundaries of the finer grids may affect an adequate spatial description of the meteorological variables.

  19. Longitudinal driver model and collision warning and avoidance algorithms based on human driving databases

    NASA Astrophysics Data System (ADS)

    Lee, Kangwon

    Intelligent vehicle systems, such as Adaptive Cruise Control (ACC) or Collision Warning/Collision Avoidance (CW/CA), are currently under development, and several companies have already offered ACC on selected models. Control or decision-making algorithms of these systems are commonly evaluated under extensive computer simulations and well-defined scenarios on test tracks. However, they have rarely been validated with large quantities of naturalistic human driving data. This dissertation utilized two University of Michigan Transportation Research Institute databases (Intelligent Cruise Control Field Operational Test and System for Assessment of Vehicle Motion Environment) in the development and evaluation of longitudinal driver models and CW/CA algorithms. First, to examine how drivers normally follow other vehicles, the vehicle motion data from the databases were processed using a Kalman smoother. The processed data was then used to fit and evaluate existing longitudinal driver models (e.g., the linear follow-the-leader model, the Newell's special model, the nonlinear follow-the-leader model, the linear optimal control model, the Gipps model and the optimal velocity model). A modified version of the Gipps model was proposed and found to be accurate in both microscopic (vehicle) and macroscopic (traffic) senses. Second, to examine emergency braking behavior and to evaluate CW/CA algorithms, the concepts of signal detection theory and a performance index suitable for unbalanced situations (few threatening data points vs. many safe data points) are introduced. Selected existing CW/CA algorithms were found to have a performance index (geometric mean of true-positive rate and precision) not exceeding 20%. To optimize the parameters of the CW/CA algorithms, a new numerical optimization scheme was developed to replace the original data points with their representative statistics. A new CW/CA algorithm was proposed, which was found to score higher than 55% in the

  20. Solubility Database

    National Institute of Standards and Technology Data Gateway

    SRD 106 IUPAC-NIST Solubility Database (Web, free access)   These solubilities are compiled from 18 volumes (Click here for List) of the International Union for Pure and Applied Chemistry(IUPAC)-NIST Solubility Data Series. The database includes liquid-liquid, solid-liquid, and gas-liquid systems. Typical solvents and solutes include water, seawater, heavy water, inorganic compounds, and a variety of organic compounds such as hydrocarbons, halogenated hydrocarbons, alcohols, acids, esters and nitrogen compounds. There are over 67,500 solubility measurements and over 1800 references.

  1. Compartmental and Data-Based Modeling of Cerebral Hemodynamics: Nonlinear Analysis.

    PubMed

    Henley, Brandon; Shin, Dae; Zhang, Rong; Marmarelis, Vasilis

    2016-07-09

    Objective-As an extension to our study comparing a putative compartmental and data-based model of linear dynamic cerebral autoregulation (CA) and CO2-vasomotor reactivity (VR), we study the CA-VR process in a nonlinear context. Methods- We use the concept of Principal Dynamic Modes (PDM) in order to obtain a compact and more easily interpretable input-output model. This in silico study permits the use of input data with a dynamic range large enough to simulate the classic homeostatic CA and VR curves using a putative structural model of the regulatory control of the cerebral circulation. The PDM model obtained using theoretical and experimental data are compared. Results- It was found that the PDM model was able to reflect accurately both the simulated static CA and VR curves in the Associated Nonlinear Functions (ANFs). Similar to experimental observations, the PDM model essentially separates the pressure-flow relationship into a linear component with fast dynamics and nonlinear components with slow dynamics. In addition, we found good qualitative agreement between the PDMs representing the dynamic theoretical and experimental CO2-flow relationship. Conclusion- Under the modeling assumption and in light of other experimental findings, we hypothesize that PDMs obtained from experimental data correspond with passive fluid dynamical and active regulatory mechanisms. Significance- Both hypothesis-based and data-based modeling approaches can be combined to offer some insight into the physiological basis of PDM model obtained from human experimental data. The PDM modeling approach potentially offers a practical way to quantify the status of specific regulatory mechanisms in the CA-VR process.

  2. Global Exposure Modelling of Semivolatile Organic Compounds

    NASA Astrophysics Data System (ADS)

    Guglielmo, F.; Lammel, G.; Maier-Reimer, E.

    2008-12-01

    Organic compounds which are persistent and toxic as the agrochemicals γ-hexachlorocyclohexane (γ-HCH, lindane) and dichlorodiphenyltrichloroethane (DDT) pose a hazard for the ecosystems. These compounds are semivolatile, hence multicompartmental substances and subject to long-range transport (LRT) in atmosphere and ocean. Being lipophilic, they accumulate in exposed organism tissues and biomagnify along food chains. The multicompartmental global fate and LRT of DDT and lindane in the atmosphere and ocean have been studied using application data for 1980, on a decadal scale using a model based on the coupling of atmosphere and (for the first time for these compounds) ocean General Circulation Models (ECHAM5 and MPI-OM). The model system encompasses furthermore 2D terrestrial compartments (soil and vegetation) and sea ice, a fully dynamic atmospheric aerosol (HAM) module and an ocean biogeochemistry module (HAMOCC5). Large mass fractions of the compounds are found in soil. Lindane is also found in comparable amount in ocean. DDT has the longest residence time in almost all compartments. The sea ice compartment locally almost inhibits volatilization from the sea. The air/sea exchange is also affected , up to a reduction of 35 % for DDT by partitioning to the organic phases (suspended and dissolved particulate matter) in the global oceans. Partitioning enhances vertical transport in the sea. Ocean dynamics are found to be more significant for vertical transport than sinking associated with particulate matter. LRT in the global environment is determined by the fast atmospheric circulation. Net meridional transport taking place in the ocean is locally effective mostly via western boundary currents, upon applications at mid- latitudes. The pathways of the long-lived semivolatile organic compounds studied include a sequence of several cycles of volatilisation, transport in the atmosphere, deposition and transport in the ocean (multihopping substances). Multihopping is

  3. Transposing an active fault database into a seismic hazard fault model for nuclear facilities - Part 1: Building a database of potentially active faults (BDFA) for metropolitan France

    NASA Astrophysics Data System (ADS)

    Jomard, Hervé; Cushing, Edward Marc; Palumbo, Luigi; Baize, Stéphane; David, Claire; Chartier, Thomas

    2017-09-01

    The French Institute for Radiation Protection and Nuclear Safety (IRSN), with the support of the Ministry of Environment, compiled a database (BDFA) to define and characterize known potentially active faults of metropolitan France. The general structure of BDFA is presented in this paper. BDFA reports to date 136 faults and represents a first step toward the implementation of seismic source models that would be used for both deterministic and probabilistic seismic hazard calculations. A robustness index was introduced, highlighting that less than 15 % of the database is controlled by reasonably complete data sets. An example of transposing BDFA into a fault source model for PSHA (probabilistic seismic hazard analysis) calculation is presented for the Upper Rhine Graben (eastern France) and exploited in the companion paper (Chartier et al., 2017, hereafter Part 2) in order to illustrate ongoing challenges for probabilistic fault-based seismic hazard calculations.

  4. Modelling motions within the organ of Corti

    NASA Astrophysics Data System (ADS)

    Ni, Guangjian; Baumgart, Johannes; Elliott, Stephen

    2015-12-01

    Most cochlear models used to describe the basilar membrane vibration along the cochlea are concerned with macromechanics, and often assume that the organ of Corti moves as a single unit, ignoring the individual motion of different components. New experimental technologies provide the opportunity to measure the dynamic behaviour of different components within the organ of Corti, but only for certain types of excitation. It is thus still difficult to directly measure every aspect of cochlear dynamics, particularly for acoustic excitation of the fully active cochlea. The present work studies the dynamic response of a model of the cross-section of the cochlea, at the microscopic level, using the finite element method. The elastic components are modelled with plate elements and the perilymph and endolymph are modelled with inviscid fluid elements. The individual motion of each component within the organ of Corti is calculated with dynamic pressure loading on the basilar membrane and the motions of the experimentally accessible parts are compared with measurements. The reticular lamina moves as a stiff plate, without much bending, and is pivoting around a point close to the region of the inner hair cells, as observed experimentally. The basilar membrane shows a slightly asymmetric mode shape, with maximum displacement occurring between the second-row and the third-row of the outer hair cells. The dynamics responses is also calculated, and compared with experiments, when driven by the outer hair cells. The receptance of the basilar membrane motion and of the deflection of the hair bundles of the outer hair cells is thus obtained, when driven either acoustically or electrically. In this way, the fully active linear response of the basilar membrane to acoustic excitation can be predicted by using a linear superposition of the calculated receptances and a defined gain function for the outer hair cell feedback.

  5. Markov model recognition and classification of DNA/protein sequences within large text databases.

    PubMed

    Wren, Jonathan D; Hildebrand, William H; Chandrasekaran, Sreedevi; Melcher, Ulrich

    2005-11-01

    Short sequence patterns frequently define regions of biological interest (binding sites, immune epitopes, primers, etc.), yet a large fraction of this information exists only within the scientific literature and is thus difficult to locate via conventional means (e.g. keyword queries or manual searches). We describe herein a system to accurately identify and classify sequence patterns from within large corpora using an n-gram Markov model (MM). As expected, on test sets we found that identification of sequences with limited alphabets and/or regular structures such as nucleic acids (non-ambiguous) and peptide abbreviations (3-letter) was highly accurate, whereas classification of symbolic (1-letter) peptide strings with more complex alphabets was more problematic. The MM was used to analyze two very large, sequence-containing corpora: over 7.75 million Medline abstracts and 9000 full-text articles from Journal of Virology. Performance was benchmarked by comparing the results with Journal of Virology entries in two existing manually curated databases: VirOligo and the HLA Ligand Database. Performance estimates were 98 +/- 2% precision/84% recall for primer identification and classification and 67 +/- 6% precision/85% recall for peptide epitopes. We also find a dramatic difference between the amounts of sequence-related data reported in abstracts versus full text. Our results suggest that automated extraction and classification of sequence elements is a promising, low-cost means of sequence database curation and annotation. MM routine and datasets are available upon request.

  6. A grid-based model for integration of distributed medical databases.

    PubMed

    Luo, Yongxing; Jiang, Lijun; Zhuang, Tian-ge

    2009-12-01

    Grid has emerged recently as an integration infrastructure for sharing and coordinated use of diverse resources in dynamic, distributed environment. In this paper, we present a prototype system for integration of heterogeneous medical databases based on Grid technology, which can provide a uniform access interface and efficient query mechanism to different medical databases. After presenting the architecture of the prototype system that employs corresponding Grid services and middleware technologies, we make an analysis on its basic functional components including OGSA-DAI, metadata model, transaction management, and query processing in detail, which cooperate with each other to enable uniform accessing and seamless integration of the underlying heterogeneous medical databases. Then, we test effectiveness and performance of the system through a query instance, analyze the experiment result, and make a discussion on some issues relating to practical medical applications. Although the prototype system has been carried out and tested in a simulated hospital information environment at present, the underlying principles are applicable to practical applications.

  7. Tectonic database and plate tectonic model of the former USSR territory

    SciTech Connect

    Bocharova, N.Yu.; Scotese, C.R.; Pristavakina, E.I.; Zonenshain, L.P. . Center for Russian Geology and Tectonics)

    1993-02-01

    A digital geographic database for the former USSR was compiled using published geologic and geodynamic maps and the unpublished suture map of Lev Zonenshain (1991). The database includes more than 900 tectonic features: strike-slip faults, sutures, thrusts, fossil and active rifts, fossil and active subduction zones, boundaries of the major and minor Precambrian blocks, ophiolites, and various volcanic complexes. The attributes of each structural unit include type of structure, name, age, tectonic setting and geographical coordinates. Paleozoic and Early Mesozoic reconstructions of the former USSR and adjacent regions were constructed using this tectonic database together with paleomagnetic data and the motions of continent over fixed hot spots. Global apparent polar wander paths in European and Siberian coordinates were calculated back to Cambrian time, using the paleomagnetic pole summaries of Van der Voo (1992) and Khramov (1992) and the global plate tectonic model of the Paleomap Project (Scotese and Becker, 1992). Trajectories of intraplate volcanics in South Siberia, Mongolia, Scandinavia and data on the White Mountain plutons and Karoo flood basalts were also taken into account. Using new data, the authors recalculated the stage and finite poles for the rotation of the Siberia and Europe with respect to the hot spot reference frame for the time interval 160 to 450 Ma.

  8. Theory and modeling of stereoselective organic reactions

    SciTech Connect

    Houk, K.N.; Paddon-Row, M.N.; Rondan, N.G.; Wu, Y.D.; Brown, F.K.; Spellmeyer, D.C.; Metz, J.T.; Li, Y.; Loncharich, R.J.

    1986-03-07

    Theoretical investigations of the transition structures of additions and cycloadditions reveal details about the geometrics of bond-forming processes that are not directly accessible by experiment. The conformational analysis of transition states has been developed from theoretical generalizations about the preferred angle of attack by reagents on multiple bonds and predictions of conformations with respect to partially formed bonds. Qualitative rules for the prediction of the stereochemistries of organic reactions have been devised, and semi-empirical computational models have also been developed to predict the stereoselectivities of reactions of large organic molecules, such as nucleophilic additions to carbonyls, electrophilic hydroborations and cycloadditions, and intramolecular radical additions and cycloadditions. 52 references, 7 figures.

  9. Assessment of cloud cover in climate models and reanalysis databases with ISCCP over the Mediterranean region

    NASA Astrophysics Data System (ADS)

    Enriquez, Aaron; Calbo, Josep; Gonzalez, Josep-Abel

    2013-04-01

    Clouds are an important regulator of climate due to their influence on the water balance of the atmosphere and their interaction with solar and infrared radiation. At any time, clouds cover a great percentage of the Earth's surface but their distribution is very irregular along time and space, which makes the evaluation of their influence on climate a difficult task. At present there are few studies related to cloud cover comparing current climate models with observational data. In this study, the database of monthly cloud cover provided by the International Satellite Cloud Climatology Project (ISCCP) has been chosen as a reference against which we compare the output of CMIP5 climate models and reanalysis databases, on the domain South-Europe-Mediterranean (SEM) established by the Intergovernmental Panel on Climate Change (IPCC) [1]. The study covers the period between 1984 and 2009, and the performance of cloud cover estimations for seasons has also been studied. To quantify the agreement between the databases we use two types of statistics: bias and SkillScore, which is based on the probability density functions (PDFs) of the databases [2]. We also use Taylor diagrams to visualize the statistics. Results indicate that there are areas where the models accurately describe what it is observed by ISCCP, for some periods of the year (e.g. Northern Africa, for autumn), compared to other areas and periods for which the agreement is lower (Iberian Peninsula in winter and the Black Sea for the summer months). However these differences should be attributed not only to the limitations of climate models, but possibly also to the data provided by ISCCP. References [1] Intergovernmental Panel on Climate Change (2007) Fourth Assessment Report: Climate Change 2007: Working Group I Report: The Physical Science Basis. [2] Ranking the AR4 climate models over the Murray Darling Basin using simulated maximum temperature, minimum temperature and precipitation. Int J Climatol 28

  10. Improving Quality and Quantity of Contributions: Two Models for Promoting Knowledge Exchange with Shared Databases

    ERIC Educational Resources Information Center

    Cress, U.; Barquero, B.; Schwan, S.; Hesse, F. W.

    2007-01-01

    Shared databases are used for knowledge exchange in groups. Whether a person is willing to contribute knowledge to a shared database presents a social dilemma: Each group member saves time and energy by not contributing any information to the database and by using the database only to retrieve information which was contributed by others. But if…

  11. Improving Quality and Quantity of Contributions: Two Models for Promoting Knowledge Exchange with Shared Databases

    ERIC Educational Resources Information Center

    Cress, U.; Barquero, B.; Schwan, S.; Hesse, F. W.

    2007-01-01

    Shared databases are used for knowledge exchange in groups. Whether a person is willing to contribute knowledge to a shared database presents a social dilemma: Each group member saves time and energy by not contributing any information to the database and by using the database only to retrieve information which was contributed by others. But if…

  12. Direct data-based model predictive control with applications to structures, robotic swarms, and aircraft

    NASA Astrophysics Data System (ADS)

    Barlow, Jonathan S.

    A direct method to design data-based model predictive controllers is presented. The design method uses system identification techniques to identify model predictive controller gains directly from a set of excitation input and disturbance corrupted output. The design is direct in that the controller gains can be designed directly from input and disturbance corrupted output data without an intermediate identification step. The direct design is simpler than previous two-step designs and reduces computation time for the design of the controller. The direct design also enables an adaptive implementation capable of identifying controller gains online. The direct data-based controllers can be used for vibration suppression, disturbance rejection, tracking and is applied to structures, robot swarms and aircraft. For the cases of vibration suppression and disturbance rejection, the data-based controller has the advantage that any disturbances present in the design data are automatically rejected without needing to know the details of the disturbances. For the case of robot swarms, extensions are made for formation control and obstacle avoidance, and the controller can be implemented as a decentralized controller in real time and in parallel on individual vehicles with communication limited to past input and past output data. A formulation for improving the robustness of the controller to parametric variations is also developed. Finally, the adaptive implementation is shown to be useful for the control of linear time-varying systems and has been successfully implemented to control a linear time-varying model of a Cruise Efficient Short Take-Off and Landing (CESTOL) type aircraft.

  13. Estimating the computational limits of detection of microbial non-model organisms.

    PubMed

    Kuhring, Mathias; Renard, Bernhard Y

    2015-10-01

    Mass spectrometry has become a key instrument for proteomic studies of single bacteria as well as microbial communities. However, the identification of spectra from MS/MS experiments is still challenging, in particular for non-model organisms. Due to the limited amount of related protein reference sequences, underexplored organisms often remain completely unidentified or their spectra match to peptides of uncertain degree of relation. Alternative strategies such as error-tolerant spectra searches or proteogenomic approaches may reduce the amount of unidentified spectra and lead to peptide matches on more related taxonomic levels. However, to what extent these strategies may be successful is difficult to judge prior to an MS/MS experiment. In this contribution, we introduce a method to estimate the suitability of databases of interest. Further, it allows estimating the possible influence of error-tolerant searches and proteogenomic approaches on databases of interest with respect to the number of unidentified spectra and the taxonomic distances of identified spectra. Furthermore, we provide an implementation of our approach that supports experimental design by evaluating the benefit and need of different search strategies with respect to present databases and organisms under study. We provide several examples which highlight the different effects of additional search strategies on databases and organisms with varying amount of known relative species available.

  14. Object-Oriented Database for Managing Building Modeling Components and Metadata: Preprint

    SciTech Connect

    Long, N.; Fleming, K.; Brackney, L.

    2011-12-01

    Building simulation enables users to explore and evaluate multiple building designs. When tools for optimization, parametrics, and uncertainty analysis are combined with analysis engines, the sheer number of discrete simulation datasets makes it difficult to keep track of the inputs. The integrity of the input data is critical to designers, engineers, and researchers for code compliance, validation, and building commissioning long after the simulations are finished. This paper discusses an application that stores inputs needed for building energy modeling in a searchable, indexable, flexible, and scalable database to help address the problem of managing simulation input data.

  15. Creating a standard models for the specialized GIS database and their functions in solving forecast tasks

    NASA Astrophysics Data System (ADS)

    Sharapatov, Abish

    2015-04-01

    Standard models of skarn-magnetite deposits in folded regions of Kazakhstan, is made by using generalized geological and geophysical parameters of the similar existing deposits. Such models might be Sarybay, Sokolovskoe and other deposits of Valeryanovskaya structural-facies zone (SFZ) in Torgay paleorifts structure. They are located in the north of SFZ. Forecasting area located in the south of SFZ - in the North of Aral Sea region. These models are outlined from the study of deep structure of the region using geophysical data. Upper and deep zones were studied by separating gravity and magnetic fields on the regional and local components. Seismic and geoelectric data of region were used in interpretation. Thus, the similarity between northern and southern part of SFZ has been identified in geophysical aspects, regional and local geophysical characteristics. Creation of standard models of scarn-magnetite deposits for GIS database allows highlighting forecast criteria of such deposits type. These include: - the presence of fault zones; - thickness of volcanic strata - about 2 km or more, the total capacity of circum-ore metasomatic rocks - about 1.5 km and more; - spatial positions and geometric data of the ore bodies - steeply dipping bodies in the medium gabbroic intrusions and their contact with carbonate-dolomitic strata; - presence in the geological section of the near surface zone with the electrical resistance of 200 Om*m, corresponding to the Devonian, Early Carboniferous volcanic sediments and volcanics associated with subvolcanic bodies and intrusions; - a relatively shallow depth of the zone at a rate of Vp = 6.4-6.8 km/s - uplifting Conrad border, thickening of the granulite-basic layer; - positive values of magnetic (high-amplitude) and gravitational field. A geological forecast model is carried out by structuring geodata based on detailed analysis and aggregation of geological and formal knowledge bases on standard targets. Aggregation method of

  16. A Bayesian Multivariate Receptor Model for Estimating Source Contributions to Particulate Matter Pollution using National Databases

    PubMed Central

    Hackstadt, Amber J.; Peng, Roger D.

    2014-01-01

    Summary Time series studies have suggested that air pollution can negatively impact health. These studies have typically focused on the total mass of fine particulate matter air pollution or the individual chemical constituents that contribute to it, and not source-specific contributions to air pollution. Source-specific contribution estimates are useful from a regulatory standpoint by allowing regulators to focus limited resources on reducing emissions from sources that are major contributors to air pollution and are also desired when estimating source-specific health effects. However, researchers often lack direct observations of the emissions at the source level. We propose a Bayesian multivariate receptor model to infer information about source contributions from ambient air pollution measurements. The proposed model incorporates information from national databases containing data on both the composition of source emissions and the amount of emissions from known sources of air pollution. The proposed model is used to perform source apportionment analyses for two distinct locations in the United States (Boston, Massachusetts and Phoenix, Arizona). Our results mirror previous source apportionment analyses that did not utilize the information from national databases and provide additional information about uncertainty that is relevant to the estimation of health effects. PMID:25309119

  17. Ad HOC Model Generation Using Multiscale LIDAR Data from a Geospatial Database

    NASA Astrophysics Data System (ADS)

    Gordon, M.; Borgmann, B.; Gehrung, J.; Hebel, M.; Arens, M.

    2015-08-01

    Due to the spread of economically priced laser scanning technology nowadays, especially in the field of topographic surveying and mapping, ever-growing amounts of data need to be handled. Depending on the requirements of the specific application, airborne, mobile or terrestrial laser scanners are commonly used. Since visualizing this flood of data is not feasible with classical approaches like raw point cloud rendering, real time decision making requires sophisticated solutions. In addition, the efficient storage and recovery of 3D measurements is a challenging task. Therefore we propose an approach for the intelligent storage of 3D point clouds using a spatial database. For a given region of interest, the database is queried for the data available. All resulting point clouds are fused in a model generation process, utilizing the fact that low density airborne measurements could be used to supplement higher density mobile or terrestrial laser scans. The octree based modeling approach divides and subdivides the world into cells of varying size and fits one plane per cell, once a specified amount of points is present. The resulting model exceeds the completeness and precision of every single data source and enables for real time visualization. This is especially supported by data compression ratios of about 90%.

  18. Transforming the Premier Perspective® Hospital Database into the Observational Medical Outcomes Partnership (OMOP) Common Data Model

    PubMed Central

    Makadia, Rupa; Ryan, Patrick B.

    2014-01-01

    Background: The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) has been implemented on various claims and electronic health record (EHR) databases, but has not been applied to a hospital transactional database. This study addresses the implementation of the OMOP CDM on the U.S. Premier Hospital database. Methods: We designed and implemented an extract, transform, load (ETL) process to convert the Premier hospital database into the OMOP CDM. Standard charge codes in Premier were mapped between the OMOP version 4.0 Vocabulary and standard charge descriptions. Visit logic was added to impute the visit dates. We tested the conversion by replicating a published study using the raw and transformed databases. The Premier hospital database was compared to a claims database, in regard to prevalence of disease. Findings: The data transformed into the CDM resulted in 1% of the data being discarded due to data errors in the raw data. A total of 91.4% of Premier standard charge codes were mapped successfully to a standard vocabulary. The results of the replication study resulted in a similar distribution of patient characteristics. The comparison to the claims data yields notable similarities and differences amongst conditions represented in both databases. Discussion: The transformation of the Premier database into the OMOP CDM version 4.0 adds value in conducting analyses due to successful mapping of the drugs and procedures. The addition of visit logic gives ordinality to drugs and procedures that wasn’t present prior to the transformation. Comparing conditions in Premier against a claims database can provide an understanding about Premier’s potential use in pharmacoepidemiology studies that are traditionally conducted via claims databases. Conclusion and Next Steps: The conversion of the Premier database into the OMOP CDM 4.0 was completed successfully. The next steps include refinement of vocabularies and mappings and continual maintenance of

  19. Making designer mutants in model organisms.

    PubMed

    Peng, Ying; Clark, Karl J; Campbell, Jarryd M; Panetta, Magdalena R; Guo, Yi; Ekker, Stephen C

    2014-11-01

    Recent advances in the targeted modification of complex eukaryotic genomes have unlocked a new era of genome engineering. From the pioneering work using zinc-finger nucleases (ZFNs), to the advent of the versatile and specific TALEN systems, and most recently the highly accessible CRISPR/Cas9 systems, we now possess an unprecedented ability to analyze developmental processes using sophisticated designer genetic tools. In this Review, we summarize the common approaches and applications of these still-evolving tools as they are being used in the most popular model developmental systems. Excitingly, these robust and simple genomic engineering tools also promise to revolutionize developmental studies using less well established experimental organisms.

  20. A parallel model for SQL astronomical databases based on solid state storage. Application to the Gaia Archive PostgreSQL database

    NASA Astrophysics Data System (ADS)

    González-Núñez, J.; Gutiérrez-Sánchez, R.; Salgado, J.; Segovia, J. C.; Merín, B.; Aguado-Agelet, F.

    2017-07-01

    Query planning and optimisation algorithms in most popular relational databases were developed at the times hard disk drives were the only storage technology available. The advent of higher parallel random access capacity devices, such as solid state disks, opens up the way for intra-machine parallel computing over large datasets. We describe a two phase parallel model for the implementation of heavy analytical processes in single instance PostgreSQL astronomical databases. This model is particularised to fulfil two frequent astronomical problems, density maps and crossmatch computation with Quad Tree Cube (Q3C) indexes. They are implemented as part of the relational databases infrastructure for the Gaia Archive and performance is assessed. Improvement of a factor 28.40 in comparison to sequential execution is observed in the reference implementation for a histogram computation. Speedup ratios of 3.7 and 4.0 are attained for the reference positional crossmatches considered. We observe large performance enhancements over sequential execution for both CPU and disk access intensive computations, suggesting these methods might be useful with the growing data volumes in Astronomy.

  1. Clinical risk assessment of organ manifestations in systemic sclerosis: a report from the EULAR Scleroderma Trials And Research group database

    PubMed Central

    Walker, U A; Tyndall, A; Czirják, L; Denton, C; Farge‐Bancel, D; Kowal‐Bielecka, O; Müller‐Ladner, U; Bocelli‐Tyndall, C; Matucci‐Cerinic, M

    2007-01-01

    Background Systemic sclerosis (SSc) is a multisystem autoimmune disease, which is classified into a diffuse cutaneous (dcSSc) and a limited cutaneous (lcSSc) subset according to the skin involvement. In order to better understand the vascular, immunological and fibrotic processes of SSc and to guide its treatment, the EULAR Scleroderma Trials And Research (EUSTAR) group was formed in June 2004. Aims and methods EUSTAR collects prospectively the Minimal Essential Data Set (MEDS) on all sequential patients fulfilling the American College of Rheumatology diagnostic criteria in participating centres. We aimed to characterise demographic, clinical and laboratory characteristics of disease presentation in SSc and analysed EUSTAR baseline visits. Results In April 2006, a total of 3656 patients (1349 with dcSSc and 2101 with lcSSc) were enrolled in 102 centres and 30 countries. 1330 individuals had autoantibodies against Scl70 and 1106 against anticentromere antibodies. 87% of patients were women. On multivariate analysis, scleroderma subsets (dcSSc vs lcSSc), antibody status and age at onset of Raynaud's phenomenon, but not gender, were found to be independently associated with the prevalence of organ manifestations. Autoantibody status in this analysis was more closely associated with clinical manifestations than were SSc subsets. Conclusion dcSSc and lcSSc subsets are associated with particular organ manifestations, but in this analysis the clinical distinction seemed to be superseded by an antibody‐based classification in predicting some scleroderma complications. The EUSTAR MEDS database facilitates the analysis of clinical patterns in SSc, and contributes to the standardised assessment and monitoring of SSc internationally. PMID:17234652

  2. Standard versus bicaval techniques for orthotopic heart transplantation: an analysis of the United Network for Organ Sharing database.

    PubMed

    Davies, Ryan R; Russo, Mark J; Morgan, Jeffrey A; Sorabella, Robert A; Naka, Yoshifumi; Chen, Jonathan M

    2010-09-01

    Most studies of anastomotic technique have been underpowered to detect subtle differences in survival. We analyzed the United Network for Organ Sharing database for trends in use and outcomes after either bicaval or traditional (biatrial) anastomoses for heart implantation. Review of United Network for Organ Sharing data identified 20,999 recipients of heart transplants from 1997 to 2007. Patients were stratified based on the technique of atrial anastomosis: standard biatrial (atrial group, n = 11,919, 59.3%), bicaval (caval group, n = 7661, 38.1%), or total orthotopic (total group, n = 519, 2.6%). The use of the bicaval anastomosis is increasing, but many transplantations continue to use a biatrial anastomosis (1997, 0.2% vs 97.6%; 2007, 62.0% vs 34.7%; P < .0001). Atrial group patients required permanent pacemaker implantation more often (odds ratio, 2.6; 95% confidence interval, 2.2-3.1). Caval group patients had a significant advantage in 30-day mortality (odds ratio, 0.83; 95% confidence interval, 0.75-0.93), and Cox regression analysis confirmed the decreased long-term survival in the atrial group (hazard ratio, 1.11; 95% confidence interval, 1.04-1.19). Heart transplantations performed with bicaval anastomoses require postoperative permanent pacemaker implantation at lower frequency and have a small but significant survival advantage compared with biatrial anastomoses. We recommend that except where technical considerations require a biatrial technique, bicaval anastomoses should be performed for heart transplantation. 2010 The American Association for Thoracic Surgery. Published by Mosby, Inc. All rights reserved.

  3. Evaluation of a vortex-based subgrid stress model using DNS databases

    NASA Technical Reports Server (NTRS)

    Misra, Ashish; Lund, Thomas S.

    1996-01-01

    The performance of a SubGrid Stress (SGS) model for Large-Eddy Simulation (LES) developed by Misra k Pullin (1996) is studied for forced and decaying isotropic turbulence on a 32(exp 3) grid. The physical viability of the model assumptions are tested using DNS databases. The results from LES of forced turbulence at Taylor Reynolds number R(sub (lambda)) approximately equals 90 are compared with filtered DNS fields. Probability density functions (pdfs) of the subgrid energy transfer, total dissipation, and the stretch of the subgrid vorticity by the resolved velocity-gradient tensor show reasonable agreement with the DNS data. The model is also tested in LES of decaying isotropic turbulence where it correctly predicts the decay rate and energy spectra measured by Comte-Bellot & Corrsin (1971).

  4. Subject and authorship of records related to the Organization for Tropical Studies (OTS) in BINABITROP, a comprehensive database about Costa Rican biology.

    PubMed

    Monge-Nájera, Julián; Nielsen-Muñoz, Vanessa; Azofeifa-Mora, Ana Beatriz

    2013-06-01

    BINABITROP is a bibliographical database of more than 38000 records about the ecosystems and organisms of Costa Rica. In contrast with commercial databases, such as Web of Knowledge and Scopus, which exclude most of the scientific journals published in tropical countries, BINABITROP is a comprehensive record of knowledge on the tropical ecosystems and organisms of Costa Rica. We analyzed its contents in three sites (La Selva, Palo Verde and Las Cruces) and recorded scientific field, taxonomic group and authorship. We found that most records dealt with ecology and systematics, and that most authors published only one article in the study period (1963-2011). Most research was published in four journals: Biotropica, Revista de Biología Tropical/ International Journal of Tropical Biology and Conservation, Zootaxa and Brenesia. This may be the first study of a such a comprehensive database for any case of tropical biology literature.

  5. Modeling disordered morphologies in organic semiconductors.

    PubMed

    Neumann, Tobias; Danilov, Denis; Lennartz, Christian; Wenzel, Wolfgang

    2013-12-05

    Organic thin film devices are investigated for many diverse applications, including light emitting diodes, organic photovoltaic and organic field effect transistors. Modeling of their properties on the basis of their detailed molecular structure requires generation of representative morphologies, many of which are amorphous. Because time-scales for the formation of the molecular structure are slow, we have developed a linear-scaling single molecule deposition protocol which generates morphologies by simulation of vapor deposition of molecular films. We have applied this protocol to systems comprising argon, buckminsterfullerene, N,N-Di(naphthalene-1-yl)-N,N'-diphenyl-benzidine, mer-tris(8-hydroxy-quinoline)aluminum(III), and phenyl-C61-butyric acid methyl ester, with and without postdeposition relaxation of the individually deposited molecules. The proposed single molecule deposition protocol leads to formation of highly ordered morphologies in argon and buckminsterfullerene systems when postdeposition relaxation is used to locally anneal the configuration in the vicinity of the newly deposited molecule. The other systems formed disordered amorphous morphologies and the postdeposition local relaxation step has only a small effect on the characteristics of the disordered morphology in comparison to the materials forming crystals.

  6. Prediction model of potential hepatocarcinogenicity of rat hepatocarcinogens using a large-scale toxicogenomics database

    SciTech Connect

    Uehara, Takeki; Minowa, Yohsuke; Morikawa, Yuji; Kondo, Chiaki; Maruyama, Toshiyuki; Kato, Ikuo; Nakatsu, Noriyuki; Igarashi, Yoshinobu; Ono, Atsushi; Hayashi, Hitomi; Mitsumori, Kunitoshi; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro

    2011-09-15

    The present study was performed to develop a robust gene-based prediction model for early assessment of potential hepatocarcinogenicity of chemicals in rats by using our toxicogenomics database, TG-GATEs (Genomics-Assisted Toxicity Evaluation System developed by the Toxicogenomics Project in Japan). The positive training set consisted of high- or middle-dose groups that received 6 different non-genotoxic hepatocarcinogens during a 28-day period. The negative training set consisted of high- or middle-dose groups of 54 non-carcinogens. Support vector machine combined with wrapper-type gene selection algorithms was used for modeling. Consequently, our best classifier yielded prediction accuracies for hepatocarcinogenicity of 99% sensitivity and 97% specificity in the training data set, and false positive prediction was almost completely eliminated. Pathway analysis of feature genes revealed that the mitogen-activated protein kinase p38- and phosphatidylinositol-3-kinase-centered interactome and the v-myc myelocytomatosis viral oncogene homolog-centered interactome were the 2 most significant networks. The usefulness and robustness of our predictor were further confirmed in an independent validation data set obtained from the public database. Interestingly, similar positive predictions were obtained in several genotoxic hepatocarcinogens as well as non-genotoxic hepatocarcinogens. These results indicate that the expression profiles of our newly selected candidate biomarker genes might be common characteristics in the early stage of carcinogenesis for both genotoxic and non-genotoxic carcinogens in the rat liver. Our toxicogenomic model might be useful for the prospective screening of hepatocarcinogenicity of compounds and prioritization of compounds for carcinogenicity testing. - Highlights: >We developed a toxicogenomic model to predict hepatocarcinogenicity of chemicals. >The optimized model consisting of 9 probes had 99% sensitivity and 97% specificity. >This model

  7. Modeling the intracellular organization of calcium signaling.

    PubMed

    Dupont, Geneviève

    2014-01-01

    Calcium (Ca²⁺) is a key signaling ion that plays a fundamental role in many cellular processes in most types of tissues and organisms. The versatility of this signaling pathway is remarkable. Depending on the cell type and the stimulus, intracellular Ca²⁺ increases can last over different periods, as short spikes or more sustained signals. From a spatial point of view, they can be localized or invade the whole cell. Such a richness of behaviors is possible thanks to numerous exchange processes with the external medium or internal Ca²⁺ pools, mainly the endoplasmic or sarcoplasmic reticulum and mitochondria. These fluxes are also highly regulated. In order to get an accurate description of the spatiotemporal organization of Ca²⁺ signaling, it is useful to resort to modeling. Thus, each flux can be described by an appropriate kinetic expression. Ca²⁺ dynamics in a given cell type can then be simulated by a modular approach, consisting of the assembly of computational descriptions of the appropriate fluxes and regulations. Modeling can also be used to get insight into the mechanisms of decoding of the Ca²⁺ signals responsible for cellular responses. Cells can use frequency or amplitude coding, as well as take profit of Ca²⁺ oscillations to increase their sensitivity to small average Ca²⁺ increases. © 2014 Wiley Periodicals, Inc.

  8. Organic acid modeling and model validation: Workshop summary. Final report

    SciTech Connect

    Sullivan, T.J.; Eilers, J.M.

    1992-08-14

    A workshop was held in Corvallis, Oregon on April 9--10, 1992 at the offices of E&S Environmental Chemistry, Inc. The purpose of this workshop was to initiate research efforts on the entitled ``Incorporation of an organic acid representation into MAGIC (Model of Acidification of Groundwater in Catchments) and testing of the revised model using Independent data sources.`` The workshop was attended by a team of internationally-recognized experts in the fields of surface water acid-bass chemistry, organic acids, and watershed modeling. The rationale for the proposed research is based on the recent comparison between MAGIC model hindcasts and paleolimnological inferences of historical acidification for a set of 33 statistically-selected Adirondack lakes. Agreement between diatom-inferred and MAGIC-hindcast lakewater chemistry in the earlier research had been less than satisfactory. Based on preliminary analyses, it was concluded that incorporation of a reasonable organic acid representation into the version of MAGIC used for hindcasting was the logical next step toward improving model agreement.

  9. Organic acid modeling and model validation: Workshop summary

    SciTech Connect

    Sullivan, T.J.; Eilers, J.M.

    1992-08-14

    A workshop was held in Corvallis, Oregon on April 9--10, 1992 at the offices of E S Environmental Chemistry, Inc. The purpose of this workshop was to initiate research efforts on the entitled Incorporation of an organic acid representation into MAGIC (Model of Acidification of Groundwater in Catchments) and testing of the revised model using Independent data sources.'' The workshop was attended by a team of internationally-recognized experts in the fields of surface water acid-bass chemistry, organic acids, and watershed modeling. The rationale for the proposed research is based on the recent comparison between MAGIC model hindcasts and paleolimnological inferences of historical acidification for a set of 33 statistically-selected Adirondack lakes. Agreement between diatom-inferred and MAGIC-hindcast lakewater chemistry in the earlier research had been less than satisfactory. Based on preliminary analyses, it was concluded that incorporation of a reasonable organic acid representation into the version of MAGIC used for hindcasting was the logical next step toward improving model agreement.

  10. An open source web interface for linking models to infrastructure system databases

    NASA Astrophysics Data System (ADS)

    Knox, S.; Mohamed, K.; Harou, J. J.; Rheinheimer, D. E.; Medellin-Azuara, J.; Meier, P.; Tilmant, A.; Rosenberg, D. E.

    2016-12-01

    Models of networked engineered resource systems such as water or energy systems are often built collaboratively with developers from different domains working at different locations. These models can be linked to large scale real world databases, and they are constantly being improved and extended. As the development and application of these models becomes more sophisticated, and the computing power required for simulations and/or optimisations increases, so has the need for online services and tools which enable the efficient development and deployment of these models. Hydra Platform is an open source, web-based data management system, which allows modellers of network-based models to remotely store network topology and associated data in a generalised manner, allowing it to serve multiple disciplines. Hydra Platform uses a web API using JSON to allow external programs (referred to as `Apps') to interact with its stored networks and perform actions such as importing data, running models, or exporting the networks to different formats. Hydra Platform supports multiple users accessing the same network and has a suite of functions for managing users and data. We present ongoing development in Hydra Platform, the Hydra Web User Interface, through which users can collaboratively manage network data and models in a web browser. The web interface allows multiple users to graphically access, edit and share their networks, run apps and view results. Through apps, which are located on the server, the web interface can give users access to external data sources and models without the need to install or configure any software. This also ensures model results can be reproduced by removing platform or version dependence. Managing data and deploying models via the web interface provides a way for multiple modellers to collaboratively manage data, deploy and monitor model runs and analyse results.

  11. ModBase, a database of annotated comparative protein structure models and associated resources

    PubMed Central

    Pieper, Ursula; Webb, Benjamin M.; Dong, Guang Qiang; Schneidman-Duhovny, Dina; Fan, Hao; Kim, Seung Joong; Khuri, Natalia; Spill, Yannick G.; Weinkam, Patrick; Hammel, Michal; Tainer, John A.; Nilges, Michael; Sali, Andrej

    2014-01-01

    ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence-structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains almost 30 million reliable models for domains in 4.7 million unique protein sequences. ModBase allows users to compute or update comparative models on demand, through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the AllosMod server for modeling ligand-induced protein dynamics (http://salilab.org/allosmod), the AllosMod-FoXS server for predicting a structural ensemble that fits an SAXS profile (http://salilab.org/allosmod-foxs), the FoXSDock server for protein–protein docking filtered by an SAXS profile (http://salilab.org/foxsdock), the SAXS Merge server for automatic merging of SAXS profiles (http://salilab.org/saxsmerge) and the Pose & Rank server for scoring protein–ligand complexes (http://salilab.org/poseandrank). In this update, we also highlight two applications of ModBase: a PSI:Biology initiative to maximize the structural coverage of the human alpha-helical transmembrane proteome and a determination of structural determinants of human immunodeficiency virus-1 protease specificity. PMID:24271400

  12. A database and model to support proactive management of sediment-related sewer blockages.

    PubMed

    Rodríguez, Juan Pablo; McIntyre, Neil; Díaz-Granados, Mario; Maksimović, Cedo

    2012-10-01

    Due to increasing customer and political pressures, and more stringent environmental regulations, sediment and other blockage issues are now a high priority when assessing sewer system operational performance. Blockages caused by sediment deposits reduce sewer system reliability and demand remedial action at considerable operational cost. Consequently, procedures are required for identifying which parts of the sewer system are in most need of proactive removal of sediments. This paper presents an exceptionally long (7.5 years) and spatially detailed (9658 grid squares--0.03 km² each--covering a population of nearly 7.5 million) data set obtained from a customer complaints database in Bogotá (Colombia). The sediment-related blockage data are modelled using homogeneous and non-homogeneous Poisson process models. In most of the analysed areas the inter-arrival time between blockages can be represented by the homogeneous process, but there are a considerable number of areas (up to 34%) for which there is strong evidence of non-stationarity. In most of these cases, the mean blockage rate increases over time, signifying a continual deterioration of the system despite repairs, this being particularly marked for pipe and gully pot related blockages. The physical properties of the system (mean pipe slope, diameter and pipe length) have a clear but weak influence on observed blockage rates. The Bogotá case study illustrates the potential value of customer complaints databases and formal analysis frameworks for proactive sewerage maintenance scheduling in large cities.

  13. Web application and database modeling of traffic impact analysis using Google Maps

    NASA Astrophysics Data System (ADS)

    Yulianto, Budi; Setiono

    2017-06-01

    Traffic impact analysis (TIA) is a traffic study that aims at identifying the impact of traffic generated by development or change in land use. In addition to identifying the traffic impact, TIA is also equipped with mitigation measurement to minimize the arising traffic impact. TIA has been increasingly important since it was defined in the act as one of the requirements in the proposal of Building Permit. The act encourages a number of TIA studies in various cities in Indonesia, including Surakarta. For that reason, it is necessary to study the development of TIA by adopting the concept Transportation Impact Control (TIC) in the implementation of the TIA standard document and multimodal modeling. It includes TIA's standardization for technical guidelines, database and inspection by providing TIA checklists, monitoring and evaluation. The research was undertaken by collecting the historical data of junctions, modeling of the data in the form of relational database, building a user interface for CRUD (Create, Read, Update and Delete) the TIA data in the form of web programming with Google Maps libraries. The result research is a system that provides information that helps the improvement and repairment of TIA documents that exist today which is more transparent, reliable and credible.

  14. ADANS database specification

    SciTech Connect

    1997-01-16

    The purpose of the Air Mobility Command (AMC) Deployment Analysis System (ADANS) Database Specification (DS) is to describe the database organization and storage allocation and to provide the detailed data model of the physical design and information necessary for the construction of the parts of the database (e.g., tables, indexes, rules, defaults). The DS includes entity relationship diagrams, table and field definitions, reports on other database objects, and a description of the ADANS data dictionary. ADANS is the automated system used by Headquarters AMC and the Tanker Airlift Control Center (TACC) for airlift planning and scheduling of peacetime and contingency operations as well as for deliberate planning. ADANS also supports planning and scheduling of Air Refueling Events by the TACC and the unit-level tanker schedulers. ADANS receives input in the form of movement requirements and air refueling requests. It provides a suite of tools for planners to manipulate these requirements/requests against mobility assets and to develop, analyze, and distribute schedules. Analysis tools are provided for assessing the products of the scheduling subsystems, and editing capabilities support the refinement of schedules. A reporting capability provides formatted screen, print, and/or file outputs of various standard reports. An interface subsystem handles message traffic to and from external systems. The database is an integral part of the functionality summarized above.

  15. Data-based modeling of the geomagnetosphere with an IMF-dependent magnetopause

    NASA Astrophysics Data System (ADS)

    Tsyganenko, N. A.

    2014-01-01

    The paper presents first results of the data-based modeling of the geomagnetospheric magnetic field, using the data of Polar, Geotail, Cluster, and Time History of Events and Macroscale Interactions during Substorms satellites, taken during the period 1995-2012 and covering 123 storm events with SYM-H ≥ -200 nT. The most important innovations in the model are (1) taking into account the interplanetary magnetic field (IMF)-dependent shape of the model magnetopause, (2) a physically more consistent global deformation of the equatorial current sheet due to the geodipole tilt, (3) symmetric and partial components of the ring current are calculated based on a realistic background magnetic field, instead of a purely dipolar field, used in earlier models, and (4) the validity region on the nightside is extended to ˜40-50 RE. The model field is confined within a magnetopause, based on Lin et al. (2010) empirical model, driven by the dipole tilt angle, solar wind pressure, and IMF Bz. A noteworthy finding is a significant dependence of the magnetotail flux connection across the equatorial plane on the model magnetopause flaring rate, controlled by the southward component of the IMF.

  16. A Conceptual Model and Database to Integrate Data and Project Management

    NASA Astrophysics Data System (ADS)

    Guarinello, M. L.; Edsall, R.; Helbling, J.; Evaldt, E.; Glenn, N. F.; Delparte, D.; Sheneman, L.; Schumaker, R.

    2015-12-01

    Data management is critically foundational to doing effective science in our data-intensive research era and done well can enhance collaboration, increase the value of research data, and support requirements by funding agencies to make scientific data and other research products available through publically accessible online repositories. However, there are few examples (but see the Long-term Ecological Research Network Data Portal) of these data being provided in such a manner that allows exploration within the context of the research process - what specific research questions do these data seek to answer? what data were used to answer these questions? what data would have been helpful to answer these questions but were not available? We propose an agile conceptual model and database design, as well as example results, that integrate data management with project management not only to maximize the value of research data products but to enhance collaboration during the project and the process of project management itself. In our project, which we call 'Data Map,' we used agile principles by adopting a user-focused approach and by designing our database to be simple, responsive, and expandable. We initially designed Data Map for the Idaho EPSCoR project "Managing Idaho's Landscapes for Ecosystem Services (MILES)" (see https://www.idahoecosystems.org//) and will present example results for this work. We consulted with our primary users- project managers, data managers, and researchers to design the Data Map. Results will be useful to project managers and to funding agencies reviewing progress because they will readily provide answers to the questions "For which research projects/questions are data available and/or being generated by MILES researchers?" and "Which research projects/questions are associated with each of the 3 primary questions from the MILES proposal?" To be responsive to the needs of the project, we chose to streamline our design for the prototype

  17. Reflective Database Access Control

    ERIC Educational Resources Information Center

    Olson, Lars E.

    2009-01-01

    "Reflective Database Access Control" (RDBAC) is a model in which a database privilege is expressed as a database query itself, rather than as a static privilege contained in an access control list. RDBAC aids the management of database access controls by improving the expressiveness of policies. However, such policies introduce new interactions…

  18. Reflective Database Access Control

    ERIC Educational Resources Information Center

    Olson, Lars E.

    2009-01-01

    "Reflective Database Access Control" (RDBAC) is a model in which a database privilege is expressed as a database query itself, rather than as a static privilege contained in an access control list. RDBAC aids the management of database access controls by improving the expressiveness of policies. However, such policies introduce new interactions…

  19. Studying Oogenesis in a Non-model Organism Using Transcriptomics: Assembling, Annotating, and Analyzing Your Data.

    PubMed

    Carter, Jean-Michel; Gibbs, Melanie; Breuker, Casper J

    2016-01-01

    This chapter provides a guide to processing and analyzing RNA-Seq data in a non-model organism. This approach was implemented for studying oogenesis in the Speckled Wood Butterfly Pararge aegeria. We focus in particular on how to perform a more informative primary annotation of your non-model organism by implementing our multi-BLAST annotation strategy. We also provide a general guide to other essential steps in the next-generation sequencing analysis workflow. Before undertaking these methods, we recommend you familiarize yourself with command line usage and fundamental concepts of database handling. Most of the operations in the primary annotation pipeline can be performed in Galaxy (or equivalent standalone versions of the tools) and through the use of common database operations (e.g. to remove duplicates) but other equivalent programs and/or custom scripts can be implemented for further automation.

  20. The Society of Thoracic Surgeons Congenital Heart Surgery Database Mortality Risk Model: Part 2-Clinical Application.

    PubMed

    Jacobs, Jeffrey P; O'Brien, Sean M; Pasquali, Sara K; Gaynor, J William; Mayer, John E; Karamlou, Tara; Welke, Karl F; Filardo, Giovanni; Han, Jane M; Kim, Sunghee; Quintessenza, James A; Pizarro, Christian; Tchervenkov, Christo I; Lacour-Gayet, Francois; Mavroudis, Constantine; Backer, Carl L; Austin, Erle H; Fraser, Charles D; Tweddell, James S; Jonas, Richard A; Edwards, Fred H; Grover, Frederick L; Prager, Richard L; Shahian, David M; Jacobs, Marshall L

    2015-09-01

    The empirically derived 2014 Society of Thoracic Surgeons Congenital Heart Surgery Database Mortality Risk Model incorporates adjustment for procedure type and patient-specific factors. The purpose of this report is to describe this model and its application in the assessment of variation in outcomes across centers. All index cardiac operations in The Society of Thoracic Surgeons Congenital Heart Surgery Database (January 1, 2010, to December 31, 2013) were eligible for inclusion. Isolated patent ductus arteriosus closures in patients weighing less than or equal to 2.5 kg were excluded, as were centers with more than 10% missing data and patients with missing data for key variables. The model includes the following covariates: primary procedure, age, any prior cardiovascular operation, any noncardiac abnormality, any chromosomal abnormality or syndrome, important preoperative factors (mechanical circulatory support, shock persisting at time of operation, mechanical ventilation, renal failure requiring dialysis or renal dysfunction (or both), and neurological deficit), any other preoperative factor, prematurity (neonates and infants), and weight (neonates and infants). Variation across centers was assessed. Centers for which the 95% confidence interval for the observed-to-expected mortality ratio does not include unity are identified as lower-performing or higher-performing programs with respect to operative mortality. Included were 52,224 operations from 86 centers. Overall discharge mortality was 3.7% (1,931 of 52,224). Discharge mortality by age category was neonates, 10.1% (1,129 of 11,144); infants, 3.0% (564 of 18,554), children, 0.9% (167 of 18,407), and adults, 1.7% (71 of 4,119). For all patients, 12 of 86 centers (14%) were lower-performing programs, 67 (78%) were not outliers, and 7 (8%) were higher-performing programs. The 2014 Society of Thoracic Surgeons Congenital Heart Surgery Database Mortality Risk Model facilitates description of outcomes

  1. High resolution topography and land cover databases for wind resource assessment using mesoscale models

    NASA Astrophysics Data System (ADS)

    Barranger, Nicolas; Stathopoulos, Christos; Kallos, Georges

    2013-04-01

    In wind resource assessment, mesoscale models can provide wind flow characteristics without the use of mast measurements. In complex terrain, local orography and land cover data assimilation are essential parameters to accurately simulate the wind flow pattern within the atmospheric boundary layer. State-of-the-art Mesoscale Models such as RAMS usually provides orography and landuse data with of resolution of 30s (about 1km). This resolution is necessary for solving mesocale phenomena accurately but not sufficient when the aim is to quantitatively estimate the wind flow characteristics passing over sharp hills or ridges. Furthermore, the abrupt change in land cover characterization is nor always taken into account in the model with a low resolution land use database. When land cover characteristics changes dramatically, parameters such as roughness, albedo or soil moisture that can highly influence the Atmospheric Boundary Layer meteorological characteristics. Therefore they require to be accurately assimilated into the model. Since few years, high resolution databases derived from satellite imagery (Modis, SRTM, LandSat, SPOT ) are available online. Being converted to RAMS requirements inputs, an evaluation of the model requires to be achieved. For this purpose, three new high resolution land cover and two topographical databases are implemented and tested in RAMS. The analysis of terrain variability is performed using basis functions of space frequency and amplitude. Practically, one and two dimension Fast Fourier Transform is applied to terrain height to reveal the main characteristics of local orography according to the obtained wave spectrum. By this way, a comparison between different topographic data sets is performed, based on the terrain power spectrum entailed in the terrain height input. Furthermore, this analysis is a powerful tool in the determination of the proper horizontal grid resolution required to resolve most of the energy containing spectrum

  2. MODEL-BASED HYDROACOUSTIC BLOCKAGE ASSESSMENT AND DEVELOPMENT OF AN EXPLOSIVE SOURCE DATABASE

    SciTech Connect

    Matzel, E; Ramirez, A; Harben, P

    2005-07-11

    We are continuing the development of the Hydroacoustic Blockage Assessment Tool (HABAT) which is designed for use by analysts to predict which hydroacoustic monitoring stations can be used in discrimination analysis for any particular event. The research involves two approaches (1) model-based assessment of blockage, and (2) ground-truth data-based assessment of blockage. The tool presents the analyst with a map of the world, and plots raypath blockages from stations to sources. The analyst inputs source locations and blockage criteria, and the tool returns a list of blockage status from all source locations to all hydroacoustic stations. We are currently using the tool in an assessment of blockage criteria for simple direct-path arrivals. Hydroacoustic data, predominantly from earthquake sources, are read in and assessed for blockage at all available stations. Several measures are taken. First, can the event be observed at a station above background noise? Second, can we establish backazimuth from the station to the source. Third, how large is the decibel drop at one station relative to other stations. These observational results are then compared with model estimates to identify the best set of blockage criteria and used to create a set of blockage maps for each station. The model-based estimates are currently limited by the coarse bathymetry of existing databases and by the limitations inherent in the raytrace method. In collaboration with BBN Inc., the Hydroacoustic Coverage Assessment Model (HydroCAM) that generates the blockage files that serve as input to HABAT, is being extended to include high-resolution bathymetry databases in key areas that increase model-based blockage assessment reliability. An important aspect of this capability is to eventually include reflected T-phases where they reliably occur and to identify the associated reflectors. To assess how well any given hydroacoustic discriminant works in separating earthquake and in-water explosion

  3. Outcomes of third heart transplants in pediatric and young adult patients: analysis of the United Network for Organ Sharing database.

    PubMed

    Friedland-Little, Joshua M; Gajarski, Robert J; Yu, Sunkyung; Donohue, Janet E; Zamberlan, Mary C; Schumacher, Kurt R

    2014-09-01

    Repeat heart transplantation (re-HTx) is standard practice in many pediatric centers. There are limited data available on outcomes of third HTx after failure of a second graft. We sought to compare outcomes of third HTx in pediatric and young adult patients with outcomes of second HTx in comparable recipients. All recipients of a third HTx in whom the primary HTx occurred before 21 years of age were identified in the United Network for Organ Sharing database (1985 to 2011) and matched 1:3 with a control group of second HTx patients by age, era and re-HTx indication. Outcomes including survival, rejection and cardiac allograft vasculopathy (CAV) were compared between groups. There was no difference between third HTx patients (n = 27) and control second HTx patients (n = 79) with respect to survival (76% vs 80% at 1 year, 62% vs 58% at 5 years and 53% vs 34% at 10 years, p = 0.75), early (<1 year from HTx) rejection (33.3% vs 44.3%, p = 0.32) or CAV (14.8% vs 30.4%, p = 0.11). Factors associated with non-survival in third HTx patients included mechanical ventilation at listing or HTx, extracorporeal membrane oxygenation support at listing or HTx, and elevated serum bilirubin at HTx. Outcomes among recipients of a third HTx are similar to those with a second HTx in matched patients, with no difference in short- or long-term survival and comparable rates of early rejection and CAV. Although the occurrence of a third HTx remains relatively rare in the USA, consideration of a third HTx appears reasonable in appropriately selected patients. Copyright © 2014 International Society for Heart and Lung Transplantation. Published by Elsevier Inc. All rights reserved.

  4. A neotropical Miocene pollen database employing image-based search and semantic modeling1

    PubMed Central

    Han, Jing Ginger; Cao, Hongfei; Barb, Adrian; Punyasena, Surangi W.; Jaramillo, Carlos; Shyu, Chi-Ren

    2014-01-01

    • Premise of the study: Digital microscopic pollen images are being generated with increasing speed and volume, producing opportunities to develop new computational methods that increase the consistency and efficiency of pollen analysis and provide the palynological community a computational framework for information sharing and knowledge transfer. • Methods: Mathematical methods were used to assign trait semantics (abstract morphological representations) of the images of neotropical Miocene pollen and spores. Advanced database-indexing structures were built to compare and retrieve similar images based on their visual content. A Web-based system was developed to provide novel tools for automatic trait semantic annotation and image retrieval by trait semantics and visual content. • Results: Mathematical models that map visual features to trait semantics can be used to annotate images with morphology semantics and to search image databases with improved reliability and productivity. Images can also be searched by visual content, providing users with customized emphases on traits such as color, shape, and texture. • Discussion: Content- and semantic-based image searches provide a powerful computational platform for pollen and spore identification. The infrastructure outlined provides a framework for building a community-wide palynological resource, streamlining the process of manual identification, analysis, and species discovery. PMID:25202648

  5. Meta-Analysis in Human Neuroimaging: Computational Modeling of Large-Scale Databases

    PubMed Central

    Fox, Peter T.; Lancaster, Jack L.; Laird, Angela R.; Eickhoff, Simon B.

    2016-01-01

    Spatial normalization—applying standardized coordinates as anatomical addresses within a reference space—was introduced to human neuroimaging research nearly 30 years ago. Over these three decades, an impressive series of methodological advances have adopted, extended, and popularized this standard. Collectively, this work has generated a methodologically coherent literature of unprecedented rigor, size, and scope. Large-scale online databases have compiled these observations and their associated meta-data, stimulating the development of meta-analytic methods to exploit this expanding corpus. Coordinate-based meta-analytic methods have emerged and evolved in rigor and utility. Early methods computed cross-study consensus, in a manner roughly comparable to traditional (nonimaging) meta-analysis. Recent advances now compute coactivation-based connectivity, connectivity-based functional parcellation, and complex network models powered from data sets representing tens of thousands of subjects. Meta-analyses of human neuroimaging data in large-scale databases now stand at the forefront of computational neurobiology. PMID:25032500

  6. A One-Degree Seismic Tomographic Model Based on a Sensitivity Kernel Database

    NASA Astrophysics Data System (ADS)

    Sales de Andrade, E.; Liu, Q.; Manners, U.; Lee-Varisco, E.; Ma, Z.; Masters, G.

    2013-12-01

    Seismic tomography is instrumental in mapping 3D velocity structures of the Earth's interior based on travel-time measurements and waveform differences. Although both ray theory and other asymptotic methods have been successfully employed in global tomography, they are less accurate for long-period waves or steep velocity gradients. They also lack the ability to predict 'non-geometrical' effects such as those for the core diffracted phases (Pdiff, Sdiff) which are crucial for mapping heterogeneities in the lowermost mantle (D'' layer). On the other hand, sensitivity kernels can be accurately calculated with no approximations by the interaction of forward and adjoint wavefields, both numerically simulated by spectral element methods. We have previously shown that by taking advantage of the symmetry of 1D reference models, we can efficiently and speedily construct sensitivity kernels of both P and S wavespeeds based on the simulation and storage of forward and adjoint strain fields for select source and receiver geometries. This technique has been used to create a database of strain fields as well as sensitivity kernels for phases typically used in global inversions. We also performed picks for 27,000 Sdiff, 35,000 Pdiff, 400,000 S, and 600,000 P phases and 33,000 SS-S, 33,000 PP-P, and 41,000 ScS-S differential phases, which provide much improved coverage of the globe. Using these travel-times and our sensitivity kernel database in a parallel LSQR inversion, we generate an updated tomographic model with 1° resolution. Using this improved coverage, we investigate differences between global models inverted based on ray theory and finite-frequency kernels.

  7. Computational Thermochemistry: Scale Factor Databases and Scale Factors for Vibrational Frequencies Obtained from Electronic Model Chemistries.

    PubMed

    Alecu, I M; Zheng, Jingjing; Zhao, Yan; Truhlar, Donald G

    2010-09-14

    Optimized scale factors for calculating vibrational harmonic and fundamental frequencies and zero-point energies have been determined for 145 electronic model chemistries, including 119 based on approximate functionals depending on occupied orbitals, 19 based on single-level wave function theory, three based on the neglect-of-diatomic-differential-overlap, two based on doubly hybrid density functional theory, and two based on multicoefficient correlation methods. Forty of the scale factors are obtained from large databases, which are also used to derive two universal scale factor ratios that can be used to interconvert between scale factors optimized for various properties, enabling the derivation of three key scale factors at the effort of optimizing only one of them. A reduced scale factor optimization model is formulated in order to further reduce the cost of optimizing scale factors, and the reduced model is illustrated by using it to obtain 105 additional scale factors. Using root-mean-square errors from the values in the large databases, we find that scaling reduces errors in zero-point energies by a factor of 2.3 and errors in fundamental vibrational frequencies by a factor of 3.0, but it reduces errors in harmonic vibrational frequencies by only a factor of 1.3. It is shown that, upon scaling, the balanced multicoefficient correlation method based on coupled cluster theory with single and double excitations (BMC-CCSD) can lead to very accurate predictions of vibrational frequencies. With a polarized, minimally augmented basis set, the density functionals with zero-point energy scale factors closest to unity are MPWLYP1M (1.009), τHCTHhyb (0.989), BB95 (1.012), BLYP (1.013), BP86 (1.014), B3LYP (0.986), MPW3LYP (0.986), and VSXC (0.986).

  8. Bootstrap imputation with a disease probability model minimized bias from misclassification due to administrative database codes.

    PubMed

    van Walraven, Carl

    2017-04-01

    Diagnostic codes used in administrative databases cause bias due to misclassification of patient disease status. It is unclear which methods minimize this bias. Serum creatinine measures were used to determine severe renal failure status in 50,074 hospitalized patients. The true prevalence of severe renal failure and its association with covariates were measured. These were compared to results for which renal failure status was determined using surrogate measures including the following: (1) diagnostic codes; (2) categorization of probability estimates of renal failure determined from a previously validated model; or (3) bootstrap methods imputation of disease status using model-derived probability estimates. Bias in estimates of severe renal failure prevalence and its association with covariates were minimal when bootstrap methods were used to impute renal failure status from model-based probability estimates. In contrast, biases were extensive when renal failure status was determined using codes or methods in which model-based condition probability was categorized. Bias due to misclassification from inaccurate diagnostic codes can be minimized using bootstrap methods to impute condition status using multivariable model-derived probability estimates. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Mouse Tumor Biology (MTB): a database of mouse models for human cancer.

    PubMed

    Bult, Carol J; Krupke, Debra M; Begley, Dale A; Richardson, Joel E; Neuhauser, Steven B; Sundberg, John P; Eppig, Janan T

    2015-01-01

    The Mouse Tumor Biology (MTB; http://tumor.informatics.jax.org) database is a unique online compendium of mouse models for human cancer. MTB provides online access to expertly curated information on diverse mouse models for human cancer and interfaces for searching and visualizing data associated with these models. The information in MTB is designed to facilitate the selection of strains for cancer research and is a platform for mining data on tumor development and patterns of metastases. MTB curators acquire data through manual curation of peer-reviewed scientific literature and from direct submissions by researchers. Data in MTB are also obtained from other bioinformatics resources including PathBase, the Gene Expression Omnibus and ArrayExpress. Recent enhancements to MTB improve the association between mouse models and human genes commonly mutated in a variety of cancers as identified in large-scale cancer genomics studies, provide new interfaces for exploring regions of the mouse genome associated with cancer phenotypes and incorporate data and information related to Patient-Derived Xenograft models of human cancers.

  10. Predictive modeling using a nationally representative database to identify patients at risk of developing microalbuminuria.

    PubMed

    Villa-Zapata, Lorenzo; Warholak, Terri; Slack, Marion; Malone, Daniel; Murcko, Anita; Runger, George; Levengood, Michael

    2016-02-01

    Predictive models allow clinicians to identify higher- and lower-risk patients and make targeted treatment decisions. Microalbuminuria (MA) is a condition whose presence is understood to be an early marker for cardiovascular disease. The aims of this study were to develop a patient data-driven predictive model and a risk-score assessment to improve the identification of MA. The 2007-2008 National Health and Nutrition Examination Survey (NHANES) was utilized to create a predictive model. The dataset was split into thirds; one-third was used to develop the model, while the other two-thirds were utilized for internal validation. The 2012-2013 NHANES was used as an external validation database. Multivariate logistic regression was performed to create the model. Performance was evaluated using three criteria: (1) receiver operating characteristic curves; (2) pseudo-R (2) values; and (3) goodness of fit (Hosmer-Lemeshow). The model was then used to develop a risk-score chart. A model was developed using variables for which there was a significant relationship. Variables included were systolic blood pressure, fasting glucose, C-reactive protein, blood urea nitrogen, and alcohol consumption. The model performed well, and no significant differences were observed when utilized in the validation datasets. A risk score was developed, and the probability of developing MA for each score was calculated. The predictive model provides new evidence about variables related with MA and may be used by clinicians to identify at-risk patients and to tailor treatment. The risk score developed may allow clinicians to measure a patient's MA risk.

  11. How I do it: a practical database management system to assist clinical research teams with data collection, organization, and reporting.

    PubMed

    Lee, Howard; Chapiro, Julius; Schernthaner, Rüdiger; Duran, Rafael; Wang, Zhijun; Gorodetski, Boris; Geschwind, Jean-François; Lin, MingDe

    2015-04-01

    The objective of this study was to demonstrate that an intra-arterial liver therapy clinical research database system is a more workflow efficient and robust tool for clinical research than a spreadsheet storage system. The database system could be used to generate clinical research study populations easily with custom search and retrieval criteria. A questionnaire was designed and distributed to 21 board-certified radiologists to assess current data storage problems and clinician reception to a database management system. Based on the questionnaire findings, a customized database and user interface system were created to perform automatic calculations of clinical scores including staging systems such as the Child-Pugh and Barcelona Clinic Liver Cancer, and facilitates data input and output. Questionnaire participants were favorable to a database system. The interface retrieved study-relevant data accurately and effectively. The database effectively produced easy-to-read study-specific patient populations with custom-defined inclusion/exclusion criteria. The database management system is workflow efficient and robust in retrieving, storing, and analyzing data. Copyright © 2015 AUR. Published by Elsevier Inc. All rights reserved.

  12. The NCBI Taxonomy database.

    PubMed

    Federhen, Scott

    2012-01-01

    The NCBI Taxonomy database (http://www.ncbi.nlm.nih.gov/taxonomy) is the standard nomenclature and classification repository for the International Nucleotide Sequence Database Collaboration (INSDC), comprising the GenBank, ENA (EMBL) and DDBJ databases. It includes organism names and taxonomic lineages for each of the sequences represented in the INSDC's nucleotide and protein sequence databases. The taxonomy database is manually curated by a small group of scientists at the NCBI who use the current taxonomic literature to maintain a phylogenetic taxonomy for the source organisms represented in the sequence databases. The taxonomy database is a central organizing hub for many of the resources at the NCBI, and provides a means for clustering elements within other domains of NCBI web site, for internal linking between domains of the Entrez system and for linking out to taxon-specific external resources on the web. Our primary purpose is to index the domain of sequences as conveniently as possible for our user community.

  13. MARRVEL: Integration of Human and Model Organism Genetic Resources to Facilitate Functional Annotation of the Human Genome.

    PubMed

    Wang, Julia; Al-Ouran, Rami; Hu, Yanhui; Kim, Seon-Young; Wan, Ying-Wooi; Wangler, Michael F; Yamamoto, Shinya; Chao, Hsiao-Tuan; Comjean, Aram; Mohr, Stephanie E; Perrimon, Norbert; Liu, Zhandong; Bellen, Hugo J

    2017-06-01

    One major challenge encountered with interpreting human genetic variants is the limited understanding of the functional impact of genetic alterations on biological processes. Furthermore, there remains an unmet demand for an efficient survey of the wealth of information on human homologs in model organisms across numerous databases. To efficiently assess the large volume of publically available information, it is important to provide a concise summary of the most relevant information in a rapid user-friendly format. To this end, we created MARRVEL (model organism aggregated resources for rare variant exploration). MARRVEL is a publicly available website that integrates information from six human genetic databases and seven model organism databases. For any given variant or gene, MARRVEL displays information from OMIM, ExAC, ClinVar, Geno2MP, DGV, and DECIPHER. Importantly, it curates model organism-specific databases to concurrently display a concise summary regarding the human gene homologs in budding and fission yeast, worm, fly, fish, mouse, and rat on a single webpage. Experiment-based information on tissue expression, protein subcellular localization, biological process, and molecular function for the human gene and homologs in the seven model organisms are arranged into a concise output. Hence, rather than visiting multiple separate databases for variant and gene analysis, users can obtain important information by searching once through MARRVEL. Altogether, MARRVEL dramatically improves efficiency and accessibility to data collection and facilitates analysis of human genes and variants by cross-disciplinary integration of 18 million records available in public databases to facilitate clinical diagnosis and basic research. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  14. The LBNL Water Heater Retail Price Database

    SciTech Connect

    Lekov, Alex; Glover, Julie; Lutz, Jim

    2000-10-01

    Lawrence Berkeley National Laboratory developed the LBNL Water Heater Price Database to compile and organize information used in the revision of U.S. energy efficiency standards for water heaters. The Database contains all major components that contribute to the consumer cost of water heaters, including basic retail prices, sales taxes, installation costs, and any associated fees. In addition, the Database provides manufacturing data on the features and design characteristics of more than 1100 different water heater models. Data contained in the Database was collected over a two-year period from 1997 to 1999.

  15. Volcanogenic Massive Sulfide Deposits of the World - Database and Grade and Tonnage Models

    USGS Publications Warehouse

    Mosier, Dan L.; Berger, Vladimir I.; Singer, Donald A.

    2009-01-01

    Grade and tonnage models are useful in quantitative mineral-resource assessments. The models and database presented in this report are an update of earlier publications about volcanogenic massive sulfide (VMS) deposits. These VMS deposits include what were formerly classified as kuroko, Cyprus, and Besshi deposits. The update was necessary because of new information about some deposits, changes in information in some deposits, such as grades, tonnages, or ages, revised locations of some deposits, and reclassification of subtypes. In this report we have added new VMS deposits and removed a few incorrectly classified deposits. This global compilation of VMS deposits contains 1,090 deposits; however, it was not our intent to include every known deposit in the world. The data was recently used for mineral-deposit density models (Mosier and others, 2007; Singer, 2008). In this paper, 867 deposits were used to construct revised grade and tonnage models. Our new models are based on a reclassification of deposits based on host lithologies: Felsic, Bimodal-Mafic, and Mafic volcanogenic massive sulfide deposits. Mineral-deposit models are important in exploration planning and quantitative resource assessments for two reasons: (1) grades and tonnages among deposit types vary significantly, and (2) deposits of different types occur in distinct geologic settings that can be identified from geologic maps. Mineral-deposit models combine the diverse geoscience information on geology, mineral occurrences, geophysics, and geochemistry used in resource assessments and mineral exploration. Globally based deposit models allow recognition of important features and demonstrate how common different features are. Well-designed deposit models allow geologists to deduce possible mineral-deposit types in a given geologic environment and economists to determine the possible economic viability of these resources. Thus, mineral-deposit models play a central role in presenting geoscience

  16. Engineering the object-relation database model in O-Raid

    NASA Technical Reports Server (NTRS)

    Dewan, Prasun; Vikram, Ashish; Bhargava, Bharat

    1989-01-01

    Raid is a distributed database system based on the relational model. O-raid is an extension of the Raid system and will support complex data objects. The design of O-Raid is evolutionary and retains all features of relational data base systems and those of a general purpose object-oriented programming language. O-Raid has several novel properties. Objects, classes, and inheritance are supported together with a predicate-base relational query language. O-Raid objects are compatible with C++ objects and may be read and manipulated by a C++ program without any 'impedance mismatch'. Relations and columns within relations may themselves be treated as objects with associated variables and methods. Relations may contain heterogeneous objects, that is, objects of more than one class in a certain column, which can individually evolve by being reclassified. Special facilities are provided to reduce the data search in a relation containing complex objects.

  17. Modelling the Geographical Origin of Rice Cultivation in Asia Using the Rice Archaeological Database.

    PubMed

    Silva, Fabio; Stevens, Chris J; Weisskopf, Alison; Castillo, Cristina; Qin, Ling; Bevan, Andrew; Fuller, Dorian Q

    2015-01-01

    We have compiled an extensive database of archaeological evidence for rice across Asia, including 400 sites from mainland East Asia, Southeast Asia and South Asia. This dataset is used to compare several models for the geographical origins of rice cultivation and infer the most likely region(s) for its origins and subsequent outward diffusion. The approach is based on regression modelling wherein goodness of fit is obtained from power law quantile regressions of the archaeologically inferred age versus a least-cost distance from the putative origin(s). The Fast Marching method is used to estimate the least-cost distances based on simple geographical features. The origin region that best fits the archaeobotanical data is also compared to other hypothetical geographical origins derived from the literature, including from genetics, archaeology and historical linguistics. The model that best fits all available archaeological evidence is a dual origin model with two centres for the cultivation and dispersal of rice focused on the Middle Yangtze and the Lower Yangtze valleys.

  18. Combining a weed traits database with a population dynamics model predicts shifts in weed communities

    PubMed Central

    Storkey, J; Holst, N; Bøjer, O Q; Bigongiali, F; Bocci, G; Colbach, N; Dorner, Z; Riemens, M M; Sartorato, I; Sønderskov, M; Verschwele, A

    2015-01-01

    A functional approach to predicting shifts in weed floras in response to management or environmental change requires the combination of data on weed traits with analytical frameworks that capture the filtering effect of selection pressures on traits. A weed traits database (WTDB) was designed, populated and analysed, initially using data for 19 common European weeds, to begin to consolidate trait data in a single repository. The initial choice of traits was driven by the requirements of empirical models of weed population dynamics to identify correlations between traits and model parameters. These relationships were used to build a generic model, operating at the level of functional traits, to simulate the impact of increasing herbicide and fertiliser use on virtual weeds along gradients of seed weight and maximum height. The model generated ‘fitness contours’ (defined as population growth rates) within this trait space in different scenarios, onto which two sets of weed species, defined as common or declining in the UK, were mapped. The effect of increasing inputs on the weed flora was successfully simulated; 77% of common species were predicted to have stable or increasing populations under high fertiliser and herbicide use, in contrast with only 29% of the species that have declined. Future development of the WTDB will aim to increase the number of species covered, incorporate a wider range of traits and analyse intraspecific variability under contrasting management and environments. PMID:26190870

  19. Combining a weed traits database with a population dynamics model predicts shifts in weed communities.

    PubMed

    Storkey, J; Holst, N; Bøjer, O Q; Bigongiali, F; Bocci, G; Colbach, N; Dorner, Z; Riemens, M M; Sartorato, I; Sønderskov, M; Verschwele, A

    2015-04-01

    A functional approach to predicting shifts in weed floras in response to management or environmental change requires the combination of data on weed traits with analytical frameworks that capture the filtering effect of selection pressures on traits. A weed traits database (WTDB) was designed, populated and analysed, initially using data for 19 common European weeds, to begin to consolidate trait data in a single repository. The initial choice of traits was driven by the requirements of empirical models of weed population dynamics to identify correlations between traits and model parameters. These relationships were used to build a generic model, operating at the level of functional traits, to simulate the impact of increasing herbicide and fertiliser use on virtual weeds along gradients of seed weight and maximum height. The model generated 'fitness contours' (defined as population growth rates) within this trait space in different scenarios, onto which two sets of weed species, defined as common or declining in the UK, were mapped. The effect of increasing inputs on the weed flora was successfully simulated; 77% of common species were predicted to have stable or increasing populations under high fertiliser and herbicide use, in contrast with only 29% of the species that have declined. Future development of the WTDB will aim to increase the number of species covered, incorporate a wider range of traits and analyse intraspecific variability under contrasting management and environments.

  20. Drug-target interaction prediction: databases, web servers and computational models.

    PubMed

    Chen, Xing; Yan, Chenggang Clarence; Zhang, Xiaotian; Zhang, Xu; Dai, Feng; Yin, Jian; Zhang, Yongdong

    2016-07-01

    Identification of drug-target interactions is an important process in drug discovery. Although high-throughput screening and other biological assays are becoming available, experimental methods for drug-target interaction identification remain to be extremely costly, time-consuming and challenging even nowadays. Therefore, various computational models have been developed to predict potential drug-target associations on a large scale. In this review, databases and web servers involved in drug-target identification and drug discovery are summarized. In addition, we mainly introduced some state-of-the-art computational models for drug-target interactions prediction, including network-based method, machine learning-based method and so on. Specially, for the machine learning-based method, much attention was paid to supervised and semi-supervised models, which have essential difference in the adoption of negative samples. Although significant improvements for drug-target interaction prediction have been obtained by many effective computational models, both network-based and machine learning-based methods have their disadvantages, respectively. Furthermore, we discuss the future directions of the network-based drug discovery and network approach for personalized drug discovery based on personalized medicine, genome sequencing, tumor clone-based network and cancer hallmark-based network. Finally, we discussed the new evaluation validation framework and the formulation of drug-target interactions prediction problem by more realistic regression formulation based on quantitative bioactivity data. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  1. Modelling the Geographical Origin of Rice Cultivation in Asia Using the Rice Archaeological Database

    PubMed Central

    Silva, Fabio; Stevens, Chris J.; Weisskopf, Alison; Castillo, Cristina; Qin, Ling; Bevan, Andrew; Fuller, Dorian Q.

    2015-01-01

    We have compiled an extensive database of archaeological evidence for rice across Asia, including 400 sites from mainland East Asia, Southeast Asia and South Asia. This dataset is used to compare several models for the geographical origins of rice cultivation and infer the most likely region(s) for its origins and subsequent outward diffusion. The approach is based on regression modelling wherein goodness of fit is obtained from power law quantile regressions of the archaeologically inferred age versus a least-cost distance from the putative origin(s). The Fast Marching method is used to estimate the least-cost distances based on simple geographical features. The origin region that best fits the archaeobotanical data is also compared to other hypothetical geographical origins derived from the literature, including from genetics, archaeology and historical linguistics. The model that best fits all available archaeological evidence is a dual origin model with two centres for the cultivation and dispersal of rice focused on the Middle Yangtze and the Lower Yangtze valleys. PMID:26327225

  2. Speciation of volatile organic compound emissions for regional air quality modeling of particulate matter and ozone

    NASA Astrophysics Data System (ADS)

    Makar, P. A.; Moran, M. D.; Scholtz, M. T.; Taylor, A.

    2003-01-01

    A new classification scheme for the speciation of organic compound emissions for use in air quality models is described. The scheme uses 81 organic compound classes to preserve both net gas-phase reactivity and particulate matter (PM) formation potential. Chemical structure, vapor pressure, hydroxyl radical (OH) reactivity, freezing point/boiling point, and solubility data were used to create the 81 compound classes. Volatile, semivolatile, and nonvolatile organic compounds are included. The new classification scheme has been used in conjunction with the Canadian Emissions Processing System (CEPS) to process 1990 gas-phase and particle-phase organic compound emissions data for summer and winter conditions for a domain covering much of eastern North America. A simple postprocessing model was used to analyze the speciated organic emissions in terms of both gas-phase reactivity and potential to form organic PM. Previously unresolved compound classes that may have a significant impact on ozone formation include biogenic high-reactivity esters and internal C6-8 alkene-alcohols and anthropogenic ethanol and propanol. Organic radical production associated with anthropogenic organic compound emissions may be 1 or more orders of magnitude more important than biogenic-associated production in northern United States and Canadian cities, and a factor of 3 more important in southern U.S. cities. Previously unresolved organic compound classes such as low vapour pressure PAHs, anthropogenic diacids, dialkyl phthalates, and high carbon number alkanes may have a significant impact on organic particle formation. Primary organic particles (poorly characterized in national emissions databases) dominate total organic particle concentrations, followed by secondary formation and primary gas-particle partitioning. The influence of the assumed initial aerosol water concentration on subsequent thermodynamic calculations suggests that hydrophobic and hydrophilic compounds may form external

  3. Coverage of whole proteome by structural genomics observed through protein homology modeling database

    PubMed Central

    Yamaguchi, Akihiro; Go, Mitiko

    2006-01-01

    We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (http://daisy.nagahama-i-bio.ac.jp/Famsbase/), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3 years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15 years. For eukaryotic proteins, they will be in hand within 25 years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics. PMID:17146617

  4. The Society of Thoracic Surgeons Congenital Heart Surgery Database Mortality Risk Model: Part 1-Statistical Methodology.

    PubMed

    O'Brien, Sean M; Jacobs, Jeffrey P; Pasquali, Sara K; Gaynor, J William; Karamlou, Tara; Welke, Karl F; Filardo, Giovanni; Han, Jane M; Kim, Sunghee; Shahian, David M; Jacobs, Marshall L

    2015-09-01

    This study's objective was to develop a risk model incorporating procedure type and patient factors to be used for case-mix adjustment in the analysis of hospital-specific operative mortality rates after congenital cardiac operations. Included were patients of all ages undergoing cardiac operations, with or without cardiopulmonary bypass, at centers participating in The Society of Thoracic Surgeons Congenital Heart Surgery Database during January 1, 2010, to December 31, 2013. Excluded were isolated patent ductus arteriosus closures in patients weighing less than or equal to 2.5 kg, centers with more than 10% missing data, and patients with missing data for key variables. Data from the first 3.5 years were used for model development, and data from the last 0.5 year were used for assessing model discrimination and calibration. Potential risk factors were proposed based on expert consensus and selected after empirically comparing a variety of modeling options. The study cohort included 52,224 patients from 86 centers with 1,931 deaths (3.7%). Covariates included in the model were primary procedure, age, weight, and 11 additional patient factors reflecting acuity status and comorbidities. The C statistic in the validation sample was 0.858. Plots of observed-vs-expected mortality rates revealed good calibration overall and within subgroups, except for a slight overestimation of risk in the highest decile of predicted risk. Removing patient preoperative factors from the model reduced the C statistic to 0.831 and affected the performance classification for 12 of 86 hospitals. The risk model is well suited to adjust for case mix in the analysis and reporting of hospital-specific mortality for congenital heart operations. Inclusion of patient factors added useful discriminatory power and reduced bias in the calculation of hospital-specific mortality metrics. Copyright © 2015 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.

  5. MEMOSys 2.0: an update of the bioinformatics database for genome-scale models and genomic data.

    PubMed

    Pabinger, Stephan; Snajder, Rene; Hardiman, Timo; Willi, Michaela; Dander, Andreas; Trajanoski, Zlatko

    2014-01-01

    The MEtabolic MOdel research and development System (MEMOSys) is a versatile database for the management, storage and development of genome-scale models (GEMs). Since its initial release, the database has undergone major improvements, and the new version introduces several new features. First, the novel concept of derived models allows users to create model hierarchies that automatically propagate modifications along their order. Second, all stored components can now be easily enhanced with additional annotations that can be directly extracted from a supplied Systems Biology Markup Language (SBML) file. Third, the web application has been substantially revised and now features new query mechanisms, an easy search system for reactions and new link-out services to publicly available databases. Fourth, the updated database now contains 20 publicly available models, which can be easily exported into standardized formats for further analysis. Fifth, MEMOSys 2.0 is now also available as a fully configured virtual image and can be found online at http://www.icbi.at/memosys and http://memoys.i-med.ac.at. Database URL: http://memosys.i-med.ac.at.

  6. A lattice vibrational model using vibrational density of states for constructing thermodynamic databases (Invited)

    NASA Astrophysics Data System (ADS)

    Jacobs, M. H.; Van Den Berg, A. P.

    2013-12-01

    Thermodynamic databases are indispensable tools in materials science and mineral physics to derive thermodynamic properties in regions of pressure-temperature-composition space for which experimental data are not available or scant. Because the amount of phases and substances in a database is arbitrarily large, thermodynamic formalisms coupled to these databases are often kept as simple as possible to sustain computational efficiency. Although formalisms based on parameterizations of 1 bar thermodynamic data, commonly used in Calphad methodology, meet this requirement, physically unrealistic behavior in properties hamper the application in the pressure regime prevailing in the Earth's lower mantle. The application becomes especially cumbersome when they are applied to planetary mantles of massive super earth exoplanets or in the development of pressure scales, where Hugoniot data at extreme conditions are involved. Methods based on the Mie-Grüneisen-Debye formalism have the advantage that physically unrealistic behavior in thermodynamic properties is absent, but due to the simple construction of the vibrational density of states (VDoS), they lack engineering precision in the low-pressure regime, especially at 1 bar pressure, hampering application of databases incorporating such formalism to industrial processes. To obtain a method that is generally applicable in the complete stability range of a material, we developed a method based on an alternative use of Kieffer's lattice vibrational formalism. The method requires experimental data to constrain the model parameters and is therefore semi-empirical. It has the advantage that microscopic properties for substances, such as the VDoS, Grüneisen parameters and electronic and static lattice properties resulting from present-day ab-initio methods can be incorporated to constrain a thermodynamic analysis of experimental data. It produces results free from physically unrealistic behavior at high pressure and temperature

  7. Multiple imputation as one tool to provide longitudinal databases for modelling human height and weight development.

    PubMed

    Aßmann, C

    2016-06-01

    Besides large efforts regarding field work, provision of valid databases requires statistical and informational infrastructure to enable long-term access to longitudinal data sets on height, weight and related issues. To foster use of longitudinal data sets within the scientific community, provision of valid databases has to address data-protection regulations. It is, therefore, of major importance to hinder identifiability of individuals from publicly available databases. To reach this goal, one possible strategy is to provide a synthetic database to the public allowing for pretesting strategies for data analysis. The synthetic databases can be established using multiple imputation tools. Given the approval of the strategy, verification is based on the original data. Multiple imputation by chained equations is illustrated to facilitate provision of synthetic databases as it allows for capturing a wide range of statistical interdependencies. Also missing values, typically occurring within longitudinal databases for reasons of item non-response, can be addressed via multiple imputation when providing databases. The provision of synthetic databases using multiple imputation techniques is one possible strategy to ensure data protection, increase visibility of longitudinal databases and enhance the analytical potential.

  8. The Time Is Right to Focus on Model Organism Metabolomes.

    PubMed

    Edison, Arthur S; Hall, Robert D; Junot, Christophe; Karp, Peter D; Kurland, Irwin J; Mistrik, Robert; Reed, Laura K; Saito, Kazuki; Salek, Reza M; Steinbeck, Christoph; Sumner, Lloyd W; Viant, Mark R

    2016-02-15

    Model organisms are an essential component of biological and biomedical research that can be used to study specific biological processes. These organisms are in part selected for facile experimental study. However, just as importantly, intensive study of a small number of model organisms yields important synergies as discoveries in one area of science for a given organism shed light on biological processes in other areas, even for other organisms. Furthermore, the extensive knowledge bases compiled for each model organism enable systems-level understandings of these species, which enhance the overall biological and biomedical knowledge for all organisms, including humans. Building upon extensive genomics research, we argue that the time is now right to focus intensively on model organism metabolomes. We propose a grand challenge for metabolomics studies of model organisms: to identify and map all metabolites onto metabolic pathways, to develop quantitative metabolic models for model organisms, and to relate organism metabolic pathways within the context of evolutionary metabolomics, i.e., phylometabolomics. These efforts should focus on a series of established model organisms in microbial, animal and plant research.

  9. The Time Is Right to Focus on Model Organism Metabolomes

    PubMed Central

    Edison, Arthur S.; Hall, Robert D.; Junot, Christophe; Karp, Peter D.; Kurland, Irwin J.; Mistrik, Robert; Reed, Laura K.; Saito, Kazuki; Salek, Reza M.; Steinbeck, Christoph; Sumner, Lloyd W.; Viant, Mark R.

    2016-01-01

    Model organisms are an essential component of biological and biomedical research that can be used to study specific biological processes. These organisms are in part selected for facile experimental study. However, just as importantly, intensive study of a small number of model organisms yields important synergies as discoveries in one area of science for a given organism shed light on biological processes in other areas, even for other organisms. Furthermore, the extensive knowledge bases compiled for each model organism enable systems-level understandings of these species, which enhance the overall biological and biomedical knowledge for all organisms, including humans. Building upon extensive genomics research, we argue that the time is now right to focus intensively on model organism metabolomes. We propose a grand challenge for metabolomics studies of model organisms: to identify and map all metabolites onto metabolic pathways, to develop quantitative metabolic models for model organisms, and to relate organism metabolic pathways within the context of evolutionary metabolomics, i.e., phylometabolomics. These efforts should focus on a series of established model organisms in microbial, animal and plant research. PMID:26891337

  10. Carbonatites of the World, Explored Deposits of Nb and REE - Database and Grade and Tonnage Models

    USGS Publications Warehouse

    Berger, Vladimir I.; Singer, Donald A.; Orris, Greta J.

    2009-01-01

    This report is based on published tonnage and grade data on 58 Nb- and rare-earth-element (REE)-bearing carbonatite deposits that are mostly well explored and are partially mined or contain resources of these elements. The deposits represent only a part of the known 527 carbonatites around the world, but they are characterized by reliable quantitative data on ore tonnages and grades of niobium and REE. Grade and tonnage models are an important component of mineral resource assessments. Carbonatites present one of the main natural sources of niobium and rare-earth elements, the economic importance of which grows consistently. A purpose of this report is to update earlier publications. New information about known deposits, as well as data on new deposits published during the last decade, are incorporated in the present paper. The compiled database (appendix 1; linked to right) contains 60 explored Nb- and REE-bearing carbonatite deposits - resources of 55 of these deposits are taken from publications. In the present updated grade-tonnage model we have added 24 deposits comparing with the previous model of Singer (1998). Resources of most deposits are residuum ores in the upper part of carbonatite bodies. Mineral-deposit models are important in exploration planning and quantitative resource assessments for two reasons: (1) grades and tonnages among deposit types vary significantly, and (2) deposits of different types are present in distinct geologic settings that can be identified from geologic maps. Mineral-deposit models combine the diverse geoscience information on geology, mineral occurrences, geophysics, and geochemistry used in resource assessments and mineral exploration. Globally based deposit models allow recognition of important features and demonstrate how common different features are. Well-designed deposit models allow geologists to deduce possible mineral-deposit types in a given geologic environment, and the grade and tonnage models allow economists to

  11. JAK/STAT signalling--an executable model assembled from molecule-centred modules demonstrating a module-oriented database concept for systems and synthetic biology.

    PubMed

    Blätke, Mary Ann; Dittrich, Anna; Rohr, Christian; Heiner, Monika; Schaper, Fred; Marwan, Wolfgang

    2013-06-01

    Mathematical models of molecular networks regulating biological processes in cells or organisms are most frequently designed as sets of ordinary differential equations. Various modularisation methods have been applied to reduce the complexity of models, to analyse their structural properties, to separate biological processes, or to reuse model parts. Taking the JAK/STAT signalling pathway with the extensive combinatorial cross-talk of its components as a case study, we make a natural approach to modularisation by creating one module for each biomolecule. Each module consists of a Petri net and associated metadata and is organised in a database publically accessible through a web interface (). The Petri net describes the reaction mechanism of a given biomolecule and its functional interactions with other components including relevant conformational states. The database is designed to support the curation, documentation, version control, and update of individual modules, and to assist the user in automatically composing complex models from modules. Biomolecule centred modules, associated metadata, and database support together allow the automatic creation of models by considering differential gene expression in given cell types or under certain physiological conditions or states of disease. Modularity also facilitates exploring the consequences of alternative molecular mechanisms by comparative simulation of automatically created models even for users without mathematical skills. Models may be selectively executed as an ODE system, stochastic, or qualitative models or hybrid and exported in the SBML format. The fully automated generation of models of redesigned networks by metadata-guided modification of modules representing biomolecules with mutated function or specificity is proposed.

  12. AgBase: supporting functional modeling in agricultural organisms

    PubMed Central

    McCarthy, Fiona M.; Gresham, Cathy R.; Buza, Teresia J.; Chouvarine, Philippe; Pillai, Lakshmi R.; Kumar, Ranjit; Ozkan, Seval; Wang, Hui; Manda, Prashanti; Arick, Tony; Bridges, Susan M.; Burgess, Shane C.

    2011-01-01

    AgBase (http://www.agbase.msstate.edu/) provides resources to facilitate modeling of functional genomics data and structural and functional annotation of agriculturally important animal, plant, microbe and parasite genomes. The website is redesigned to improve accessibility and ease of use, including improved search capabilities. Expanded capabilities include new dedicated pages for horse, cat, dog, cotton, rice and soybean. We currently provide 590 240 Gene Ontology (GO) annotations to 105 454 gene products in 64 different species, including GO annotations linked to transcripts represented on agricultural microarrays. For many of these arrays, this provides the only functional annotation available. GO annotations are available for download and we provide comprehensive, species-specific GO annotation files for 18 different organisms. The tools available at AgBase have been expanded and several existing tools improved based upon user feedback. One of seven new tools available at AgBase, GOModeler, supports hypothesis testing from functional genomics data. We host several associated databases and provide genome browsers for three agricultural pathogens. Moreover, we provide comprehensive training resources (including worked examples and tutorials) via links to Educational Resources at the AgBase website. PMID:21075795

  13. A Calibration Database for Stellar Models of Asymptotic Giant Branch Stars

    NASA Astrophysics Data System (ADS)

    Dalcanton, Julianne

    2009-07-01

    Studies of galaxy formation and evolution rely increasingly on the interpretation and modeling of near-infrared observations. At these wavelengths, the brightest stars are intermediate mass asymptotic giant branch {AGB} stars. These stars can contribute nearly 50% of the integrated luminosity at near infrared and even optical wavelengths, particularly for the younger stellar populations characteristic of high-redshift galaxies {z>1}. AGB stars are also significant sources of dust and heavy elements. Accurate modeling of AGB stars is therefore of the utmost importance. The primary limitation facing current models is the lack of useful calibration data. Current models are tuned to match the properties of the AGB population in the Magellanic Clouds, and thus have only been calibrated in a very narrow range of sub-solar metallicities. Preliminary observations already suggest that the models are overestimating AGB lifetimes by factors of 2-3 at lower metallicities. At higher {solar} metallicities, there are no appropriate observations for calibrating the models.We propose a WFC3/IR SNAP survey of nearby galaxies to create a large database of AGB populations spanning the full range of metallicities and star formation histories. Because of their intrinsically red colors and dusty circumstellar envelopes, tracking the numbers and bolometric fluxes of AGB stars requires the NIR observations we propose here. The resulting observations of nearby galaxies with deep ACS imaging offer the opportunity to obtain large {100-1000's} complete samples of AGB stars at a single distance, in systems with well-constrained star formation histories and metallicities.

  14. A database of lumbar spinal mechanical behavior for validation of spinal analytical models.

    PubMed

    Stokes, Ian A F; Gardner-Morse, Mack

    2016-03-21

    Data from two experimental studies with eight specimens each of spinal motion segments and/or intervertebral discs are presented in a form that can be used for comparison with finite element model predictions. The data include the effect of compressive preload (0, 250 and 500N) with quasistatic cyclic loading (0.0115Hz) and the effect of loading frequency (1, 0.1, 0.01 and 0.001Hz) with a physiological compressive preload (mean 642N). Specimens were tested with displacements in each of six degrees of freedom (three translations and three rotations) about defined anatomical axes. The three forces and three moments in the corresponding axis system were recorded during each test. Linearized stiffness matrices were calculated that could be used in multi-segmental biomechanical models of the spine and these matrices were analyzed to determine whether off-diagonal terms and symmetry assumptions should be included. These databases of lumbar spinal mechanical behavior under physiological conditions quantify behaviors that should be present in finite element model simulations. The addition of more specimens to identify sources of variability associated with physical dimensions, degeneration, and other variables would be beneficial. Supplementary data provide the recorded data and Matlab® codes for reading files. Linearized stiffness matrices derived from the tests at different preloads revealed few significant unexpected off-diagonal terms and little evidence of significant matrix asymmetry.

  15. Towards Global QSAR Model Building for Acute Toxicity: Munro Database Case Study

    PubMed Central

    Chavan, Swapnil; Nicholls, Ian A.; Karlsson, Björn C. G.; Rosengren, Annika M.; Ballabio, Davide; Consonni, Viviana; Todeschini, Roberto

    2014-01-01

    A series of 436 Munro database chemicals were studied with respect to their corresponding experimental LD50 values to investigate the possibility of establishing a global QSAR model for acute toxicity. Dragon molecular descriptors were used for the QSAR model development and genetic algorithms were used to select descriptors better correlated with toxicity data. Toxic values were discretized in a qualitative class on the basis of the Globally Harmonized Scheme: the 436 chemicals were divided into 3 classes based on their experimental LD50 values: highly toxic, intermediate toxic and low to non-toxic. The k-nearest neighbor (k-NN) classification method was calibrated on 25 molecular descriptors and gave a non-error rate (NER) equal to 0.66 and 0.57 for internal and external prediction sets, respectively. Even if the classification performances are not optimal, the subsequent analysis of the selected descriptors and their relationship with toxicity levels constitute a step towards the development of a global QSAR model for acute toxicity. PMID:25302621

  16. Towards global QSAR model building for acute toxicity: Munro database case study.

    PubMed

    Chavan, Swapnil; Nicholls, Ian A; Karlsson, Björn C G; Rosengren, Annika M; Ballabio, Davide; Consonni, Viviana; Todeschini, Roberto

    2014-10-09

    A series of 436 Munro database chemicals were studied with respect to their corresponding experimental LD50 values to investigate the possibility of establishing a global QSAR model for acute toxicity. Dragon molecular descriptors were used for the QSAR model development and genetic algorithms were used to select descriptors better correlated with toxicity data. Toxic values were discretized in a qualitative class on the basis of the Globally Harmonized Scheme: the 436 chemicals were divided into 3 classes based on their experimental LD50 values: highly toxic, intermediate toxic and low to non-toxic. The k-nearest neighbor (k-NN) classification method was calibrated on 25 molecular descriptors and gave a non-error rate (NER) equal to 0.66 and 0.57 for internal and external prediction sets, respectively. Even if the classification performances are not optimal, the subsequent analysis of the selected descriptors and their relationship with toxicity levels constitute a step towards the development of a global QSAR model for acute toxicity.

  17. The Neotoma Paleoecology Database

    NASA Astrophysics Data System (ADS)

    Grimm, E. C.; Ashworth, A. C.; Barnosky, A. D.; Betancourt, J. L.; Bills, B.; Booth, R.; Blois, J.; Charles, D. F.; Graham, R. W.; Goring, S. J.; Hausmann, S.; Smith, A. J.; Williams, J. W.; Buckland, P.

    2015-12-01

    The Neotoma Paleoecology Database (www.neotomadb.org) is a multiproxy, open-access, relational database that includes fossil data for the past 5 million years (the late Neogene and Quaternary Periods). Modern distributional data for various organisms are also being made available for calibration and paleoecological analyses. The project is a collaborative effort among individuals from more than 20 institutions worldwide, including domain scientists representing a spectrum of Pliocene-Quaternary fossil data types, as well as experts in information technology. Working groups are active for diatoms, insects, ostracodes, pollen and plant macroscopic remains, testate amoebae, rodent middens, vertebrates, age models, geochemistry and taphonomy. Groups are also active in developing online tools for data analyses and for developing modules for teaching at different levels. A key design concept of NeotomaDB is that stewards for various data types are able to remotely upload and manage data. Cooperatives for different kinds of paleo data, or from different regions, can appoint their own stewards. Over the past year, much progress has been made on development of the steward software-interface that will enable this capability. The steward interface uses web services that provide access to the database. More generally, these web services enable remote programmatic access to the database, which both desktop and web applications can use and which provide real-time access to the most current data. Use of these services can alleviate the need to download the entire database, which can be out-of-date as soon as new data are entered. In general, the Neotoma web services deliver data either from an entire table or from the results of a view. Upon request, new web services can be quickly generated. Future developments will likely expand the spatial and temporal dimensions of the database. NeotomaDB is open to receiving new datasets and stewards from the global Quaternary community

  18. Spectral Line-Shape Model to Replace the Voigt Profile in Spectroscopic Databases

    NASA Astrophysics Data System (ADS)

    Lisak, Daniel; Ngo, Ngoc Hoa; Tran, Ha; Hartmann, Jean-Michel

    2014-06-01

    The standard description of molecular line shapes in spectral databases and radiative transfer codes is based on the Voigt profile. It is well known that its simplified assumptions of absorber free motion and independence of collisional parameters from absorber velocity lead to systematic errors in analysis of experimental spectra, and retrieval of gas concentration. We demonstrate1,2 that the partially correlated quadratic speed-dependent hardcollision profile3. (pCqSDHCP) is a good candidate to replace the Voigt profile in the next generations of spectroscopic databases. This profile takes into account the following physical effects: the Doppler broadening, the pressure broadening and shifting of the line, the velocity-changing collisions, the speed-dependence of pressure broadening and shifting, and correlations between velocity- and phase/state-changing collisions. The speed-dependence of pressure broadening and shifting is incorporated into the pCqSDNGP in the so-called quadratic approximation. The velocity-changing collisions lead to the Dicke narrowing effect; however in many cases correlations between velocityand phase/state-changing collisions may lead to effective reduction of observed Dicke narrowing. The hard-collision model of velocity-changing collisions is also known as the Nelkin-Ghatak model or Rautian model. Applicability of the pCqSDHCP for different molecular systems was tested on calculated and experimental spectra of such molecules as H2, O2, CO2, H2O in a wide span of pressures. For all considered systems, pCqSDHCP is able to describe molecular spectra at least an order of magnitude better than the Voigt profile with all fitted parameters being linear with pressure. In the most cases pCqSDHCP can reproduce the reference spectra down to 0.2% or better, which fulfills the requirements of the most demanding remote-sensing applications. An important advantage of pCqSDHCP is that a fast algorithm for its computation was developedab4,5 and allows

  19. High Prevalence of Multistability of Rest States and Bursting in a Database of a Model Neuron

    PubMed Central

    Marin, Bóris; Barnett, William H.; Doloc-Mihu, Anca; Calabrese, Ronald L.; Cymbalyuk, Gennady S.

    2013-01-01

    Flexibility in neuronal circuits has its roots in the dynamical richness of their neurons. Depending on their membrane properties single neurons can produce a plethora of activity regimes including silence, spiking and bursting. What is less appreciated is that these regimes can coexist with each other so that a transient stimulus can cause persistent change in the activity of a given neuron. Such multistability of the neuronal dynamics has been shown in a variety of neurons under different modulatory conditions. It can play either a functional role or present a substrate for dynamical diseases. We considered a database of an isolated leech heart interneuron model that can display silent, tonic spiking and bursting regimes. We analyzed only the cases of endogenous bursters producing functional half-center oscillators (HCOs). Using a one parameter (the leak conductance ()) bifurcation analysis, we extended the database to include silent regimes (stationary states) and systematically classified cases for the coexistence of silent and bursting regimes. We showed that different cases could exhibit two stable depolarized stationary states and two hyperpolarized stationary states in addition to various spiking and bursting regimes. We analyzed all cases of endogenous bursters and found that 18% of the cases were multistable, exhibiting coexistences of stationary states and bursting. Moreover, 91% of the cases exhibited multistability in some range of . We also explored HCOs built of multistable neuron cases with coexisting stationary states and a bursting regime. In 96% of cases analyzed, the HCOs resumed normal alternating bursting after one of the neurons was reset to a stationary state, proving themselves robust against this perturbation. PMID:23505348