Science.gov

Sample records for kegg pathways database

  1. A systematic comparison of the MetaCyc and KEGG pathway databases

    PubMed Central

    2013-01-01

    Background The MetaCyc and KEGG projects have developed large metabolic pathway databases that are used for a variety of applications including genome analysis and metabolic engineering. We present a comparison of the compound, reaction, and pathway content of MetaCyc version 16.0 and a KEGG version downloaded on Feb-27-2012 to increase understanding of their relative sizes, their degree of overlap, and their scope. To assess their overlap, we must know the correspondences between compounds, reactions, and pathways in MetaCyc, and those in KEGG. We devoted significant effort to computational and manual matching of these entities, and we evaluated the accuracy of the correspondences. Results KEGG contains 179 module pathways versus 1,846 base pathways in MetaCyc; KEGG contains 237 map pathways versus 296 super pathways in MetaCyc. KEGG pathways contain 3.3 times as many reactions on average as do MetaCyc pathways, and the databases employ different conceptualizations of metabolic pathways. KEGG contains 8,692 reactions versus 10,262 for MetaCyc. 6,174 KEGG reactions are components of KEGG pathways versus 6,348 for MetaCyc. KEGG contains 16,586 compounds versus 11,991 for MetaCyc. 6,912 KEGG compounds act as substrates in KEGG reactions versus 8,891 for MetaCyc. MetaCyc contains a broader set of database attributes than does KEGG, such as relationships from a compound to enzymes that it regulates, identification of spontaneous reactions, and the expected taxonomic range of metabolic pathways. MetaCyc contains many pathways not found in KEGG, from plants, fungi, metazoa, and actinobacteria; KEGG contains pathways not found in MetaCyc, for xenobiotic degradation, glycan metabolism, and metabolism of terpenoids and polyketides. MetaCyc contains fewer unbalanced reactions, which facilitates metabolic modeling such as using flux-balance analysis. MetaCyc includes generic reactions that may be instantiated computationally. Conclusions KEGG contains significantly more

  2. EGENES: Transcriptome-Based Plant Database of Genes with Metabolic Pathway Information and Expressed Sequence Tag Indices in KEGG1[C][W][OA

    PubMed Central

    Masoudi-Nejad, Ali; Goto, Susumu; Jauregui, Ruy; Ito, Masumi; Kawashima, Shuichi; Moriya, Yuki; Endo, Takashi R.; Kanehisa, Minoru

    2007-01-01

    EGENES is a knowledge-based database for efficient analysis of plant expressed sequence tags (ESTs) that was recently added to the KEGG suite of databases. It links plant genomic information with higher order functional information in a single database. It also provides gene indices for each genome. The genomic information in EGENES is a collection of EST contigs constructed from assembly of ESTs. Due to the extremely large genomes of plant species, the bulk collection of data such as ESTs is a quick way to capture a complete repertoire of genes expressed in an organism. Using ESTs for reconstructing metabolic pathways is a new expansion in KEGG and provides researchers with a new resource for species in which only EST sequences are available. Functional annotation in EGENES is a process of linking a set of genes/transcripts in each genome with a network of interacting molecules in the cell. EGENES is a multispecies, integrated resource consisting of genomic, chemical, and network information containing a complete set of building blocks (genes and molecules) and wiring diagrams (biological pathways) to represent cellular functions. Using EGENES, genome-based pathway annotation and EST-based annotation can now be compared and mutually validated. The ultimate goals of EGENES will be to: bring new plant species into KEGG by clustering and annotating ESTs; abstract knowledge and principles from large-scale plant EST data; and improve computational prediction of systems of higher complexity. EGENES will be updated at least once a year. EGENES is publicly available and is accessible by the following link or by KEGG's navigation system (http://www.genome.jp/kegg-bin/create_kegg_menu?category=plants_egenes). PMID:17468225

  3. Precise generation of systems biology models from KEGG pathways

    PubMed Central

    2013-01-01

    Background The KEGG PATHWAY database provides a plethora of pathways for a diversity of organisms. All pathway components are directly linked to other KEGG databases, such as KEGG COMPOUND or KEGG REACTION. Therefore, the pathways can be extended with an enormous amount of information and provide a foundation for initial structural modeling approaches. As a drawback, KGML-formatted KEGG pathways are primarily designed for visualization purposes and often omit important details for the sake of a clear arrangement of its entries. Thus, a direct conversion into systems biology models would produce incomplete and erroneous models. Results Here, we present a precise method for processing and converting KEGG pathways into initial metabolic and signaling models encoded in the standardized community pathway formats SBML (Levels 2 and 3) and BioPAX (Levels 2 and 3). This method involves correcting invalid or incomplete KGML content, creating complete and valid stoichiometric reactions, translating relations to signaling models and augmenting the pathway content with various information, such as cross-references to Entrez Gene, OMIM, UniProt ChEBI, and many more. Finally, we compare several existing conversion tools for KEGG pathways and show that the conversion from KEGG to BioPAX does not involve a loss of information, whilst lossless translations to SBML can only be performed using SBML Level 3, including its recently proposed qualitative models and groups extension packages. Conclusions Building correct BioPAX and SBML signaling models from the KEGG database is a unique characteristic of the proposed method. Further, there is no other approach that is able to appropriately construct metabolic models from KEGG pathways, including correct reactions with stoichiometry. The resulting initial models, which contain valid and comprehensive SBML or BioPAX code and a multitude of cross-references, lay the foundation to facilitate further modeling steps. PMID:23433509

  4. Conversion of KEGG metabolic pathways to SBGN maps including automatic layout

    PubMed Central

    2013-01-01

    Background Biologists make frequent use of databases containing large and complex biological networks. One popular database is the Kyoto Encyclopedia of Genes and Genomes (KEGG) which uses its own graphical representation and manual layout for pathways. While some general drawing conventions exist for biological networks, arbitrary graphical representations are very common. Recently, a new standard has been established for displaying biological processes, the Systems Biology Graphical Notation (SBGN), which aims to unify the look of such maps. Ideally, online repositories such as KEGG would automatically provide networks in a variety of notations including SBGN. Unfortunately, this is non‐trivial, since converting between notations may add, remove or otherwise alter map elements so that the existing layout cannot be simply reused. Results Here we describe a methodology for automatic translation of KEGG metabolic pathways into the SBGN format. We infer important properties of the KEGG layout and treat these as layout constraints that are maintained during the conversion to SBGN maps. Conclusions This allows for the drawing and layout conventions of SBGN to be followed while creating maps that are still recognizably the original KEGG pathways. This article details the steps in this process and provides examples of the final result. PMID:23953132

  5. Computationally identifying virulence factors based on KEGG pathways.

    PubMed

    Cui, Weiren; Chen, Lei; Huang, Tao; Gao, Qian; Jiang, Min; Zhang, Ning; Zheng, Lulu; Feng, Kaiyan; Cai, Yudong; Wang, Hongwei

    2013-06-01

    Virulence factors are molecules that play very important roles in enhancing the pathogen's capability in causing diseases. Many efforts were made to investigate the mechanism of virulence factors using in silico methods. In this study, we present a novel computational method to predict virulence factors by integrating protein-protein interactions in a STRING database and biological pathways in the KEGG. Three specific species were studied according to their records in the VFDB. They are Campylobacter jejuni NCTC 11168, Escherichia coli O6 : K15 : H31 536 (UPEC) and Pseudomonas aeruginosa PAO1. The prediction accuracies reached were 0.9467, 0.9575 and 0.9180, respectively. Metabolism pathways, flagellar assembly and chemotaxis may be of importance for virulence based on the analysis of the optimal feature sets we obtained. We hope this can provide some insight and guidance for related research.

  6. Drug-Path: a database for drug-induced pathways.

    PubMed

    Zeng, Hui; Qiu, Chengxiang; Cui, Qinghua

    2015-01-01

    Some databases for drug-associated pathways have been built and are publicly available. However, the pathways curated in most of these databases are drug-action or drug-metabolism pathways. In recent years, high-throughput technologies such as microarray and RNA-sequencing have produced lots of drug-induced gene expression profiles. Interestingly, drug-induced gene expression profile frequently show distinct patterns, indicating that drugs normally induce the activation or repression of distinct pathways. Therefore, these pathways contribute to study the mechanisms of drugs and drug-repurposing. Here, we present Drug-Path, a database of drug-induced pathways, which was generated by KEGG pathway enrichment analysis for drug-induced upregulated genes and downregulated genes based on drug-induced gene expression datasets in Connectivity Map. Drug-Path provides user-friendly interfaces to retrieve, visualize and download the drug-induced pathway data in the database. In addition, the genes deregulated by a given drug are highlighted in the pathways. All data were organized using SQLite. The web site was implemented using Django, a Python web framework. Finally, we believe that this database will be useful for related researches.

  7. Putative drug and vaccine target protein identification using comparative genomic analysis of KEGG annotated metabolic pathways of Mycoplasma hyopneumoniae.

    PubMed

    Damte, Dereje; Suh, Joo-Won; Lee, Seung-Jin; Yohannes, Sileshi Belew; Hossain, Md Akil; Park, Seung-Chun

    2013-07-01

    In the present study, a computational comparative and subtractive genomic/proteomic analysis aimed at the identification of putative therapeutic target and vaccine candidate proteins from Kyoto Encyclopedia of Genes and Genomes (KEGG) annotated metabolic pathways of Mycoplasma hyopneumoniae was performed for drug design and vaccine production pipelines against M.hyopneumoniae. The employed comparative genomic and metabolic pathway analysis with a predefined computational systemic workflow extracted a total of 41 annotated metabolic pathways from KEGG among which five were unique to M. hyopneumoniae. A total of 234 proteins were identified to be involved in these metabolic pathways. Although 125 non homologous and predicted essential proteins were found from the total that could serve as potential drug targets and vaccine candidates, additional prioritizing parameters characterize 21 proteins as vaccine candidate while druggability of each of the identified proteins evaluated by the DrugBank database prioritized 42 proteins suitable for drug targets.

  8. Pathway Interaction Database (PID) —

    Cancer.gov

    The National Cancer Institute (NCI) in collaboration with Nature Publishing Group has established the Pathway Interaction Database (PID) in order to provide a highly structured, curated collection of information about known biomolecular interactions and key cellular processes assembled into signaling pathways.

  9. Microarray and synchronization of neuronal differentiation with pathway changes in the Kyoto Encyclopedia of Genes and Genomes (KEGG) databank in nerve growth factor-treated PC12 cells.

    PubMed

    Lin, Chih-Ming; Feng, Wayne

    2012-08-01

    The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database creates networks from interrelations between molecular biology and underlying chemical elements. This allows for analysis of biologic networks, genomic information, and higher-order functional information at a systems level. We performed microarray experiments and used the KEGG database, systems biology analysis, and annotation of pathway function to study nerve growth factor (NGF)-induced differentiation of PC12 cells. Cells were cultured to 70%-80% confluence, treated with NGF for 1 or 3 hours (h), and RNA was extracted. Stage 1 data analysis involved analysis of variance (ANOVA), and stage 2 involved cluster analysis and heat map generation. We identified 2020 NGF-induced PC12 genes (1038 at 1 h and 1554 at 3 h). Results showed changes in gene expression over time. We compared these genes with 6035 genes from the KEGG database. Cross-matching resulted in 830 genes. Among these, we identified 395 altered genes (155 at 1 h and 301 at 3 h; 2-fold increase from 1 h to 3 h). We identified 191 biologic pathways in the KEGG database; the top 15 showed correlations with neuronal differentiation (mitogen-activated protein kinase [MAPK] pathway: 35 genes at 1 h, 54 genes at 3 h; genes associated with axonal guidance: 12 at 1 h, 26 at 3 h; Wnt pathway: 16 at 1 h, 25 at 3 h; neurotrophin pathway: 4 at 1 h, 14 at 3 h). Thus, we identified changes in neuronal differentiation pathways with the KEGG database, which were synchronized with NGF-induced differentiation.

  10. MPW : the metabolic pathways database.

    SciTech Connect

    Selkov, E., Jr.; Grechkin, Y.; Mikhailova, N.; Selkov, E.; Mathematics and Computer Science; Russian Academy of Sciences

    1998-01-01

    The Metabolic Pathways Database (MPW) (www.biobase.com/emphome.html/homepage. html.pags/pathways.html) a derivative of EMP (www.biobase.com/EMP) plays a fundamental role in the technology of metabolic reconstructions from sequenced genomes under the PUMA (www.mcs.anl.gov/home/compbio/PUMA/Production/ ReconstructedMetabolism/reconstruction.html), WIT (www.mcs.anl.gov/home/compbio/WIT/wit.html ) and WIT2 (beauty.isdn.msc.anl.gov/WIT2.pub/CGI/user.cgi) systems. In October 1997, it included some 2800 pathway diagrams covering primary and secondary metabolism, membrane transport, signal transduction pathways, intracellular traffic, translation and transcription. In the current public release of MPW (beauty.isdn.mcs.anl.gov/MPW), the encoding is based on the logical structure of the pathways and is represented by the objects commonly used in electronic circuit design. This facilitates drawing and editing the diagrams and makes possible automation of the basic simulation operations such as deriving stoichiometric matrices, rate laws, and, ultimately, dynamic models of metabolic pathways. Individual pathway diagrams, automatically derived from the original ASCII records, are stored as SGML instances supplemented by relational indices. An auxiliary database of compound names and structures, encoded in the SMILES format, is maintained to unambiguously connect the pathways to the chemical structures of their intermediates.

  11. SMPDB: The Small Molecule Pathway Database.

    PubMed

    Frolkis, Alex; Knox, Craig; Lim, Emilia; Jewison, Timothy; Law, Vivian; Hau, David D; Liu, Phillip; Gautam, Bijaya; Ly, Son; Guo, An Chi; Xia, Jianguo; Liang, Yongjie; Shrivastava, Savita; Wishart, David S

    2010-01-01

    The Small Molecule Pathway Database (SMPDB) is an interactive, visual database containing more than 350 small-molecule pathways found in humans. More than 2/3 of these pathways (>280) are not found in any other pathway database. SMPDB is designed specifically to support pathway elucidation and pathway discovery in clinical metabolomics, transcriptomics, proteomics and systems biology. SMPDB provides exquisitely detailed, hyperlinked diagrams of human metabolic pathways, metabolic disease pathways, metabolite signaling pathways and drug-action pathways. All SMPDB pathways include information on the relevant organs, organelles, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures. Each small molecule is hyperlinked to detailed descriptions contained in the Human Metabolome Database (HMDB) or DrugBank and each protein or enzyme complex is hyperlinked to UniProt. All SMPDB pathways are accompanied with detailed descriptions, providing an overview of the pathway, condition or processes depicted in each diagram. The database is easily browsed and supports full text searching. Users may query SMPDB with lists of metabolite names, drug names, genes/protein names, SwissProt IDs, GenBank IDs, Affymetrix IDs or Agilent microarray IDs. These queries will produce lists of matching pathways and highlight the matching molecules on each of the pathway diagrams. Gene, metabolite and protein concentration data can also be visualized through SMPDB's mapping interface. All of SMPDB's images, image maps, descriptions and tables are downloadable. SMPDB is available at: http://www.smpdb.ca. PMID:19948758

  12. IPAD: the Integrated Pathway Analysis Database for Systematic Enrichment Analysis

    PubMed Central

    2012-01-01

    ) cross-linking of multiple available data sources. IPAD is a comprehensive database covering about 22,498 genes, 25,469 proteins, 1956 pathways, 6704 diseases, 5615 drugs, and 52 organs integrated from databases including the BioCarta, KEGG, NCI-Nature curated, Reactome, CTD, PharmGKB, DrugBank, PharmGKB, and HOMER. The database has a web-based user interface that allows users to perform enrichment analysis from genes/proteins/molecules and inter-association analysis from a pathway, disease, drug, and organ. Moreover, the quality of the database was validated with the context of the existing biological knowledge and a "gold standard" constructed from reputable and reliable sources. Two case studies were also presented to demonstrate: 1) self-validation of enrichment analysis and inter-association analysis on brain-specific markers, and 2) identification of previously undiscovered components by the enrichment analysis from a prostate cancer study. Conclusions IPAD is a new resource for analyzing, identifying, and validating pathway, disease, drug, organ specificity and their inter-associations. The statistical method we developed for enrichment and similarity measurement and the two criteria we described for setting the threshold parameters can be extended to other enrichment applications. Enriched pathways, diseases, drugs, organs and their inter-associations can be searched, displayed, and downloaded from our online user interface. The current IPAD database can help users address a wide range of biological pathway related, disease susceptibility related, drug target related and organ specificity related questions in human disease studies. PMID:23046449

  13. KEGG: Kyoto Encyclopedia of Genes and Genomes.

    PubMed

    Ogata, H; Goto, S; Sato, K; Fujibuchi, W; Bono, H; Kanehisa, M

    1999-01-01

    Kyoto Encyclopedia of Genes and Genomes (KEGG) is a knowledge base for systematic analysis of gene functions in terms of the networks of genes and molecules. The major component of KEGG is the PATHWAY database that consists of graphical diagrams of biochemical pathways including most of the known metabolic pathways and some of the known regulatory pathways. The pathway information is also represented by the ortholog group tables summarizing orthologous and paralogous gene groups among different organisms. KEGG maintains the GENES database for the gene catalogs of all organisms with complete genomes and selected organisms with partial genomes, which are continuously re-annotated, as well as the LIGAND database for chemical compounds and enzymes. Each gene catalog is associated with the graphical genome map for chromosomal locations that is represented by Java applet. In addition to the data collection efforts, KEGG develops and provides various computational tools, such as for reconstructing biochemical pathways from the complete genome sequence and for predicting gene regulatory networks from the gene expression profiles. The KEGG databases are daily updated and made freely available (http://www.genome.ad.jp/kegg/).

  14. hiPathDB: a human-integrated pathway database with facile visualization

    PubMed Central

    Yu, Namhee; Seo, Jihae; Rho, Kyoohyoung; Jang, Yeongjun; Park, Jinah; Kim, Wan Kyu; Lee, Sanghyuk

    2012-01-01

    One of the biggest challenges in the study of biological regulatory networks is the systematic organization and integration of complex interactions taking place within various biological pathways. Currently, the information of the biological pathways is dispersed in multiple databases in various formats. hiPathDB is an integrated pathway database that combines the curated human pathway data of NCI-Nature PID, Reactome, BioCarta and KEGG. In total, it includes 1661 pathways consisting of 8976 distinct physical entities. hiPathDB provides two different types of integration. The pathway-level integration, conceptually a simple collection of individual pathways, was achieved by devising an elaborate model that takes distinct features of four databases into account and subsequently reformatting all pathways in accordance with our model. The entity-level integration creates a single unified pathway that encompasses all pathways by merging common components. Even though the detailed molecular-level information such as complex formation or post-translational modifications tends to be lost, such integration makes it possible to investigate signaling network over the entire pathways and allows identification of pathway cross-talks. Another strong merit of hiPathDB is the built-in pathway visualization module that supports explorative studies of complex networks in an interactive fashion. The layout algorithm is optimized for virtually automatic visualization of the pathways. hiPathDB is available at http://hiPathDB.kobic.re.kr. PMID:22123737

  15. Towards rule-based metabolic databases: a requirement analysis based on KEGG.

    PubMed

    Richter, Stephan; Fetzer, Ingo; Thullner, Martin; Centler, Florian; Dittrich, Peter

    2015-01-01

    Knowledge of metabolic processes is collected in easily accessable online databases which are increasing rapidly in content and detail. Using these databases for the automatic construction of metabolic network models requires high accuracy and consistency. In this bipartite study we evaluate current accuracy and consistency problems using the KEGG database as a prominent example and propose design principles for dealing with such problems. In the first half, we present our computational approach for classifying inconsistencies and provide an overview of the classes of inconsistencies we identified. We detected inconsistencies both for database entries referring to substances and entries referring to reactions. In the second part, we present strategies to deal with the detected problem classes. We especially propose a rule-based database approach which allows for the inclusion of parameterised molecular species and parameterised reactions. Detailed case-studies and a comparison of explicit networks from KEGG with their anticipated rule-based representation underline the applicability and scalability of this approach. PMID:26547981

  16. KEGG-PATH: Kyoto encyclopedia of genes and genomes-based pathway analysis using a path analysis model.

    PubMed

    Du, Junli; Yuan, Zhifa; Ma, Ziwei; Song, Jiuzhou; Xie, Xiaoli; Chen, Yulin

    2014-07-29

    The dynamic impact approach (DIA) represents an alternative to overrepresentation analysis (ORA) for functional analysis of time-course experiments or those involving multiple treatments. The DIA can be used to estimate the biological impact of the differentially expressed genes (DEGs) associated with particular biological functions, for example, as represented by the Kyoto encyclopedia of genes and genomes (KEGG) annotations. However, the DIA does not take into account the correlated dependence structure of the KEGG pathway hierarchy. We have developed herein a path analysis model (KEGG-PATH) to subdivide the total effect of each KEGG pathway into the direct effect and indirect effect by taking into account not only each KEGG pathway itself, but also the correlation with its related pathways. In addition, this work also attempts to preliminarily estimate the impact direction of each KEGG pathway by a gradient analysis method from principal component analysis (PCA). As a result, the advantage of the KEGG-PATH model is demonstrated through the functional analysis of the bovine mammary transcriptome during lactation.

  17. KEGG: kyoto encyclopedia of genes and genomes.

    PubMed

    Kanehisa, M; Goto, S

    2000-01-01

    KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).

  18. KEGG for representation and analysis of molecular networks involving diseases and drugs.

    PubMed

    Kanehisa, Minoru; Goto, Susumu; Furumichi, Miho; Tanabe, Mao; Hirakawa, Mika

    2010-01-01

    Most human diseases are complex multi-factorial diseases resulting from the combination of various genetic and environmental factors. In the KEGG database resource (http://www.genome.jp/kegg/), diseases are viewed as perturbed states of the molecular system, and drugs as perturbants to the molecular system. Disease information is computerized in two forms: pathway maps and gene/molecule lists. The KEGG PATHWAY database contains pathway maps for the molecular systems in both normal and perturbed states. In the KEGG DISEASE database, each disease is represented by a list of known disease genes, any known environmental factors at the molecular level, diagnostic markers and therapeutic drugs, which may reflect the underlying molecular system. The KEGG DRUG database contains chemical structures and/or chemical components of all drugs in Japan, including crude drugs and TCM (Traditional Chinese Medicine) formulas, and drugs in the USA and Europe. This database also captures knowledge about two types of molecular networks: the interaction network with target molecules, metabolizing enzymes, other drugs, etc. and the chemical structure transformation network in the history of drug development. The new disease/drug information resource named KEGG MEDICUS can be used as a reference knowledge base for computational analysis of molecular networks, especially, by integrating large-scale experimental datasets.

  19. The Use of Gene Ontology Term and KEGG Pathway Enrichment for Analysis of Drug Half-Life

    PubMed Central

    Chen, Lei; Lu, Jing; Kong, XiangYin; Huang, Tao; Li, HaiPeng

    2016-01-01

    A drug’s biological half-life is defined as the time required for the human body to metabolize or eliminate 50% of the initial drug dosage. Correctly measuring the half-life of a given drug is helpful for the safe and accurate usage of the drug. In this study, we investigated which gene ontology (GO) terms and biological pathways were highly related to the determination of drug half-life. The investigated drugs, with known half-lives, were analyzed based on their enrichment scores for associated GO terms and KEGG pathways. These scores indicate which GO terms or KEGG pathways the drug targets. The feature selection method, minimum redundancy maximum relevance, was used to analyze these GO terms and KEGG pathways and to identify important GO terms and pathways, such as sodium-independent organic anion transmembrane transporter activity (GO:0015347), monoamine transmembrane transporter activity (GO:0008504), negative regulation of synaptic transmission (GO:0050805), neuroactive ligand-receptor interaction (hsa04080), serotonergic synapse (hsa04726), and linoleic acid metabolism (hsa00591), among others. This analysis confirmed our results and may show evidence for a new method in studying drug half-lives and building effective computational methods for the prediction of drug half-lives. PMID:27780226

  20. SPIKE: a database of highly curated human signaling pathways.

    PubMed

    Paz, Arnon; Brownstein, Zippora; Ber, Yaara; Bialik, Shani; David, Eyal; Sagir, Dorit; Ulitsky, Igor; Elkon, Ran; Kimchi, Adi; Avraham, Karen B; Shiloh, Yosef; Shamir, Ron

    2011-01-01

    The rapid accumulation of knowledge on biological signaling pathways and their regulatory mechanisms has highlighted the need for specific repositories that can store, organize and allow retrieval of pathway information in a way that will be useful for the research community. SPIKE (Signaling Pathways Integrated Knowledge Engine; http://www.cs.tau.ac.il/&~spike/) is a database for achieving this goal, containing highly curated interactions for particular human pathways, along with literature-referenced information on the nature of each interaction. To make database population and pathway comprehension straightforward, a simple yet informative data model is used, and pathways are laid out as maps that reflect the curator’s understanding and make the utilization of the pathways easy. The database currently focuses primarily on pathways describing DNA damage response, cell cycle, programmed cell death and hearing related pathways. Pathways are regularly updated, and additional pathways are gradually added. The complete database and the individual maps are freely exportable in several formats. The database is accompanied by a stand-alone software tool for analysis and dynamic visualization of pathways.

  1. KaPPA-View4: a metabolic pathway database for representation and analysis of correlation networks of gene co-expression and metabolite co-accumulation and omics data.

    PubMed

    Sakurai, Nozomu; Ara, Takeshi; Ogata, Yoshiyuki; Sano, Ryosuke; Ohno, Takashi; Sugiyama, Kenjiro; Hiruta, Atsushi; Yamazaki, Kiyoshi; Yano, Kentaro; Aoki, Koh; Aharoni, Asaph; Hamada, Kazuki; Yokoyama, Koji; Kawamura, Shingo; Otsuka, Hirofumi; Tokimatsu, Toshiaki; Kanehisa, Minoru; Suzuki, Hideyuki; Saito, Kazuki; Shibata, Daisuke

    2011-01-01

    Correlations of gene-to-gene co-expression and metabolite-to-metabolite co-accumulation calculated from large amounts of transcriptome and metabolome data are useful for uncovering unknown functions of genes, functional diversities of gene family members and regulatory mechanisms of metabolic pathway flows. Many databases and tools are available to interpret quantitative transcriptome and metabolome data, but there are only limited ones that connect correlation data to biological knowledge and can be utilized to find biological significance of it. We report here a new metabolic pathway database, KaPPA-View4 (http://kpv.kazusa.or.jp/kpv4/), which is able to overlay gene-to-gene and/or metabolite-to-metabolite relationships as curves on a metabolic pathway map, or on a combination of up to four maps. This representation would help to discover, for example, novel functions of a transcription factor that regulates genes on a metabolic pathway. Pathway maps of the Kyoto Encyclopedia of Genes and Genomes (KEGG) and maps generated from their gene classifications are available at KaPPA-View4 KEGG version (http://kpv.kazusa.or.jp/kpv4-kegg/). At present, gene co-expression data from the databases ATTED-II, COXPRESdb, CoP and MiBASE for human, mouse, rat, Arabidopsis, rice, tomato and other plants are available. PMID:21097783

  2. Co-LncRNA: investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data.

    PubMed

    Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia

    2015-01-01

    Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/.

  3. Co-LncRNA: investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data

    PubMed Central

    Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia

    2015-01-01

    Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/ PMID:26363020

  4. Co-LncRNA: investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data.

    PubMed

    Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia

    2015-01-01

    Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/. PMID:26363020

  5. The KEGG resource for deciphering the genome.

    PubMed

    Kanehisa, Minoru; Goto, Susumu; Kawashima, Shuichi; Okuno, Yasushi; Hattori, Masahiro

    2004-01-01

    A grand challenge in the post-genomic era is a complete computer representation of the cell and the organism, which will enable computational prediction of higher-level complexity of cellular processes and organism behavior from genomic information. Toward this end we have been developing a knowledge-based approach for network prediction, which is to predict, given a complete set of genes in the genome, the protein interaction networks that are responsible for various cellular processes. KEGG at http://www.genome.ad.jp/kegg/ is the reference knowledge base that integrates current knowledge on molecular interaction networks such as pathways and complexes (PATHWAY database), information about genes and proteins generated by genome projects (GENES/SSDB/KO databases) and information about biochemical compounds and reactions (COMPOUND/GLYCAN/REACTION databases). These three types of database actually represent three graph objects, called the protein network, the gene universe and the chemical universe. New efforts are being made to abstract knowledge, both computationally and manually, about ortholog clusters in the KO (KEGG Orthology) database, and to collect and analyze carbohydrate structures in the GLYCAN database.

  6. The KEGG resource for deciphering the genome.

    PubMed

    Kanehisa, Minoru; Goto, Susumu; Kawashima, Shuichi; Okuno, Yasushi; Hattori, Masahiro

    2004-01-01

    A grand challenge in the post-genomic era is a complete computer representation of the cell and the organism, which will enable computational prediction of higher-level complexity of cellular processes and organism behavior from genomic information. Toward this end we have been developing a knowledge-based approach for network prediction, which is to predict, given a complete set of genes in the genome, the protein interaction networks that are responsible for various cellular processes. KEGG at http://www.genome.ad.jp/kegg/ is the reference knowledge base that integrates current knowledge on molecular interaction networks such as pathways and complexes (PATHWAY database), information about genes and proteins generated by genome projects (GENES/SSDB/KO databases) and information about biochemical compounds and reactions (COMPOUND/GLYCAN/REACTION databases). These three types of database actually represent three graph objects, called the protein network, the gene universe and the chemical universe. New efforts are being made to abstract knowledge, both computationally and manually, about ortholog clusters in the KO (KEGG Orthology) database, and to collect and analyze carbohydrate structures in the GLYCAN database. PMID:14681412

  7. Sample Level Enrichment Analysis of KEGG Pathways Identifies Clinically Relevant Subtypes of Glioblastoma

    PubMed Central

    Wanggou, Siyi; Feng, Chengyuan; Xie, Yuanyang; Ye, Linrong; Wang, Feiyifan; Li, Xuejun

    2016-01-01

    Background: Glioblastoma is the most lethal primary brain tumor in adults. Aberrant signal transduction pathways, associated with the progression of glioblastoma, have been identified recently and may offer a potential gene therapy strategy. Methods and Findings: We first used the sample level enrichment analysis to transfer gene expression profile of TCGA dataset into pathway enrichment z-score matrix. Then, we classified glioblastoma into five subtypes (Cluster A to Cluster E) by the consensus clustering and silhouette analysis. Principle component analysis showed the five subtype could be separated by first three principle components. Integrative omics data showed that mesenchymal subtype was rich in Cluster A, neural subtype was centered in Cluster D and proneural subtype was gathered in Cluster E, while Cluster E showed a high percentage of G-CIMP subtype. Additionally, according to analyze the overall survival and progression free survival of each subtype by Kaplan-Merie analysis and Cox hazard proportion model, we identified Cluster D and Cluster E received a better prognosis. Conclusions: We report a clinically relevant classification of glioblastoma based on sample level KEGG pathway enrichment profile and this novel classification system provided new insights into the heterogeneity of glioblastoma, and may be used as an important clinical tool to predict the prognosis.

  8. Available pathways database (APD): an essential resource for combinatorial biology.

    PubMed

    Pirrung, M C; Silva, C M; Jaeger, J

    2000-10-01

    A relational database, the Available Pathways Database (APD), has been constructed of microbial natural products, their producing strains, and their biosynthetic pathways. The database allows the ready selection of donor strains for combinatorial biology experiments. It provides the same type of resource for combinatorial biology as the Available Chemicals Directory (ACD) does for combinatorial chemical library generation. Its cataloging ability can also provide insight into novel aspects of biosynthetic routes. In particular, no 10-unit Type I polyketides were found in the compilation of this edition of the APD (Version I). PMID:11076562

  9. Pathway Analysis for Drug Repositioning Based on Public Database Mining

    PubMed Central

    2015-01-01

    Sixteen FDA-approved drugs were investigated to elucidate their mechanisms of action (MOAs) and clinical functions by pathway analysis based on retrieved drug targets interacting with or affected by the investigated drugs. Protein and gene targets and associated pathways were obtained by data-mining of public databases including the MMDB, PubChem BioAssay, GEO DataSets, and the BioSystems databases. Entrez E-Utilities were applied, and in-house Ruby scripts were developed for data retrieval and pathway analysis to identify and evaluate relevant pathways common to the retrieved drug targets. Pathways pertinent to clinical uses or MOAs were obtained for most drugs. Interestingly, some drugs identified pathways responsible for other diseases than their current therapeutic uses, and these pathways were verified retrospectively by in vitro tests, in vivo tests, or clinical trials. The pathway enrichment analysis based on drug target information from public databases could provide a novel approach for elucidating drug MOAs and repositioning, therefore benefiting the discovery of new therapeutic treatments for diseases. PMID:24460210

  10. BlastKOALA and GhostKOALA: KEGG Tools for Functional Characterization of Genome and Metagenome Sequences.

    PubMed

    Kanehisa, Minoru; Sato, Yoko; Morishima, Kanae

    2016-02-22

    BlastKOALA and GhostKOALA are automatic annotation servers for genome and metagenome sequences, which perform KO (KEGG Orthology) assignments to characterize individual gene functions and reconstruct KEGG pathways, BRITE hierarchies and KEGG modules to infer high-level functions of the organism or the ecosystem. Both servers are made freely available at the KEGG Web site (http://www.kegg.jp/blastkoala/). In BlastKOALA, the KO assignment is performed by a modified version of the internally used KOALA algorithm after the BLAST search against a non-redundant dataset of pangenome sequences at the species, genus or family level, which is generated from the KEGG GENES database by retaining the KO content of each taxonomic category. In GhostKOALA, which utilizes more rapid GHOSTX for database search and is suitable for metagenome annotation, the pangenome dataset is supplemented with Cd-hit clusters including those for viral genes. The result files may be downloaded and manipulated for further KEGG Mapper analysis, such as comparative pathway analysis using multiple BlastKOALA results. PMID:26585406

  11. Genic and Intergenic SSR Database Generation, SNPs Determination and Pathway Annotations, in Date Palm (Phoenix dactylifera L.).

    PubMed

    Mokhtar, Morad M; Adawy, Sami S; El-Assal, Salah El-Din S; Hussein, Ebtissam H A

    2016-01-01

    The present investigation was carried out aiming to use the bioinformatics tools in order to identify and characterize, simple sequence repeats within the third Version of the date palm genome and develop a new SSR primers database. In addition single nucleotide polymorphisms (SNPs) that are located within the SSR flanking regions were recognized. Moreover, the pathways for the sequences assigned by SSR primers, the biological functions and gene interaction were determined. A total of 172,075 SSR motifs was identified on date palm genome sequence with a frequency of 450.97 SSRs per Mb. Out of these, 130,014 SSRs (75.6%) were located within the intergenic regions with a frequency of 499 SSRs per Mb. While, only 42,061 SSRs (24.4%) were located within the genic regions with a frequency of 347.5 SSRs per Mb. A total of 111,403 of SSR primer pairs were designed, that represents 291.9 SSR primers per Mb. Out of the 111,403, only 31,380 SSR primers were in the genic regions, while 80,023 primers were in the intergenic regions. A number of 250,507 SNPs were recognized in 84,172 SSR flanking regions, which represents 75.55% of the total SSR flanking regions. Out of 12,274 genes only 463 genes comprising 896 SSR primers were mapped onto 111 pathways using KEGG data base. The most abundant enzymes were identified in the pathway related to the biosynthesis of antibiotics. We tested 1031 SSR primers using both publicly available date palm genome sequences as templates in the in silico PCR reactions. Concerning in vitro validation, 31 SSR primers among those used in the in silico PCR were synthesized and tested for their ability to detect polymorphism among six Egyptian date palm cultivars. All tested primers have successfully amplified products, but only 18 primers detected polymorphic amplicons among the studied date palm cultivars. PMID:27434138

  12. Genic and Intergenic SSR Database Generation, SNPs Determination and Pathway Annotations, in Date Palm (Phoenix dactylifera L.).

    PubMed

    Mokhtar, Morad M; Adawy, Sami S; El-Assal, Salah El-Din S; Hussein, Ebtissam H A

    2016-01-01

    The present investigation was carried out aiming to use the bioinformatics tools in order to identify and characterize, simple sequence repeats within the third Version of the date palm genome and develop a new SSR primers database. In addition single nucleotide polymorphisms (SNPs) that are located within the SSR flanking regions were recognized. Moreover, the pathways for the sequences assigned by SSR primers, the biological functions and gene interaction were determined. A total of 172,075 SSR motifs was identified on date palm genome sequence with a frequency of 450.97 SSRs per Mb. Out of these, 130,014 SSRs (75.6%) were located within the intergenic regions with a frequency of 499 SSRs per Mb. While, only 42,061 SSRs (24.4%) were located within the genic regions with a frequency of 347.5 SSRs per Mb. A total of 111,403 of SSR primer pairs were designed, that represents 291.9 SSR primers per Mb. Out of the 111,403, only 31,380 SSR primers were in the genic regions, while 80,023 primers were in the intergenic regions. A number of 250,507 SNPs were recognized in 84,172 SSR flanking regions, which represents 75.55% of the total SSR flanking regions. Out of 12,274 genes only 463 genes comprising 896 SSR primers were mapped onto 111 pathways using KEGG data base. The most abundant enzymes were identified in the pathway related to the biosynthesis of antibiotics. We tested 1031 SSR primers using both publicly available date palm genome sequences as templates in the in silico PCR reactions. Concerning in vitro validation, 31 SSR primers among those used in the in silico PCR were synthesized and tested for their ability to detect polymorphism among six Egyptian date palm cultivars. All tested primers have successfully amplified products, but only 18 primers detected polymorphic amplicons among the studied date palm cultivars.

  13. Genic and Intergenic SSR Database Generation, SNPs Determination and Pathway Annotations, in Date Palm (Phoenix dactylifera L.)

    PubMed Central

    2016-01-01

    The present investigation was carried out aiming to use the bioinformatics tools in order to identify and characterize, simple sequence repeats within the third Version of the date palm genome and develop a new SSR primers database. In addition single nucleotide polymorphisms (SNPs) that are located within the SSR flanking regions were recognized. Moreover, the pathways for the sequences assigned by SSR primers, the biological functions and gene interaction were determined. A total of 172,075 SSR motifs was identified on date palm genome sequence with a frequency of 450.97 SSRs per Mb. Out of these, 130,014 SSRs (75.6%) were located within the intergenic regions with a frequency of 499 SSRs per Mb. While, only 42,061 SSRs (24.4%) were located within the genic regions with a frequency of 347.5 SSRs per Mb. A total of 111,403 of SSR primer pairs were designed, that represents 291.9 SSR primers per Mb. Out of the 111,403, only 31,380 SSR primers were in the genic regions, while 80,023 primers were in the intergenic regions. A number of 250,507 SNPs were recognized in 84,172 SSR flanking regions, which represents 75.55% of the total SSR flanking regions. Out of 12,274 genes only 463 genes comprising 896 SSR primers were mapped onto 111 pathways using KEGG data base. The most abundant enzymes were identified in the pathway related to the biosynthesis of antibiotics. We tested 1031 SSR primers using both publicly available date palm genome sequences as templates in the in silico PCR reactions. Concerning in vitro validation, 31 SSR primers among those used in the in silico PCR were synthesized and tested for their ability to detect polymorphism among six Egyptian date palm cultivars. All tested primers have successfully amplified products, but only 18 primers detected polymorphic amplicons among the studied date palm cultivars. PMID:27434138

  14. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases

    PubMed Central

    Caspi, Ron; Altman, Tomer; Dreher, Kate; Fulcher, Carol A.; Subhraveti, Pallavi; Keseler, Ingrid M.; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A.; Ong, Quang; Paley, Suzanne; Pujar, Anuradha; Shearer, Alexander G.; Travers, Michael; Weerasinghe, Deepika; Zhang, Peifen; Karp, Peter D.

    2012-01-01

    The MetaCyc database (http://metacyc.org/) provides a comprehensive and freely accessible resource for metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are experimentally determined, small-molecule metabolic pathways and are curated from the primary scientific literature. MetaCyc contains more than 1800 pathways derived from more than 30 000 publications, and is the largest curated collection of metabolic pathways currently available. Most reactions in MetaCyc pathways are linked to one or more well-characterized enzymes, and both pathways and enzymes are annotated with reviews, evidence codes and literature citations. BioCyc (http://biocyc.org/) is a collection of more than 1700 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the full genome and predicted metabolic network of one organism. The network, which is predicted by the Pathway Tools software using MetaCyc as a reference database, consists of metabolites, enzymes, reactions and metabolic pathways. BioCyc PGDBs contain additional features, including predicted operons, transport systems and pathway-hole fillers. The BioCyc website and Pathway Tools software offer many tools for querying and analysis of PGDBs, including Omics Viewers and comparative analysis. New developments include a zoomable web interface for diagrams; flux-balance analysis model generation from PGDBs; web services; and a new tool called Web Groups. PMID:22102576

  15. KEGG orthology-based annotation of the predicted proteome of Acropora digitifera: ZoophyteBase - an open access and searchable database of a coral genome

    PubMed Central

    2013-01-01

    Background Contemporary coral reef research has firmly established that a genomic approach is urgently needed to better understand the effects of anthropogenic environmental stress and global climate change on coral holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a reef-building coral by applying advanced bioinformatics. Description Sequences from the KEGG database of protein function were used to construct hidden Markov models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query words to protein attributes. We present features of the annotation that underpin the molecular structure of key processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca2+-signalling proteins, (5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, (9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (11) microbial symbioses and pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical defensome and (15) coral epigenetics. Conclusions We advocate that providing annotation in an open-access searchable database available to the public domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of

  16. INOH: ontology-based highly structured database of signal transduction pathways

    PubMed Central

    Sakai, Noriko; Nakamura, Hiromi; Fukagawa, Hiroshi; Fukuda, Ken; Takagi, Toshihisa

    2011-01-01

    The Integrating Network Objects with Hierarchies (INOH) database is a highly structured, manually curated database of signal transduction pathways including Mammalia, Xenopus laevis, Drosophila melanogaster, Caenorhabditis elegans and canonical. Since most pathway knowledge resides in scientific articles, the database focuses on curating and encoding textual knowledge into a machine-processable form. We use a hierarchical pathway representation model with a compound graph, and every pathway component in the INOH database is annotated by a set of uniquely developed ontologies. Finally, we developed the Similarity Search using the combination of a compound graph and hierarchical ontologies. The INOH database is to be a good resource for many users who want to analyze a large protein network. INOH ontologies and 73 signal transduction and 29 metabolic pathway diagrams (including over 6155 interactions and 3395 protein entities) are freely available in INOH XML and BioPAX formats. Database URL: http://www.inoh.org/ PMID:22120663

  17. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases.

    PubMed

    Caspi, Ron; Billington, Richard; Ferrer, Luciana; Foerster, Hartmut; Fulcher, Carol A; Keseler, Ingrid M; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A; Ong, Quang; Paley, Suzanne; Subhraveti, Pallavi; Weaver, Daniel S; Karp, Peter D

    2016-01-01

    The MetaCyc database (MetaCyc.org) is a freely accessible comprehensive database describing metabolic pathways and enzymes from all domains of life. The majority of MetaCyc pathways are small-molecule metabolic pathways that have been experimentally determined. MetaCyc contains more than 2400 pathways derived from >46,000 publications, and is the largest curated collection of metabolic pathways. BioCyc (BioCyc.org) is a collection of 5700 organism-specific Pathway/Genome Databases (PGDBs), each containing the full genome and predicted metabolic network of one organism, including metabolites, enzymes, reactions, metabolic pathways, predicted operons, transport systems, and pathway-hole fillers. The BioCyc website offers a variety of tools for querying and analyzing PGDBs, including Omics Viewers and tools for comparative analysis. This article provides an update of new developments in MetaCyc and BioCyc during the last two years, including addition of Gibbs free energy values for compounds and reactions; redesign of the primary gene/protein page; addition of a tool for creating diagrams containing multiple linked pathways; several new search capabilities, including searching for genes based on sequence patterns, searching for databases based on an organism's phenotypes, and a cross-organism search; and a metabolite identifier translation service.

  18. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases

    PubMed Central

    Caspi, Ron; Billington, Richard; Ferrer, Luciana; Foerster, Hartmut; Fulcher, Carol A.; Keseler, Ingrid M.; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A.; Ong, Quang; Paley, Suzanne; Subhraveti, Pallavi; Weaver, Daniel S.; Karp, Peter D.

    2016-01-01

    The MetaCyc database (MetaCyc.org) is a freely accessible comprehensive database describing metabolic pathways and enzymes from all domains of life. The majority of MetaCyc pathways are small-molecule metabolic pathways that have been experimentally determined. MetaCyc contains more than 2400 pathways derived from >46 000 publications, and is the largest curated collection of metabolic pathways. BioCyc (BioCyc.org) is a collection of 5700 organism-specific Pathway/Genome Databases (PGDBs), each containing the full genome and predicted metabolic network of one organism, including metabolites, enzymes, reactions, metabolic pathways, predicted operons, transport systems, and pathway-hole fillers. The BioCyc website offers a variety of tools for querying and analyzing PGDBs, including Omics Viewers and tools for comparative analysis. This article provides an update of new developments in MetaCyc and BioCyc during the last two years, including addition of Gibbs free energy values for compounds and reactions; redesign of the primary gene/protein page; addition of a tool for creating diagrams containing multiple linked pathways; several new search capabilities, including searching for genes based on sequence patterns, searching for databases based on an organism's phenotypes, and a cross-organism search; and a metabolite identifier translation service. PMID:26527732

  19. Database Constraints Applied to Metabolic Pathway Reconstruction Tools

    PubMed Central

    Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi

    2014-01-01

    Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes. PMID:25202745

  20. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases.

    PubMed

    Caspi, Ron; Altman, Tomer; Billington, Richard; Dreher, Kate; Foerster, Hartmut; Fulcher, Carol A; Holland, Timothy A; Keseler, Ingrid M; Kothari, Anamika; Kubo, Aya; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A; Ong, Quang; Paley, Suzanne; Subhraveti, Pallavi; Weaver, Daniel S; Weerasinghe, Deepika; Zhang, Peifen; Karp, Peter D

    2014-01-01

    The MetaCyc database (MetaCyc.org) is a comprehensive and freely accessible database describing metabolic pathways and enzymes from all domains of life. MetaCyc pathways are experimentally determined, mostly small-molecule metabolic pathways and are curated from the primary scientific literature. MetaCyc contains >2100 pathways derived from >37,000 publications, and is the largest curated collection of metabolic pathways currently available. BioCyc (BioCyc.org) is a collection of >3000 organism-specific Pathway/Genome Databases (PGDBs), each containing the full genome and predicted metabolic network of one organism, including metabolites, enzymes, reactions, metabolic pathways, predicted operons, transport systems and pathway-hole fillers. Additions to BioCyc over the past 2 years include YeastCyc, a PGDB for Saccharomyces cerevisiae, and 891 new genomes from the Human Microbiome Project. The BioCyc Web site offers a variety of tools for querying and analysis of PGDBs, including Omics Viewers and tools for comparative analysis. New developments include atom mappings in reactions, a new representation of glycan degradation pathways, improved compound structure display, better coverage of enzyme kinetic data, enhancements of the Web Groups functionality, improvements to the Omics viewers, a new representation of the Enzyme Commission system and, for the desktop version of the software, the ability to save display states.

  1. Data mining in the MetaCyc family of pathway databases.

    PubMed

    Karp, Peter D; Paley, Suzanne; Altman, Tomer

    2013-01-01

    Pathway databases collect the bioreactions and molecular interactions that define the processes of life. The MetaCyc family of pathway databases consists of thousands of databases that were derived through computational inference of metabolic pathways from the MetaCyc pathway/genome database (PGDB). In some cases, these DBs underwent subsequent manual curation. Curated pathway DBs are now available for most of the major model organisms. Databases in the MetaCyc family are managed using the Pathway Tools software. This chapter presents methods for performing data mining on the MetaCyc family of pathway DBs. We discuss the major data access mechanisms for the family, which include data files in multiple formats; application programming interfaces (APIs) for the Lisp, Java, and Perl languages; and web services. We present an overview of the Pathway Tools schema, an understanding of which is needed to query the DBs. The chapter also presents several interactive data mining tools within Pathway Tools for performing omics data analysis. PMID:23192547

  2. Data Mining in the MetaCyc Family of Pathway Databases

    PubMed Central

    Karp, Peter D.; Paley, Suzanne; Altman, Tomer

    2013-01-01

    Pathway databases collect the bioreactions and molecular interactions that define the processes of life. The MetaCyc family of pathway databases consists of thousands of databases that were derived through computational inference of metabolic pathways from the MetaCyc Pathway/Genome Database (PGDB). In some cases these DBs underwent subsequent manual curation. Curated pathway DBs are now available for most of the major model organisms. Databases in the MetaCyc family are managed using the Pathway Tools software. This chapter presents methods for performing data mining on the MetaCyc family of pathway DBs. We discuss the major data access mechanisms for the family, which include data files in multiple formats; application programming interfaces (APIs) for the Lisp, Java, and Perl languages; and web services. We present an overview of the Pathway Tools schema, an understanding of which is needed to query the DBs. The chapter also presents several interactive data mining tools within Pathway Tools for performing omics data analysis. PMID:23192547

  3. Integrated pathway-genome databases and their role in drug discovery.

    PubMed

    Karp, P D; Krummenacker, M; Paley, S; Wagg, J

    1999-07-01

    Integrated pathway-genome databases describe the genes and genome of an organism, as well as its predicted pathways, reactions, enzymes and metabolites. In conjunction with visualization and analysis software, these databases provide a framework for improved understanding of microbial physiology and for antimicrobial drug discovery. We describe pathway-based analyses of the genomes of a number of medically relevant microorganisms and a novel software tool that visualizes gene-expression data on a diagram showing the whole metabolic network of the microorganism.

  4. Biochemical pathways analysis of microarray results: regulation of myogenesis in pigs

    PubMed Central

    te Pas, Marinus FW; Hulsegge, Ina; Coster, Albart; Pool, Marco H; Heuven, Henri H; Janss, Luc LG

    2007-01-01

    Background Combining microarray results and biological pathway information will add insight into biological processes. Pathway information is widely available in databases through the internet. Mammalian muscle formation has been previously studied using microarray technology in pigs because these animals are an interesting animal model for muscle formation due to selection for increased muscle mass. Results indicated regulation of the expression of genes involved in proliferation and differentiation of myoblasts, and energy metabolism. The aim of the present study was to analyse microarrays studying myogenesis in pigs. It was necessary to develop methods to search biochemical pathways databases. Results PERL scripts were developed that used the names of the genes on the microarray to search databases. Synonyms of gene names were added to the list by searching the Gene Ontology database. The KEGG database was searched for pathway information using this updated gene list. The KEGG database returned 88 pathways. Most genes were found in a single pathway, but others were found in up to seven pathways. Combining the pathways and the microarray information 21 pathways showed sufficient information content for further analysis. These pathways were related to regulation of several steps in myogenesis and energy metabolism. Pathways regulating myoblast proliferation and muscle fibre formation were described. Furthermore, two networks of pathways describing the formation of the myoblast cytoskeleton and regulation of the energy metabolism during myogenesis were presented. Conclusion Combining microarray results and pathways information available through the internet provide biological insight in how the process of porcine myogenesis is regulated. PMID:17567520

  5. KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases.

    PubMed

    Xie, Chen; Mao, Xizeng; Huang, Jiaju; Ding, Yang; Wu, Jianmin; Dong, Shan; Kong, Lei; Gao, Ge; Li, Chuan-Yun; Wei, Liping

    2011-07-01

    High-throughput experimental technologies often identify dozens to hundreds of genes related to, or changed in, a biological or pathological process. From these genes one wants to identify biological pathways that may be involved and diseases that may be implicated. Here, we report a web server, KOBAS 2.0, which annotates an input set of genes with putative pathways and disease relationships based on mapping to genes with known annotations. It allows for both ID mapping and cross-species sequence similarity mapping. It then performs statistical tests to identify statistically significantly enriched pathways and diseases. KOBAS 2.0 incorporates knowledge across 1327 species from 5 pathway databases (KEGG PATHWAY, PID, BioCyc, Reactome and Panther) and 5 human disease databases (OMIM, KEGG DISEASE, FunDO, GAD and NHGRI GWAS Catalog). KOBAS 2.0 can be accessed at http://kobas.cbi.pku.edu.cn.

  6. Algorithms for effective querying of compound graph-based pathway databases

    PubMed Central

    2009-01-01

    Background Graph-based pathway ontologies and databases are widely used to represent data about cellular processes. This representation makes it possible to programmatically integrate cellular networks and to investigate them using the well-understood concepts of graph theory in order to predict their structural and dynamic properties. An extension of this graph representation, namely hierarchically structured or compound graphs, in which a member of a biological network may recursively contain a sub-network of a somehow logically similar group of biological objects, provides many additional benefits for analysis of biological pathways, including reduction of complexity by decomposition into distinct components or modules. In this regard, it is essential to effectively query such integrated large compound networks to extract the sub-networks of interest with the help of efficient algorithms and software tools. Results Towards this goal, we developed a querying framework, along with a number of graph-theoretic algorithms from simple neighborhood queries to shortest paths to feedback loops, that is applicable to all sorts of graph-based pathway databases, from PPIs (protein-protein interactions) to metabolic and signaling pathways. The framework is unique in that it can account for compound or nested structures and ubiquitous entities present in the pathway data. In addition, the queries may be related to each other through "AND" and "OR" operators, and can be recursively organized into a tree, in which the result of one query might be a source and/or target for another, to form more complex queries. The algorithms were implemented within the querying component of a new version of the software tool PATIKAweb (Pathway Analysis Tool for Integration and Knowledge Acquisition) and have proven useful for answering a number of biologically significant questions for large graph-based pathway databases. Conclusion The PATIKA Project Web site is http

  7. Evaluation of database-derived pathway development for enabling biomarker discovery for hepatotoxicity.

    PubMed

    Hebels, Dennie G A; Jetten, Marlon J A; Aerts, Hugo J W; Herwig, Ralf; Theunissen, Daniël H J; Gaj, Stan; van Delft, Joost H; Kleinjans, Jos C S

    2014-01-01

    Current testing models for predicting drug-induced liver injury are inadequate, as they basically under-report human health risks. We present here an approach towards developing pathways based on hepatotoxicity-associated gene groups derived from two types of publicly accessible hepatotoxicity databases, in order to develop drug-induced liver injury biomarker profiles. One human liver 'omics-based and four text-mining-based databases were explored for hepatotoxicity-associated gene lists. Over-representation analysis of these gene lists with a hepatotoxicant-exposed primary human hepatocytes data set showed that human liver 'omics gene lists performed better than text-mining gene lists and the results of the latter differed strongly between databases. However, both types of databases contained gene lists demonstrating biomarker potential. Visualizing those in pathway format may aid in interpreting the biomolecular background. We conclude that exploiting existing and openly accessible databases in a dedicated manner seems promising in providing venues for translational research in toxicology and biomarker development.

  8. Data, information, knowledge and principle: back to metabolism in KEGG.

    PubMed

    Kanehisa, Minoru; Goto, Susumu; Sato, Yoko; Kawashima, Masayuki; Furumichi, Miho; Tanabe, Mao

    2014-01-01

    In the hierarchy of data, information and knowledge, computational methods play a major role in the initial processing of data to extract information, but they alone become less effective to compile knowledge from information. The Kyoto Encyclopedia of Genes and Genomes (KEGG) resource (http://www.kegg.jp/ or http://www.genome.jp/kegg/) has been developed as a reference knowledge base to assist this latter process. In particular, the KEGG pathway maps are widely used for biological interpretation of genome sequences and other high-throughput data. The link from genomes to pathways is made through the KEGG Orthology system, a collection of manually defined ortholog groups identified by K numbers. To better automate this interpretation process the KEGG modules defined by Boolean expressions of K numbers have been expanded and improved. Once genes in a genome are annotated with K numbers, the KEGG modules can be computationally evaluated revealing metabolic capacities and other phenotypic features. The reaction modules, which represent chemical units of reactions, have been used to analyze design principles of metabolic networks and also to improve the definition of K numbers and associated annotations. For translational bioinformatics, the KEGG MEDICUS resource has been developed by integrating drug labels (package inserts) used in society.

  9. Data, information, knowledge and principle: back to metabolism in KEGG.

    PubMed

    Kanehisa, Minoru; Goto, Susumu; Sato, Yoko; Kawashima, Masayuki; Furumichi, Miho; Tanabe, Mao

    2014-01-01

    In the hierarchy of data, information and knowledge, computational methods play a major role in the initial processing of data to extract information, but they alone become less effective to compile knowledge from information. The Kyoto Encyclopedia of Genes and Genomes (KEGG) resource (http://www.kegg.jp/ or http://www.genome.jp/kegg/) has been developed as a reference knowledge base to assist this latter process. In particular, the KEGG pathway maps are widely used for biological interpretation of genome sequences and other high-throughput data. The link from genomes to pathways is made through the KEGG Orthology system, a collection of manually defined ortholog groups identified by K numbers. To better automate this interpretation process the KEGG modules defined by Boolean expressions of K numbers have been expanded and improved. Once genes in a genome are annotated with K numbers, the KEGG modules can be computationally evaluated revealing metabolic capacities and other phenotypic features. The reaction modules, which represent chemical units of reactions, have been used to analyze design principles of metabolic networks and also to improve the definition of K numbers and associated annotations. For translational bioinformatics, the KEGG MEDICUS resource has been developed by integrating drug labels (package inserts) used in society. PMID:24214961

  10. Creation of a genome-wide metabolic pathway database for Populus trichocarpa using a new approach for reconstruction and curation of metabolic pathways for plants.

    PubMed

    Zhang, Peifen; Dreher, Kate; Karthikeyan, A; Chi, Anjo; Pujar, Anuradha; Caspi, Ron; Karp, Peter; Kirkup, Vanessa; Latendresse, Mario; Lee, Cynthia; Mueller, Lukas A; Muller, Robert; Rhee, Seung Yon

    2010-08-01

    Metabolic networks reconstructed from sequenced genomes or transcriptomes can help visualize and analyze large-scale experimental data, predict metabolic phenotypes, discover enzymes, engineer metabolic pathways, and study metabolic pathway evolution. We developed a general approach for reconstructing metabolic pathway complements of plant genomes. Two new reference databases were created and added to the core of the infrastructure: a comprehensive, all-plant reference pathway database, PlantCyc, and a reference enzyme sequence database, RESD, for annotating metabolic functions of protein sequences. PlantCyc (version 3.0) includes 714 metabolic pathways and 2,619 reactions from over 300 species. RESD (version 1.0) contains 14,187 literature-supported enzyme sequences from across all kingdoms. We used RESD, PlantCyc, and MetaCyc (an all-species reference metabolic pathway database), in conjunction with the pathway prediction software Pathway Tools, to reconstruct a metabolic pathway database, PoplarCyc, from the recently sequenced genome of Populus trichocarpa. PoplarCyc (version 1.0) contains 321 pathways with 1,807 assigned enzymes. Comparing PoplarCyc (version 1.0) with AraCyc (version 6.0, Arabidopsis [Arabidopsis thaliana]) showed comparable numbers of pathways distributed across all domains of metabolism in both databases, except for a higher number of AraCyc pathways in secondary metabolism and a 1.5-fold increase in carbohydrate metabolic enzymes in PoplarCyc. Here, we introduce these new resources and demonstrate the feasibility of using them to identify candidate enzymes for specific pathways and to analyze metabolite profiling data through concrete examples. These resources can be searched by text or BLAST, browsed, and downloaded from our project Web site (http://plantcyc.org).

  11. FragariaCyc: A Metabolic Pathway Database for Woodland Strawberry Fragaria vesca.

    PubMed

    Naithani, Sushma; Partipilo, Christina M; Raja, Rajani; Elser, Justin L; Jaiswal, Pankaj

    2016-01-01

    FragariaCyc is a strawberry-specific cellular metabolic network based on the annotated genome sequence of Fragaria vesca L. ssp. vesca, accession Hawaii 4. It was built on the Pathway-Tools platform using MetaCyc as the reference. The experimental evidences from published literature were used for supporting/editing existing entities and for the addition of new pathways, enzymes, reactions, compounds, and small molecules in the database. To date, FragariaCyc comprises 66 super-pathways, 488 unique pathways, 2348 metabolic reactions, 3507 enzymes, and 2134 compounds. In addition to searching and browsing FragariaCyc, researchers can compare pathways across various plant metabolic networks and analyze their data using Omics Viewer tool. We view FragariaCyc as a resource for the community of researchers working with strawberry and related fruit crops. It can help understanding the regulation of overall metabolism of strawberry plant during development and in response to diseases and abiotic stresses. FragariaCyc is available online at http://pathways.cgrb.oregonstate.edu.

  12. FragariaCyc: A Metabolic Pathway Database for Woodland Strawberry Fragaria vesca

    PubMed Central

    Naithani, Sushma; Partipilo, Christina M.; Raja, Rajani; Elser, Justin L.; Jaiswal, Pankaj

    2016-01-01

    FragariaCyc is a strawberry-specific cellular metabolic network based on the annotated genome sequence of Fragaria vesca L. ssp. vesca, accession Hawaii 4. It was built on the Pathway-Tools platform using MetaCyc as the reference. The experimental evidences from published literature were used for supporting/editing existing entities and for the addition of new pathways, enzymes, reactions, compounds, and small molecules in the database. To date, FragariaCyc comprises 66 super-pathways, 488 unique pathways, 2348 metabolic reactions, 3507 enzymes, and 2134 compounds. In addition to searching and browsing FragariaCyc, researchers can compare pathways across various plant metabolic networks and analyze their data using Omics Viewer tool. We view FragariaCyc as a resource for the community of researchers working with strawberry and related fruit crops. It can help understanding the regulation of overall metabolism of strawberry plant during development and in response to diseases and abiotic stresses. FragariaCyc is available online at http://pathways.cgrb.oregonstate.edu. PMID:26973684

  13. Comprehensive analysis of the N-glycan biosynthetic pathway using bioinformatics to generate UniCorn: A theoretical N-glycan structure database.

    PubMed

    Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P

    2016-08-01

    Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database.

  14. SPIKE – a database, visualization and analysis tool of cellular signaling pathways

    PubMed Central

    Elkon, Ran; Vesterman, Rita; Amit, Nira; Ulitsky, Igor; Zohar, Idan; Weisz, Mali; Mass, Gilad; Orlev, Nir; Sternberg, Giora; Blekhman, Ran; Assa, Jackie; Shiloh, Yosef; Shamir, Ron

    2008-01-01

    Background Biological signaling pathways that govern cellular physiology form an intricate web of tightly regulated interlocking processes. Data on these regulatory networks are accumulating at an unprecedented pace. The assimilation, visualization and interpretation of these data have become a major challenge in biological research, and once met, will greatly boost our ability to understand cell functioning on a systems level. Results To cope with this challenge, we are developing the SPIKE knowledge-base of signaling pathways. SPIKE contains three main software components: 1) A database (DB) of biological signaling pathways. Carefully curated information from the literature and data from large public sources constitute distinct tiers of the DB. 2) A visualization package that allows interactive graphic representations of regulatory interactions stored in the DB and superposition of functional genomic and proteomic data on the maps. 3) An algorithmic inference engine that analyzes the networks for novel functional interplays between network components. SPIKE is designed and implemented as a community tool and therefore provides a user-friendly interface that allows registered users to upload data to SPIKE DB. Our vision is that the DB will be populated by a distributed and highly collaborative effort undertaken by multiple groups in the research community, where each group contributes data in its field of expertise. Conclusion The integrated capabilities of SPIKE make it a powerful platform for the analysis of signaling networks and the integration of knowledge on such networks with omics data. PMID:18289391

  15. Enhancing a Pathway-Genome Database (PGDB) to Capture Subcellular Localization of Metabolites and Enzymes: The Nucleotide-Sugar Biosynthetic Pathways of Populus trichocarpa

    SciTech Connect

    Nag, A.; Karpinets, T. V.; Chang, C. H.; Bar-Peled, M.

    2012-01-01

    Understanding how cellular metabolism works and is regulated requires that the underlying biochemical pathways be adequately represented and integrated with large metabolomic data sets to establish a robust network model. Genetically engineering energy crops to be less recalcitrant to saccharification requires detailed knowledge of plant polysaccharide structures and a thorough understanding of the metabolic pathways involved in forming and regulating cell-wall synthesis. Nucleotide-sugars are building blocks for synthesis of cell wall polysaccharides. The biosynthesis of nucleotide-sugars is catalyzed by a multitude of enzymes that reside in different subcellular organelles, and precise representation of these pathways requires accurate capture of this biological compartmentalization. The lack of simple localization cues in genomic sequence data and annotations however leads to missing compartmentalization information for eukaryotes in automatically generated databases, such as the Pathway-Genome Databases (PGDBs) of the SRI Pathway Tools software that drives much biochemical knowledge representation on the internet. In this report, we provide an informal mechanism using the existing Pathway Tools framework to integrate protein and metabolite sub-cellular localization data with the existing representation of the nucleotide-sugar metabolic pathways in a prototype PGDB for Populus trichocarpa. The enhanced pathway representations have been successfully used to map SNP abundance data to individual nucleotide-sugar biosynthetic genes in the PGDB. The manually curated pathway representations are more conducive to the construction of a computational platform that will allow the simulation of natural and engineered nucleotide-sugar precursor fluxes into specific recalcitrant polysaccharide(s).

  16. DemaDb: an integrated dematiaceous fungal genomes database

    PubMed Central

    Kuan, Chee Sian; Yew, Su Mei; Chan, Chai Ling; Toh, Yue Fen; Lee, Kok Wei; Cheong, Wei-Hien; Yee, Wai-Yan; Hoh, Chee-Choong; Yap, Soon-Joo; Ng, Kee Peng

    2016-01-01

    Many species of dematiaceous fungi are associated with allergic reactions and potentially fatal diseases in human, especially in tropical climates. Over the past 10 years, we have isolated more than 400 dematiaceous fungi from various clinical samples. In this study, DemaDb, an integrated database was designed to support the integration and analysis of dematiaceous fungal genomes. A total of 92 072 putative genes and 6527 pathways that identified in eight dematiaceous fungi (Bipolaris papendorfii UM 226, Daldinia eschscholtzii UM 1400, D. eschscholtzii UM 1020, Pyrenochaeta unguis-hominis UM 256, Ochroconis mirabilis UM 578, Cladosporium sphaerospermum UM 843, Herpotrichiellaceae sp. UM 238 and Pleosporales sp. UM 1110) were deposited in DemaDb. DemaDb includes functional annotations for all predicted gene models in all genomes, such as Gene Ontology, EuKaryotic Orthologous Groups, Kyoto Encyclopedia of Genes and Genomes (KEGG), Pfam and InterProScan. All predicted protein models were further functionally annotated to Carbohydrate-Active enzymes, peptidases, secondary metabolites and virulence factors. DemaDb Genome Browser enables users to browse and visualize entire genomes with annotation data including gene prediction, structure, orientation and custom feature tracks. The Pathway Browser based on the KEGG pathway database allows users to look into molecular interaction and reaction networks for all KEGG annotated genes. The availability of downloadable files containing assembly, nucleic acid, as well as protein data allows the direct retrieval for further downstream works. DemaDb is a useful resource for fungal research community especially those involved in genome-scale analysis, functional genomics, genetics and disease studies of dematiaceous fungi. Database URL: http://fungaldb.um.edu.my PMID:26980516

  17. aglgenes, A curated and searchable database of archaeal N-glycosylation pathway components.

    PubMed

    Godin, Noa; Eichler, Jerry

    2014-01-01

    Whereas N-glycosylation is a posttranslational modification performed across evolution, the archaeal version of this protein-processing event presents a degree of diversity not seen in either bacteria or eukarya. Accordingly, archaeal N-glycosylation relies on a large number of enzymes that are often species-specific or restricted to a select group of species. As such, there is a need for an organized platform upon which amassing information about archaeal glycosylation (agl) genes can rest. Accordingly, the aglgenes database provides detailed descriptions of experimentally characterized archaeal N-glycosyation pathway components. For each agl gene, genomic information, supporting literature and relevant external links are provided at a functional intuitive web-interface designed for data browsing. Routine updates ensure that novel experimental information on genes and proteins contributing to archaeal N-glycosylation is incorporated into aglgenes in a timely manner. As such, aglgenes represents a specialized resource for sharing validated experimental information online, providing support for workers in the field of archaeal protein glycosylation. Database URL: www.bgu.ac.il/aglgenes.

  18. Multiomics in grape berry skin revealed specific induction of the stilbene synthetic pathway by ultraviolet-C irradiation.

    PubMed

    Suzuki, Mami; Nakabayashi, Ryo; Ogata, Yoshiyuki; Sakurai, Nozomu; Tokimatsu, Toshiaki; Goto, Susumu; Suzuki, Makoto; Jasinski, Michal; Martinoia, Enrico; Otagaki, Shungo; Matsumoto, Shogo; Saito, Kazuki; Shiratake, Katsuhiro

    2015-05-01

    Grape (Vitis vinifera) accumulates various polyphenolic compounds, which protect against environmental stresses, including ultraviolet-C (UV-C) light and pathogens. In this study, we looked at the transcriptome and metabolome in grape berry skin after UV-C irradiation, which demonstrated the effectiveness of omics approaches to clarify important traits of grape. We performed transcriptome analysis using a genome-wide microarray, which revealed 238 genes up-regulated more than 5-fold by UV-C light. Enrichment analysis of Gene Ontology terms showed that genes encoding stilbene synthase, a key enzyme for resveratrol synthesis, were enriched in the up-regulated genes. We performed metabolome analysis using liquid chromatography-quadrupole time-of-flight mass spectrometry, and 2,012 metabolite peaks, including unidentified peaks, were detected. Principal component analysis using the peaks showed that only one metabolite peak, identified as resveratrol, was highly induced by UV-C light. We updated the metabolic pathway map of grape in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and in the KaPPA-View 4 KEGG system, then projected the transcriptome and metabolome data on a metabolic pathway map. The map showed specific induction of the resveratrol synthetic pathway by UV-C light. Our results showed that multiomics is a powerful tool to elucidate the accumulation mechanisms of secondary metabolites, and updated systems, such as KEGG and KaPPA-View 4 KEGG for grape, can support such studies.

  19. Multiomics in Grape Berry Skin Revealed Specific Induction of the Stilbene Synthetic Pathway by Ultraviolet-C Irradiation1

    PubMed Central

    Suzuki, Mami; Nakabayashi, Ryo; Ogata, Yoshiyuki; Sakurai, Nozomu; Tokimatsu, Toshiaki; Goto, Susumu; Suzuki, Makoto; Jasinski, Michal; Martinoia, Enrico; Otagaki, Shungo; Matsumoto, Shogo; Saito, Kazuki; Shiratake, Katsuhiro

    2015-01-01

    Grape (Vitis vinifera) accumulates various polyphenolic compounds, which protect against environmental stresses, including ultraviolet-C (UV-C) light and pathogens. In this study, we looked at the transcriptome and metabolome in grape berry skin after UV-C irradiation, which demonstrated the effectiveness of omics approaches to clarify important traits of grape. We performed transcriptome analysis using a genome-wide microarray, which revealed 238 genes up-regulated more than 5-fold by UV-C light. Enrichment analysis of Gene Ontology terms showed that genes encoding stilbene synthase, a key enzyme for resveratrol synthesis, were enriched in the up-regulated genes. We performed metabolome analysis using liquid chromatography-quadrupole time-of-flight mass spectrometry, and 2,012 metabolite peaks, including unidentified peaks, were detected. Principal component analysis using the peaks showed that only one metabolite peak, identified as resveratrol, was highly induced by UV-C light. We updated the metabolic pathway map of grape in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and in the KaPPA-View 4 KEGG system, then projected the transcriptome and metabolome data on a metabolic pathway map. The map showed specific induction of the resveratrol synthetic pathway by UV-C light. Our results showed that multiomics is a powerful tool to elucidate the accumulation mechanisms of secondary metabolites, and updated systems, such as KEGG and KaPPA-View 4 KEGG for grape, can support such studies. PMID:25761715

  20. Pathway Network Analyses for Autism Reveal Multisystem Involvement, Major Overlaps with Other Diseases and Convergence upon MAPK and Calcium Signaling.

    PubMed

    Wen, Ya; Alshikho, Mohamad J; Herbert, Martha R

    2016-01-01

    We used established databases in standard ways to systematically characterize gene ontologies, pathways and functional linkages in the large set of genes now associated with autism spectrum disorders (ASDs). These conditions are particularly challenging--they lack clear pathognomonic biological markers, they involve great heterogeneity across multiple levels (genes, systemic biological and brain characteristics, and nuances of behavioral manifestations)-and yet everyone with this diagnosis meets the same defining behavioral criteria. Using the human gene list from Simons Foundation Autism Research Initiative (SFARI) we performed gene set enrichment analysis with the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Database, and then derived a pathway network from pathway-pathway functional interactions again in reference to KEGG. Through identifying the GO (Gene Ontology) groups in which SFARI genes were enriched, mapping the coherence between pathways and GO groups, and ranking the relative strengths of representation of pathway network components, we 1) identified 10 disease-associated and 30 function-associated pathways 2) revealed calcium signaling pathway and neuroactive ligand-receptor interaction as the most enriched, statistically significant pathways from the enrichment analysis, 3) showed calcium signaling pathways and MAPK signaling pathway to be interactive hubs with other pathways and also to be involved with pervasively present biological processes, 4) found convergent indications that the process "calcium-PRC (protein kinase C)-Ras-Raf-MAPK/ERK" is likely a major contributor to ASD pathophysiology, and 5) noted that perturbations associated with KEGG's category of environmental information processing were common. These findings support the idea that ASD-associated genes may contribute not only to core features of ASD themselves but also to vulnerability to other chronic and systemic problems potentially including cancer, metabolic conditions

  1. Pathway Network Analyses for Autism Reveal Multisystem Involvement, Major Overlaps with Other Diseases and Convergence upon MAPK and Calcium Signaling.

    PubMed

    Wen, Ya; Alshikho, Mohamad J; Herbert, Martha R

    2016-01-01

    We used established databases in standard ways to systematically characterize gene ontologies, pathways and functional linkages in the large set of genes now associated with autism spectrum disorders (ASDs). These conditions are particularly challenging--they lack clear pathognomonic biological markers, they involve great heterogeneity across multiple levels (genes, systemic biological and brain characteristics, and nuances of behavioral manifestations)-and yet everyone with this diagnosis meets the same defining behavioral criteria. Using the human gene list from Simons Foundation Autism Research Initiative (SFARI) we performed gene set enrichment analysis with the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Database, and then derived a pathway network from pathway-pathway functional interactions again in reference to KEGG. Through identifying the GO (Gene Ontology) groups in which SFARI genes were enriched, mapping the coherence between pathways and GO groups, and ranking the relative strengths of representation of pathway network components, we 1) identified 10 disease-associated and 30 function-associated pathways 2) revealed calcium signaling pathway and neuroactive ligand-receptor interaction as the most enriched, statistically significant pathways from the enrichment analysis, 3) showed calcium signaling pathways and MAPK signaling pathway to be interactive hubs with other pathways and also to be involved with pervasively present biological processes, 4) found convergent indications that the process "calcium-PRC (protein kinase C)-Ras-Raf-MAPK/ERK" is likely a major contributor to ASD pathophysiology, and 5) noted that perturbations associated with KEGG's category of environmental information processing were common. These findings support the idea that ASD-associated genes may contribute not only to core features of ASD themselves but also to vulnerability to other chronic and systemic problems potentially including cancer, metabolic conditions

  2. BioWarehouse: a bioinformatics database warehouse toolkit

    PubMed Central

    Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D

    2006-01-01

    Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for

  3. iPathCons and iPathDB: an improved insect pathway construction tool and the database.

    PubMed

    Zhang, Zan; Yin, Chuanlin; Liu, Ying; Jie, Wencai; Lei, Wenjie; Li, Fei

    2014-01-01

    Insects are one of the most successful animal groups on earth. Some insects, such as the silkworm and honeybee, are beneficial to humans, whereas others are notorious pests of crops. At present, the genomes of 38 insects have been sequenced and made publically available. In addition, the transcriptomes of dozens of insects have been sequenced. As gene data rapidly accumulate, constructing the pathway of molecular interactions becomes increasingly important for entomological research. Here, we developed an improved tool, iPathCons, for knowledge-based construction of pathways from the transcriptomes or the official gene sets of genomes. Considering the high evolution diversity in insects, iPathCons uses a voting system for Kyoto Encyclopedia of Genes and Genomes Orthology assignment. Both stand-alone software and a web server of iPathCons are provided. Using iPathCons, we constructed the pathways of molecular interactions of 52 insects, including 37 genome-sequenced and 15 transcriptome-sequenced ones. These pathways are available in the iPathDB, which provides searches, web server, data downloads, etc. This database will be highly useful for the insect research community. Database URL: http://ento.njau.edu.cn/ipath/

  4. PathCase-SB architecture and database design

    PubMed Central

    2011-01-01

    Background Integration of metabolic pathways resources and regulatory metabolic network models, and deploying new tools on the integrated platform can help perform more effective and more efficient systems biology research on understanding the regulation in metabolic networks. Therefore, the tasks of (a) integrating under a single database environment regulatory metabolic networks and existing models, and (b) building tools to help with modeling and analysis are desirable and intellectually challenging computational tasks. Description PathCase Systems Biology (PathCase-SB) is built and released. The PathCase-SB database provides data and API for multiple user interfaces and software tools. The current PathCase-SB system provides a database-enabled framework and web-based computational tools towards facilitating the development of kinetic models for biological systems. PathCase-SB aims to integrate data of selected biological data sources on the web (currently, BioModels database and KEGG), and to provide more powerful and/or new capabilities via the new web-based integrative framework. This paper describes architecture and database design issues encountered in PathCase-SB's design and implementation, and presents the current design of PathCase-SB's architecture and database. Conclusions PathCase-SB architecture and database provide a highly extensible and scalable environment with easy and fast (real-time) access to the data in the database. PathCase-SB itself is already being used by researchers across the world. PMID:22070889

  5. Transcriptome and Metabolite analysis reveal candidate genes of the cardiac glycoside biosynthetic pathway from Calotropis procera

    PubMed Central

    Pandey, Akansha; Swarnkar, Vishakha; Pandey, Tushar; Srivastava, Piush; Kanojiya, Sanjeev; Mishra, Dipak Kumar; Tripathi, Vineeta

    2016-01-01

    Calotropis procera is a medicinal plant of immense importance due to its pharmaceutical active components, especially cardiac glycosides (CG). As genomic resources for this plant are limited, the genes involved in CG biosynthetic pathway remain largely unknown till date. Our study on stage and tissue specific metabolite accumulation showed that CG’s were maximally accumulated in stems of 3 month old seedlings. De novo transcriptome sequencing of same was done using high throughput Illumina HiSeq platform generating 44074 unigenes with average mean length of 1785 base pair. Around 66.6% of unigenes were annotated by using various public databases and 5324 unigenes showed significant match in the KEGG database involved in 133 different pathways of plant metabolism. Further KEGG analysis resulted in identification of 336 unigenes involved in cardenolide biosynthesis. Tissue specific expression analysis of 30 putative transcripts involved in terpenoid, steroid and cardenolide pathways showed a positive correlation between metabolite and transcript accumulation. Wound stress elevated CG levels as well the levels of the putative transcripts involved in its biosynthetic pathways. This result further validated the involvement of identified transcripts in CGs biosynthesis. The identified transcripts will lay a substantial foundation for further research on metabolic engineering and regulation of cardiac glycosides biosynthesis pathway genes. PMID:27703261

  6. Pathway Network Analyses for Autism Reveal Multisystem Involvement, Major Overlaps with Other Diseases and Convergence upon MAPK and Calcium Signaling

    PubMed Central

    Wen, Ya; Alshikho, Mohamad J.; Herbert, Martha R.

    2016-01-01

    We used established databases in standard ways to systematically characterize gene ontologies, pathways and functional linkages in the large set of genes now associated with autism spectrum disorders (ASDs). These conditions are particularly challenging—they lack clear pathognomonic biological markers, they involve great heterogeneity across multiple levels (genes, systemic biological and brain characteristics, and nuances of behavioral manifestations)—and yet everyone with this diagnosis meets the same defining behavioral criteria. Using the human gene list from Simons Foundation Autism Research Initiative (SFARI) we performed gene set enrichment analysis with the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Database, and then derived a pathway network from pathway-pathway functional interactions again in reference to KEGG. Through identifying the GO (Gene Ontology) groups in which SFARI genes were enriched, mapping the coherence between pathways and GO groups, and ranking the relative strengths of representation of pathway network components, we 1) identified 10 disease-associated and 30 function-associated pathways 2) revealed calcium signaling pathway and neuroactive ligand-receptor interaction as the most enriched, statistically significant pathways from the enrichment analysis, 3) showed calcium signaling pathways and MAPK signaling pathway to be interactive hubs with other pathways and also to be involved with pervasively present biological processes, 4) found convergent indications that the process “calcium-PRC (protein kinase C)-Ras-Raf-MAPK/ERK” is likely a major contributor to ASD pathophysiology, and 5) noted that perturbations associated with KEGG’s category of environmental information processing were common. These findings support the idea that ASD-associated genes may contribute not only to core features of ASD themselves but also to vulnerability to other chronic and systemic problems potentially including cancer, metabolic

  7. An efficient algorithm for de novo predictions of biochemical pathways between chemical compounds

    PubMed Central

    2012-01-01

    Background Prediction of biochemical (metabolic) pathways has a wide range of applications, including the optimization of drug candidates, and the elucidation of toxicity mechanisms. Recently, several methods have been developed for pathway prediction to derive a goal compound from a start compound. However, these methods require high computational costs, and cannot perform comprehensive prediction of novel metabolic pathways. Our aim of this study is to develop a de novo prediction method for reconstructions of metabolic pathways and predictions of unknown biosynthetic pathways in the sense that it does not require any initial network such as KEGG metabolic network to be explored. Results We formulated pathway prediction between a start compound and a goal compound as the shortest path search problem in terms of the number of enzyme reactions applied. We propose an efficient search method based on A* algorithm and heuristic techniques utilizing Linear Programming (LP) solution for estimation of the distance to the goal. First, a chemical compound is represented by a feature vector which counts frequencies of substructure occurrences in the structural formula. Second, an enzyme reaction is represented as an operator vector by detecting the structural changes to compounds before and after the reaction. By defining compound vectors as nodes and operator vectors as edges, prediction of the reaction pathway is reduced to the shortest path search problem in the vector space. In experiments on the DDT degradation pathway, we verify that the shortest paths predicted by our method are biologically correct pathways registered in the KEGG database. The results also demonstrate that the LP heuristics can achieve significant reduction in computation time. Furthermore, we apply our method to a secondary metabolite pathway of plant origin, and successfully find a novel biochemical pathway which cannot be predicted by the existing method. For the reconstruction of a known

  8. Sentra : a database of signal transduction proteins for comparative genome analysis.

    SciTech Connect

    D'Souza, M.; Glass, E. M.; Syed, M. H.; Zhang, Y.; Rodriguez, A.; Maltsev, N.; Galerpin, M. Y.; Mathematics and Computer Science; Univ. of Chicago; NIH

    2007-01-01

    Sentra (http://compbio.mcs.anl.gov/sentra), a database of signal transduction proteins encoded in completely sequenced prokaryotic genomes, has been updated to reflect recent advances in understanding signal transduction events on a whole-genome scale. Sentra consists of two principal components, a manually curated list of signal transduction proteins in 202 completely sequenced prokaryotic genomes and an automatically generated listing of predicted signaling proteins in 235 sequenced genomes that are awaiting manual curation. In addition to two-component histidine kinases and response regulators, the database now lists manually curated Ser/Thr/Tyr protein kinases and protein phosphatases, as well as adenylate and diguanylate cyclases and c-di-GMP phosphodiesterases, as defined in several recent reviews. All entries in Sentra are extensively annotated with relevant information from public databases (e.g. UniProt, KEGG, PDB and NCBI). Sentra's infrastructure was redesigned to support interactive cross-genome comparisons of signal transduction capabilities of prokaryotic organisms from a taxonomic and phenotypic perspective and in the framework of signal transduction pathways from KEGG. Sentra leverages the PUMA2 system to support interactive analysis and annotation of signal transduction proteins by the users.

  9. The Candidate Cancer Gene Database: a database of cancer driver genes from forward genetic screens in mice.

    PubMed

    Abbott, Kenneth L; Nyre, Erik T; Abrahante, Juan; Ho, Yen-Yi; Isaksson Vogel, Rachel; Starr, Timothy K

    2015-01-01

    Identification of cancer driver gene mutations is crucial for advancing cancer therapeutics. Due to the overwhelming number of passenger mutations in the human tumor genome, it is difficult to pinpoint causative driver genes. Using transposon mutagenesis in mice many laboratories have conducted forward genetic screens and identified thousands of candidate driver genes that are highly relevant to human cancer. Unfortunately, this information is difficult to access and utilize because it is scattered across multiple publications using different mouse genome builds and strength metrics. To improve access to these findings and facilitate meta-analyses, we developed the Candidate Cancer Gene Database (CCGD, http://ccgd-starrlab.oit.umn.edu/). The CCGD is a manually curated database containing a unified description of all identified candidate driver genes and the genomic location of transposon common insertion sites (CISs) from all currently published transposon-based screens. To demonstrate relevance to human cancer, we performed a modified gene set enrichment analysis using KEGG pathways and show that human cancer pathways are highly enriched in the database. We also used hierarchical clustering to identify pathways enriched in blood cancers compared to solid cancers. The CCGD is a novel resource available to scientists interested in the identification of genetic drivers of cancer.

  10. PathwAX: a web server for network crosstalk based pathway annotation

    PubMed Central

    Ogris, Christoph; Helleday, Thomas; Sonnhammer, Erik L.L.

    2016-01-01

    Pathway annotation of gene lists is often used to functionally analyse biomolecular data such as gene expression in order to establish which processes are activated in a given experiment. Databases such as KEGG or GO represent collections of how genes are known to be organized in pathways, and the challenge is to compare a given gene list with the known pathways such that all true relations are identified. Most tools apply statistical measures to the gene overlap between the gene list and pathway. It is however problematic to avoid false negatives and false positives when only using the gene overlap. The pathwAX web server (http://pathwAX.sbc.su.se/) applies a different approach which is based on network crosstalk. It uses the comprehensive network FunCoup to analyse network crosstalk between a query gene list and KEGG pathways. PathwAX runs the BinoX algorithm, which employs Monte-Carlo sampling of randomized networks and estimates a binomial distribution, for estimating the statistical significance of the crosstalk. This results in substantially higher accuracy than gene overlap methods. The system was optimized for speed and allows interactive web usage. We illustrate the usage and output of pathwAX. PMID:27151197

  11. SoyFN: a knowledge database of soybean functional networks.

    PubMed

    Xu, Yungang; Guo, Maozu; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang

    2014-01-01

    Many databases for soybean genomic analysis have been built and made publicly available, but few of them contain knowledge specifically targeting the omics-level gene-gene, gene-microRNA (miRNA) and miRNA-miRNA interactions. Here, we present SoyFN, a knowledge database of soybean functional gene networks and miRNA functional networks. SoyFN provides user-friendly interfaces to retrieve, visualize, analyze and download the functional networks of soybean genes and miRNAs. In addition, it incorporates much information about KEGG pathways, gene ontology annotations and 3'-UTR sequences as well as many useful tools including SoySearch, ID mapping, Genome Browser, eFP Browser and promoter motif scan. SoyFN is a schema-free database that can be accessed as a Web service from any modern programming language using a simple Hypertext Transfer Protocol call. The Web site is implemented in Java, JavaScript, PHP, HTML and Apache, with all major browsers supported. We anticipate that this database will be useful for members of research communities both in soybean experimental science and bioinformatics. Database URL: http://nclab.hit.edu.cn/SoyFN.

  12. dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock.

    PubMed

    Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta

    2016-01-01

    Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf. PMID:26727469

  13. dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock.

    PubMed

    Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta

    2016-01-01

    Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf.

  14. dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock

    PubMed Central

    Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta

    2016-01-01

    Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf. PMID:26727469

  15. Pedigree-based random effect tests to screen gene pathways.

    PubMed

    Almeida, Marcio; Peralta, Juan M; Farook, Vidya; Puppala, Sobha; Kent, John W; Duggirala, Ravindranath; Blangero, John

    2014-01-01

    The new generation of sequencing platforms opens new horizons in the genetics field. It is possible to exhaustively assay all genetic variants in an individual and search for phenotypic associations. The whole genome sequencing approach, when applied to a large human sample like the San Antonio Family Study, detects a very large number (>25 million) of single nucleotide variants along with other more complex variants. The analytical challenges imposed by this number of variants are formidable, suggesting that methods are needed to reduce the overall number of statistical tests. In this study, we develop a single degree-of-freedom test of variants in a gene pathway employing a random effect model that uses an empirical pathway-specific genetic relationship matrix as the focal covariance kernel. The empirical pathway-specific genetic relationship uses all variants (or a chosen subset) from gene members of a given biological pathway. Using SOLAR's pedigree-based variance components modeling, which also allows for arbitrary fixed effects, such as principal components, to deal with latent population structure, we employ a likelihood ratio test of the pathway-specific genetic relationship matrix model. We examine all gene pathways in KEGG database gene pathways using our method in the first replicate of the Genetic Analysis Workshop 18 simulation of systolic blood pressure. Our random effect approach was able to detect true association signals in causal gene pathways. Those pathways could be easily be further dissected by the independent analysis of all markers. PMID:25519354

  16. MOPED: Model Organism Protein Expression Database.

    PubMed

    Kolker, Eugene; Higdon, Roger; Haynes, Winston; Welch, Dean; Broomall, William; Lancet, Doron; Stanberry, Larissa; Kolker, Natali

    2012-01-01

    Large numbers of mass spectrometry proteomics studies are being conducted to understand all types of biological processes. The size and complexity of proteomics data hinders efforts to easily share, integrate, query and compare the studies. The Model Organism Protein Expression Database (MOPED, htttp://moped.proteinspire.org) is a new and expanding proteomics resource that enables rapid browsing of protein expression information from publicly available studies on humans and model organisms. MOPED is designed to simplify the comparison and sharing of proteomics data for the greater research community. MOPED uniquely provides protein level expression data, meta-analysis capabilities and quantitative data from standardized analysis. Data can be queried for specific proteins, browsed based on organism, tissue, localization and condition and sorted by false discovery rate and expression. MOPED empowers users to visualize their own expression data and compare it with existing studies. Further, MOPED links to various protein and pathway databases, including GeneCards, Entrez, UniProt, KEGG and Reactome. The current version of MOPED contains over 43,000 proteins with at least one spectral match and more than 11 million high certainty spectra.

  17. Pathway modeling of microarray data: A case study of pathway activity changes in the testis following in utero exposure to dibutyl phthalate (DBP)

    SciTech Connect

    Ovacik, Meric A.; Sen, Banalata; Euling, Susan Y.; Gaido, Kevin W.; Ierapetritou, Marianthi G.; Androulakis, Ioannis P.

    2013-09-15

    Pathway activity level analysis, the approach pursued in this study, focuses on all genes that are known to be members of metabolic and signaling pathways as defined by the KEGG database. The pathway activity level analysis entails singular value decomposition (SVD) of the expression data of the genes constituting a given pathway. We explore an extension of the pathway activity methodology for application to time-course microarray data. We show that pathway analysis enhances our ability to detect biologically relevant changes in pathway activity using synthetic data. As a case study, we apply the pathway activity level formulation coupled with significance analysis to microarray data from two different rat testes exposed in utero to Dibutyl Phthalate (DBP). In utero DBP exposure in the rat results in developmental toxicity of a number of male reproductive organs, including the testes. One well-characterized mode of action for DBP and the male reproductive developmental effects is the repression of expression of genes involved in cholesterol transport, steroid biosynthesis and testosterone synthesis that lead to a decreased fetal testicular testosterone. Previous analyses of DBP testes microarray data focused on either individual gene expression changes or changes in the expression of specific genes that are hypothesized, or known, to be important in testicular development and testosterone synthesis. However, a pathway analysis may inform whether there are additional affected pathways that could inform additional modes of action linked to DBP developmental toxicity. We show that Pathway activity analysis may be considered for a more comprehensive analysis of microarray data.

  18. Pathway modeling of microarray data: a case study of pathway activity changes in the testis following in utero exposure to dibutyl phthalate (DBP).

    PubMed

    Ovacik, Meric A; Sen, Banalata; Euling, Susan Y; Gaido, Kevin W; Ierapetritou, Marianthi G; Androulakis, Ioannis P

    2013-09-15

    Pathway activity level analysis, the approach pursued in this study, focuses on all genes that are known to be members of metabolic and signaling pathways as defined by the KEGG database. The pathway activity level analysis entails singular value decomposition (SVD) of the expression data of the genes constituting a given pathway. We explore an extension of the pathway activity methodology for application to time-course microarray data. We show that pathway analysis enhances our ability to detect biologically relevant changes in pathway activity using synthetic data. As a case study, we apply the pathway activity level formulation coupled with significance analysis to microarray data from two different rat testes exposed in utero to Dibutyl Phthalate (DBP). In utero DBP exposure in the rat results in developmental toxicity of a number of male reproductive organs, including the testes. One well-characterized mode of action for DBP and the male reproductive developmental effects is the repression of expression of genes involved in cholesterol transport, steroid biosynthesis and testosterone synthesis that lead to a decreased fetal testicular testosterone. Previous analyses of DBP testes microarray data focused on either individual gene expression changes or changes in the expression of specific genes that are hypothesized, or known, to be important in testicular development and testosterone synthesis. However, a pathway analysis may inform whether there are additional affected pathways that could inform additional modes of action linked to DBP developmental toxicity. We show that Pathway activity analysis may be considered for a more comprehensive analysis of microarray data.

  19. Metabolic pathway reconstruction of eugenol to vanillin bioconversion in Aspergillus niger

    PubMed Central

    Srivastava, Suchita; Luqman, Suaib; Khan, Feroz; Chanotiya, Chandan S; Darokar, Mahendra P

    2010-01-01

    Identification of missing genes or proteins participating in the metabolic pathways as enzymes are of great interest. One such class of pathway is involved in the eugenol to vanillin bioconversion. Our goal is to develop an integral approach for identifying the topology of a reference or known pathway in other organism. We successfully identify the missing enzymes and then reconstruct the vanillin biosynthetic pathway in Aspergillus niger. The procedure combines enzyme sequence similarity searched through BLAST homology search and orthologs detection through COG & KEGG databases. Conservation of protein domains and motifs was searched through CDD, PFAM & PROSITE databases. Predictions regarding how proteins act in pathway were validated experimentally and also compared with reported data. The bioconversion of vanillin was screened on UV-TLC plates and later confirmed through GC and GC-MS techniques. We applied a procedure for identifying missing enzymes on the basis of conserved functional motifs and later reconstruct the metabolic pathway in target organism. Using the vanillin biosynthetic pathway of Pseudomonas fluorescens as a case study, we indicate how this approach can be used to reconstruct the reference pathway in A. niger and later results were experimentally validated through chromatography and spectroscopy techniques. PMID:20978605

  20. ESEA: Discovering the Dysregulated Pathways based on Edge Set Enrichment Analysis

    PubMed Central

    Han, Junwei; Shi, Xinrui; Zhang, Yunpeng; Xu, Yanjun; Jiang, Ying; Zhang, Chunlong; Feng, Li; Yang, Haixiu; Shang, Desi; Sun, Zeguo; Su, Fei; Li, Chunquan; Li, Xia

    2015-01-01

    Pathway analyses are playing an increasingly important role in understanding biological mechanism, cellular function and disease states. Current pathway-identification methods generally focus on only the changes of gene expression levels; however, the biological relationships among genes are also the fundamental components of pathways, and the dysregulated relationships may also alter the pathway activities. We propose a powerful computational method, Edge Set Enrichment Analysis (ESEA), for the identification of dysregulated pathways. This provides a novel way of pathway analysis by investigating the changes of biological relationships of pathways in the context of gene expression data. Simulation studies illustrate the power and performance of ESEA under various simulated conditions. Using real datasets from p53 mutation, Type 2 diabetes and lung cancer, we validate effectiveness of ESEA in identifying dysregulated pathways. We further compare our results with five other pathway enrichment analysis methods. With these analyses, we show that ESEA is able to help uncover dysregulated biological pathways underlying complex traits and human diseases via specific use of the dysregulated biological relationships. We develop a freely available R-based tool of ESEA. Currently, ESEA can support pathway analysis of the seven public databases (KEGG; Reactome; Biocarta; NCI; SPIKE; HumanCyc; Panther). PMID:26267116

  1. Ouabain rescues rat nephrogenesis during intrauterine growth restriction by regulating the complement and coagulation cascades and calcium signaling pathway.

    PubMed

    Chen, L; Yue, J; Han, X; Li, J; Hu, Y

    2016-02-01

    Intrauterine growth restriction (IUGR) is associated with a reduction in the numbers of nephrons in neonates, which increases the risk of hypertension. Our previous study showed that ouabain protects the development of the embryonic kidney during IUGR. To explore this molecular mechanism, IUGR rats were induced by protein and calorie restriction throughout pregnancy, and ouabain was delivered using a mini osmotic pump. RNA sequencing technology was used to identify the differentially expressed genes (DEGs) of the embryonic kidneys. DEGs were submitted to the Database for Annotation and Visualization and Integrated Discovery, and gene ontology enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were conducted. Maternal malnutrition significantly reduced fetal weight, but ouabain treatment had no significant effect on body weight. A total of 322 (177 upregulated and 145 downregulated) DEGs were detected between control and the IUGR group. Meanwhile, 318 DEGs were found to be differentially expressed (180 increased and 138 decreased) between the IUGR group and the ouabain-treated group. KEGG pathway analysis indicated that maternal undernutrition mainly disrupts the complement and coagulation cascades and the calcium signaling pathway, which could be protected by ouabain treatment. Taken together, these two biological pathways may play an important role in nephrogenesis, indicating potential novel therapeutic targets against the unfavorable effects of IUGR.

  2. Crosstalk analysis of pathways in breast cancer using a network model based on overlapping differentially expressed genes

    PubMed Central

    SUN, YONG; YUAN, KAI; ZHANG, PENG; MA, RONG; ZHANG, QI-WEN; TIAN, XING-SONG

    2015-01-01

    Multiple signal transduction pathways can affect each other considerably through crosstalk. However, the presence and extent of this phenomenon have not been rigorously studied. The aim of the present study was to identify strong and normal interactions between pathways in breast cancer and determine the main pathway. Five sets of breast cancer data were downloaded from the high-throughput Gene Expression Omnibus (GEO) and analyzed to identify differentially expressed (DE) genes using the Rank Product (RankProd) method. A list of pathways with differential expression was obtained by gene set enrichment analysis (GSEA) of the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The DE genes that overlapped between pathways were identified and a crosstalk network diagram based on the overlap of DE genes was constructed. A total of 1,464 DE genes and 26 pathways were identified. In addition, the number of DE genes that overlapped between specific pathways were determined, and the greatest degree of overlap was between the extracellular matrix (ECM)-receptor interaction and Focal adhesion pathways, which had 22 overlapping DE genes. Weighted pathway analysis of the crosstalk between pathways identified that Pathways in cancer was the main pathway in breast cancer. PMID:26622386

  3. GEM System: automatic prototyping of cell-wide metabolic pathway models from genomes

    PubMed Central

    Arakawa, Kazuharu; Yamada, Yohei; Shinoda, Kosaku; Nakayama, Yoichi; Tomita, Masaru

    2006-01-01

    Background Successful realization of a "systems biology" approach to analyzing cells is a grand challenge for our understanding of life. However, current modeling approaches to cell simulation are labor-intensive, manual affairs, and therefore constitute a major bottleneck in the evolution of computational cell biology. Results We developed the Genome-based Modeling (GEM) System for the purpose of automatically prototyping simulation models of cell-wide metabolic pathways from genome sequences and other public biological information. Models generated by the GEM System include an entire Escherichia coli metabolism model comprising 968 reactions of 1195 metabolites, achieving 100% coverage when compared with the KEGG database, 92.38% with the EcoCyc database, and 95.06% with iJR904 genome-scale model. Conclusion The GEM System prototypes qualitative models to reduce the labor-intensive tasks required for systems biology research. Models of over 90 bacterial genomes are available at our web site. PMID:16553966

  4. What's My Substrate? Computational Function Assignment of Candida parapsilosis ADH5 by Genome Database Search, Virtual Screening, and QM/MM Calculations.

    PubMed

    Dhoke, Gaurao V; Ensari, Yunus; Davari, Mehdi D; Ruff, Anna Joëlle; Schwaneberg, Ulrich; Bocola, Marco

    2016-07-25

    Zinc-dependent medium chain reductase from Candida parapsilosis can be used in the reduction of carbonyl compounds to pharmacologically important chiral secondary alcohols. To date, the nomenclature of cpADH5 is differing (CPCR2/RCR/SADH) in the literature, and its natural substrate is not known. In this study, we utilized a substrate docking based virtual screening method combined with KEGG, MetaCyc pathway, and Candida genome databases search for the discovery of natural substrates of cpADH5. The virtual screening of 7834 carbonyl compounds from the ZINC database provided 94 aldehydes or methyl/ethyl ketones as putative carbonyl substrates. Out of which, 52 carbonyl substrates of cpADH5 with catalytically active docking pose were identified by employing mechanism based substrate docking protocol. Comparison of the virtual screening results with KEGG, MetaCyc database search, and Candida genome pathway analysis suggest that cpADH5 might be involved in the Ehrlich pathway (reduction of fusel aldehydes in leucine, isoleucine, and valine degradation). Our QM/MM calculations and experimental activity measurements affirmed that butyraldehyde substrates are the potential natural substrates of cpADH5, suggesting a carbonyl reductase role for this enzyme in butyraldehyde reduction in aliphatic amino acid degradation pathways. Phylogenetic tree analysis of known ADHs from Candida albicans shows that cpADH5 is close to caADH5. We therefore propose, according to the experimental substrate identification and sequence similarity, the common name butyraldehyde dehydrogenase cpADH5 for Candida parapsilosis CPCR2/RCR/SADH. PMID:27387009

  5. KENeV: A web-application for the automated reconstruction and visualization of the enriched metabolic and signaling super-pathways deriving from genomic experiments.

    PubMed

    Pilalis, Eleftherios; Koutsandreas, Theodoros; Valavanis, Ioannis; Athanasiadis, Emmanouil; Spyrou, George; Chatziioannou, Aristotelis

    2015-01-01

    Gene expression analysis, using high throughput genomic technologies,has become an indispensable step for the meaningful interpretation of the underlying molecular complexity, which shapes the phenotypic manifestation of the investigated biological mechanism. The modularity of the cellular response to different experimental conditions can be comprehended through the exploitation of molecular pathway databases, which offer a controlled, curated background for statistical enrichment analysis. Existing tools enable pathway analysis, visualization, or pathway merging but none integrates a fully automated workflow, combining all above-mentioned modules and destined to non-programmer users. We introduce an online web application, named KEGG Enriched Network Visualizer (KENeV), which enables a fully automated workflow starting from a list of differentially expressed genes and deriving the enriched KEGG metabolic and signaling pathways, merged into two respective, non-redundant super-networks. The final networks can be downloaded as SBML files, for further analysis, or instantly visualized through an interactive visualization module. In conclusion, KENeV (available online at http://www.grissom.gr/kenev) provides an integrative tool, suitable for users with no programming experience, for the functional interpretation, at both the metabolic and signaling level, of differentially expressed gene subsets deriving from genomic experiments. PMID:26925206

  6. KENeV: A web-application for the automated reconstruction and visualization of the enriched metabolic and signaling super-pathways deriving from genomic experiments

    PubMed Central

    Pilalis, Eleftherios; Koutsandreas, Theodoros; Valavanis, Ioannis; Athanasiadis, Emmanouil; Spyrou, George; Chatziioannou, Aristotelis

    2015-01-01

    Gene expression analysis, using high throughput genomic technologies,has become an indispensable step for the meaningful interpretation of the underlying molecular complexity, which shapes the phenotypic manifestation of the investigated biological mechanism. The modularity of the cellular response to different experimental conditions can be comprehended through the exploitation of molecular pathway databases, which offer a controlled, curated background for statistical enrichment analysis. Existing tools enable pathway analysis, visualization, or pathway merging but none integrates a fully automated workflow, combining all above-mentioned modules and destined to non-programmer users. We introduce an online web application, named KEGG Enriched Network Visualizer (KENeV), which enables a fully automated workflow starting from a list of differentially expressed genes and deriving the enriched KEGG metabolic and signaling pathways, merged into two respective, non-redundant super-networks. The final networks can be downloaded as SBML files, for further analysis, or instantly visualized through an interactive visualization module. In conclusion, KENeV (available online at http://www.grissom.gr/kenev) provides an integrative tool, suitable for users with no programming experience, for the functional interpretation, at both the metabolic and signaling level, of differentially expressed gene subsets deriving from genomic experiments. PMID:26925206

  7. Identification of key pathways and genes in colorectal cancer using bioinformatics analysis.

    PubMed

    Liang, Bin; Li, Chunning; Zhao, Jianying

    2016-10-01

    Colorectal cancer (CRC) is the most common malignant tumor of digestive system. The aim of this study was to identify gene signatures during CRC and uncover their potential mechanisms. The gene expression profiles of GSE21815 were downloaded from GEO database. The GSE21815 dataset contained 141 samples, including 132 CRC and 9 normal colon epitheliums. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed, and protein-protein interaction (PPI) network of the differentially expressed genes (DEGs) was constructed by Cytoscape software. In total, 3500 DEGs were identified in CRC, including 1370 up-regulated genes and 2130 down-regulated genes. GO analysis results showed that up-regulated DEGs were significantly enriched in biological processes (BP), including cell cycle, cell division, and cell proliferation; the down-regulated DEGs were significantly enriched in biological processes, including immune response, intracellular signaling cascade and defense response. KEGG pathway analysis showed the up-regulated DEGs were enriched in cell cycle and DNA replication, while the down-regulated DEGs were enriched in drug metabolism, metabolism of xenobiotics by cytochrome P450, and retinol metabolism pathways. The top 10 hub genes, GNG2, AGT, SAA1, ADCY5, LPAR1, NMU, IL8, CXCL12, GNAI1, and CCR2 were identified from the PPI network, and sub-networks revealed these genes were involved in significant pathways, including G protein-coupled receptors signaling pathway, gastrin-CREB signaling pathway via PKC and MAPK, and extracellular matrix organization. In conclusion, the present study indicated that the identified DEGs and hub genes promote our understanding of the molecular mechanisms underlying the development of CRC, and might be used as molecular targets and diagnostic biomarkers for the treatment of CRC. PMID:27581154

  8. EcID. A database for the inference of functional interactions in E. coli.

    PubMed

    Andres Leon, Eduardo; Ezkurdia, Iakes; García, Beatriz; Valencia, Alfonso; Juan, David

    2009-01-01

    The EcID database (Escherichia coli Interaction Database) provides a framework for the integration of information on functional interactions extracted from the following sources: EcoCyc (metabolic pathways, protein complexes and regulatory information), KEGG (metabolic pathways), MINT and IntAct (protein interactions). It also includes information on protein complexes from the two E. coli high-throughput pull-down experiments and potential interactions extracted from the literature using the web services associated to the iHOP text-mining system. Additionally, EcID incorporates results of various prediction methods, including two protein interaction prediction methods based on genomic information (Phylogenetic Profiles and Gene Neighbourhoods) and three methods based on the analysis of co-evolution (Mirror Tree, In Silico 2 Hybrid and Context Mirror). EcID associates to each prediction a specifically developed confidence score. The two main features that make EcID different from other systems are the combination of co-evolution-based predictions with the experimental data, and the introduction of E. coli-specific information, such as gene regulation information from EcoCyc. The possibilities offered by the combination of the EcID database information are illustrated with a prediction of potential functions for a group of poorly characterized genes related to yeaG. EcID is available online at http://ecid.bioinfo.cnio.es.

  9. Bioinformatics Annotation of Human Y Chromosome-Encoded Protein Pathways and Interactions.

    PubMed

    Rengaraj, Deivendran; Kwon, Woo-Sung; Pang, Myung-Geol

    2015-09-01

    We performed a comprehensive analysis of human Y chromosome-encoded proteins, their pathways, and their interactions using bioinformatics tools. From the NCBI annotation release 107 of human genome, we retrieved a total of 66 proteins encoded on Y chromosome. Most of the retrieved proteins were also matched with the proteins listed in the core databases of the Human Proteome Project including neXtProt, PeptideAtlas, and the Human Protein Atlas. When we examined the pathways of human Y-encoded proteins through KEGG database and Pathway Studio software, many of proteins fall into the categories related to cell signaling pathways. Using the STRING program, we found a total of 49 human Y-encoded proteins showing strong/medium interaction with each other. While using the Pathway studio software, we found that a total of 16 proteins interact with other chromosome-encoded proteins. In particular, the SRY protein interacted with 17 proteins encoded on other chromosomes. Additionally, we aligned the sequences of human Y-encoded proteins with the sequences of chimpanzee and mouse Y-encoded proteins using the NCBI BLAST program. This analysis resulted in a significant number of orthologous proteins between human, chimpanzee, and mouse. Collectively, our findings provide the scientific community with additional information on the human Y chromosome-encoded proteins.

  10. ReprOlive: a database with linked data for the olive tree (Olea europaea L.) reproductive transcriptome.

    PubMed

    Carmona, Rosario; Zafra, Adoración; Seoane, Pedro; Castro, Antonio J; Guerrero-Fernández, Darío; Castillo-Castillo, Trinidad; Medina-García, Ana; Cánovas, Francisco M; Aldana-Montes, José F; Navas-Delgado, Ismael; Alché, Juan de Dios; Claros, M Gonzalo

    2015-01-01

    Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species. PMID:26322066

  11. ReprOlive: a database with linked data for the olive tree (Olea europaea L.) reproductive transcriptome

    PubMed Central

    Carmona, Rosario; Zafra, Adoración; Seoane, Pedro; Castro, Antonio J.; Guerrero-Fernández, Darío; Castillo-Castillo, Trinidad; Medina-García, Ana; Cánovas, Francisco M.; Aldana-Montes, José F.; Navas-Delgado, Ismael; Alché, Juan de Dios; Claros, M. Gonzalo

    2015-01-01

    Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species. PMID:26322066

  12. ReprOlive: a database with linked data for the olive tree (Olea europaea L.) reproductive transcriptome.

    PubMed

    Carmona, Rosario; Zafra, Adoración; Seoane, Pedro; Castro, Antonio J; Guerrero-Fernández, Darío; Castillo-Castillo, Trinidad; Medina-García, Ana; Cánovas, Francisco M; Aldana-Montes, José F; Navas-Delgado, Ismael; Alché, Juan de Dios; Claros, M Gonzalo

    2015-01-01

    Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species.

  13. Analysis of Polygala tenuifolia Transcriptome and Description of Secondary Metabolite Biosynthetic Pathways by Illumina Sequencing

    PubMed Central

    Tian, Hongling; Xu, Xiaoshuang; Zhang, Fusheng; Wang, Yaoqin; Guo, Shuhong; Qin, Xuemei; Du, Guanhua

    2015-01-01

    Radix polygalae, the dried roots of Polygala tenuifolia and P. sibirica, is one of the most well-known traditional Chinese medicinal plants. Radix polygalae contains various saponins, xanthones, and oligosaccharide esters and these compounds are responsible for several pharmacological properties. To provide basic breeding information, enhance molecular biological analysis, and determine secondary metabolite biosynthetic pathways of P. tenuifolia, we applied Illumina sequencing technology and de novo assembly. We also applied this technique to gain an overview of P. tenuifolia transcriptome from samples with different years. Using Illumina sequencing, approximately 67.2% of unique sequences were annotated by basic local alignment search tool similarity searches against public sequence databases. We classified the annotated unigenes by using Nr, Nt, GO, COG, and KEGG databases compared with NCBI. We also obtained many candidates CYP450s and UGTs by the analysis of genes in the secondary metabolite biosynthetic pathways, including putative terpenoid backbone and phenylpropanoid biosynthesis pathway. With this transcriptome sequencing, future genetic and genomics studies related to the molecular mechanisms associated with the chemical composition of P. tenuifolia may be improved. Genes involved in the enrichment of secondary metabolite biosynthesis-related pathways could enhance the potential applications of P. tenuifolia in pharmaceutical industries. PMID:26543847

  14. Pathway-Based Genome-Wide Association Studies for Two Meat Production Traits in Simmental Cattle

    PubMed Central

    Fan, Huizhong; Wu, Yang; Zhou, Xiaojing; Xia, Jiangwei; Zhang, Wengang; Song, Yuxin; Liu, Fei; Chen, Yan; Zhang, Lupei; Gao, Xue; Gao, Huijiang; Li, Junya

    2015-01-01

    Most single nucleotide polymorphisms (SNPs) detected by genome-wide association studies (GWAS), explain only a small fraction of phenotypic variation. Pathway-based GWAS were proposed to improve the proportion of genes for some human complex traits that could be explained by enriching a mass of SNPs within genetic groups. However, few attempts have been made to describe the quantitative traits in domestic animals. In this study, we used a dataset with approximately 7,700,000 SNPs from 807 Simmental cattle and analyzed live weight and longissimus muscle area using a modified pathway-based GWAS method to orthogonalise the highly linked SNPs within each gene using principal component analysis (PCA). As a result, of the 262 biological pathways of cattle collected from the KEGG database, the gamma aminobutyric acid (GABA)ergic synapse pathway and the non-alcoholic fatty liver disease (NAFLD) pathway were significantly associated with the two traits analyzed. The GABAergic synapse pathway was biologically applicable to the traits analyzed because of its roles in feed intake and weight gain. The proposed method had high statistical power and a low false discovery rate, compared to those of the smallest P-value and SNP set enrichment analysis methods. PMID:26672757

  15. Pathway modeling of microarray data: a case study of pathway activity changes in the testis following in utero exposure to dibutyl phthalate (DBP).

    PubMed

    Ovacik, Meric A; Sen, Banalata; Euling, Susan Y; Gaido, Kevin W; Ierapetritou, Marianthi G; Androulakis, Ioannis P

    2013-09-15

    Pathway activity level analysis, the approach pursued in this study, focuses on all genes that are known to be members of metabolic and signaling pathways as defined by the KEGG database. The pathway activity level analysis entails singular value decomposition (SVD) of the expression data of the genes constituting a given pathway. We explore an extension of the pathway activity methodology for application to time-course microarray data. We show that pathway analysis enhances our ability to detect biologically relevant changes in pathway activity using synthetic data. As a case study, we apply the pathway activity level formulation coupled with significance analysis to microarray data from two different rat testes exposed in utero to Dibutyl Phthalate (DBP). In utero DBP exposure in the rat results in developmental toxicity of a number of male reproductive organs, including the testes. One well-characterized mode of action for DBP and the male reproductive developmental effects is the repression of expression of genes involved in cholesterol transport, steroid biosynthesis and testosterone synthesis that lead to a decreased fetal testicular testosterone. Previous analyses of DBP testes microarray data focused on either individual gene expression changes or changes in the expression of specific genes that are hypothesized, or known, to be important in testicular development and testosterone synthesis. However, a pathway analysis may inform whether there are additional affected pathways that could inform additional modes of action linked to DBP developmental toxicity. We show that Pathway activity analysis may be considered for a more comprehensive analysis of microarray data. PMID:20850466

  16. Using the Reactome Database

    PubMed Central

    Haw, Robin

    2012-01-01

    There is considerable interest in the bioinformatics community in creating pathway databases. The Reactome project (a collaboration between the Ontario Institute for Cancer Research, Cold Spring Harbor Laboratory, New York University Medical Center and the European Bioinformatics Institute) is one such pathway database and collects structured information on all the biological pathways and processes in the human. It is an expert-authored and peer-reviewed, curated collection of well-documented molecular reactions that span the gamut from simple intermediate metabolism to signaling pathways and complex cellular events. This information is supplemented with likely orthologous molecular reactions in mouse, rat, zebrafish, worm and other model organisms. This unit describes how to use the Reactome database to learn the steps of a biological pathway; navigate and browse through the Reactome database; identify the pathways in which a molecule of interest is involved; use the Pathway and Expression analysis tools to search the database for and visualize possible connections within user-supplied experimental data set and Reactome pathways; and the Species Comparison tool to compare human and model organism pathways. PMID:22700314

  17. De Novo Transcriptomic Analysis of Peripheral Blood Lymphocytes from the Chinese Goose: Gene Discovery and Immune System Pathway Description

    PubMed Central

    Tariq, Mansoor; Chen, Rong; Yuan, Hongyu; Liu, Yanjie; Wu, Yanan; Wang, Junya; Xia, Chun

    2015-01-01

    Background The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes. Principal Findings De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr) protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go) categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs) categories. Kyoto Encyclopedia of Genes and Genomes (KEGG) database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose. Conclusion This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with

  18. Integrated interactive chart as a tool for teaching metabolic pathways.

    PubMed

    Kalogiannis, Stavros; Pagkalos, Ioannis; Koufoudakis, Panagiotis; Dashi, Ino; Pontikeri, Kyriaki; Christodoulou, Constantina

    2014-01-01

    An interactive chart of energy metabolism with didactic function, complementary to the already existing metabolic maps, located at the URL www.metpath.teithe.gr is being presented. The chart illustrates the major catabolic and biosynthetic pathways of glucose, fatty acids, and aminoacids, individually as well as in an integrated view. For every metabolite and reaction an information sheet may be presented at the side of the map as fancybox, containing chemical structural formulae, an external link to the KEGG database and links that lead to the reactions at which the produced metabolites may participate as reactants. The latter allows the user to navigate through metabolic reactions following a route similar to the metabolic flow of substances, while keeping track of the occurring chemical transformations. Simultaneously, users may observe how they move across the metabolic map, possibly along different pathways, thus enhancing the user's integrated perception of metabolism. The site has already been introduced in biochemistry lectures and the students evaluated it. Most students were helped a lot or more to understand individual pathways as well as their interconnections and they also found it pleasant and easy to navigate. The vast majority of the students considered its use in the classroom desired. The chart currently may be displayed in English and in Greek while more languages can be integrated in the future. The authors' view, in accordance to the users' perception, is that the presented site may offer biochemistry tutors and students a useful teaching aid.

  19. Transcriptome Analysis Reveals the Genetic Basis of the Resveratrol Biosynthesis Pathway in an Endophytic Fungus (Alternaria sp. MG1) Isolated from Vitis vinifera.

    PubMed

    Che, Jinxin; Shi, Junling; Gao, Zhenhong; Zhang, Yan

    2016-01-01

    Alternaria sp. MG1, an endophytic fungus previously isolated from Merlot grape, produces resveratrol from glucose, showing similar metabolic flux to the phenylpropanoid biosynthesis pathway, currently found solely in plants. In order to identify the resveratrol biosynthesis pathway in this strain at the gene level, de novo transcriptome sequencing was conducted using Illumina paired-end sequencing. A total of 22,954,434 high-quality reads were assembled into contigs and 18,570 unigenes were identified. Among these unigenes, 14,153 were annotated in the NCBI non-redundant protein database and 5341 were annotated in the Swiss-Prot database. After KEGG mapping, 2701 unigenes were mapped onto 115 pathways. Eighty-four unigenes were annotated in major pathways from glucose to resveratrol, coding 20 enzymes for glycolysis, 10 for phenylalanine biosynthesis, 4 for phenylpropanoid biosynthesis, and 4 for stilbenoid biosynthesis. Chalcone synthase was identified for resveratrol biosynthesis in this strain, due to the absence of stilbene synthase. All the identified enzymes indicated a reasonable biosynthesis pathway from glucose to resveratrol via glycolysis, phenylalanine biosynthesis, phenylpropanoid biosynthesis, and stilbenoid pathways. These results provide essential evidence for the occurrence of resveratrol biosynthesis in Alternaria sp. MG1 at the gene level, facilitating further elucidation of the molecular mechanisms involved in this strain's secondary metabolism. PMID:27588016

  20. Transcriptome Analysis Reveals the Genetic Basis of the Resveratrol Biosynthesis Pathway in an Endophytic Fungus (Alternaria sp. MG1) Isolated from Vitis vinifera

    PubMed Central

    Che, Jinxin; Shi, Junling; Gao, Zhenhong; Zhang, Yan

    2016-01-01

    Alternaria sp. MG1, an endophytic fungus previously isolated from Merlot grape, produces resveratrol from glucose, showing similar metabolic flux to the phenylpropanoid biosynthesis pathway, currently found solely in plants. In order to identify the resveratrol biosynthesis pathway in this strain at the gene level, de novo transcriptome sequencing was conducted using Illumina paired-end sequencing. A total of 22,954,434 high-quality reads were assembled into contigs and 18,570 unigenes were identified. Among these unigenes, 14,153 were annotated in the NCBI non-redundant protein database and 5341 were annotated in the Swiss-Prot database. After KEGG mapping, 2701 unigenes were mapped onto 115 pathways. Eighty-four unigenes were annotated in major pathways from glucose to resveratrol, coding 20 enzymes for glycolysis, 10 for phenylalanine biosynthesis, 4 for phenylpropanoid biosynthesis, and 4 for stilbenoid biosynthesis. Chalcone synthase was identified for resveratrol biosynthesis in this strain, due to the absence of stilbene synthase. All the identified enzymes indicated a reasonable biosynthesis pathway from glucose to resveratrol via glycolysis, phenylalanine biosynthesis, phenylpropanoid biosynthesis, and stilbenoid pathways. These results provide essential evidence for the occurrence of resveratrol biosynthesis in Alternaria sp. MG1 at the gene level, facilitating further elucidation of the molecular mechanisms involved in this strain's secondary metabolism. PMID:27588016

  1. Characterization of Differentially Expressed Genes Involved in Pathways Associated with Gastric Cancer

    PubMed Central

    Li, Hao; Yu, Beiqin; Li, Jianfang; Su, Liping; Yan, Min; Zhang, Jun; Li, Chen; Zhu, Zhenggang; Liu, Bingya

    2015-01-01

    To explore the patterns of gene expression in gastric cancer, a total of 26 paired gastric cancer and noncancerous tissues from patients were enrolled for gene expression microarray analyses. Limma methods were applied to analyze the data, and genes were considered to be significantly differentially expressed if the False Discovery Rate (FDR) value was < 0.01, P-value was <0.01 and the fold change (FC) was >2. Subsequently, Gene Ontology (GO) categories were used to analyze the main functions of the differentially expressed genes. According to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, we found pathways significantly associated with the differential genes. Gene-Act network and co-expression network were built respectively based on the relationships among the genes, proteins and compounds in the database. 2371 mRNAs and 350 lncRNAs considered as significantly differentially expressed genes were selected for the further analysis. The GO categories, pathway analyses and the Gene-Act network showed a consistent result that up-regulated genes were responsible for tumorigenesis, migration, angiogenesis and microenvironment formation, while down-regulated genes were involved in metabolism. These results of this study provide some novel findings on coding RNAs, lncRNAs, pathways and the co-expression network in gastric cancer which will be useful to guide further investigation and target therapy for this disease. PMID:25928635

  2. Meta-All: a system for managing metabolic pathway information

    PubMed Central

    Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H

    2006-01-01

    Background Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. Results We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. Conclusion META-ALL is a system for managing information about metabolic

  3. HypoxiaDB: a database of hypoxia-regulated proteins.

    PubMed

    Khurana, Pankaj; Sugadev, Ragumani; Jain, Jaspreet; Singh, Shashi Bala

    2013-01-01

    There has been intense interest in the cellular response to hypoxia, and a large number of differentially expressed proteins have been identified through various high-throughput experiments. These valuable data are scattered, and there have been no systematic attempts to document the various proteins regulated by hypoxia. Compilation, curation and annotation of these data are important in deciphering their role in hypoxia and hypoxia-related disorders. Therefore, we have compiled HypoxiaDB, a database of hypoxia-regulated proteins. It is a comprehensive, manually-curated, non-redundant catalog of proteins whose expressions are shown experimentally to be altered at different levels and durations of hypoxia. The database currently contains 72 000 manually curated entries taken on 3500 proteins extracted from 73 peer-reviewed publications selected from PubMed. HypoxiaDB is distinctive from other generalized databases: (i) it compiles tissue-specific protein expression changes under different levels and duration of hypoxia. Also, it provides manually curated literature references to support the inclusion of the protein in the database and establish its association with hypoxia. (ii) For each protein, HypoxiaDB integrates data on gene ontology, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway, protein-protein interactions, protein family (Pfam), OMIM (Online Mendelian Inheritance in Man), PDB (Protein Data Bank) structures and homology to other sequenced genomes. (iii) It also provides pre-compiled information on hypoxia-proteins, which otherwise requires tedious computational analysis. This includes information like chromosomal location, identifiers like Entrez, HGNC, Unigene, Uniprot, Ensembl, Vega, GI numbers and Genbank accession numbers associated with the protein. These are further cross-linked to respective public databases augmenting HypoxiaDB to the external repositories. (iv) In addition, HypoxiaDB provides an online sequence-similarity search tool for

  4. Integrated miRNA–risk gene–pathway pair network analysis provides prognostic biomarkers for gastric cancer

    PubMed Central

    Cai, Hui; Xu, Jiping; Han, Yifang; Lu, Zhengmao; Han, Ting; Ding, Yibo; Ma, Liye

    2016-01-01

    Purpose This study aimed to identify molecular prognostic biomarkers for gastric cancer. Methods mRNA and miRNA expression profiles of eligible gastric cancer and control samples were downloaded from Gene Expression Omnibus to screen the differentially expressed genes (DEGs) and differentially expressed miRNAs (DEmiRs), using MetaDE and limma packages, respectively. Target genes of the DEmiRs were also collected from both predictive and experimentally validated target databases of miRNAs. The overlapping genes between selected targets and DEGs were identified as risk genes, followed by functional enrichment analysis. Human pathways and their corresponding genes were downloaded from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database for the expression analysis of each pathway in gastric cancer samples. Next, co-pathway pairs were selected according to the Pearson correlation coefficients. Finally, the co-pathway pairs, miRNA–target pairs, and risk gene–pathway pairs were merged into a complex interaction network, the most important nodes (miRNAs/target genes/co-pathway pairs) of which were selected by calculating their degrees. Results Totally, 1,260 DEGs and 144 DEmiRs were identified. There were 336 risk genes found in the 9,572 miRNA–target pairs. Judging from the pathway expression files, 45 co-pathway pairs were screened out. There were 1,389 interactive pairs and 480 nodes in the integrated network. Among all nodes in the network, focal adhesion/extracellular matrix–receptor interaction pathways, CALM2, miR-19b, and miR-181b were the hub nodes with higher degrees. Conclusion CALM2, hsa-miR-19b, and hsa-miR-181b might be used as potential prognostic targets for gastric cancer. PMID:27284247

  5. Databases for Microbiologists

    DOE PAGES

    Zhulin, Igor B.

    2015-05-26

    Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.

  6. Databases for Microbiologists

    PubMed Central

    2015-01-01

    Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. The purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists. PMID:26013493

  7. MetRxn: a knowledgebase of metabolites and reactions spanning metabolic models and databases

    PubMed Central

    2012-01-01

    Background Increasingly, metabolite and reaction information is organized in the form of genome-scale metabolic reconstructions that describe the reaction stoichiometry, directionality, and gene to protein to reaction associations. A key bottleneck in the pace of reconstruction of new, high-quality metabolic models is the inability to directly make use of metabolite/reaction information from biological databases or other models due to incompatibilities in content representation (i.e., metabolites with multiple names across databases and models), stoichiometric errors such as elemental or charge imbalances, and incomplete atomistic detail (e.g., use of generic R-group or non-explicit specification of stereo-specificity). Description MetRxn is a knowledgebase that includes standardized metabolite and reaction descriptions by integrating information from BRENDA, KEGG, MetaCyc, Reactome.org and 44 metabolic models into a single unified data set. All metabolite entries have matched synonyms, resolved protonation states, and are linked to unique structures. All reaction entries are elementally and charge balanced. This is accomplished through the use of a workflow of lexicographic, phonetic, and structural comparison algorithms. MetRxn allows for the download of standardized versions of existing genome-scale metabolic models and the use of metabolic information for the rapid reconstruction of new ones. Conclusions The standardization in description allows for the direct comparison of the metabolite and reaction content between metabolic models and databases and the exhaustive prospecting of pathways for biotechnological production. This ever-growing dataset currently consists of over 76,000 metabolites participating in more than 72,000 reactions (including unresolved entries). MetRxn is hosted on a web-based platform that uses relational database models (MySQL). PMID:22233419

  8. The 2014 Nucleic Acids Research Database Issue and an updated NAR online Molecular Biology Database Collection

    PubMed Central

    Fernández-Suárez, Xosé M.; Rigden, Daniel J.; Galperin, Michael Y.

    2014-01-01

    The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI’s MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/). PMID:24316579

  9. The 2014 Nucleic Acids Research Database Issue and an updated NAR online Molecular Biology Database Collection.

    PubMed

    Fernández-Suárez, Xosé M; Rigden, Daniel J; Galperin, Michael Y

    2014-01-01

    The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI's MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).

  10. Systematization of the protein sequence diversity in enzymes related to secondary metabolic pathways in plants, in the context of big data biology inspired by the KNApSAcK motorcycle database.

    PubMed

    Ikeda, Shun; Abe, Takashi; Nakamura, Yukiko; Kibinge, Nelson; Hirai Morita, Aki; Nakatani, Atsushi; Ono, Naoaki; Ikemura, Toshimichi; Nakamura, Kensuke; Altaf-Ul-Amin, Md; Kanaya, Shigehiko

    2013-05-01

    Biology is increasingly becoming a data-intensive science with the recent progress of the omics fields, e.g. genomics, transcriptomics, proteomics and metabolomics. The species-metabolite relationship database, KNApSAcK Core, has been widely utilized and cited in metabolomics research, and chronological analysis of that research work has helped to reveal recent trends in metabolomics research. To meet the needs of these trends, the KNApSAcK database has been extended by incorporating a secondary metabolic pathway database called Motorcycle DB. We examined the enzyme sequence diversity related to secondary metabolism by means of batch-learning self-organizing maps (BL-SOMs). Initially, we constructed a map by using a big data matrix consisting of the frequencies of all possible dipeptides in the protein sequence segments of plants and bacteria. The enzyme sequence diversity of the secondary metabolic pathways was examined by identifying clusters of segments associated with certain enzyme groups in the resulting map. The extent of diversity of 15 secondary metabolic enzyme groups is discussed. Data-intensive approaches such as BL-SOM applied to big data matrices are needed for systematizing protein sequences. Handling big data has become an inevitable part of biology.

  11. Path2Models: large-scale generation of computational models from biochemical pathway maps

    PubMed Central

    2013-01-01

    Background Systems biology projects and omics technologies have led to a growing number of biochemical pathway models and reconstructions. However, the majority of these models are still created de novo, based on literature mining and the manual processing of pathway data. Results To increase the efficiency of model creation, the Path2Models project has automatically generated mathematical models from pathway representations using a suite of freely available software. Data sources include KEGG, BioCarta, MetaCyc and SABIO-RK. Depending on the source data, three types of models are provided: kinetic, logical and constraint-based. Models from over 2 600 organisms are encoded consistently in SBML, and are made freely available through BioModels Database at http://www.ebi.ac.uk/biomodels-main/path2models. Each model contains the list of participants, their interactions, the relevant mathematical constructs, and initial parameter values. Most models are also available as easy-to-understand graphical SBGN maps. Conclusions To date, the project has resulted in more than 140 000 freely available models. Such a resource can tremendously accelerate the development of mathematical models by providing initial starting models for simulation and analysis, which can be subsequently curated and further parameterized. PMID:24180668

  12. A toolbox model of evolution of metabolic pathways on networks of arbitrary topology.

    PubMed

    Pang, Tin Yau; Maslov, Sergei

    2011-05-01

    In prokaryotic genomes the number of transcriptional regulators is known to be proportional to the square of the total number of protein-coding genes. A toolbox model of evolution was recently proposed to explain this empirical scaling for metabolic enzymes and their regulators. According to its rules, the metabolic network of an organism evolves by horizontal transfer of pathways from other species. These pathways are part of a larger "universal" network formed by the union of all species-specific networks. It remained to be understood, however, how the topological properties of this universal network influence the scaling law of functional content of genomes in the toolbox model. Here we answer this question by first analyzing the scaling properties of the toolbox model on arbitrary tree-like universal networks. We prove that critical branching topology, in which the average number of upstream neighbors of a node is equal to one, is both necessary and sufficient for quadratic scaling. We further generalize the rules of the model to incorporate reactions with multiple substrates/products as well as branched and cyclic metabolic pathways. To achieve its metabolic tasks, the new model employs evolutionary optimized pathways with minimal number of reactions. Numerical simulations of this realistic model on the universal network of all reactions in the KEGG database produced approximately quadratic scaling between the number of regulated pathways and the size of the metabolic network. To quantify the geometrical structure of individual pathways, we investigated the relationship between their number of reactions, byproducts, intermediate, and feedback metabolites. Our results validate and explain the ubiquitous appearance of the quadratic scaling for a broad spectrum of topologies of underlying universal metabolic networks. They also demonstrate why, in spite of "small-world" topology, real-life metabolic networks are characterized by a broad distribution of pathway

  13. Identification of hub genes and pathways associated with retinoblastoma based on co-expression network analysis.

    PubMed

    Wang, Q L; Chen, X; Zhang, M H; Shen, Q H; Qin, Z M

    2015-01-01

    The objective of this paper was to identify hub genes and pathways associated with retinoblastoma using centrality analysis of the co-expression network and pathway-enrichment analysis. The co-expression network of retinoblastoma was constructed by weighted gene co-expression network analysis (WGCNA) based on differentially expressed (DE) genes, and clusters were obtained through the molecular complex detection (MCODE) algorithm. Degree centrality analysis of the co-expression network was performed to explore hub genes present in retinoblastoma. Pathway-enrichment analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Validation of hub gene expression in retinoblastoma was performed by reverse transcription-polymerase chain reaction (RT-PCR) analysis. The co-expression network based on 221 DE genes between retinoblastoma and normal controls consisted of 210 nodes and 3965 edges, and 5 clusters of the network were evaluated. By assessing the centrality analysis of the co-expression network, 21 hub genes were identified, such as SNORD115-41, RASSF2, and SNORD115-44. According to RT-PCR analysis, 16 of the 21 hub genes were differently expressed, including RASSF2 and CDCA7, and 5 were not differently expressed in retinoblastoma compared to normal controls. Pathway analysis showed that genes in 2 clusters were enriched in 3 pathways: purine metabolism, p53 signaling pathway, and melanogenesis. In this study, we successfully identified 16 hub genes and 3 pathways associated with retinoblastoma, which may be potential biomarkers for early detection and therapy for retinoblastoma. PMID:26662407

  14. A Toolbox Model of Evolution of Metabolic Pathways on Networks of Arbitrary Topology

    SciTech Connect

    Maslov, S.; Pang, T.Y.

    2011-05-01

    In prokaryotic genomes the number of transcriptional regulators is known to be proportional to the square of the total number of protein-coding genes. A toolbox model of evolution was recently proposed to explain this empirical scaling for metabolic enzymes and their regulators. According to its rules, the metabolic network of an organism evolves by horizontal transfer of pathways from other species. These pathways are part of a larger 'universal' network formed by the union of all species-specific networks. It remained to be understood, however, how the topological properties of this universal network influence the scaling law of functional content of genomes in the toolbox model. Here we answer this question by first analyzing the scaling properties of the toolbox model on arbitrary tree-like universal networks. We prove that critical branching topology, in which the average number of upstream neighbors of a node is equal to one, is both necessary and sufficient for quadratic scaling. We further generalize the rules of the model to incorporate reactions with multiple substrates/products as well as branched and cyclic metabolic pathways. To achieve its metabolic tasks, the new model employs evolutionary optimized pathways with minimal number of reactions. Numerical simulations of this realistic model on the universal network of all reactions in the KEGG database produced approximately quadratic scaling between the number of regulated pathways and the size of the metabolic network. To quantify the geometrical structure of individual pathways, we investigated the relationship between their number of reactions, byproducts, intermediate, and feedback metabolites. Our results validate and explain the ubiquitous appearance of the quadratic scaling for a broad spectrum of topologies of underlying universal metabolic networks. They also demonstrate why, in spite of 'small-world' topology, real-life metabolic networks are characterized by a broad distribution of pathway

  15. EuDBase: An online resource for automated EST analysis pipeline (ESTFrontier) and database for red seaweed Eucheuma denticulatum.

    PubMed

    Hussein, Zeti Azura Mohamed; Loke, Kok Keong; Abidin, Rabiatul Adawiah Zainal; Othman, Roohaida

    2011-01-01

    Functional genomics has proven to be an efficient tool in identifying genes involved in various biological functions. However the availability of commercially important seaweed Eucheuma denticulatum functional resources is still limited. EuDBase is the first seaweed online repository that provides integrated access to ESTs of Eucheuma denticulatum generated from samples collected from Kudat and Semporna in Sabah, Malaysia. The database stored 10,031 ESTs that are clustered and assembled into 2,275 unique transcripts (UT) and 955 singletons. Raw data were automatically processed using ESTFrontier, an in-house automated EST analysis pipeline. Data was collected in MySQL database. Web interface is implemented using PHP and it allows browsing and querying EuDBase through search engine. Data is searchable via BLAST hit, domain search, Gene Ontology or KEGG Pathway. A user-friendly interface allows the identification of sequences either using a simple text query or similarity search. The development of EuDBase is initiated to store, manage and analyze the E. denticulatum ESTs and to provide accumulative digital resources for the use of global scientific community. EuDBase is freely available from http://www.inbiosis.ukm.my/eudbase/.

  16. EuDBase: An online resource for automated EST analysis pipeline (ESTFrontier) and database for red seaweed Eucheuma denticulatum.

    PubMed

    Hussein, Zeti Azura Mohamed; Loke, Kok Keong; Abidin, Rabiatul Adawiah Zainal; Othman, Roohaida

    2011-01-01

    Functional genomics has proven to be an efficient tool in identifying genes involved in various biological functions. However the availability of commercially important seaweed Eucheuma denticulatum functional resources is still limited. EuDBase is the first seaweed online repository that provides integrated access to ESTs of Eucheuma denticulatum generated from samples collected from Kudat and Semporna in Sabah, Malaysia. The database stored 10,031 ESTs that are clustered and assembled into 2,275 unique transcripts (UT) and 955 singletons. Raw data were automatically processed using ESTFrontier, an in-house automated EST analysis pipeline. Data was collected in MySQL database. Web interface is implemented using PHP and it allows browsing and querying EuDBase through search engine. Data is searchable via BLAST hit, domain search, Gene Ontology or KEGG Pathway. A user-friendly interface allows the identification of sequences either using a simple text query or similarity search. The development of EuDBase is initiated to store, manage and analyze the E. denticulatum ESTs and to provide accumulative digital resources for the use of global scientific community. EuDBase is freely available from http://www.inbiosis.ukm.my/eudbase/. PMID:22102771

  17. EuDBase: An online resource for automated EST analysis pipeline (ESTFrontier) and database for red seaweed Eucheuma denticulatum

    PubMed Central

    Hussein, Zeti Azura Mohamed; Loke, Kok Keong; Abidin, Rabiatul Adawiah Zainal; Othman, Roohaida

    2011-01-01

    Functional genomics has proven to be an efficient tool in identifying genes involved in various biological functions. However the availability of commercially important seaweed Eucheuma denticulatum functional resources is still limited. EuDBase is the first seaweed online repository that provides integrated access to ESTs of Eucheuma denticulatum generated from samples collected from Kudat and Semporna in Sabah, Malaysia. The database stored 10,031 ESTs that are clustered and assembled into 2,275 unique transcripts (UT) and 955 singletons. Raw data were automatically processed using ESTFrontier, an in-house automated EST analysis pipeline. Data was collected in MySQL database. Web interface is implemented using PHP and it allows browsing and querying EuDBase through search engine. Data is searchable via BLAST hit, domain search, Gene Ontology or KEGG Pathway. A user-friendly interface allows the identification of sequences either using a simple text query or similarity search. The development of EuDBase is initiated to store, manage and analyze the E. denticulatum ESTs and to provide accumulative digital resources for the use of global scientific community. EuDBase is freely available from http://www.inbiosis.ukm.my/eudbase/. PMID:22102771

  18. Altered molecular expression of the TLR4/NF-κB signaling pathway in mammary tissue of Chinese Holstein cattle with mastitis.

    PubMed

    Wu, Jie; Li, Lian; Sun, Yu; Huang, Shuai; Tang, Juan; Yu, Pan; Wang, Genlin

    2015-01-01

    Toll-like receptor 4 (TLR4) mediated activation of the nuclear transcription factor κB (NF-κB) signaling pathway by mastitis initiates expression of genes associated with inflammation and the innate immune response. In this study, the profile of mastitis-induced differential gene expression in the mammary tissue of Chinese Holstein cattle was investigated by Gene-Chip microarray and bioinformatics. The microarray results revealed that 79 genes associated with the TLR4/NF-κB signaling pathway were differentially expressed. Of these genes, 19 were up-regulated and 29 were down-regulated in mastitis tissue compared to normal, healthy tissue. Statistical analysis of transcript and protein level expression changes indicated that 10 genes, namely TLR4, MyD88, IL-6, and IL-10, were up-regulated, while, CD14, TNF-α, MD-2, IL-β, NF-κB, and IL-12 were significantly down-regulated in mastitis tissue in comparison with normal tissue. Analyses using bioinformatics database resources, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and the Gene Ontology Consortium (GO) for term enrichment analysis, suggested that these differently expressed genes implicate different regulatory pathways for immune function in the mammary gland. In conclusion, our study provides new evidence for better understanding the differential expression and mechanisms of the TLR4 /NF-κB signaling pathway in Chinese Holstein cattle with mastitis.

  19. Enchytraeus albidus Microarray: Enrichment, Design, Annotation and Database (EnchyBASE)

    PubMed Central

    Novais, Sara C.; Arrais, Joel; Lopes, Pedro; Vandenbrouck, Tine; De Coen, Wim; Roelofs, Dick; Soares, Amadeu M. V. M.; Amorim, Mónica J. B.

    2012-01-01

    Enchytraeus albidus (Oligochaeta) is an ecologically relevant species used as standard test organisms for risk assessment. Effects of stressors in this species are commonly determined at the population level using reproduction and survival as endpoints. The assessment of transcriptomic responses can be very useful e.g. to understand underlying mechanisms of toxicity with gene expression fingerprinting. In the present paper the following is being addressed: 1) development of suppressive subtractive hybridization (SSH) libraries enriched for differentially expressed genes after metal and pesticide exposures; 2) sequencing and characterization of all generated cDNA inserts; 3) development of a publicly available genomic database on E. albidus. A total of 2100 Expressed Sequence Tags (ESTs) were isolated, sequenced and assembled into 1124 clusters (947 singletons and 177 contigs). From these sequences, 41% matched known proteins in GenBank (BLASTX, e-value≤10-5) and 37% had at least one Gene Ontology (GO) term assigned. In total, 5.5% of the sequences were assigned to a metabolic pathway, based on KEGG. With this new sequencing information, an Agilent custom oligonucleotide microarray was designed, representing a potential tool for transcriptomic studies. EnchyBASE (http://bioinformatics.ua.pt/enchybase/) was developed as a web freely available database containing genomic information on E. albidus and will be further extended in the near future for other enchytraeid species. The database so far includes all ESTs generated for E. albidus from three cDNA libraries. This information can be downloaded and applied in functional genomics and transcription studies. PMID:22558086

  20. DGEM--a microarray gene expression database for primary human disease tissues.

    PubMed

    Xia, Yuni; Campen, Andrew; Rigsby, Dan; Guo, Ying; Feng, Xingdong; Su, Eric W; Palakal, Mathew; Li, Shuyu

    2007-01-01

    Gene expression patterns can reflect gene regulations in human tissues under normal or pathologic conditions. Gene expression profiling data from studies of primary human disease samples are particularly valuable since these studies often span many years in order to collect patient clinical information and achieve a large sample size. Disease-to-Gene Expression Mapper (DGEM) provides a beneficial community resource to access and analyze these data; it currently includes Affymetrix oligonucleotide array datasets for more than 40 human diseases and 1400 samples. The data are normalized to the same scale and stored in a relational database. A statistical-analysis pipeline was implemented to identify genes abnormally expressed in disease tissues or genes whose expressions are associated with clinical parameters such as cancer patient survival. Data-mining results can be queried through a web-based interface at http://dgem.dhcp.iupui.edu/. The query tool enables dynamic generation of graphs and tables that are further linked to major gene and pathway resources that connect the data to relevant biology, including Entrez Gene and Kyoto Encyclopedia of Genes and Genomes (KEGG). In summary, DGEM provides scientists and physicians a valuable tool to study disease mechanisms, to discover potential disease biomarkers for diagnosis and prognosis, and to identify novel gene targets for drug discovery. The source code is freely available for non-profit use, on request to the authors. PMID:17570735

  1. Biofuel Database

    National Institute of Standards and Technology Data Gateway

    Biofuel Database (Web, free access)   This database brings together structural, biological, and thermodynamic data for enzymes that are either in current use or are being considered for use in the production of biofuels.

  2. Database Administrator

    ERIC Educational Resources Information Center

    Moore, Pam

    2010-01-01

    The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…

  3. Study on the regulatory mechanism of the lipid metabolism pathways during chicken male germ cell differentiation based on RNA-seq.

    PubMed

    Zuo, Qisheng; Li, Dong; Zhang, Lei; Elsayed, Ahmed Kamel; Lian, Chao; Shi, Qingqing; Zhang, Zhentao; Zhu, Rui; Wang, Yinjie; Jin, Kai; Zhang, Yani; Li, Bichun

    2015-01-01

    Here, we explore the regulatory mechanism of lipid metabolic signaling pathways and related genes during differentiation of male germ cells in chickens, with the hope that better understanding of these pathways may improve in vitro induction. Fluorescence-activated cell sorting was used to obtain highly purified cultures of embryonic stem cells (ESCs), primitive germ cells (PGCs), and spermatogonial stem cells (SSCs). The total RNA was then extracted from each type of cell. High-throughput analysis methods (RNA-seq) were used to sequence the transcriptome of these cells. Gene Ontology (GO) analysis and the KEGG database were used to identify lipid metabolism pathways and related genes. Retinoic acid (RA), the end-product of the retinol metabolism pathway, induced in vitro differentiation of ESC into male germ cells. Quantitative real-time PCR (qRT-PCR) was used to detect changes in the expression of the genes involved in the retinol metabolic pathways. From the results of RNA-seq and the database analyses, we concluded that there are 328 genes in 27 lipid metabolic pathways continuously involved in lipid metabolism during the differentiation of ESC into SSC in vivo, including retinol metabolism. Alcohol dehydrogenase 5 (ADH5) and aldehyde dehydrogenase 1 family member A1 (ALDH1A1) are involved in RA synthesis in the cell. ADH5 was specifically expressed in PGC in our experiments and aldehyde dehydrogenase 1 family member A1 (ALDH1A1) persistently increased throughout development. CYP26b1, a member of the cytochrome P450 superfamily, is involved in the degradation of RA. Expression of CYP26b1, in contrast, decreased throughout development. Exogenous RA in the culture medium induced differentiation of ESC to SSC-like cells. The expression patterns of ADH5, ALDH1A1, and CYP26b1 were consistent with RNA-seq results. We conclude that the retinol metabolism pathway plays an important role in the process of chicken male germ cell differentiation.

  4. MetaMapR: pathway independent metabolomic network analysis incorporating unknowns

    PubMed Central

    Grapov, Dmitry; Wanichthanarak, Kwanjeera; Fiehn, Oliver

    2015-01-01

    Summary: Metabolic network mapping is a widely used approach for integration of metabolomic experimental results with biological domain knowledge. However, current approaches can be limited by biochemical domain or pathway knowledge which results in sparse disconnected graphs for real world metabolomic experiments. MetaMapR integrates enzymatic transformations with metabolite structural similarity, mass spectral similarity and empirical associations to generate richly connected metabolic networks. This open source, web-based or desktop software, written in the R programming language, leverages KEGG and PubChem databases to derive associations between metabolites even in cases where biochemical domain or molecular annotations are unknown. Network calculation is enhanced through an interface to the Chemical Translation System, which allows metabolite identifier translation between >200 common biochemical databases. Analysis results are presented as interactive visualizations or can be exported as high-quality graphics and numerical tables which can be imported into common network analysis and visualization tools. Availability and Implementation: Freely available at http://dgrapov.github.io/MetaMapR/. Requires R and a modern web browser. Installation instructions, tutorials and application examples are available at http://dgrapov.github.io/MetaMapR/. Contact: ofiehn@ucdavis.edu PMID:25847005

  5. The Importance of Biological Databases in Biological Discovery.

    PubMed

    Baxevanis, Andreas D; Bateman, Alex

    2015-01-01

    Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed. PMID:26094768

  6. [Analysis of oil synthesis metabolism pathways based on transcriptome changes in tung oil tree's seeds during three different development stages].

    PubMed

    Chen, Hao; Jiang, Gui-Xiong; Long, Hong-Xu; Tan, Xiao-Feng

    2013-12-01

    Tung oil tree (Verniciafordii) is one of the important woody oil plants in China. Past researches on tung oil tree mainly focu on the cultivation and conventional breeding while the molecular mechanisms related to tung oil synthesis are still uncovered. We compared transcriptome of tung oil tree's seeds at three different oil synthesis stages using RNA-seq technology and then obtained a lot of differentially expressed Unigenes. Through GO classification and pathway enrichment analysis, all of these differentially expressed Unigenes were classified into 128 metabolism pathways including fatty acid biosynthesis and glycerophospholipid metabolism which are involved in oil synthesis. Some homologous proteins of key enzymes were obtained when the sequences of the Unigenes within these two pathways were aligned against KEGG database. Through analysis of expression profiles of these key enzyme genes during seed's oil synthesis stage, this research not only shed light on elucidation of plant oil synthesis but also provides candidate genes for genetic improvement of tung oil tree thereby increasing the yield per unit area of tung oil tree.

  7. Pathway Based Analysis of Genes and Interactions Influencing Porcine Testis Samples from Boars with Divergent Androstenone Content in Back Fat

    PubMed Central

    Sahadevan, Sudeep; Gunawan, Asep; Tholen, Ernst; Große-Brinkhaus, Christine; Tesfaye, Dawit; Schellander, Karl; Hofmann-Apitius, Martin; Cinar, Mehmet Ulas; Uddin, Muhammad Jasim

    2014-01-01

    One of the primary factors contributing to boar taint is the level of androstenone in porcine adipose tissues. A majority of the studies performed to identify candidate biomarkers for the synthesis of androstenone in testis tissues follow a reductionist approach, identifying and studying the effect of biomarkers individually. Although these studies provide detailed information on individual biomarkers, a global picture of changes in metabolic pathways that lead to the difference in androstenone synthesis is still missing. The aim of this work was to identify major pathways and interactions influencing steroid hormone synthesis and androstenone biosynthesis using an integrative approach to provide a bird’s eye view of the factors causing difference in steroidogenesis and androstenone biosynthesis. For this purpose, we followed an analysis procedure merging together gene expression data from boars with divergent levels of androstenone and pathway mapping and interaction network retrieved from KEGG database. The interaction networks were weighted with Pearson correlation coefficients calculated from gene expression data and significant interactions and enriched pathways were identified based on these networks. The results show that 1,023 interactions were significant for high and low androstenone animals and that a total of 92 pathways were enriched for significant interactions. Although published articles show that a number of these enriched pathways were activated as a result of downstream signaling of steroid hormones, we speculate that the significant interactions in pathways such as glutathione metabolism, sphingolipid metabolism, fatty acid metabolism and significant interactions in cAMP-PKA/PKC signaling might be the key factors determining the difference in steroidogenesis and androstenone biosynthesis between boars with divergent androstenone levels in our study. The results and assumptions presented in this study are from an in-silico analysis done at the

  8. Pathway based analysis of genes and interactions influencing porcine testis samples from boars with divergent androstenone content in back fat.

    PubMed

    Sahadevan, Sudeep; Gunawan, Asep; Tholen, Ernst; Große-Brinkhaus, Christine; Tesfaye, Dawit; Schellander, Karl; Hofmann-Apitius, Martin; Cinar, Mehmet Ulas; Uddin, Muhammad Jasim

    2014-01-01

    One of the primary factors contributing to boar taint is the level of androstenone in porcine adipose tissues. A majority of the studies performed to identify candidate biomarkers for the synthesis of androstenone in testis tissues follow a reductionist approach, identifying and studying the effect of biomarkers individually. Although these studies provide detailed information on individual biomarkers, a global picture of changes in metabolic pathways that lead to the difference in androstenone synthesis is still missing. The aim of this work was to identify major pathways and interactions influencing steroid hormone synthesis and androstenone biosynthesis using an integrative approach to provide a bird's eye view of the factors causing difference in steroidogenesis and androstenone biosynthesis. For this purpose, we followed an analysis procedure merging together gene expression data from boars with divergent levels of androstenone and pathway mapping and interaction network retrieved from KEGG database. The interaction networks were weighted with Pearson correlation coefficients calculated from gene expression data and significant interactions and enriched pathways were identified based on these networks. The results show that 1,023 interactions were significant for high and low androstenone animals and that a total of 92 pathways were enriched for significant interactions. Although published articles show that a number of these enriched pathways were activated as a result of downstream signaling of steroid hormones, we speculate that the significant interactions in pathways such as glutathione metabolism, sphingolipid metabolism, fatty acid metabolism and significant interactions in cAMP-PKA/PKC signaling might be the key factors determining the difference in steroidogenesis and androstenone biosynthesis between boars with divergent androstenone levels in our study. The results and assumptions presented in this study are from an in-silico analysis done at the

  9. Systematic Pathway Enrichment Analysis of a Genome-Wide Association Study on Breast Cancer Survival Reveals an Influence of Genes Involved in Cell Adhesion and Calcium Signaling on the Patients’ Clinical Outcome

    PubMed Central

    Woltmann, Andrea; Chen, Bowang; Lascorz, Jesús; Johansson, Robert; Eyfjörd, Jorunn E.; Hamann, Ute; Manjer, Jonas; Enquist-Olsson, Kerstin; Henriksson, Roger; Herms, Stefan; Hoffmann, Per; Hemminki, Kari; Lenner, Per; Försti, Asta

    2014-01-01

    Genome-wide association studies (GWASs) may help to understand the effects of genetic polymorphisms on breast cancer (BC) progression and survival. However, they give only a focused view, which cannot capture the tremendous complexity of this disease. Therefore, we investigated data from a previously conducted GWAS on BC survival for enriched pathways by different enrichment analysis tools using the two main annotation databases Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). The goal was to identify the functional categories (GO terms and KEGG pathways) that are consistently overrepresented in a statistically significant way in the list of genes generated from the single nucleotide polymorphism (SNP) data. The SNPs with allelic p-value cut-offs 0.005 and 0.01 were annotated to the genes by excluding or including a 20 kb up-and down-stream sequence of the genes and analyzed by six different tools. We identified eleven consistently enriched categories, the most significant ones relating to cell adhesion and calcium ion binding. Moreover, we investigated the similarity between our GWAS and the enrichment analyses of twelve published gene expression signatures for breast cancer prognosis. Five of them were commonly used and commercially available, five were based on different aspects of metastasis formation and two were developed from meta-analyses of published prognostic signatures. This comparison revealed similarities between our GWAS data and the general and the specific brain metastasis gene signatures as well as the Oncotype DX signature. As metastasis formation is a strong indicator of a patient’s prognosis, this result reflects the survival aspect of the conducted GWAS and supports cell adhesion and calcium signaling as important pathways in cancer progression. PMID:24886783

  10. Integrative Pathway Analysis of Metabolic Signature in Bladder Cancer: A Linkage to The Cancer Genome Atlas Project and Prediction of Survival

    PubMed Central

    von Rundstedt, Friedrich-Carl; Rajapakshe, Kimal; Ma, Jing; Arnold, James M.; Gohlke, Jie; Putluri, Vasanta; Krishnapuram, Rashmi; Piyarathna, D. Badrajee; Lotan, Yair; Gödde, Daniel; Roth, Stephan; Störkel, Stephan; Levitt, Jonathan M.; Michailidis, George; Sreekumar, Arun; Lerner, Seth P.; Coarfa, Cristian; Putluri, Nagireddy

    2016-01-01

    Purpose We used targeted mass spectrometry to study the metabolic fingerprint of urothelial cancer and determine whether the biochemical pathway analysis gene signature would have a predictive value in independent cohorts of patients with bladder cancer. Materials and Methods Pathologically evaluated, bladder derived tissues, including benign adjacent tissue from 14 patients and bladder cancer from 46, were analyzed by liquid chromatography based targeted mass spectrometry. Differential metabolites associated with tumor samples in comparison to benign tissue were identified by adjusting the p values for multiple testing at a false discovery rate threshold of 15%. Enrichment of pathways and processes associated with the metabolic signature were determined using the GO (Gene Ontology) Database and MSigDB (Molecular Signature Database). Integration of metabolite alterations with transcriptome data from TCGA (The Cancer Genome Atlas) was done to identify the molecular signature of 30 metabolic genes. Available outcome data from TCGA portal were used to determine the association with survival. Results We identified 145 metabolites, of which analysis revealed 31 differential metabolites when comparing benign and tumor tissue samples. Using the KEGG (Kyoto Encyclopedia of Genes and Genomes) Database we identified a total of 174 genes that correlated with the altered metabolic pathways involved. By integrating these genes with the transcriptomic data from the corresponding TCGA data set we identified a metabolic signature consisting of 30 genes. The signature was significant in its prediction of survival in 95 patients with a low signature score vs 282 with a high signature score (p = 0.0458). Conclusions Targeted mass spectrometry of bladder cancer is highly sensitive for detecting metabolic alterations. Applying transcriptome data allows for integration into larger data sets and identification of relevant metabolic pathways in bladder cancer progression. PMID:26802582

  11. Database Manager

    ERIC Educational Resources Information Center

    Martin, Andrew

    2010-01-01

    It is normal practice today for organizations to store large quantities of records of related information as computer-based files or databases. Purposeful information is retrieved by performing queries on the data sets. The purpose of DATABASE MANAGER is to communicate to students the method by which the computer performs these queries. This…

  12. Maize databases

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This chapter is a succinct overview of maize data held in the species-specific database MaizeGDB (the Maize Genomics and Genetics Database), and selected multi-species data repositories, such as Gramene/Ensembl Plants, Phytozome, UniProt and the National Center for Biotechnology Information (NCBI), ...

  13. The EcoCyc Database

    PubMed Central

    Karp, Peter D.; Riley, Monica; Saier, Milton; Paulsen, Ian T.; Collado-Vides, Julio; Paley, Suzanne M.; Pellegrini-Toole, Alida; Bonavides, César; Gama-Castro, Socorro

    2002-01-01

    EcoCyc is an organism-specific pathway/genome database that describes the metabolic and signal-transduction pathways of Escherichia coli, its enzymes, its transport proteins and its mechanisms of transcriptional control of gene expression. EcoCyc is queried using the Pathway Tools graphical user interface, which provides a wide variety of query operations and visualization tools. EcoCyc is available at http://ecocyc.org/. PMID:11752253

  14. Genome databases

    SciTech Connect

    Courteau, J.

    1991-10-11

    Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts in the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.

  15. Transcriptome-based discovery of pathways and genes related to resistance against Fusarium head blight in wheat landrace Wangshuibai

    PubMed Central

    2013-01-01

    Background Fusarium head blight (FHB), caused mainly by Fusarium graminearum (Fg) Schwabe (teleomorph: Gibberellazeae Schwble), brings serious damage to wheat production. Chinese wheat landrace Wangshuibai is one of the most important resistance sources in the world. The knowledge of mechanism underlying its resistance to FHB is still limited. Results To get an overview of transcriptome characteristics of Wangshuibai during infection by Fg, a high-throughput RNA sequencing based on next generation sequencing (NGS) technology (Illumina) were performed. Totally, 165,499 unigenes were generated and assigned to known protein databases including NCBI non-redundant protein database (nr) (82,721, 50.0%), Gene Ontology (GO) (38,184, 23.1%), Swiss-Prot (50,702, 30.6%), Clusters of orthologous groups (COG) (51,566, 31.2%) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) (30,657, 18.5%), as determined by Blastx search. With another NGS based platform, a digital gene expression (DGE) system, gene expression in Wangshuibai and its FHB susceptible mutant NAUH117 was profiled and compared at two infection stages by inoculation of Fg at 24 and 48 hour, with the aim of identifying genes involved in FHB resistance. Conclusion Pathogen-related proteins such as PR5, PR14 and ABC transporter and JA signaling pathway were crucial for FHB resistance, especially that mediated by Fhb1. ET pathway and ROS/NO pathway were not activated in Wangshuibai and may be not pivotal in defense to FHB. Consistent with the fact that in NAUH117 there presented a chromosome fragment deletion, which led to its increased FHB susceptibility, in Wangshuibai, twenty out of eighty-nine genes showed changed expression patterns upon the infection of Fg. The up-regulation of eight of them was confirmed by qRT-PCR, revealing they may be candidate genes for Fhb1 and need further functional analysis to confirm their roles in FHB resistance. PMID:23514540

  16. Quantitative Proteogenomics and the Reconstruction of the Metabolic Pathway in Lactobacillus mucosae LM1

    PubMed Central

    Lee, Ji-Yoon

    2015-01-01

    Lactobacillus mucosae is a natural resident of the gastrointestinal tract of humans and animals and a potential probiotic bacterium. To understand the global protein expression profile and metabolic features of L. mucosae LM1 in the early stationary phase, the QExactiveTM Hybrid Quadrupole-Orbitrap Mass Spectrometer was used. Characterization of the intracellular proteome identified 842 proteins, accounting for approximately 35% of the 2,404 protein-coding sequences in the complete genome of L. mucosae LM1. Proteome quantification using QExactiveTM Orbitrap MS detected 19 highly abundant proteins (> 1.0% of the intracellular proteome), including CysK (cysteine synthase, 5.41%) and EF-Tu (elongation factor Tu, 4.91%), which are involved in cell survival against environmental stresses. Metabolic pathway annotation of LM1 proteome using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database showed that half of the proteins expressed are important for basic metabolic and biosynthetic processes, and the other half might be structurally important or involved in basic cellular processes. In addition, glycogen biosynthesis was activated in the early stationary phase, which is important for energy storage and maintenance. The proteogenomic data presented in this study provide a suitable reference to understand the protein expression pattern of lactobacilli in standard conditions. PMID:26761899

  17. Solubility Database

    National Institute of Standards and Technology Data Gateway

    SRD 106 IUPAC-NIST Solubility Database (Web, free access)   These solubilities are compiled from 18 volumes (Click here for List) of the International Union for Pure and Applied Chemistry(IUPAC)-NIST Solubility Data Series. The database includes liquid-liquid, solid-liquid, and gas-liquid systems. Typical solvents and solutes include water, seawater, heavy water, inorganic compounds, and a variety of organic compounds such as hydrocarbons, halogenated hydrocarbons, alcohols, acids, esters and nitrogen compounds. There are over 67,500 solubility measurements and over 1800 references.

  18. Next Generation Sequencing and Transcriptome Analysis Predicts Biosynthetic Pathway of Sennosides from Senna (Cassia angustifolia Vahl.), a Non-Model Plant with Potent Laxative Properties.

    PubMed

    Rama Reddy, Nagaraja Reddy; Mehta, Rucha Harishbhai; Soni, Palak Harendrabhai; Makasana, Jayanti; Gajbhiye, Narendra Athamaram; Ponnuchamy, Manivel; Kumar, Jitendra

    2015-01-01

    Senna (Cassia angustifolia Vahl.) is a world's natural laxative medicinal plant. Laxative properties are due to sennosides (anthraquinone glycosides) natural products. However, little genetic information is available for this species, especially concerning the biosynthetic pathways of sennosides. We present here the transcriptome sequencing of young and mature leaf tissue of Cassia angustifolia using Illumina MiSeq platform that resulted in a total of 6.34 Gb of raw nucleotide sequence. The sequence assembly resulted in 42230 and 37174 transcripts with an average length of 1119 bp and 1467 bp for young and mature leaf, respectively. The transcripts were annotated using NCBI BLAST with 'green plant database (txid 33090)', Swiss Prot, Kyoto Encylcopedia of Genes & Genomes (KEGG), Cluster of Orthologous Gene (COG) and Gene Ontology (GO). Out of the total transcripts, 40138 (95.0%) and 36349 (97.7%) from young and mature leaf, respectively, were annotated by BLASTX against green plant database of NCBI. We used InterProscan to see protein similarity at domain level, a total of 34031 (young leaf) and 32077 (mature leaf) transcripts were annotated against the Pfam domains. All transcripts from young and mature leaf were assigned to 191 KEGG pathways. There were 166 and 159 CDS, respectively, from young and mature leaf involved in metabolism of terpenoids and polyketides. Many CDS encoding enzymes leading to biosynthesis of sennosides were identified. A total of 10,763 CDS differentially expressing in both young and mature leaf libraries of which 2,343 (21.7%) CDS were up-regulated in young compared to mature leaf. Several differentially expressed genes found functionally associated with sennoside biosynthesis. CDS encoding for many CYPs and TF families were identified having probable roles in metabolism of primary as well as secondary metabolites. We developed SSR markers for molecular breeding of senna. We have identified a set of putative genes involved in various

  19. Next Generation Sequencing and Transcriptome Analysis Predicts Biosynthetic Pathway of Sennosides from Senna (Cassia angustifolia Vahl.), a Non-Model Plant with Potent Laxative Properties

    PubMed Central

    Rama Reddy, Nagaraja Reddy; Mehta, Rucha Harishbhai; Soni, Palak Harendrabhai; Makasana, Jayanti; Gajbhiye, Narendra Athamaram; Ponnuchamy, Manivel; Kumar, Jitendra

    2015-01-01

    Senna (Cassia angustifolia Vahl.) is a world’s natural laxative medicinal plant. Laxative properties are due to sennosides (anthraquinone glycosides) natural products. However, little genetic information is available for this species, especially concerning the biosynthetic pathways of sennosides. We present here the transcriptome sequencing of young and mature leaf tissue of Cassia angustifolia using Illumina MiSeq platform that resulted in a total of 6.34 Gb of raw nucleotide sequence. The sequence assembly resulted in 42230 and 37174 transcripts with an average length of 1119 bp and 1467 bp for young and mature leaf, respectively. The transcripts were annotated using NCBI BLAST with ‘green plant database (txid 33090)’, Swiss Prot, Kyoto Encylcopedia of Genes & Genomes (KEGG), Cluster of Orthologous Gene (COG) and Gene Ontology (GO). Out of the total transcripts, 40138 (95.0%) and 36349 (97.7%) from young and mature leaf, respectively, were annotated by BLASTX against green plant database of NCBI. We used InterProscan to see protein similarity at domain level, a total of 34031 (young leaf) and 32077 (mature leaf) transcripts were annotated against the Pfam domains. All transcripts from young and mature leaf were assigned to 191 KEGG pathways. There were 166 and 159 CDS, respectively, from young and mature leaf involved in metabolism of terpenoids and polyketides. Many CDS encoding enzymes leading to biosynthesis of sennosides were identified. A total of 10,763 CDS differentially expressing in both young and mature leaf libraries of which 2,343 (21.7%) CDS were up-regulated in young compared to mature leaf. Several differentially expressed genes found functionally associated with sennoside biosynthesis. CDS encoding for many CYPs and TF families were identified having probable roles in metabolism of primary as well as secondary metabolites. We developed SSR markers for molecular breeding of senna. We have identified a set of putative genes involved in various

  20. De Novo Transcriptome Analysis of Wing Development-Related Signaling Pathways in Locusta migratoria Manilensis and Ostrinia furnacalis (Guenée)

    PubMed Central

    Chu, Yuan; Zhang, Long; Shen, Jie; An, Chunju

    2014-01-01

    Background Orthopteran migratory locust, Locusta migratoria, and lepidopteran Asian corn borer, Ostrinia furnacalis, are two types of insects undergoing incomplete and complete metamorphosis, respectively. Identification of candidate genes regulating wing development in these two insects would provide insights into the further study about the molecular mechanisms controlling metamorphosis development. We have sequenced the transcriptome of O. furnacalis larvae previously. Here we sequenced and characterized the transcriptome of L. migratoria wing discs with special emphasis on wing development-related signaling pathways. Methodology/Principal Findings Illumina Hiseq2000 was used to sequence 8.38 Gb of the transcriptome from dissected nymphal wing discs. De novo assembly generated 91,907 unigenes with mean length of 610 nt. All unigenes were searched against five databases including Nt, Nr, Swiss-Prot, COG, and KEGG for annotations using blastn or blastx algorithm with an cut-off E-value of 10−5. A total of 23,359 (25.4%) unigenes have homologs within at least one database. Based on sequence similarity to homologs known to regulate Drosophila melanogaster wing development, we identified 50 and 46 potential wing development-related unigenes from L. migratoria and O. furnacalis transcriptome, respectively. The identified unigenes encode putative orthologs for nearly all components of the Hedgehog (Hh), Decapentaplegic (Dpp), Notch (N), and Wingless (Wg) signaling pathways, which are essential for growth and pattern formation during wing development. We investigated the expression profiles of the component genes involved in these signaling pathways in forewings and hind wings of L. migratoria and O. furnacalis. The results revealed the tested genes had different expression patterns in two insects. Conclusions/Significance This study provides the comprehensive sequence resource of the wing development-related signaling pathways of L. migratoria. The obtained data

  1. Changes in the Proteome of Langat-Infected Ixodes scapularis ISE6 Cells: Metabolic Pathways Associated with Flavivirus Infection

    PubMed Central

    Grabowski, Jeffrey M.; Perera, Rushika; Roumani, Ali M.; Hedrick, Victoria E.; Inerowicz, Halina D.; Hill, Catherine A.; Kuhn, Richard J.

    2016-01-01

    Background Ticks (Family Ixodidae) transmit a variety of disease causing agents to humans and animals. The tick-borne flaviviruses (TBFs; family Flaviviridae) are a complex of viruses, many of which cause encephalitis and hemorrhagic fever, and represent global threats to human health and biosecurity. Pathogenesis has been well studied in human and animal disease models. Equivalent analyses of tick-flavivirus interactions are limited and represent an area of study that could reveal novel approaches for TBF control. Methodology/Principal Findings High resolution LC-MS/MS was used to analyze the proteome of Ixodes scapularis (Lyme disease tick) embryonic ISE6 cells following infection with Langat virus (LGTV) and identify proteins associated with viral infection and replication. Maximal LGTV infection of cells and determination of peak release of infectious virus, was observed at 36 hours post infection (hpi). Proteins were extracted from ISE6 cells treated with LGTV and non-infectious (UV inactivated) LGTV at 36 hpi and analyzed by mass spectrometry. The Omics Discovery Pipeline (ODP) identified thousands of MS peaks. Protein homology searches against the I. scapularis IscaW1 genome assembly identified a total of 486 proteins that were subsequently assigned to putative functional pathways using searches against the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. 266 proteins were differentially expressed following LGTV infection relative to non-infected (mock) cells. Of these, 68 proteins exhibited increased expression and 198 proteins had decreased expression. The majority of the former were classified in the KEGG pathways: “translation”, “amino acid metabolism”, and “protein folding/sorting/degradation”. Finally, Trichostatin A and Oligomycin A increased and decreased LGTV replication in vitro in ISE6 cells, respectively. Conclusions/Significance Proteomic analyses revealed ISE6 proteins that were differentially expressed at the peak of LGTV

  2. Insilico analysis of hypothetical proteins unveils putative metabolic pathways and essential genes in Leishmania donovani.

    PubMed

    Ravooru, Nithin; Ganji, Sandesh; Sathyanarayanan, Nitish; Nagendra, Holenarsipur G

    2014-01-01

    Leishmaniasis is a parasitic disease caused by the protozoan Leishmania, which is active in two broad forms namely, Visceral Leishmaniasis (VL or Kala Azar) and Cutaneous Leishmaniasis (CL). The disease is most prevalent in the tropical regions and poses a threat to over 70 countries across the globe. About 200 million people are estimated to be at risk of developing VL in the Indian subcontinent, and this refers to around 67% of the global VL disease burden. The Indian state of Bihar alone accounts for 50% of the total VL cases. While no vaccination exists, several pentavalent antimonials and drugs like Paromomycin, Amphotericin, Miltefosine etc. are used in the treatment of Leishmaniasis. However, due to their low efficacies and the resistance developed by the bug to these medications, there is an urgent need to look into newer species specific targets. The proteome information available suggests that among the 7960 proteins in Leishmania donavani, a staggering 65% remains classified as a hypothetical uncharacterized set. In this background, we have attempted to assign probable functions to these hypothetical sequences present in this parasite, to explore their plausible roles as druggable receptors. Thus, putative functions have been defined to 105 hypothetical proteins, which exhibited a GO term correlation and PFAM domain coverage of more than 50% over the query sequence length. Of these, 27 sequences were found to be associated with a reference pathway in KEGG as well. Further, using homology approaches, four pathways viz., Ubiquinone biosynthesis, Fatty acid elongation in Mitochondria, Fatty Acid Elongation in ER and Seleno-cysteine Metabolism have been reconstructed. In addition, 7 new putative essential genes have been mined with the help of Eukaryotic Database of Essential Genes (DEG). All these information related to pathways and essential genes indeed show promise for exploiting the select molecules as potential therapeutic targets. PMID:25206363

  3. Drinking Water Treatability Database (Database)

    EPA Science Inventory

    The drinking Water Treatability Database (TDB) will provide data taken from the literature on the control of contaminants in drinking water, and will be housed on an interactive, publicly-available USEPA web site. It can be used for identifying effective treatment processes, rec...

  4. Pathway analysis of genome-wide association study and transcriptome data highlights new biological pathways in colorectal cancer.

    PubMed

    Quan, Baoku; Qi, Xingsi; Yu, Zhihui; Jiang, Yongshuai; Liao, Mingzhi; Wang, Guangyu; Feng, Rennan; Zhang, Liangcai; Chen, Zugen; Jiang, Qinghua; Liu, Guiyou

    2015-04-01

    Colorectal cancer (CRC) is a common malignancy that meets the definition of a complex disease. Genome-wide association study (GWAS) has identified several loci of weak predictive value in CRC, however, these do not fully explain the occurrence risk. Recently, gene set analysis has allowed enhanced interpretation of GWAS data in CRC, identifying a number of metabolic pathways as important for disease pathogenesis. Whether there are other important pathways involved in CRC, however, remains unclear. We present a systems analysis of KEGG pathways in CRC using (1) a human CRC GWAS dataset and (2) a human whole transcriptome CRC case-control expression dataset. Analysis of the GWAS dataset revealed significantly enriched KEGG pathways related to metabolism, immune system and diseases, cellular processes, environmental information processing, genetic information processing, and neurodegenerative diseases. Altered gene expression was confirmed in these pathways using the transcriptome dataset. Taken together, these findings not only confirm previous work in this area, but also highlight new biological pathways whose deregulation is critical for CRC. These results contribute to our understanding of disease-causing mechanisms and will prove useful for future genetic and functional studies in CRC. PMID:25362561

  5. The Comparative Toxicogenomics Database facilitates identification and understanding of chemical-gene-disease associations: arsenic as a case study

    PubMed Central

    Davis, Allan P; Murphy, Cynthia G; Rosenstein, Michael C; Wiegers, Thomas C; Mattingly, Carolyn J

    2008-01-01

    Background The etiology of many chronic diseases involves interactions between environmental factors and genes that modulate physiological processes. Understanding interactions between environmental chemicals and genes/proteins may provide insights into the mechanisms of chemical actions, disease susceptibility, toxicity, and therapeutic drug interactions. The Comparative Toxicogenomics Database (CTD; ) provides these insights by curating and integrating data describing relationships between chemicals, genes/proteins, and human diseases. To illustrate the scope and application of CTD, we present an analysis of curated data for the chemical arsenic. Arsenic represents a major global environmental health threat and is associated with many diseases. The mechanisms by which arsenic modulates these diseases are not well understood. Methods Curated interactions between arsenic compounds and genes were downloaded using export and batch query tools at CTD. The list of genes was analyzed for molecular interactions, Gene Ontology (GO) terms, KEGG pathway annotations, and inferred disease relationships. Results CTD contains curated data from the published literature describing 2,738 molecular interactions between 21 different arsenic compounds and 1,456 genes and proteins. Analysis of these genes and proteins provide insight into the biological functions and molecular networks that are affected by exposure to arsenic, including stress response, apoptosis, cell cycle, and specific protein signaling pathways. Integrating arsenic-gene data with gene-disease data yields a list of diseases that may be associated with arsenic exposure and genes that may explain this association. Conclusion CTD data integration and curation strategies yield insight into the actions of environmental chemicals and provide a basis for developing hypotheses about the molecular mechanisms underlying the etiology of environmental diseases. While many reports describe the molecular response to arsenic, CTD

  6. Transcriptomic analysis of the head kidney of Topmouth culter (Culter alburnus) infected with Flavobacterium columnare with an emphasis on phagosome pathway.

    PubMed

    Zhao, Lijuan; Tu, Jiagang; Zhang, Yulei; Wang, Jinfu; Yang, Ling; Wang, Weimin; Wu, Zaohe; Meng, Qinglei; Lin, Li

    2016-10-01

    Flavobacterium columnare (FC) has caused worldwide fish columnaris disease with high mortality and great economic losses in cultured fish, including Topmouth culter (Culter alburnus). However, the knowledge about the host factors involved in FC infection is little known. In this study, the transcriptomic profiles of the head kidney from Topmouth culter with or without FC infection were obtained using HiSeq™ 2500 (Illumina). Totally 79,641 unigenes with high quality were obtained. Among them, 4037 differently expressed genes, including 1217 up-regulated and 2820 down-regulated genes, were identified and enriched using databases of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). The differently expressed genes were mainly associated with pathways such as immune response, carbohydrate metabolism, amino acid metabolism, and lipid metabolism. Since phagocytosis is a central mechanism of innate immune response by host cells to defense against infectious agents, genes related to the phagosome pathway were scrutinized and 9 differently expressed phagosome-related genes were identified including 3 up-regulated and 6 down-regulated genes. Five of them were further validated by quantitative real-time polymerase chain reaction (qRT-PCR). This transcriptomic analysis of host genes in response to FC infection provides data towards understanding the infection mechanisms and will shed a new light on the prevention of columnaris. PMID:27601296

  7. Transcriptomic analysis of the head kidney of Topmouth culter (Culter alburnus) infected with Flavobacterium columnare with an emphasis on phagosome pathway.

    PubMed

    Zhao, Lijuan; Tu, Jiagang; Zhang, Yulei; Wang, Jinfu; Yang, Ling; Wang, Weimin; Wu, Zaohe; Meng, Qinglei; Lin, Li

    2016-10-01

    Flavobacterium columnare (FC) has caused worldwide fish columnaris disease with high mortality and great economic losses in cultured fish, including Topmouth culter (Culter alburnus). However, the knowledge about the host factors involved in FC infection is little known. In this study, the transcriptomic profiles of the head kidney from Topmouth culter with or without FC infection were obtained using HiSeq™ 2500 (Illumina). Totally 79,641 unigenes with high quality were obtained. Among them, 4037 differently expressed genes, including 1217 up-regulated and 2820 down-regulated genes, were identified and enriched using databases of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). The differently expressed genes were mainly associated with pathways such as immune response, carbohydrate metabolism, amino acid metabolism, and lipid metabolism. Since phagocytosis is a central mechanism of innate immune response by host cells to defense against infectious agents, genes related to the phagosome pathway were scrutinized and 9 differently expressed phagosome-related genes were identified including 3 up-regulated and 6 down-regulated genes. Five of them were further validated by quantitative real-time polymerase chain reaction (qRT-PCR). This transcriptomic analysis of host genes in response to FC infection provides data towards understanding the infection mechanisms and will shed a new light on the prevention of columnaris.

  8. MINEs: Open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics

    DOE PAGES

    Jeffryes, James G.; Colastani, Ricardo L.; Elbadawi-Sidhu, Mona; Kind, Tobias; Niehaus, Thomas D.; Broadbelt, Linda J.; Hanson, Andrew D.; Fiehn, Oliver; Tyo, Keith E. J.; Henry, Christopher S.

    2015-08-28

    Metabolomics have proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography–mass spectrometry (LC–MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likelymore » to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC–MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose

  9. MINEs: Open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics

    SciTech Connect

    Jeffryes, James G.; Colastani, Ricardo L.; Elbadawi-Sidhu, Mona; Kind, Tobias; Niehaus, Thomas D.; Broadbelt, Linda J.; Hanson, Andrew D.; Fiehn, Oliver; Tyo, Keith E. J.; Henry, Christopher S.

    2015-08-28

    Metabolomics have proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography–mass spectrometry (LC–MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likely to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC–MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results

  10. De novo Transcriptome Analysis of Sinapis alba in Revealing the Glucosinolate and Phytochelatin Pathways

    PubMed Central

    Zhang, Xiaohui; Liu, Tongjin; Duan, Mengmeng; Song, Jiangping; Li, Xixiang

    2016-01-01

    Sinapis alba is an important condiment crop and can also be used as a phytoremediation plant. Though it has important economic and agronomic values, sequence data, and the genetic tools are still rare in this plant. In the present study, a de novo transcriptome based on the transcriptions of leaves, stems, and roots was assembled for S. alba for the first time. The transcriptome contains 47,972 unigenes with a mean length of 1185 nt and an N50 of 1672 nt. Among these unigenes, 46,535 (97%) unigenes were annotated by at least one of the following databases: NCBI non-redundant (Nr), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, Gene Ontology (GO), and Clusters of Orthologous Groups of proteins (COGs). The tissue expression pattern profiles revealed that 3489, 1361, and 8482 unigenes were predominantly expressed in the leaves, stems, and roots of S. alba, respectively. Genes predominantly expressed in the leaf were enriched in photosynthesis- and carbon fixation-related pathways. Genes predominantly expressed in the stem were enriched in not only pathways related to sugar, ether lipid, and amino acid metabolisms but also plant hormone signal transduction and circadian rhythm pathways, while the root-dominant genes were enriched in pathways related to lignin and cellulose syntheses, involved in plant-pathogen interactions, and potentially responsible for heavy metal chelating, and detoxification. Based on this transcriptome, 14,727 simple sequence repeats (SSRs) were identified, and 12,830 pairs of primers were developed for 2522 SSR-containing unigenes. Additionally, the glucosinolate (GSL) and phytochelatin metabolic pathways, which give the characteristic flavor and the heavy metal tolerance of this plant, were intensively analyzed. The genes of aliphatic GSLs pathway were predominantly expressed in roots. The absence of aliphatic GSLs in leaf tissues was due to the shutdown of BCAT4, MAM1, and CYP79F1 expressions. Glutathione was extensively

  11. Orchidstra: An Integrated Orchid Functional Genomics Database

    PubMed Central

    Su, Chun-lin; Chao, Ya-Ting; Yen, Shao-Hua; Chen, Chun-Yi; Chen, Wan-Chieh; Chang, Yao-Chien Alex; Shih, Ming-Che

    2013-01-01

    A specialized orchid database, named Orchidstra (URL: http://orchidstra.abrc.sinica.edu.tw), has been constructed to collect, annotate and share genomic information for orchid functional genomics studies. The Orchidaceae is a large family of Angiosperms that exhibits extraordinary biodiversity in terms of both the number of species and their distribution worldwide. Orchids exhibit many unique biological features; however, investigation of these traits is currently constrained due to the limited availability of genomic information. Transcriptome information for five orchid species and one commercial hybrid has been included in the Orchidstra database. Altogether, these comprise >380,000 non-redundant orchid transcript sequences, of which >110,000 are protein-coding genes. Sequences from the transcriptome shotgun assembly (TSA) were obtained either from output reads from next-generation sequencing technologies assembled into contigs, or from conventional cDNA library approaches. An annotation pipeline using Gene Ontology, KEGG and Pfam was built to assign gene descriptions and functional annotation to protein-coding genes. Deep sequencing of small RNA was also performed for Phalaenopsis aphrodite to search for microRNAs (miRNAs), extending the information archived for this species to miRNA annotation, precursors and putative target genes. The P. aphrodite transcriptome information was further used to design probes for an oligonucleotide microarray, and expression profiling analysis was carried out. The intensities of hybridized probes derived from microarray assays of various tissues were incorporated into the database as part of the functional evidence. In the future, the content of the Orchidstra database will be expanded with transcriptome data and genomic information from more orchid species. PMID:23324169

  12. De Novo transcriptome sequencing reveals important molecular networks and metabolic pathways of the plant, Chlorophytum borivilianum.

    PubMed

    Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

    2013-01-01

    Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.

  13. Stackfile Database

    NASA Technical Reports Server (NTRS)

    deVarvalho, Robert; Desai, Shailen D.; Haines, Bruce J.; Kruizinga, Gerhard L.; Gilmer, Christopher

    2013-01-01

    This software provides storage retrieval and analysis functionality for managing satellite altimetry data. It improves the efficiency and analysis capabilities of existing database software with improved flexibility and documentation. It offers flexibility in the type of data that can be stored. There is efficient retrieval either across the spatial domain or the time domain. Built-in analysis tools are provided for frequently performed altimetry tasks. This software package is used for storing and manipulating satellite measurement data. It was developed with a focus on handling the requirements of repeat-track altimetry missions such as Topex and Jason. It was, however, designed to work with a wide variety of satellite measurement data [e.g., Gravity Recovery And Climate Experiment -- GRACE). The software consists of several command-line tools for importing, retrieving, and analyzing satellite measurement data.

  14. Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms.

    PubMed

    Ortegon, Patricia; Poot-Hernández, Augusto C; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

    2015-01-01

    In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case. PMID:25973143

  15. Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms.

    PubMed

    Ortegon, Patricia; Poot-Hernández, Augusto C; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

    2015-01-01

    In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case.

  16. Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms

    PubMed Central

    Ortegon, Patricia; Poot-Hernández, Augusto C.; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya

    2015-01-01

    In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case. PMID:25973143

  17. Database Marketplace 2002: The Database Universe.

    ERIC Educational Resources Information Center

    Tenopir, Carol; Baker, Gayle; Robinson, William

    2002-01-01

    Reviews the database industry over the past year, including new companies and services, company closures, popular database formats, popular access methods, and changes in existing products and services. Lists 33 firms and their database services; 33 firms and their database products; and 61 company profiles. (LRW)

  18. A database of macromolecular motions.

    PubMed Central

    Gerstein, M; Krebs, W

    1998-01-01

    We describe a database of macromolecular motions meant to be of general use to the structural community. The database, which is accessible on the World Wide Web with an entry point at http://bioinfo.mbb.yale.edu/MolMovDB , attempts to systematize all instances of protein and nucleic acid movement for which there is at least some structural information. At present it contains >120 motions, most of which are of proteins. Protein motions are further classified hierarchically into a limited number of categories, first on the basis of size (distinguishing between fragment, domain and subunit motions) and then on the basis of packing. Our packing classification divides motions into various categories (shear, hinge, other) depending on whether or not they involve sliding over a continuously maintained and tightly packed interface. In addition, the database provides some indication about the evidence behind each motion (i.e. the type of experimental information or whether the motion is inferred based on structural similarity) and attempts to describe many aspects of a motion in terms of a standardized nomenclature (e.g. the maximum rotation, the residue selection of a fixed core, etc.). Currently, we use a standard relational design to implement the database. However, the complexity and heterogeneity of the information kept in the database makes it an ideal application for an object-relational approach, and we are moving it in this direction. Specifically, in terms of storing complex information, the database contains plausible representations for motion pathways, derived from restrained 3D interpolation between known endpoint conformations. These pathways can be viewed in a variety of movie formats, and the database is associated with a server that can automatically generate these movies from submitted coordinates. PMID:9722650

  19. Robust gene signatures from microarray data using genetic algorithms enriched with biological pathway keywords.

    PubMed

    Luque-Baena, R M; Urda, D; Gonzalo Claros, M; Franco, L; Jerez, J M

    2014-06-01

    Genetic algorithms are widely used in the estimation of expression profiles from microarrays data. However, these techniques are unable to produce stable and robust solutions suitable to use in clinical and biomedical studies. This paper presents a novel two-stage evolutionary strategy for gene feature selection combining the genetic algorithm with biological information extracted from the KEGG database. A comparative study is carried out over public data from three different types of cancer (leukemia, lung cancer and prostate cancer). Even though the analyses only use features having KEGG information, the results demonstrate that this two-stage evolutionary strategy increased the consistency, robustness and accuracy of a blind discrimination among relapsed and healthy individuals. Therefore, this approach could facilitate the definition of gene signatures for the clinical prognosis and diagnostic of cancer diseases in a near future. Additionally, it could also be used for biological knowledge discovery about the studied disease.

  20. Core Proteomic Analysis of Unique Metabolic Pathways of Salmonella enterica for the Identification of Potential Drug Targets

    PubMed Central

    2016-01-01

    Background Infections caused by Salmonella enterica, a Gram-negative facultative anaerobic bacteria belonging to the family of Enterobacteriaceae, are major threats to the health of humans and animals. The recent availability of complete genome data of pathogenic strains of the S. enterica gives new avenues for the identification of drug targets and drug candidates. We have used the genomic and metabolic pathway data to identify pathways and proteins essential to the pathogen and absent from the host. Methods We took the whole proteome sequence data of 42 strains of S. enterica and Homo sapiens along with KEGG-annotated metabolic pathway data, clustered proteins sequences using CD-HIT, identified essential genes using DEG database and discarded S. enterica homologs of human proteins in unique metabolic pathways (UMPs) and characterized hypothetical proteins with SVM-prot and InterProScan. Through this core proteomic analysis we have identified enzymes essential to the pathogen. Results The identification of 73 enzymes common in 42 strains of S. enterica is the real strength of the current study. We proposed all 73 unexplored enzymes as potential drug targets against the infections caused by the S. enterica. The study is comprehensive around S. enterica and simultaneously considered every possible pathogenic strain of S. enterica. This comprehensiveness turned the current study significant since, to the best of our knowledge it is the first subtractive core proteomic analysis of the unique metabolic pathways applied to any pathogen for the identification of drug targets. We applied extensive computational methods to shortlist few potential drug targets considering the druggability criteria e.g. Non-homologous to the human host, essential to the pathogen and playing significant role in essential metabolic pathways of the pathogen (i.e. S. enterica). In the current study, the subtractive proteomics through a novel approach was applied i.e. by considering only proteins

  1. Overlap in Bibliographic Databases.

    ERIC Educational Resources Information Center

    Hood, William W.; Wilson, Concepcion S.

    2003-01-01

    Examines the topic of Fuzzy Set Theory to determine the overlap of coverage in bibliographic databases. Highlights include examples of comparisons of database coverage; frequency distribution of the degree of overlap; records with maximum overlap; records unique to one database; intra-database duplicates; and overlap in the top ten databases.…

  2. De novo assembly and transcriptome analysis of the rubber tree (Hevea brasiliensis) and SNP markers development for rubber biosynthesis pathways.

    PubMed

    Mantello, Camila Campos; Cardoso-Silva, Claudio Benicio; da Silva, Carla Cristina; de Souza, Livia Moura; Scaloppi Junior, Erivaldo José; de Souza Gonçalves, Paulo; Vicentini, Renato; de Souza, Anete Pereira

    2014-01-01

    Hevea brasiliensis (Willd. Ex Adr. Juss.) Muell.-Arg. is the primary source of natural rubber that is native to the Amazon rainforest. The singular properties of natural rubber make it superior to and competitive with synthetic rubber for use in several applications. Here, we performed RNA sequencing (RNA-seq) of H. brasiliensis bark on the Illumina GAIIx platform, which generated 179,326,804 raw reads on the Illumina GAIIx platform. A total of 50,384 contigs that were over 400 bp in size were obtained and subjected to further analyses. A similarity search against the non-redundant (nr) protein database returned 32,018 (63%) positive BLASTx hits. The transcriptome analysis was annotated using the clusters of orthologous groups (COG), gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Pfam databases. A search for putative molecular marker was performed to identify simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). In total, 17,927 SSRs and 404,114 SNPs were detected. Finally, we selected sequences that were identified as belonging to the mevalonate (MVA) and 2-C-methyl-D-erythritol 4-phosphate (MEP) pathways, which are involved in rubber biosynthesis, to validate the SNP markers. A total of 78 SNPs were validated in 36 genotypes of H. brasiliensis. This new dataset represents a powerful information source for rubber tree bark genes and will be an important tool for the development of microsatellites and SNP markers for use in future genetic analyses such as genetic linkage mapping, quantitative trait loci identification, investigations of linkage disequilibrium and marker-assisted selection.

  3. bioDBnet: the biological database network

    PubMed Central

    Mudunuri, Uma; Che, Anney; Yi, Ming; Stephens, Robert M.

    2009-01-01

    Summary: bioDBnet is an online web resource that provides interconnected access to many types of biological databases. It has integrated many of the most commonly used biological databases and in its current state has 153 database identifiers (nodes) covering all aspects of biology including genes, proteins, pathways and other biological concepts. bioDBnet offers various ways to work with these databases including conversions, extensive database reports, custom navigation and has various tools to enhance the quality of the results. Importantly, the access to bioDBnet is updated regularly, providing access to the most recent releases of each individual database. Availability: http://biodbnet.abcc.ncifcrf.gov Contact: stephensr@mail.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online PMID:19129209

  4. Production of a reference transcriptome and transcriptomic database (PocilloporaBase) for the cauliflower coral, Pocillopora damicornis

    PubMed Central

    2011-01-01

    extensive transcriptomic data are publicly available. Through PocilloporaBase http://www.PocilloporaBase.org, one can obtain assembled contigs and raw reads and query the data according to a wide assortment of attributes including taxonomic origin, PFAM motif, KEGG pathway, and GO annotation. PMID:22126435

  5. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    SciTech Connect

    Shi, CY; Yang, H; Wei, CL; Yu, O; Zhang, ZZ; Sun, J; Wan, XC

    2011-01-01

    Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real

  6. The stem cell renewal and DNA damage response pathways are frequently altered in fibroepithelial tumors of breast in Indian patients.

    PubMed

    Mukherjee, Nupur; Islam, Md Saimul; Roychowdhury, Anirban; Bhattacharya, Rittwika; Chunder, Nilanjana; Bhattacharya, Nilanjana; Sinha, Satyabrata; Alam, Neyaz; Roy, Anup; Roychoudhury, Susanta; Panda, Chinmay Kumar

    2016-03-01

    Genetic and epigenetic alterations in genes associated with distinct cellular pathways were checked in fibroepithelial tumors, including fibroadenomas, benign and malignant phyllode and atypical ductal hyperplasia. A panel of 22 genes associated with different cellular pathways such as stem cell renewal (Wnt and Hedgehog), DNA damage response [homologous recombination (HR), mismatch repair (MMR) and nucleotide excision repair (NER)] and cell proliferation signaling pathway were tested. Alterations (genetic/epigenetic) of the genes associated with Wnt signaling pathway were detected in 100% (20/20) of the breast tumors for at least one out of the six Wnt antagonists tested. Frequent molecular alterations (57-64%) were detected in HR and MMR pathway and low frequency of alterations (8-25%) were seen in cell-proliferation and cell signaling pathways showing a differential pattern of alterations in different tumor types. The patterns of alterations, in particular the epigenetic alterations, differed little from that seen previously in breast carcinoma cells, suggesting epigenetic alterations to be an early event in the development of the tumors. In gene ontology analysis, it was evident that Wnt signaling pathway [GO: 0030111, Kegg: 04310], cell proliferation pathway [GO: 0008285] and pathways in cancer [Kegg: 05200] were significantly enriched by differentially altered genes in fibroadenoma and phyllode tumor types. All these results may provide a new breakthrough in early diagnosis, prognosis and treatment of these tumors. PMID:26774289

  7. Integrative data mining of high-throughput in vitro screens, in vivo data, and disease information to identify Adverse Outcome Pathway (AOP) signatures:ToxCast high-throughput screening data and Comparative Toxicogenomics Database (CTD) as a case study.

    EPA Science Inventory

    The Adverse Outcome Pathway (AOP) framework provides a systematic way to describe linkages between molecular and cellular processes and organism or population level effects. The current AOP assembly methods however, are inefficient. Our goal is to generate computationally-pr...

  8. Inference of miRNA targets using evolutionary conservation and pathway analysis

    PubMed Central

    Gaidatzis, Dimos; van Nimwegen, Erik; Hausser, Jean; Zavolan, Mihaela

    2007-01-01

    Background MicroRNAs have emerged as important regulatory genes in a variety of cellular processes and, in recent years, hundreds of such genes have been discovered in animals. In contrast, functional annotations are available only for a very small fraction of these miRNAs, and even in these cases only partially. Results We developed a general Bayesian method for the inference of miRNA target sites, in which, for each miRNA, we explicitly model the evolution of orthologous target sites in a set of related species. Using this method we predict target sites for all known miRNAs in flies, worms, fish, and mammals. By comparing our predictions in fly with a reference set of experimentally tested miRNA-mRNA interactions we show that our general method performs at least as well as the most accurate methods available to date, including ones specifically tailored for target prediction in fly. An important novel feature of our model is that it explicitly infers the phylogenetic distribution of functional target sites, independently for each miRNA. This allows us to infer species-specific and clade-specific miRNA targeting. We also show that, in long human 3' UTRs, miRNA target sites occur preferentially near the start and near the end of the 3' UTR. To characterize miRNA function beyond the predicted lists of targets we further present a method to infer significant associations between the sets of targets predicted for individual miRNAs and specific biochemical pathways, in particular those of the KEGG pathway database. We show that this approach retrieves several known functional miRNA-mRNA associations, and predicts novel functions for known miRNAs in cell growth and in development. Conclusion We have presented a Bayesian target prediction algorithm without any tunable parameters, that can be applied to sequences from any clade of species. The algorithm automatically infers the phylogenetic distribution of functional sites for each miRNA, and assigns a posterior

  9. Databases: Beyond the Basics.

    ERIC Educational Resources Information Center

    Whittaker, Robert

    This presented paper offers an elementary description of database characteristics and then provides a survey of databases that may be useful to the teacher and researcher in Slavic and East European languages and literatures. The survey focuses on commercial databases that are available, usable, and needed. Individual databases discussed include:…

  10. Reflective Database Access Control

    ERIC Educational Resources Information Center

    Olson, Lars E.

    2009-01-01

    "Reflective Database Access Control" (RDBAC) is a model in which a database privilege is expressed as a database query itself, rather than as a static privilege contained in an access control list. RDBAC aids the management of database access controls by improving the expressiveness of policies. However, such policies introduce new interactions…

  11. Human Mitochondrial Protein Database

    National Institute of Standards and Technology Data Gateway

    SRD 131 Human Mitochondrial Protein Database (Web, free access)   The Human Mitochondrial Protein Database (HMPDb) provides comprehensive data on mitochondrial and human nuclear encoded proteins involved in mitochondrial biogenesis and function. This database consolidates information from SwissProt, LocusLink, Protein Data Bank (PDB), GenBank, Genome Database (GDB), Online Mendelian Inheritance in Man (OMIM), Human Mitochondrial Genome Database (mtDB), MITOMAP, Neuromuscular Disease Center and Human 2-D PAGE Databases. This database is intended as a tool not only to aid in studying the mitochondrion but in studying the associated diseases.

  12. KCF-S: KEGG Chemical Function and Substructure for improved interpretability and prediction in chemical bioinformatics

    PubMed Central

    2013-01-01

    Background In order to develop hypothesis on unknown metabolic pathways, biochemists frequently rely on literature that uses a free-text format to describe functional groups or substructures. In computational chemistry or cheminformatics, molecules are typically represented by chemical descriptors, i.e., vectors that summarize information on its various properties. However, it is difficult to interpret these chemical descriptors since they are not directly linked to the terminology of functional groups or substructures that the biochemists use. Methods In this study, we used KEGG Chemical Function (KCF) format to computationally describe biochemical substructures in seven attributes that resemble biochemists' way of dealing with substructures. Results We established KCF-S (KCF-and-Substructures) format as an additional structural information of KCF. Applying KCF-S revealed the specific appearance of substructures from various datasets of molecules that describes the characteristics of the respective datasets. Structure-based clustering of molecules using KCF-S resulted the clusters in which molecular weights and structures were less diverse than those obtained by conventional chemical fingerprints. We further applied KCF-S to find the pairs of molecules that are possibly converted to each other in enzymatic reactions, and KCF-S clearly improved predictive performance than that presented previously. Conclusions KCF-S defines biochemical substructures with keeping interpretability, suggesting the potential to apply more studies on chemical bioinformatics. KCF and KCF-S can be automatically converted from Molfile format, enabling to deal with molecules from any data sources. PMID:24564846

  13. Pathway-based factor analysis of gene expression data produces highly heritable phenotypes that associate with age.

    PubMed

    Anand Brown, Andrew; Ding, Zhihao; Viñuela, Ana; Glass, Dan; Parts, Leopold; Spector, Tim; Winn, John; Durbin, Richard

    2015-03-09

    Statistical factor analysis methods have previously been used to remove noise components from high-dimensional data prior to genetic association mapping and, in a guided fashion, to summarize biologically relevant sources of variation. Here, we show how the derived factors summarizing pathway expression can be used to analyze the relationships between expression, heritability, and aging. We used skin gene expression data from 647 twins from the MuTHER Consortium and applied factor analysis to concisely summarize patterns of gene expression to remove broad confounding influences and to produce concise pathway-level phenotypes. We derived 930 "pathway phenotypes" that summarized patterns of variation across 186 KEGG pathways (five phenotypes per pathway). We identified 69 significant associations of age with phenotype from 57 distinct KEGG pathways at a stringent Bonferroni threshold ([Formula: see text]). These phenotypes are more heritable ([Formula: see text]) than gene expression levels. On average, expression levels of 16% of genes within these pathways are associated with age. Several significant pathways relate to metabolizing sugars and fatty acids; others relate to insulin signaling. We have demonstrated that factor analysis methods combined with biological knowledge can produce more reliable phenotypes with less stochastic noise than the individual gene expression levels, which increases our power to discover biologically relevant associations. These phenotypes could also be applied to discover associations with other environmental factors.

  14. Pathway-Based Factor Analysis of Gene Expression Data Produces Highly Heritable Phenotypes That Associate with Age

    PubMed Central

    Anand Brown, Andrew; Ding, Zhihao; Viñuela, Ana; Glass, Dan; Parts, Leopold; Spector, Tim; Winn, John; Durbin, Richard

    2015-01-01

    Statistical factor analysis methods have previously been used to remove noise components from high-dimensional data prior to genetic association mapping and, in a guided fashion, to summarize biologically relevant sources of variation. Here, we show how the derived factors summarizing pathway expression can be used to analyze the relationships between expression, heritability, and aging. We used skin gene expression data from 647 twins from the MuTHER Consortium and applied factor analysis to concisely summarize patterns of gene expression to remove broad confounding influences and to produce concise pathway-level phenotypes. We derived 930 “pathway phenotypes” that summarized patterns of variation across 186 KEGG pathways (five phenotypes per pathway). We identified 69 significant associations of age with phenotype from 57 distinct KEGG pathways at a stringent Bonferroni threshold (P<5.38×10−5). These phenotypes are more heritable (h2=0.32) than gene expression levels. On average, expression levels of 16% of genes within these pathways are associated with age. Several significant pathways relate to metabolizing sugars and fatty acids; others relate to insulin signaling. We have demonstrated that factor analysis methods combined with biological knowledge can produce more reliable phenotypes with less stochastic noise than the individual gene expression levels, which increases our power to discover biologically relevant associations. These phenotypes could also be applied to discover associations with other environmental factors. PMID:25758824

  15. In silico enzymatic synthesis of a 400,000 compound biochemical database for nontargeted metabolomics.

    PubMed

    Menikarachchi, Lochana C; Hill, Dennis W; Hamdalla, Mai A; Mandoiu, Ion I; Grant, David F

    2013-09-23

    Current methods of structure identification in mass-spectrometry-based nontargeted metabolomics rely on matching experimentally determined features of an unknown compound to those of candidate compounds contained in biochemical databases. A major limitation of this approach is the relatively small number of compounds currently included in these databases. If the correct structure is not present in a database, it cannot be identified, and if it cannot be identified, it cannot be included in a database. Thus, there is an urgent need to augment metabolomics databases with rationally designed biochemical structures using alternative means. Here we present the In Vivo/In Silico Metabolites Database (IIMDB), a database of in silico enzymatically synthesized metabolites, to partially address this problem. The database, which is available at http://metabolomics.pharm.uconn.edu/iimdb/, includes ~23,000 known compounds (mammalian metabolites, drugs, secondary plant metabolites, and glycerophospholipids) collected from existing biochemical databases plus more than 400,000 computationally generated human phase-I and phase-II metabolites of these known compounds. IIMDB features a user-friendly web interface and a programmer-friendly RESTful web service. Ninety-five percent of the computationally generated metabolites in IIMDB were not found in any existing database. However, 21,640 were identical to compounds already listed in PubChem, HMDB, KEGG, or HumanCyc. Furthermore, the vast majority of these in silico metabolites were scored as biological using BioSM, a software program that identifies biochemical structures in chemical structure space. These results suggest that in silico biochemical synthesis represents a viable approach for significantly augmenting biochemical databases for nontargeted metabolomics applications.

  16. Transcriptome Analysis of Pig In Vivo, In Vitro–Fertilized, and Nuclear Transfer Blastocyst-Stage Embryos Treated with Histone Deacetylase Inhibitors Postfusion and Activation Reveals Changes in the Lysosomal Pathway

    PubMed Central

    Whitworth, Kristin M.; Mao, Jiude; Lee, Kiho; Spollen, William G.; Samuel, Melissa S.; Walters, Eric M.; Spate, Lee D.

    2015-01-01

    Abstract Genetically modified pigs are commonly created via somatic cell nuclear transfer (SCNT). Treatment of reconstructed embryos with histone deacetylase inhibitors (HDACi) immediately after activation improves cloning efficiency. The objective of this experiment was to evaluate the transcriptome of SCNT embryos treated with suberoylanilide hydroxamic acid (SAHA), 4-iodo-SAHA (ISAHA), or Scriptaid as compared to untreated SCNT, in vitro–fertilized (IVF), and in vivo (IVV) blastocyst-stage embryos. SAHA (10 μM) had the highest level of blastocyst development at 43.9%, and all treatments except 10 μM ISAHA had the same percentage of blastocyst development as Scriptaid (p<0.05). Two treatments, 1.0 μM ISAHA and 1.0 μM SAHA, had higher mean cell number than No HDACi treatment (p<0.021). Embryo transfers performed with 10 μM SAHA- and 1 μM ISAHA-treated embryos resulted in the birth of healthy piglets. GenBank accession numbers from up- and downregulated transcripts were loaded into the Database for Annotation, Visualization and Integrated Discovery to identify enriched biological themes. HDACi treatment yielded the highest enrichment for transcripts within the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway, lysosome. The mean intensity of LysoTracker was lower in IVV embryos compared to IVF and SCNT embryos (p<0.0001). SAHA and ISAHA can successfully be used to create healthy piglets from SCNT. PMID:26731590

  17. Physiological Information Database (PID)

    EPA Science Inventory

    EPA has developed a physiological information database (created using Microsoft ACCESS) intended to be used in PBPK modeling. The database contains physiological parameter values for humans from early childhood through senescence as well as similar data for laboratory animal spec...

  18. Network II Database

    1994-11-07

    The Oak Ridge National Laboratory (ORNL) Rail and Barge Network II Database is a representation of the rail and barge system of the United States. The network is derived from the Federal Rail Administration (FRA) rail database.

  19. THE ECOTOX DATABASE

    EPA Science Inventory

    The database provides chemical-specific toxicity information for aquatic life, terrestrial plants, and terrestrial wildlife. ECOTOX is a comprehensive ecotoxicology database and is therefore essential for providing and suppoirting high quality models needed to estimate population...

  20. Household Products Database: Pesticides

    MedlinePlus

    ... Names Types of Products Manufacturers Ingredients About the Database FAQ Product Recalls Help Glossary Contact Us More ... holders. Information is extracted from Consumer Product Information Database ©2001-2015 by DeLima Associates. All rights reserved. ...

  1. Aviation Safety Issues Database

    NASA Technical Reports Server (NTRS)

    Morello, Samuel A.; Ricks, Wendell R.

    2009-01-01

    The aviation safety issues database was instrumental in the refinement and substantiation of the National Aviation Safety Strategic Plan (NASSP). The issues database is a comprehensive set of issues from an extremely broad base of aviation functions, personnel, and vehicle categories, both nationally and internationally. Several aviation safety stakeholders such as the Commercial Aviation Safety Team (CAST) have already used the database. This broader interest was the genesis to making the database publically accessible and writing this report.

  2. Scopus database: a review

    PubMed Central

    Burnham, Judy F

    2006-01-01

    The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs. PMID:16522216

  3. Scopus database: a review.

    PubMed

    Burnham, Judy F

    2006-03-08

    The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs.

  4. Mission and Assets Database

    NASA Technical Reports Server (NTRS)

    Baldwin, John; Zendejas, Silvino; Gutheinz, Sandy; Borden, Chester; Wang, Yeou-Fang

    2009-01-01

    Mission and Assets Database (MADB) Version 1.0 is an SQL database system with a Web user interface to centralize information. The database stores flight project support resource requirements, view periods, antenna information, schedule, and forecast results for use in mid-range and long-term planning of Deep Space Network (DSN) assets.

  5. Exploring metabolic pathway disruption in the subchronic phencyclidine model of schizophrenia with the Generalized Singular Value Decomposition

    PubMed Central

    2011-01-01

    (KEGG) metabolite pathway database) were altered in the PFC of PCP-treated rats. Several significant changes were discovered, notably: 1) neuroactive ligands active at glutamate and GABA receptors are disrupted in the PFC of PCP-treated animals, 2) glutamate dysfunction in these animals was not limited to compromised glutamatergic neurotransmission but also involves the disruption of metabolic pathways linked to glutamate; and 3) a specific series of purine reactions Xanthine ← Hypoxyanthine ↔ Inosine ← IMP → adenylosuccinate is also disrupted in the PFC of PCP-treated animals. Conclusions Network reordering via the GSVD provides a means to discover statistically validated differences in clustering between a pair of networks. In practice this analytical approach, when applied to metabolomic data, allows us to quantify the alterations in metabolic pathways between two experimental groups. With this new computational technique we identified metabolic pathway alterations that are consistent with known results. Furthermore, we discovered disruption in a novel series of purine reactions that may contribute to the PFC dysfunction and cognitive deficits seen in schizophrenia. PMID:21575198

  6. Development of SRS.php, a Simple Object Access Protocol-based library for data acquisition from integrated biological databases.

    PubMed

    Barbosa-Silva, A; Pafilis, E; Ortega, J M; Schneider, R

    2007-12-11

    Data integration has become an important task for biological database providers. The current model for data exchange among different sources simplifies the manner that distinct information is accessed by users. The evolution of data representation from HTML to XML enabled programs, instead of humans, to interact with biological databases. We present here SRS.php, a PHP library that can interact with the data integration Sequence Retrieval System (SRS). The library has been written using SOAP definitions, and permits the programmatic communication through webservices with the SRS. The interactions are possible by invoking the methods described in WSDL by exchanging XML messages. The current functions available in the library have been built to access specific data stored in any of the 90 different databases (such as UNIPROT, KEGG and GO) using the same query syntax format. The inclusion of the described functions in the source of scripts written in PHP enables them as webservice clients to the SRS server. The functions permit one to query the whole content of any SRS database, to list specific records in these databases, to get specific fields from the records, and to link any record among any pair of linked databases. The case study presented exemplifies the library usage to retrieve information regarding registries of a Plant Defense Mechanisms database. The Plant Defense Mechanisms database is currently being developed, and the proposal of SRS.php library usage is to enable the data acquisition for the further warehousing tasks related to its setup and maintenance.

  7. Pathways with PathWhiz

    PubMed Central

    Pon, Allison; Jewison, Timothy; Su, Yilu; Liang, Yongjie; Knox, Craig; Maciejewski, Adam; Wilson, Michael; Wishart, David S.

    2015-01-01

    PathWhiz (http://smpdb.ca/pathwhiz) is a web server designed to create colourful, visually pleasing and biologically accurate pathway diagrams that are both machine-readable and interactive. As a web server, PathWhiz is accessible from almost any place and compatible with essentially any operating system. It also houses a public library of pathways and pathway components that can be easily viewed and expanded upon by its users. PathWhiz allows users to readily generate biologically complex pathways by using a specially designed drawing palette to quickly render metabolites (including automated structure generation), proteins (including quaternary structures, covalent modifications and cofactors), nucleic acids, membranes, subcellular structures, cells, tissues and organs. Both small-molecule and protein/gene pathways can be constructed by combining multiple pathway processes such as reactions, interactions, binding events and transport activities. PathWhiz's pathway replication and propagation functions allow for existing pathways to be used to create new pathways or for existing pathways to be automatically propagated across species. PathWhiz pathways can be saved in BioPAX, SBGN-ML and SBML data exchange formats, as well as PNG, PWML, HTML image map or SVG images that can be viewed offline or explored using PathWhiz's interactive viewer. PathWhiz has been used to generate over 700 pathway diagrams for a number of popular databases including HMDB, DrugBank and SMPDB. PMID:25934797

  8. The NCBI Taxonomy database.

    PubMed

    Federhen, Scott

    2012-01-01

    The NCBI Taxonomy database (http://www.ncbi.nlm.nih.gov/taxonomy) is the standard nomenclature and classification repository for the International Nucleotide Sequence Database Collaboration (INSDC), comprising the GenBank, ENA (EMBL) and DDBJ databases. It includes organism names and taxonomic lineages for each of the sequences represented in the INSDC's nucleotide and protein sequence databases. The taxonomy database is manually curated by a small group of scientists at the NCBI who use the current taxonomic literature to maintain a phylogenetic taxonomy for the source organisms represented in the sequence databases. The taxonomy database is a central organizing hub for many of the resources at the NCBI, and provides a means for clustering elements within other domains of NCBI web site, for internal linking between domains of the Entrez system and for linking out to taxon-specific external resources on the web. Our primary purpose is to index the domain of sequences as conveniently as possible for our user community.

  9. The 2016 database issue of Nucleic Acids Research and an updated molecular biology database collection.

    PubMed

    Rigden, Daniel J; Fernández-Suárez, Xosé M; Galperin, Michael Y

    2016-01-01

    The 2016 Database Issue of Nucleic Acids Research starts with overviews of the resources provided by three major bioinformatics centers, the U.S. National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute (EMBL-EBI) and Swiss Institute for Bioinformatics (SIB). Also included are descriptions of 62 new databases and updates on 95 databases that have been previously featured in NAR plus 17 previously described elsewhere. A number of papers in this issue deal with resources on nucleic acids, including various kinds of non-coding RNAs and their interactions, molecular dynamics simulations of nucleic acid structure, and two databases of super-enhancers. The protein database section features important updates on the EBI's Pfam, PDBe and PRIDE databases, as well as a variety of resources on pathways, metabolomics and metabolic modeling. This issue also includes updates on popular metagenomics resources, such as MG-RAST, EBI Metagenomics, and probeBASE, as well as a newly compiled Human Pan-Microbe Communities database. A significant fraction of the new and updated databases are dedicated to the genetic basis of disease, primarily cancer, and various aspects of drug research, including resources for patented drugs, their side effects, withdrawn drugs, and potential drug targets. A further six papers present updated databases of various antimicrobial and anticancer peptides. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/). The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been updated with the addition of 88 new resources and removal of 23 obsolete websites, which brought the current listing to 1685 databases. PMID:26740669

  10. The 2016 database issue of Nucleic Acids Research and an updated molecular biology database collection

    PubMed Central

    Rigden, Daniel J.; Fernández-Suárez, Xosé M.; Galperin, Michael Y.

    2016-01-01

    The 2016 Database Issue of Nucleic Acids Research starts with overviews of the resources provided by three major bioinformatics centers, the U.S. National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute (EMBL-EBI) and Swiss Institute for Bioinformatics (SIB). Also included are descriptions of 62 new databases and updates on 95 databases that have been previously featured in NAR plus 17 previously described elsewhere. A number of papers in this issue deal with resources on nucleic acids, including various kinds of non-coding RNAs and their interactions, molecular dynamics simulations of nucleic acid structure, and two databases of super-enhancers. The protein database section features important updates on the EBI's Pfam, PDBe and PRIDE databases, as well as a variety of resources on pathways, metabolomics and metabolic modeling. This issue also includes updates on popular metagenomics resources, such as MG-RAST, EBI Metagenomics, and probeBASE, as well as a newly compiled Human Pan-Microbe Communities database. A significant fraction of the new and updated databases are dedicated to the genetic basis of disease, primarily cancer, and various aspects of drug research, including resources for patented drugs, their side effects, withdrawn drugs, and potential drug targets. A further six papers present updated databases of various antimicrobial and anticancer peptides. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/). The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been updated with the addition of 88 new resources and removal of 23 obsolete websites, which brought the current listing to 1685 databases. PMID:26740669

  11. The 2016 database issue of Nucleic Acids Research and an updated molecular biology database collection.

    PubMed

    Rigden, Daniel J; Fernández-Suárez, Xosé M; Galperin, Michael Y

    2016-01-01

    The 2016 Database Issue of Nucleic Acids Research starts with overviews of the resources provided by three major bioinformatics centers, the U.S. National Center for Biotechnology Information (NCBI), the European Bioinformatics Institute (EMBL-EBI) and Swiss Institute for Bioinformatics (SIB). Also included are descriptions of 62 new databases and updates on 95 databases that have been previously featured in NAR plus 17 previously described elsewhere. A number of papers in this issue deal with resources on nucleic acids, including various kinds of non-coding RNAs and their interactions, molecular dynamics simulations of nucleic acid structure, and two databases of super-enhancers. The protein database section features important updates on the EBI's Pfam, PDBe and PRIDE databases, as well as a variety of resources on pathways, metabolomics and metabolic modeling. This issue also includes updates on popular metagenomics resources, such as MG-RAST, EBI Metagenomics, and probeBASE, as well as a newly compiled Human Pan-Microbe Communities database. A significant fraction of the new and updated databases are dedicated to the genetic basis of disease, primarily cancer, and various aspects of drug research, including resources for patented drugs, their side effects, withdrawn drugs, and potential drug targets. A further six papers present updated databases of various antimicrobial and anticancer peptides. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/). The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been updated with the addition of 88 new resources and removal of 23 obsolete websites, which brought the current listing to 1685 databases.

  12. SENTRA, a database of signal transduction proteins.

    SciTech Connect

    D'Souza, M.; Romine, M. F.; Maltsev, N.; Mathematics and Computer Science; PNNL

    2000-01-01

    SENTRA, available via URL http://wit.mcs.anl.gov/WIT2/Sentra/, is a database of proteins associated with microbial signal transduction. The database currently includes the classical two-component signal transduction pathway proteins and methyl-accepting chemotaxis proteins, but will be expanded to also include other classes of signal transduction systems that are modulated by phosphorylation or methylation reactions. Although the majority of database entries are from prokaryotic systems, eukaroytic proteins with bacterial-like signal transduction domains are also included. Currently SENTRA contains signal transduction proteins in 34 complete and almost completely sequenced prokaryotic genomes, as well as sequences from 243 organisms available in public databases (SWISS-PROT and EMBL). The analysis was carried out within the framework of the WIT2 system, which is designed and implemented to support genetic sequence analysis and comparative analysis of sequenced genomes.

  13. An Introduction to Database Structure and Database Machines.

    ERIC Educational Resources Information Center

    Detweiler, Karen

    1984-01-01

    Enumerates principal management objectives of database management systems (data independence, quality, security, multiuser access, central control) and criteria for comparison (response time, size, flexibility, other features). Conventional database management systems, relational databases, and database machines used for backend processing are…

  14. HMDB: the Human Metabolome Database

    PubMed Central

    Wishart, David S.; Tzur, Dan; Knox, Craig; Eisner, Roman; Guo, An Chi; Young, Nelson; Cheng, Dean; Jewell, Kevin; Arndt, David; Sawhney, Summit; Fung, Chris; Nikolai, Lisa; Lewis, Mike; Coutouly, Marie-Aude; Forsythe, Ian; Tang, Peter; Shrivastava, Savita; Jeroncic, Kevin; Stothard, Paul; Amegbey, Godwin; Block, David; Hau, David. D.; Wagner, James; Miniaci, Jessica; Clements, Melisa; Gebremedhin, Mulu; Guo, Natalie; Zhang, Ying; Duggan, Gavin E.; MacInnis, Glen D.; Weljie, Alim M.; Dowlatabadi, Reza; Bamforth, Fiona; Clive, Derrick; Greiner, Russ; Li, Liang; Marrie, Tom; Sykes, Brian D.; Vogel, Hans J.; Querengesser, Lori

    2007-01-01

    The Human Metabolome Database (HMDB) is currently the most complete and comprehensive curated collection of human metabolite and human metabolism data in the world. It contains records for more than 2180 endogenous metabolites with information gathered from thousands of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the HMDB also contains an extensive collection of experimental metabolite concentration data compiled from hundreds of mass spectra (MS) and Nuclear Magnetic resonance (NMR) metabolomic analyses performed on urine, blood and cerebrospinal fluid samples. This is further supplemented with thousands of NMR and MS spectra collected on purified, reference metabolites. Each metabolite entry in the HMDB contains an average of 90 separate data fields including a comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, biofluid concentrations, disease associations, pathway information, enzyme data, gene sequence data, SNP and mutation data as well as extensive links to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided. The HMDB is designed to address the broad needs of biochemists, clinical chemists, physicians, medical geneticists, nutritionists and members of the metabolomics community. The HMDB is available at: PMID:17202168

  15. Thermodynamic Analysis of Biodegradation Pathways

    PubMed Central

    Finley, Stacey D.; Broadbelt, Linda J.

    2014-01-01

    Microorganisms provide a wealth of biodegradative potential in the reduction and elimination of xenobiotic compounds in the environment. One useful metric to evaluate potential biodegradation pathways is thermodynamic feasibility. However, experimental data for the thermodynamic properties of xenobiotics is scarce. The present work uses a group contribution method to study the thermodynamic properties of the University of Minnesota Biocatalysis/Biodegradation Database. The Gibbs free energies of formation and reaction are estimated for 914 compounds (81%) and 902 reactions (75%), respectively, in the database. The reactions are classified based on the minimum and maximum Gibbs free energy values, which accounts for uncertainty in the free energy estimates and a feasible concentration range relevant to biodegradation. Using the free energy estimates, the cumulative free energy change of 89 biodegradation pathways (51%) in the database could be estimated. A comparison of the likelihood of the biotransformation rules in the Pathway Prediction System and their thermodynamic feasibility was then carried out. This analysis revealed that when evaluating the feasibility of biodegradation pathways, it is important to consider the thermodynamic topology of the reactions in the context of the complete pathway. Group contribution is shown to be a viable tool for estimating, a priori, the thermodynamic feasibility and the relative likelihood of alternative biodegradation reactions. This work offers a useful tool to a broad range of researchers interested in estimating the feasibility of the reactions in existing or novel biodegradation pathways. PMID:19288443

  16. MetaMapp: mapping and visualizing metabolomic data by integrating information from biochemical pathways and chemical and mass spectral similarity

    PubMed Central

    2012-01-01

    Background Exposure to environmental tobacco smoke (ETS) leads to higher rates of pulmonary diseases and infections in children. To study the biochemical changes that may precede lung diseases, metabolomic effects on fetal and maternal lungs and plasma from rats exposed to ETS were compared to filtered air control animals. Genome- reconstructed metabolic pathways may be used to map and interpret dysregulation in metabolic networks. However, mass spectrometry-based non-targeted metabolomics datasets often comprise many metabolites for which links to enzymatic reactions have not yet been reported. Hence, network visualizations that rely on current biochemical databases are incomplete and also fail to visualize novel, structurally unidentified metabolites. Results We present a novel approach to integrate biochemical pathway and chemical relationships to map all detected metabolites in network graphs (MetaMapp) using KEGG reactant pair database, Tanimoto chemical and NIST mass spectral similarity scores. In fetal and maternal lungs, and in maternal blood plasma from pregnant rats exposed to environmental tobacco smoke (ETS), 459 unique metabolites comprising 179 structurally identified compounds were detected by gas chromatography time of flight mass spectrometry (GC-TOF MS) and BinBase data processing. MetaMapp graphs in Cytoscape showed much clearer metabolic modularity and complete content visualization compared to conventional biochemical mapping approaches. Cytoscape visualization of differential statistics results using these graphs showed that overall, fetal lung metabolism was more impaired than lungs and blood metabolism in dams. Fetuses from ETS-exposed dams expressed lower lipid and nucleotide levels and higher amounts of energy metabolism intermediates than control animals, indicating lower biosynthetic rates of metabolites for cell division, structural proteins and lipids that are critical for in lung development. Conclusions MetaMapp graphs efficiently

  17. The BioPAX community standard for pathway

    SciTech Connect

    Syed, Mustafa H

    2010-01-01

    Biological Pathway Exchange (BioPAX) is a standard language to represent biological pathways at the molecular and cellular level and to facilitate the exchange of pathway data. The rapid growth of the volume of pathway data has spurred the development of databases and computational tools to aid interpretation; however, use of these data is hampered by the current fragmentation of pathway information across many databases with incompatible formats. BioPAX, which was created through a community process, solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. Using BioPAX, millions of interactions, organized into thousands of pathways, from many organisms are available from a growing number of databases. This large amount of pathway data in a computable form will support visualization, analysis and biological discovery.

  18. BSD: the Biodegradative Strain Database.

    PubMed

    Urbance, John W; Cole, James; Saxman, Paul; Tiedje, James M

    2003-01-01

    The Biodegradative Strain Database (BSD) is a freely-accessible, web-based database providing detailed information on degradative bacteria and the hazardous substances that they degrade, including corresponding literature citations, relevant patents and links to additional web-based biological and chemical data. The BSD (http://bsd.cme.msu.edu) is being developed within the phylogenetic framework of the Ribosomal Database Project II (RDPII: http://rdp.cme.msu.edu/html) to provide a biological complement to the chemical and degradative pathway data of the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD: http://umbbd.ahc.umn.edu). Data is accessible through a series of strain, chemical and reference lists or by keyword search. The web site also includes on-line data submission and user survey forms to solicit user contributions and suggestions. The current release contains information on over 250 degradative bacterial strains and 150 hazardous substances. The transformation of xenobiotics and other environmentally toxic compounds by microorganisms is central to strategies for biocatalysis and the bioremediation of contaminated environments. However, practical, comprehensive, strain-level information on biocatalytic/biodegradative microbes is not readily available and is often difficult to compile. Similarly, for any given environmental contaminant, there is no single resource that can provide comparative information on the array of identified microbes capable of degrading the chemical. A web site that consolidates and cross-references strain, chemical and reference data related to biocatalysis, biotransformation, biodegradation and bioremediation would be an invaluable tool for academic and industrial researchers and environmental engineers.

  19. ITS-90 Thermocouple Database

    National Institute of Standards and Technology Data Gateway

    SRD 60 NIST ITS-90 Thermocouple Database (Web, free access)   Web version of Standard Reference Database 60 and NIST Monograph 175. The database gives temperature -- electromotive force (emf) reference functions and tables for the letter-designated thermocouple types B, E, J, K, N, R, S and T. These reference functions have been adopted as standards by the American Society for Testing and Materials (ASTM) and the International Electrotechnical Commission (IEC).

  20. 2010 Worldwide Gasification Database

    DOE Data Explorer

    The 2010 Worldwide Gasification Database describes the current world gasification industry and identifies near-term planned capacity additions. The database lists gasification projects and includes information (e.g., plant location, number and type of gasifiers, syngas capacity, feedstock, and products). The database reveals that the worldwide gasification capacity has continued to grow for the past several decades and is now at 70,817 megawatts thermal (MWth) of syngas output at 144 operating plants with a total of 412 gasifiers.

  1. Veterans Administration Databases

    Cancer.gov

    The Veterans Administration Information Resource Center provides database and informatics experts, customer service, expert advice, information products, and web technology to VA researchers and others.

  2. Databases for LDEF results

    NASA Technical Reports Server (NTRS)

    Bohnhoff-Hlavacek, Gail

    1992-01-01

    One of the objectives of the team supporting the LDEF Systems and Materials Special Investigative Groups is to develop databases of experimental findings. These databases identify the hardware flown, summarize results and conclusions, and provide a system for acknowledging investigators, tracing sources of data, and future design suggestions. To date, databases covering the optical experiments, and thermal control materials (chromic acid anodized aluminum, silverized Teflon blankets, and paints) have been developed at Boeing. We used the Filemaker Pro software, the database manager for the Macintosh computer produced by the Claris Corporation. It is a flat, text-retrievable database that provides access to the data via an intuitive user interface, without tedious programming. Though this software is available only for the Macintosh computer at this time, copies of the databases can be saved to a format that is readable on a personal computer as well. Further, the data can be exported to more powerful relational databases, capabilities, and use of the LDEF databases and describe how to get copies of the database for your own research.

  3. Potential hippocampal genes and pathways involved in Alzheimer's disease: a bioinformatic analysis.

    PubMed

    Zhang, L; Guo, X Q; Chu, J F; Zhang, X; Yan, Z R; Li, Y Z

    2015-06-29

    Alzheimer's disease (AD) is a neurodegenerative disor-der and the most common cause of dementia in elderly people. Nu-merous studies have focused on the dysregulated genes in AD, but the pathogenesis is still unknown. In this study, we explored critical hippocampal genes and pathways that might potentially be involved in the pathogenesis of AD. Four transcriptome datasets for the hip-pocampus of patients with AD were downloaded from ArrayExpress, and the gene signature was identified by integrated analysis of mul-tiple transcriptomes using novel genome-wide relative significance and genome-wide global significance models. A protein-protein interaction network was constructed, and five clusters were selected. The biologi-cal functions and pathways were identified by Gene Ontology and Kyo-to Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. A total of 6994 genes were screened, and the top 300 genes were subjected to further analysis. Four significant KEGG pathways were identified, including oxidative phosphorylation and Parkinson's disease, Huntington's disease, and Alzheimer's disease pathways. The hub network of cluster 1 with the highest average rank value was de-fined. The genes (NDUFB3, NDUFA9, NDUFV1, NDUFV2, NDUFS3, NDUFA10, COX7B, and UQCR1) were considered critical with high degree in cluster 1 as well as being shared by the four significant path-ways. The oxidative phosphorylation process was also involved in the other three pathways and is considered to be relevant to energy-related AD pathology in the hippocampus. This research provides a perspec-tive from which to explore critical genes and pathways for potential AD therapies.

  4. Deriving pathway maps from automated text analysis using a grammar-based approach.

    PubMed

    Olsson, Björn; Gawronska, Barbara; Erlendsson, Björn

    2006-04-01

    We demonstrate how automated text analysis can be used to support the large-scale analysis of metabolic and regulatory pathways by deriving pathway maps from textual descriptions found in the scientific literature. The main assumption is that correct syntactic analysis combined with domain-specific heuristics provides a good basis for relation extraction. Our method uses an algorithm that searches through the syntactic trees produced by a parser based on a Referent Grammar formalism, identifies relations mentioned in the sentence, and classifies them with respect to their semantic class and epistemic status (facts, counterfactuals, hypotheses). The semantic categories used in the classification are based on the relation set used in KEGG (Kyoto Encyclopedia of Genes and Genomes), so that pathway maps using KEGG notation can be automatically generated. We present the current version of the relation extraction algorithm and an evaluation based on a corpus of abstracts obtained from PubMed. The results indicate that the method is able to combine a reasonable coverage with high accuracy. We found that 61% of all sentences were parsed, and 97% of the parse trees were judged to be correct. The extraction algorithm was tested on a sample of 300 parse trees and was found to produce correct extractions in 90.5% of the cases. PMID:16819797

  5. Database in Artificial Intelligence.

    ERIC Educational Resources Information Center

    Wilkinson, Julia

    1986-01-01

    Describes a specialist bibliographic database of literature in the field of artificial intelligence created by the Turing Institute (Glasgow, Scotland) using the BRS/Search information retrieval software. The subscription method for end-users--i.e., annual fee entitles user to unlimited access to database, document provision, and printed awareness…

  6. BioImaging Database

    SciTech Connect

    David Nix, Lisa Simirenko

    2006-10-25

    The Biolmaging Database (BID) is a relational database developed to store the data and meta-data for the 3D gene expression in early Drosophila embryo development on a cellular level. The schema was written to be used with the MySQL DBMS but with minor modifications can be used on any SQL compliant relational DBMS.

  7. Biological Macromolecule Crystallization Database

    National Institute of Standards and Technology Data Gateway

    SRD 21 Biological Macromolecule Crystallization Database (Web, free access)   The Biological Macromolecule Crystallization Database and NASA Archive for Protein Crystal Growth Data (BMCD) contains the conditions reported for the crystallization of proteins and nucleic acids used in X-ray structure determinations and archives the results of microgravity macromolecule crystallization studies.

  8. Online Database Searching Workbook.

    ERIC Educational Resources Information Center

    Littlejohn, Alice C.; Parker, Joan M.

    Designed primarily for use by first-time searchers, this workbook provides an overview of online searching. Following a brief introduction which defines online searching, databases, and database producers, five steps in carrying out a successful search are described: (1) identifying the main concepts of the search statement; (2) selecting a…

  9. Ionic Liquids Database- (ILThermo)

    National Institute of Standards and Technology Data Gateway

    SRD 147 Ionic Liquids Database- (ILThermo) (Web, free access)   IUPAC Ionic Liquids Database, ILThermo, is a free web research tool that allows users worldwide to access an up-to-date data collection from the publications on experimental investigations of thermodynamic, and transport properties of ionic liquids as well as binary and ternary mixtures containing ionic liquids.

  10. HIV Structural Database

    National Institute of Standards and Technology Data Gateway

    SRD 102 HIV Structural Database (Web, free access)   The HIV Protease Structural Database is an archive of experimentally determined 3-D structures of Human Immunodeficiency Virus 1 (HIV-1), Human Immunodeficiency Virus 2 (HIV-2) and Simian Immunodeficiency Virus (SIV) Proteases and their complexes with inhibitors or products of substrate cleavage.

  11. Morchella MLST database

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Welcome to the Morchella MLST database. This dedicated database was set up at the CBS-KNAW Biodiversity Center by Vincent Robert in February 2012, using BioloMICS software (Robert et al., 2011), to facilitate DNA sequence-based identifications of Morchella species via the Internet. The current datab...

  12. Atomic Spectra Database (ASD)

    National Institute of Standards and Technology Data Gateway

    SRD 78 NIST Atomic Spectra Database (ASD) (Web, free access)   This database provides access and search capability for NIST critically evaluated data on atomic energy levels, wavelengths, and transition probabilities that are reasonably up-to-date. The NIST Atomic Spectroscopy Data Center has carried out these critical compilations.

  13. First Look: TRADEMARKSCAN Database.

    ERIC Educational Resources Information Center

    Fernald, Anne Conway; Davidson, Alan B.

    1984-01-01

    Describes database produced by Thomson and Thomson and available on Dialog which contains over 700,000 records representing all active federal trademark registrations and applications for registrations filed in United States Patent and Trademark Office. A typical record, special features, database applications, learning to use TRADEMARKSCAN, and…

  14. Dictionary as Database.

    ERIC Educational Resources Information Center

    Painter, Derrick

    1996-01-01

    Discussion of dictionaries as databases focuses on the digitizing of The Oxford English dictionary (OED) and the use of Standard Generalized Mark-Up Language (SGML). Topics include the creation of a consortium to digitize the OED, document structure, relational databases, text forms, sequence, and discourse. (LRW)

  15. Structural Ceramics Database

    National Institute of Standards and Technology Data Gateway

    SRD 30 NIST Structural Ceramics Database (Web, free access)   The NIST Structural Ceramics Database (WebSCD) provides evaluated materials property data for a wide range of advanced ceramics known variously as structural ceramics, engineering ceramics, and fine ceramics.

  16. Build Your Own Database.

    ERIC Educational Resources Information Center

    Jacso, Peter; Lancaster, F. W.

    This book is intended to help librarians and others to produce databases of better value and quality, especially if they have had little previous experience in database construction. Drawing upon almost 40 years of experience in the field of information retrieval, this book emphasizes basic principles and approaches rather than in-depth and…

  17. Knowledge Discovery in Databases.

    ERIC Educational Resources Information Center

    Norton, M. Jay

    1999-01-01

    Knowledge discovery in databases (KDD) revolves around the investigation and creation of knowledge, processes, algorithms, and mechanisms for retrieving knowledge from data collections. The article is an introductory overview of KDD. The rationale and environment of its development and applications are discussed. Issues related to database design…

  18. Database Searching by Managers.

    ERIC Educational Resources Information Center

    Arnold, Stephen E.

    Managers and executives need the easy and quick access to business and management information that online databases can provide, but many have difficulty articulating their search needs to an intermediary. One possible solution would be to encourage managers and their immediate support staff members to search textual databases directly as they now…

  19. A Quality System Database

    NASA Technical Reports Server (NTRS)

    Snell, William H.; Turner, Anne M.; Gifford, Luther; Stites, William

    2010-01-01

    A quality system database (QSD), and software to administer the database, were developed to support recording of administrative nonconformance activities that involve requirements for documentation of corrective and/or preventive actions, which can include ISO 9000 internal quality audits and customer complaints.

  20. Assignment to database industy

    NASA Astrophysics Data System (ADS)

    Abe, Kohichiroh

    Various kinds of databases are considered to be essential part in future large sized systems. Information provision only by databases is also considered to be growing as the market becomes mature. This paper discusses how such circumstances have been built and will be developed from now on.

  1. FOAM (Functional Ontology Assignments for Metagenomes): A Hidden Markov Model (HMM) database with environmental focus

    SciTech Connect

    Prestat, Emmanuel; David, Maude M.; Hultman, Jenni; Ta , Neslihan; Lamendella, Regina; Dvornik, Jill; Mackelprang, Rachel; Myrold, David D.; Jumpponen, Ari; Tringe, Susannah G.; Holman, Elizabeth; Mavromatis, Konstantinos; Jansson, Janet K.

    2014-09-26

    A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associated functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.

  2. FOAM (Functional Ontology Assignments for Metagenomes): a Hidden Markov Model (HMM) database with environmental focus.

    PubMed

    Prestat, Emmanuel; David, Maude M; Hultman, Jenni; Taş, Neslihan; Lamendella, Regina; Dvornik, Jill; Mackelprang, Rachel; Myrold, David D; Jumpponen, Ari; Tringe, Susannah G; Holman, Elizabeth; Mavromatis, Konstantinos; Jansson, Janet K

    2014-10-29

    A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. 'profiles') were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associated functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.

  3. Cascadia Tsunami Deposit Database

    USGS Publications Warehouse

    Peters, Robert; Jaffe, Bruce; Gelfenbaum, Guy; Peterson, Curt

    2003-01-01

    The Cascadia Tsunami Deposit Database contains data on the location and sedimentological properties of tsunami deposits found along the Cascadia margin. Data have been compiled from 52 studies, documenting 59 sites from northern California to Vancouver Island, British Columbia that contain known or potential tsunami deposits. Bibliographical references are provided for all sites included in the database. Cascadia tsunami deposits are usually seen as anomalous sand layers in coastal marsh or lake sediments. The studies cited in the database use numerous criteria based on sedimentary characteristics to distinguish tsunami deposits from sand layers deposited by other processes, such as river flooding and storm surges. Several studies cited in the database contain evidence for more than one tsunami at a site. Data categories include age, thickness, layering, grainsize, and other sedimentological characteristics of Cascadia tsunami deposits. The database documents the variability observed in tsunami deposits found along the Cascadia margin.

  4. ECMDB: the E. coli Metabolome Database.

    PubMed

    Guo, An Chi; Jewison, Timothy; Wilson, Michael; Liu, Yifeng; Knox, Craig; Djoumbou, Yannick; Lo, Patrick; Mandal, Rupasri; Krishnamurthy, Ram; Wishart, David S

    2013-01-01

    The Escherichia coli Metabolome Database (ECMDB, http://www.ecmdb.ca) is a comprehensively annotated metabolomic database containing detailed information about the metabolome of E. coli (K-12). Modelled closely on the Human and Yeast Metabolome Databases, the ECMDB contains >2600 metabolites with links to ∼1500 different genes and proteins, including enzymes and transporters. The information in the ECMDB has been collected from dozens of textbooks, journal articles and electronic databases. Each metabolite entry in the ECMDB contains an average of 75 separate data fields, including comprehensive compound descriptions, names and synonyms, chemical taxonomy, compound structural and physicochemical data, bacterial growth conditions and substrates, reactions, pathway information, enzyme data, gene/protein sequence data and numerous hyperlinks to images, references and other public databases. The ECMDB also includes an extensive collection of intracellular metabolite concentration data compiled from our own work as well as other published metabolomic studies. This information is further supplemented with thousands of fully assigned reference nuclear magnetic resonance and mass spectrometry spectra obtained from pure E. coli metabolites that we (and others) have collected. Extensive searching, relational querying and data browsing tools are also provided that support text, chemical structure, spectral, molecular weight and gene/protein sequence queries. Because of E. coli's importance as a model organism for biologists and as a biofactory for industry, we believe this kind of database could have considerable appeal not only to metabolomics researchers but also to molecular biologists, systems biologists and individuals in the biotechnology industry.

  5. DIP: The Database of Interacting Proteins

    DOE Data Explorer

    The DIP Database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions. By interaction, the DIP Database creators mean that two amino acid chains were experimentally identified to bind to each other. The database lists such pairs to aid those studying a particular protein-protein interaction but also those investigating entire regulatory and signaling pathways as well as those studying the organisation and complexity of the protein interaction network at the cellular level. The data stored within the DIP database were curated, both, manually by expert curators and also automatically using computational approaches that utilize the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data. It is a relational database that can be searched by protein, sequence, motif, article information, and pathBLAST. The website also serves as an access point to a number of projects related to DIP, such as LiveDIP, The Database of Ligand-Receptor Partners (DLRP) and JDIP. Users have free and open access to DIP after login. [Taken from the DIP Guide and the DIP website] (Specialized Interface) (Registration Required)

  6. ClonorESTdb: a comprehensive database for Clonorchis sinensis EST sequences

    PubMed Central

    2014-01-01

    Background Clonorchiasis, which is primarily caused by liver fluke (Platyhelminthes), is a fatal infectious disease that is mainly associated with bile duct malignancy and the subsequent development of cholangiocarcinoma. Thus, a genomic approach now represents an important step to further our knowledge of biology and the pathology of these parasites. The results of expressed sequence tags (ESTs) sequencing need to be well organized into databases to provide an integrated set of tools and functional information. Findings Here, the ClonorESTdb database represents a collection of Clonorchis sinensis ESTs that is intended as a resource for parasite functional genomics. A total of 55,736 successful EST sequences, which are cleaned and clustered into non-redundant 13,305 C. sinensis assembled EST sequences (6,497 clusters and 6,808 singletons), were obtained from three in-house prepared cDNA libraries of C. sinensis at different developmental stages. The assembled consensus sequences were annotated using the BLAST algorithm or/and hmm against NCBI NR, UniProt, KEGG and InterProScan. The ClonorESTdb database provides functional annotation, their expression profiles, tandem repeats and putative single nucleotide polymorphisms with utility tools such as local BLAST search and text retrieval. Conclusions This resource enables the researcher to identify and compare expression signatures under different biological stages and promotes ongoing parasite drug and vaccine development and biological research. Database URL: http://pathod.cdc.go.kr/clonorestdb/ PMID:24957044

  7. An extended bioreaction database that significantly improves reconstruction and analysis of genome-scale metabolic networks.

    PubMed

    Stelzer, Michael; Sun, Jibin; Kamphans, Tom; Fekete, Sándor P; Zeng, An-Ping

    2011-11-01

    The bioreaction database established by Ma and Zeng (Bioinformatics, 2003, 19, 270-277) for in silico reconstruction of genome-scale metabolic networks has been widely used. Based on more recent information in the reference databases KEGG LIGAND and Brenda, we upgrade the bioreaction database in this work by almost doubling the number of reactions from 3565 to 6851. Over 70% of the reactions have been manually updated/revised in terms of reversibility, reactant pairs, currency metabolites and error correction. For the first time, 41 spontaneous sugar mutarotation reactions are introduced into the biochemical database. The upgrade significantly improves the reconstruction of genome scale metabolic networks. Many gaps or missing biochemical links can be recovered, as exemplified with three model organisms Homo sapiens, Aspergillus niger, and Escherichia coli. The topological parameters of the constructed networks were also largely affected, however, the overall network structure remains scale-free. Furthermore, we consider the problem of computing biologically feasible shortest paths in reconstructed metabolic networks. We show that these paths are hard to compute and present solutions to find such paths in networks of small and medium size.

  8. Hazard Analysis Database Report

    SciTech Connect

    GRAMS, W.H.

    2000-12-28

    The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for U S . Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for HNF-SD-WM-SAR-067, Tank Farms Final Safety Analysis Report (FSAR). The FSAR is part of the approved Authorization Basis (AB) for the River Protection Project (RPP). This document describes, identifies, and defines the contents and structure of the Tank Farms FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The Hazard Analysis Database supports the preparation of Chapters 3 ,4 , and 5 of the Tank Farms FSAR and the Unreviewed Safety Question (USQ) process and consists of two major, interrelated data sets: (1) Hazard Analysis Database: Data from the results of the hazard evaluations, and (2) Hazard Topography Database: Data from the system familiarization and hazard identification.

  9. Hazard Analysis Database Report

    SciTech Connect

    GAULT, G.W.

    1999-10-13

    The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for US Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for the Tank Waste Remediation System (TWRS) Final Safety Analysis Report (FSAR). The FSAR is part of the approved TWRS Authorization Basis (AB). This document describes, identifies, and defines the contents and structure of the TWRS FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The TWRS Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The database supports the preparation of Chapters 3,4, and 5 of the TWRS FSAR and the USQ process and consists of two major, interrelated data sets: (1) Hazard Evaluation Database--Data from the results of the hazard evaluations; and (2) Hazard Topography Database--Data from the system familiarization and hazard identification.

  10. Database for propagation models

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.

    1991-01-01

    A propagation researcher or a systems engineer who intends to use the results of a propagation experiment is generally faced with various database tasks such as the selection of the computer software, the hardware, and the writing of the programs to pass the data through the models of interest. This task is repeated every time a new experiment is conducted or the same experiment is carried out at a different location generating different data. Thus the users of this data have to spend a considerable portion of their time learning how to implement the computer hardware and the software towards the desired end. This situation may be facilitated considerably if an easily accessible propagation database is created that has all the accepted (standardized) propagation phenomena models approved by the propagation research community. Also, the handling of data will become easier for the user. Such a database construction can only stimulate the growth of the propagation research it if is available to all the researchers, so that the results of the experiment conducted by one researcher can be examined independently by another, without different hardware and software being used. The database may be made flexible so that the researchers need not be confined only to the contents of the database. Another way in which the database may help the researchers is by the fact that they will not have to document the software and hardware tools used in their research since the propagation research community will know the database already. The following sections show a possible database construction, as well as properties of the database for the propagation research.

  11. A computational platform to maintain and migrate manual functional annotations for BioCyc databases

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Model organism databases are an important resource for information on biological pathways and genomic data. Such databases represent the accumulation of biological data, some of which has been manually curated from literature. An essential feature of these databases is the continuing data integratio...

  12. International Comparisions Database

    National Institute of Standards and Technology Data Gateway

    International Comparisions Database (Web, free access)   The International Comparisons Database (ICDB) serves the U.S. and the Inter-American System of Metrology (SIM) with information based on Appendices B (International Comparisons), C (Calibration and Measurement Capabilities) and D (List of Participating Countries) of the Comit� International des Poids et Mesures (CIPM) Mutual Recognition Arrangement (MRA). The official source of the data is The BIPM key comparison database. The ICDB provides access to results of comparisons of measurements and standards organized by the consultative committees of the CIPM and the Regional Metrology Organizations.

  13. Phase Equilibria Diagrams Database

    National Institute of Standards and Technology Data Gateway

    SRD 31 NIST/ACerS Phase Equilibria Diagrams Database (PC database for purchase)   The Phase Equilibria Diagrams Database contains commentaries and more than 21,000 diagrams for non-organic systems, including those published in all 21 hard-copy volumes produced as part of the ACerS-NIST Phase Equilibria Diagrams Program (formerly titled Phase Diagrams for Ceramists): Volumes I through XIV (blue books); Annuals 91, 92, 93; High Tc Superconductors I & II; Zirconium & Zirconia Systems; and Electronic Ceramics I. Materials covered include oxides as well as non-oxide systems such as chalcogenides and pnictides, phosphates, salt systems, and mixed systems of these classes.

  14. JICST Factual Database

    NASA Astrophysics Data System (ADS)

    Suzuki, Kazuaki; Shimura, Kazuki; Monma, Yoshio; Sakamoto, Masao; Morishita, Hiroshi; Kanazawa, Kenji

    The Japan Information Center of Science and Technology (JICST) has started the on-line service of JICST/NRIM Materials Strength Database for Engineering Steels and Alloys (JICST ME) in this March (1990). This database has been developed under the joint research between JICST and the National Research Institute for Metals (NRIM). It provides material strength data (creep, fatigue, etc.) of engineering steels and alloys. It is able to search and display on-line, and to analyze the searched data statistically and plot the result on graphic display. The database system and the data in JICST ME are described.

  15. Hybrid Terrain Database

    NASA Technical Reports Server (NTRS)

    Arthur, Trey

    2006-01-01

    A prototype hybrid terrain database is being developed in conjunction with other databases and with hardware and software that constitute subsystems of aerospace cockpit display systems (known in the art as synthetic vision systems) that generate images to increase pilots' situation awareness and eliminate poor visibility as a cause of aviation accidents. The basic idea is to provide a clear view of the world around an aircraft by displaying computer-generated imagery derived from an onboard database of terrain, obstacle, and airport information.

  16. Dynameomics: A comprehensive database of protein dynamics

    PubMed Central

    van der Kamp, Marc W.; Schaeffer, Richard D.; Jonsson, Amanda L.; Scouras, Alexander D.; Simms, Andrew; Toofanny, Rudesh D.; Benson, Noah C.; Anderson, Peter C.; Merkley, Eric D.; Rysavy, Steve; Bromley, Denny; Beck, David A. C.; Daggett, Valerie

    2010-01-01

    Summary The dynamic behavior of proteins is important for an understanding of their function and folding. We have performed molecular dynamics simulations of the native state and unfolding pathways of over 1000 proteins, representing the majority of folds in globular proteins. These data are stored and organized using an innovative database approach, which can be mined to obtain both general and specific information about the dynamics and folding/unfolding of proteins, relevant subsets thereof, and individual proteins. Here we describe the project in general terms and the type of information contained in the database. Then we provide examples of mining the database for information relevant to protein folding, structure building, the effect of single-nucleotide polymorphisms, and drug design. The native state simulation data and corresponding analyses for the 100 most populated metafolds, together with related resources, are publicly accessible through www.dynameomics.org. PMID:20399180

  17. Large-scale annotation of small-molecule libraries using public databases.

    PubMed

    Zhou, Yingyao; Zhou, Bin; Chen, Kaisheng; Yan, S Frank; King, Frederick J; Jiang, Shumei; Winzeler, Elizabeth A

    2007-01-01

    While many large publicly accessible databases provide excellent annotation for biological macromolecules, the same is not true for small chemical compounds. Commercial data sources also fail to encompass an annotation interface for large numbers of compounds and tend to be cost prohibitive to be widely available to biomedical researchers. Therefore, using annotation information for the selection of lead compounds from a modern day high-throughput screening (HTS) campaign presently occurs only under a very limited scale. The recent rapid expansion of the NIH PubChem database provides an opportunity to link existing biological databases with compound catalogs and provides relevant information that potentially could improve the information garnered from large-scale screening efforts. Using the 2.5 million compound collection at the Genomics Institute of the Novartis Research Foundation (GNF) as a model, we determined that approximately 4% of the library contained compounds with potential annotation in such databases as PubChem and the World Drug Index (WDI) as well as related databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) and ChemIDplus. Furthermore, the exact structure match analysis showed 32% of GNF compounds can be linked to third party databases via PubChem. We also showed annotations such as MeSH (medical subject headings) terms can be applied to in-house HTS databases in identifying signature biological inhibition profiles of interest as well as expediting the assay validation process. The automated annotation of thousands of screening hits in batch is becoming feasible and has the potential to play an essential role in the hit-to-lead decision making process.

  18. Cancer Metabolomics and the Human Metabolome Database

    PubMed Central

    Wishart, David S.; Mandal, Rupasri; Stanislaus, Avalyn; Ramirez-Gaona, Miguel

    2016-01-01

    The application of metabolomics towards cancer research has led to a renewed appreciation of metabolism in cancer development and progression. It has also led to the discovery of metabolite cancer biomarkers and the identification of a number of novel cancer causing metabolites. The rapid growth of metabolomics in cancer research is also leading to challenges. In particular, with so many cancer-associate metabolites being identified, it is often difficult to keep track of which compounds are associated with which cancers. It is also challenging to track down information on the specific pathways that particular metabolites, drugs or drug metabolites may be affecting. Even more frustrating are the difficulties associated with identifying metabolites from NMR or MS spectra. Fortunately, a number of metabolomics databases are emerging that are designed to address these challenges. One such database is the Human Metabolome Database (HMDB). The HMDB is currently the world’s largest and most comprehensive, organism-specific metabolomics database. It contains more than 40,000 metabolite entries, thousands of metabolite concentrations, >700 metabolic and disease-associated pathways, as well as information on dozens of cancer biomarkers. This review is intended to provide a brief summary of the HMDB and to offer some guidance on how it can be used in metabolomic studies of cancer. PMID:26950159

  19. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1994-05-27

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  20. Nuclear Science References Database

    SciTech Connect

    Pritychenko, B.; Běták, E.; Singh, B.; Totans, J.

    2014-06-15

    The Nuclear Science References (NSR) database together with its associated Web interface, is the world's only comprehensive source of easily accessible low- and intermediate-energy nuclear physics bibliographic information for more than 210,000 articles since the beginning of nuclear science. The weekly-updated NSR database provides essential support for nuclear data evaluation, compilation and research activities. The principles of the database and Web application development and maintenance are described. Examples of nuclear structure, reaction and decay applications are specifically included. The complete NSR database is freely available at the websites of the National Nuclear Data Center (http://www.nndc.bnl.gov/nsr) and the International Atomic Energy Agency (http://www-nds.iaea.org/nsr)

  1. Hawaii bibliographic database

    USGS Publications Warehouse

    Wright, T.L.; Takahashi, T.J.

    1998-01-01

    The Hawaii bibliographic database has been created to contain all of the literature, from 1779 to the present, pertinent to the volcanological history of the Hawaiian-Emperor volcanic chain. References are entered in a PC- and Macintosh-compatible EndNote Plus bibliographic database with keywords and abstracts or (if no abstract) with annotations as to content. Keywords emphasize location, discipline, process, identification of new chemical data or age determinations, and type of publication. The database is updated approximately three times a year and is available to upload from an ftp site. The bibliography contained 8460 references at the time this paper was submitted for publication. Use of the database greatly enhances the power and completeness of library searches for anyone interested in Hawaiian volcanism.

  2. Navigating public microarray databases.

    PubMed

    Penkett, Christopher J; Bähler, Jürg

    2004-01-01

    With the ever-escalating amount of data being produced by genome-wide microarray studies, it is of increasing importance that these data are captured in public databases so that researchers can use this information to complement and enhance their own studies. Many groups have set up databases of expression data, ranging from large repositories, which are designed to comprehensively capture all published data, through to more specialized databases. The public repositories, such as ArrayExpress at the European Bioinformatics Institute contain complete datasets in raw format in addition to processed data, whilst the specialist databases tend to provide downstream analysis of normalized data from more focused studies and data sources. Here we provide a guide to the use of these public microarray resources.

  3. Chemical Kinetics Database

    National Institute of Standards and Technology Data Gateway

    SRD 17 NIST Chemical Kinetics Database (Web, free access)   The NIST Chemical Kinetics Database includes essentially all reported kinetics results for thermal gas-phase chemical reactions. The database is designed to be searched for kinetics data based on the specific reactants involved, for reactions resulting in specified products, for all the reactions of a particular species, or for various combinations of these. In addition, the bibliography can be searched by author name or combination of names. The database contains in excess of 38,000 separate reaction records for over 11,700 distinct reactant pairs. These data have been abstracted from over 12,000 papers with literature coverage through early 2000.

  4. TREATABILITY DATABASE DESCRIPTION

    EPA Science Inventory

    The Drinking Water Treatability Database (TDB) presents referenced information on the control of contaminants in drinking water. It allows drinking water utilities, first responders to spills or emergencies, treatment process designers, research organizations, academics, regulato...

  5. THE CTEPP DATABASE

    EPA Science Inventory

    The CTEPP (Children's Total Exposure to Persistent Pesticides and Other Persistent Organic Pollutants) database contains a wealth of data on children's aggregate exposures to pollutants in their everyday surroundings. Chemical analysis data for the environmental media and ques...

  6. Requirements Management Database

    2009-08-13

    This application is a simplified and customized version of the RBA and CTS databases to capture federal, site, and facility requirements, link to actions that must be performed to maintain compliance with their contractual and other requirements.

  7. Steam Properties Database

    National Institute of Standards and Technology Data Gateway

    SRD 10 NIST/ASME Steam Properties Database (PC database for purchase)   Based upon the International Association for the Properties of Water and Steam (IAPWS) 1995 formulation for the thermodynamic properties of water and the most recent IAPWS formulations for transport and other properties, this updated version provides water properties over a wide range of conditions according to the accepted international standards.

  8. Database computing in HEP

    SciTech Connect

    Day, C.T.; Loken, S.; MacFarlane, J.F. ); May, E.; Lifka, D.; Lusk, E.; Price, L.E. ); Baden, A. . Dept. of Physics); Grossman, R.; Qin, X. . Dept. of Mathematics, Statistics and Computer Science); Cormell, L.; Leibold, P.; Liu, D

    1992-01-01

    The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors. I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototype based on relational and object-oriented databases of CDF data samples.

  9. Database computing in HEP

    NASA Technical Reports Server (NTRS)

    Day, C. T.; Loken, S.; Macfarlane, J. F.; May, E.; Lifka, D.; Lusk, E.; Price, L. E.; Baden, A.; Grossman, R.; Qin, X.

    1992-01-01

    The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors, I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototypes based on relational and object-oriented databases of CDF data samples.

  10. Querying genomic databases

    SciTech Connect

    Baehr, A.; Hagstrom, R.; Joerg, D.; Overbeek, R.

    1991-09-01

    A natural-language interface has been developed that retrieves genomic information by using a simple subset of English. The interface spares the biologist from the task of learning database-specific query languages and computer programming. Currently, the interface deals with the E. coli genome. It can, however, be readily extended and shows promise as a means of easy access to other sequenced genomic databases as well.

  11. Drinking Water Database

    NASA Technical Reports Server (NTRS)

    Murray, ShaTerea R.

    2004-01-01

    This summer I had the opportunity to work in the Environmental Management Office (EMO) under the Chemical Sampling and Analysis Team or CS&AT. This team s mission is to support Glenn Research Center (GRC) and EM0 by providing chemical sampling and analysis services and expert consulting. Services include sampling and chemical analysis of water, soil, fbels, oils, paint, insulation materials, etc. One of this team s major projects is the Drinking Water Project. This is a project that is done on Glenn s water coolers and ten percent of its sink every two years. For the past two summers an intern had been putting together a database for this team to record the test they had perform. She had successfully created a database but hadn't worked out all the quirks. So this summer William Wilder (an intern from Cleveland State University) and I worked together to perfect her database. We began be finding out exactly what every member of the team thought about the database and what they would change if any. After collecting this data we both had to take some courses in Microsoft Access in order to fix the problems. Next we began looking at what exactly how the database worked from the outside inward. Then we began trying to change the database but we quickly found out that this would be virtually impossible.

  12. The Halophile protein database.

    PubMed

    Sharma, Naveen; Farooqi, Mohammad Samir; Chaturvedi, Krishna Kumar; Lal, Shashi Bhushan; Grover, Monendra; Rai, Anil; Pandey, Pankaj

    2014-01-01

    Halophilic archaea/bacteria adapt to different salt concentration, namely extreme, moderate and low. These type of adaptations may occur as a result of modification of protein structure and other changes in different cell organelles. Thus proteins may play an important role in the adaptation of halophilic archaea/bacteria to saline conditions. The Halophile protein database (HProtDB) is a systematic attempt to document the biochemical and biophysical properties of proteins from halophilic archaea/bacteria which may be involved in adaptation of these organisms to saline conditions. In this database, various physicochemical properties such as molecular weight, theoretical pI, amino acid composition, atomic composition, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (Gravy) have been listed. These physicochemical properties play an important role in identifying the protein structure, bonding pattern and function of the specific proteins. This database is comprehensive, manually curated, non-redundant catalogue of proteins. The database currently contains 59 897 proteins properties extracted from 21 different strains of halophilic archaea/bacteria. The database can be accessed through link. Database URL: http://webapp.cabgrid.res.in/protein/

  13. Crude Oil Analysis Database

    DOE Data Explorer

    Shay, Johanna Y.

    The composition and physical properties of crude oil vary widely from one reservoir to another within an oil field, as well as from one field or region to another. Although all oils consist of hydrocarbons and their derivatives, the proportions of various types of compounds differ greatly. This makes some oils more suitable than others for specific refining processes and uses. To take advantage of this diversity, one needs access to information in a large database of crude oil analyses. The Crude Oil Analysis Database (COADB) currently satisfies this need by offering 9,056 crude oil analyses. Of these, 8,500 are United States domestic oils. The database contains results of analysis of the general properties and chemical composition, as well as the field, formation, and geographic location of the crude oil sample. [Taken from the Introduction to COAMDATA_DESC.pdf, part of the zipped software and database file at http://www.netl.doe.gov/technologies/oil-gas/Software/database.html] Save the zipped file to your PC. When opened, it will contain PDF documents and a large Excel spreadsheet. It will also contain the database in Microsoft Access 2002.

  14. Open systems and databases

    SciTech Connect

    Martire, G.S. ); Nuttall, D.J.H. )

    1993-05-01

    This paper is part of a series of papers invited by the IEEE POWER CONTROL CENTER WORKING GROUP concerning the changing designs of modern control centers. Papers invited by the Working Group discuss the following issues: Benefits of Openness, Criteria for Evaluating Open EMS Systems, Hardware Design, Configuration Management, Security, Project Management, Databases, SCADA, Inter- and Intra-System Communications and Man-Machine Interfaces,'' The goal of this paper is to provide an introduction to the issues pertaining to Open Systems and Databases.'' The intent is to assist understanding of some of the underlying factors that effect choices that must be made when selecting a database system for use in a control room environment. This paper describes and compares the major database information models which are in common use for database systems and provides an overview of SQL. A case for the control center community to follow the workings of the non-formal standards bodies is presented along with possible uses and the benefits of commercially available databases within the control center. The reasons behind the emergence of industry supported standards organizations such as the Open Software Foundation (OSF) and SQL Access are presented.

  15. The comprehensive peptaibiotics database.

    PubMed

    Stoppacher, Norbert; Neumann, Nora K N; Burgstaller, Lukas; Zeilinger, Susanne; Degenkolb, Thomas; Brückner, Hans; Schuhmacher, Rainer

    2013-05-01

    Peptaibiotics are nonribosomally biosynthesized peptides, which - according to definition - contain the marker amino acid α-aminoisobutyric acid (Aib) and possess antibiotic properties. Being known since 1958, a constantly increasing number of peptaibiotics have been described and investigated with a particular emphasis on hypocrealean fungi. Starting from the existing online 'Peptaibol Database', first published in 1997, an exhaustive literature survey of all known peptaibiotics was carried out and resulted in a list of 1043 peptaibiotics. The gathered information was compiled and used to create the new 'The Comprehensive Peptaibiotics Database', which is presented here. The database was devised as a software tool based on Microsoft (MS) Access. It is freely available from the internet at http://peptaibiotics-database.boku.ac.at and can easily be installed and operated on any computer offering a Windows XP/7 environment. It provides useful information on characteristic properties of the peptaibiotics included such as peptide category, group name of the microheterogeneous mixture to which the peptide belongs, amino acid sequence, sequence length, producing fungus, peptide subfamily, molecular formula, and monoisotopic mass. All these characteristics can be used and combined for automated search within the database, which makes The Comprehensive Peptaibiotics Database a versatile tool for the retrieval of valuable information about peptaibiotics. Sequence data have been considered as to December 14, 2012. PMID:23681723

  16. Specialist Bibliographic Databases

    PubMed Central

    2016-01-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485

  17. Specialist Bibliographic Databases.

    PubMed

    Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D

    2016-05-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485

  18. Specialist Bibliographic Databases.

    PubMed

    Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D

    2016-05-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls.

  19. Great Basin paleontological database

    USGS Publications Warehouse

    Zhang, N.; Blodgett, R.B.; Hofstra, A.H.

    2008-01-01

    The U.S. Geological Survey has constructed a paleontological database for the Great Basin physiographic province that can be served over the World Wide Web for data entry, queries, displays, and retrievals. It is similar to the web-database solution that we constructed for Alaskan paleontological data (www.alaskafossil.org). The first phase of this effort was to compile a paleontological bibliography for Nevada and portions of adjacent states in the Great Basin that has recently been completed. In addition, we are also compiling paleontological reports (Known as E&R reports) of the U.S. Geological Survey, which are another extensive source of l,egacy data for this region. Initial population of the database benefited from a recently published conodont data set and is otherwise focused on Devonian and Mississippian localities because strata of this age host important sedimentary exhalative (sedex) Au, Zn, and barite resources and enormons Carlin-type An deposits. In addition, these strata are the most important petroleum source rocks in the region, and record the transition from extension to contraction associated with the Antler orogeny, the Alamo meteorite impact, and biotic crises associated with global oceanic anoxic events. The finished product will provide an invaluable tool for future geologic mapping, paleontological research, and mineral resource investigations in the Great Basin, making paleontological data acquired over nearly the past 150 yr readily available over the World Wide Web. A description of the structure of the database and the web interface developed for this effort are provided herein. This database is being used ws a model for a National Paleontological Database (which we am currently developing for the U.S. Geological Survey) as well as for other paleontological databases now being developed in other parts of the globe. ?? 2008 Geological Society of America.

  20. Implication of the immune system in Alzheimer's disease: evidence from genome-wide pathway analysis.

    PubMed

    Lambert, Jean-Charles; Grenier-Boley, Benjamin; Chouraki, Vincent; Heath, Simon; Zelenika, Diana; Fievet, Nathalie; Hannequin, Didier; Pasquier, Florence; Hanon, Olivier; Brice, Alexis; Epelbaum, Jacques; Berr, Claudine; Dartigues, Jean-Francois; Tzourio, Christophe; Campion, Dominique; Lathrop, Mark; Amouyel, Philippe

    2010-01-01

    The results of several genome-wide association studies (GWASs) in the field of Alzheimer's disease (AD) have recently been published. Although these studies reported in detail on single-nucleotide polymorphisms (SNPs) and the neighboring genes with the strongest evidence of association with AD, little attention was paid to the rest of the genome. However, complementary statistical and bio-informatics approaches now enable the extraction of pertinent information from other SNPs and/or genes which are only nominally associated with the disease risk. Two different tools (the ALIGATOR and GenGen/KEGG software packages) were used to analyze a large GWAS dataset containing 2,032 AD cases and 5,328 controls. Convergent outputs from the two gene set enrichment approaches suggested an immune system dysfunction in AD. Furthermore, although these statistical approaches did not adopt a priori hypotheses concerning a biological function's putative role in the disease process, genes associated with AD risk were overrepresented in the "Alzheimer's disease" KEGG pathway. In conclusion, a systematic search for biological pathways using GWAS data set seems to comfort the primary causes already suspected but may specifically highlight the importance of the immune system in AD.

  1. FishTraits Database

    USGS Publications Warehouse

    Angermeier, Paul L.; Frimpong, Emmanuel A.

    2009-01-01

    The need for integrated and widely accessible sources of species traits data to facilitate studies of ecology, conservation, and management has motivated development of traits databases for various taxa. In spite of the increasing number of traits-based analyses of freshwater fishes in the United States, no consolidated database of traits of this group exists publicly, and much useful information on these species is documented only in obscure sources. The largely inaccessible and unconsolidated traits information makes large-scale analysis involving many fishes and/or traits particularly challenging. FishTraits is a database of >100 traits for 809 (731 native and 78 exotic) fish species found in freshwaters of the conterminous United States, including 37 native families and 145 native genera. The database contains information on four major categories of traits: (1) trophic ecology, (2) body size and reproductive ecology (life history), (3) habitat associations, and (4) salinity and temperature tolerances. Information on geographic distribution and conservation status is also included. Together, we refer to the traits, distribution, and conservation status information as attributes. Descriptions of attributes are available here. Many sources were consulted to compile attributes, including state and regional species accounts and other databases.

  2. ADANS database specification

    SciTech Connect

    1997-01-16

    The purpose of the Air Mobility Command (AMC) Deployment Analysis System (ADANS) Database Specification (DS) is to describe the database organization and storage allocation and to provide the detailed data model of the physical design and information necessary for the construction of the parts of the database (e.g., tables, indexes, rules, defaults). The DS includes entity relationship diagrams, table and field definitions, reports on other database objects, and a description of the ADANS data dictionary. ADANS is the automated system used by Headquarters AMC and the Tanker Airlift Control Center (TACC) for airlift planning and scheduling of peacetime and contingency operations as well as for deliberate planning. ADANS also supports planning and scheduling of Air Refueling Events by the TACC and the unit-level tanker schedulers. ADANS receives input in the form of movement requirements and air refueling requests. It provides a suite of tools for planners to manipulate these requirements/requests against mobility assets and to develop, analyze, and distribute schedules. Analysis tools are provided for assessing the products of the scheduling subsystems, and editing capabilities support the refinement of schedules. A reporting capability provides formatted screen, print, and/or file outputs of various standard reports. An interface subsystem handles message traffic to and from external systems. The database is an integral part of the functionality summarized above.

  3. NASA Records Database

    NASA Technical Reports Server (NTRS)

    Callac, Christopher; Lunsford, Michelle

    2005-01-01

    The NASA Records Database, comprising a Web-based application program and a database, is used to administer an archive of paper records at Stennis Space Center. The system begins with an electronic form, into which a user enters information about records that the user is sending to the archive. The form is smart : it provides instructions for entering information correctly and prompts the user to enter all required information. Once complete, the form is digitally signed and submitted to the database. The system determines which storage locations are not in use, assigns the user s boxes of records to some of them, and enters these assignments in the database. Thereafter, the software tracks the boxes and can be used to locate them. By use of search capabilities of the software, specific records can be sought by box storage locations, accession numbers, record dates, submitting organizations, or details of the records themselves. Boxes can be marked with such statuses as checked out, lost, transferred, and destroyed. The system can generate reports showing boxes awaiting destruction or transfer. When boxes are transferred to the National Archives and Records Administration (NARA), the system can automatically fill out NARA records-transfer forms. Currently, several other NASA Centers are considering deploying the NASA Records Database to help automate their records archives.

  4. Shuttle Hypervelocity Impact Database

    NASA Technical Reports Server (NTRS)

    Hyde, James L.; Christiansen, Eric L.; Lear, Dana M.

    2011-01-01

    With three missions outstanding, the Shuttle Hypervelocity Impact Database has nearly 3000 entries. The data is divided into tables for crew module windows, payload bay door radiators and thermal protection system regions, with window impacts compromising just over half the records. In general, the database provides dimensions of hypervelocity impact damage, a component level location (i.e., window number or radiator panel number) and the orbiter mission when the impact occurred. Additional detail on the type of particle that produced the damage site is provided when sampling data and definitive analysis results are available. Details and insights on the contents of the database including examples of descriptive statistics will be provided. Post flight impact damage inspection and sampling techniques that were employed during the different observation campaigns will also be discussed. Potential enhancements to the database structure and availability of the data for other researchers will be addressed in the Future Work section. A related database of returned surfaces from the International Space Station will also be introduced.

  5. Shuttle Hypervelocity Impact Database

    NASA Technical Reports Server (NTRS)

    Hyde, James I.; Christiansen, Eric I.; Lear, Dana M.

    2011-01-01

    With three flights remaining on the manifest, the shuttle impact hypervelocity database has over 2800 entries. The data is currently divided into tables for crew module windows, payload bay door radiators and thermal protection system regions, with window impacts compromising just over half the records. In general, the database provides dimensions of hypervelocity impact damage, a component level location (i.e., window number or radiator panel number) and the orbiter mission when the impact occurred. Additional detail on the type of particle that produced the damage site is provided when sampling data and definitive analysis results are available. The paper will provide details and insights on the contents of the database including examples of descriptive statistics using the impact data. A discussion of post flight impact damage inspection and sampling techniques that were employed during the different observation campaigns will be presented. Future work to be discussed will be possible enhancements to the database structure and availability of the data for other researchers. A related database of ISS returned surfaces that are under development will also be introduced.

  6. Computer Databases: A Survey; Part 1: General and News Databases.

    ERIC Educational Resources Information Center

    O'Leary, Mick

    1986-01-01

    Descriptions and evaluations of 13 databases devoted to computer information are presented by type under four headings: bibliographic databases; daily news services; online computer magazines; and specialized computer industry databases. Information on database producers, starting date of file, update frequency, vendors, and prices is summarized…

  7. The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

    SciTech Connect

    Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.

    2010-01-27

    Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in

  8. Reconstruction of Sugar Metabolic Pathways of Giardia lamblia

    PubMed Central

    Han, Jian; Collins, Lesley J.

    2012-01-01

    Giardia lamblia is an “important” pathogen of humans, but as a diplomonad excavate it is evolutionarily distant from other eukaryotes and relatively little is known about its core metabolic pathways. KEGG, the widely referenced site for providing information of metabolism, does not yet include many enzymes from Giardia species. Here we identify Giardia's core sugar metabolism using standard bioinformatic approaches. By comparing Giardia proteomes with known enzymes from other species, we have identified enzymes in the glycolysis pathway, as well as some enzymes involved in the TCA cycle and oxidative phosphorylation. However, the majority of enzymes from the latter two pathways were not identifiable, indicating the likely absence of these functionalities. We have also found enzymes from the Giardia glycolysis pathway that appear more similar to those from bacteria. Because these enzymes are different from those found in mammals, the host organisms for Giardia, we raise the possibility that these bacteria-like enzymes could be novel drug targets for treating Giardia infections. PMID:23119161

  9. VIEWCACHE: An incremental database access method for autonomous interoperable databases

    NASA Technical Reports Server (NTRS)

    Roussopoulos, Nick; Sellis, Timoleon

    1991-01-01

    The objective is to illustrate the concept of incremental access to distributed databases. An experimental database management system, ADMS, which has been developed at the University of Maryland, in College Park, uses VIEWCACHE, a database access method based on incremental search. VIEWCACHE is a pointer-based access method that provides a uniform interface for accessing distributed databases and catalogues. The compactness of the pointer structures formed during database browsing and the incremental access method allow the user to search and do inter-database cross-referencing with no actual data movement between database sites. Once the search is complete, the set of collected pointers pointing to the desired data are dereferenced.

  10. De novo transcriptome assembly databases for the butterfly orchid Phalaenopsis equestris

    PubMed Central

    Niu, Shan-Ce; Xu, Qing; Zhang, Guo-Qiang; Zhang, Yong-Qiang; Tsai, Wen-Chieh; Hsu, Jui-Ling; Liang, Chieh-Kai; Luo, Yi-Bo; Liu, Zhong-Jian

    2016-01-01

    Orchids are renowned for their spectacular flowers and ecological adaptations. After the sequencing of the genome of the tropical epiphytic orchid Phalaenopsis equestris, we combined Illumina HiSeq2000 for RNA-Seq and Trinity for de novo assembly to characterize the transcriptomes for 11 diverse P. equestris tissues representing the root, stem, leaf, flower buds, column, lip, petal, sepal and three developmental stages of seeds. Our aims were to contribute to a better understanding of the molecular mechanisms driving the analysed tissue characteristics and to enrich the available data for P. equestris. Here, we present three databases. The first dataset is the RNA-Seq raw reads, which can be used to execute new experiments with different analysis approaches. The other two datasets allow different types of searches for candidate homologues. The second dataset includes the sets of assembled unigenes and predicted coding sequences and proteins, enabling a sequence-based search. The third dataset consists of the annotation results of the aligned unigenes versus the Nonredundant (Nr) protein database, Kyoto Encyclopaedia of Genes and Genomes (KEGG) and Clusters of Orthologous Groups (COG) databases with low e-values, enabling a name-based search. PMID:27673730

  11. Open Geoscience Database

    NASA Astrophysics Data System (ADS)

    Bashev, A.

    2012-04-01

    Currently there is an enormous amount of various geoscience databases. Unfortunately the only users of the majority of the databases are their elaborators. There are several reasons for that: incompaitability, specificity of tasks and objects and so on. However the main obstacles for wide usage of geoscience databases are complexity for elaborators and complication for users. The complexity of architecture leads to high costs that block the public access. The complication prevents users from understanding when and how to use the database. Only databases, associated with GoogleMaps don't have these drawbacks, but they could be hardly named "geoscience" Nevertheless, open and simple geoscience database is necessary at least for educational purposes (see our abstract for ESSI20/EOS12). We developed a database and web interface to work with them and now it is accessible at maps.sch192.ru. In this database a result is a value of a parameter (no matter which) in a station with a certain position, associated with metadata: the date when the result was obtained; the type of a station (lake, soil etc); the contributor that sent the result. Each contributor has its own profile, that allows to estimate the reliability of the data. The results can be represented on GoogleMaps space image as a point in a certain position, coloured according to the value of the parameter. There are default colour scales and each registered user can create the own scale. The results can be also extracted in *.csv file. For both types of representation one could select the data by date, object type, parameter type, area and contributor. The data are uploaded in *.csv format: Name of the station; Lattitude(dd.dddddd); Longitude(ddd.dddddd); Station type; Parameter type; Parameter value; Date(yyyy-mm-dd). The contributor is recognised while entering. This is the minimal set of features that is required to connect a value of a parameter with a position and see the results. All the complicated data

  12. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1992-04-30

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on R-32, R-123, R-124, R- 125, R-134a, R-141b, R142b, R-143a, R-152a, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses polyalkylene glycol (PAG), ester, and other lubricants. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits.

  13. Enhancing medical database semantics.

    PubMed Central

    Leão, B. de F.; Pavan, A.

    1995-01-01

    Medical Databases deal with dynamic, heterogeneous and fuzzy data. The modeling of such complex domain demands powerful semantic data modeling methodologies. This paper describes GSM-Explorer a Case Tool that allows for the creation of relational databases using semantic data modeling techniques. GSM Explorer fully incorporates the Generic Semantic Data Model-GSM enabling knowledge engineers to model the application domain with the abstraction mechanisms of generalization/specialization, association and aggregation. The tool generates a structure that implements persistent database-objects through the automatic generation of customized SQL ANSI scripts that sustain the semantics defined in the higher lever. This paper emphasizes the system architecture and the mapping of the semantic model into relational tables. The present status of the project and its further developments are discussed in the Conclusions. PMID:8563288

  14. Protein Structure Databases.

    PubMed

    Laskowski, Roman A

    2016-01-01

    Web-based protein structure databases come in a wide variety of types and levels of information content. Those having the most general interest are the various atlases that describe each experimentally determined protein structure and provide useful links, analyses, and schematic diagrams relating to its 3D structure and biological function. Also of great interest are the databases that classify 3D structures by their folds as these can reveal evolutionary relationships which may be hard to detect from sequence comparison alone. Related to these are the numerous servers that compare folds-particularly useful for newly solved structures, and especially those of unknown function. Beyond these are a vast number of databases for the more specialized user, dealing with specific families, diseases, structural features, and so on. PMID:27115626

  15. Mouse genome database 2016.

    PubMed

    Bult, Carol J; Eppig, Janan T; Blake, Judith A; Kadin, James A; Richardson, Joel E

    2016-01-01

    The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data.

  16. Mouse genome database 2016

    PubMed Central

    Bult, Carol J.; Eppig, Janan T.; Blake, Judith A.; Kadin, James A.; Richardson, Joel E.

    2016-01-01

    The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data. PMID:26578600

  17. Mouse genome database 2016.

    PubMed

    Bult, Carol J; Eppig, Janan T; Blake, Judith A; Kadin, James A; Richardson, Joel E

    2016-01-01

    The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data. PMID:26578600

  18. National Ambient Radiation Database

    SciTech Connect

    Dziuban, J.; Sears, R.

    2003-02-25

    The U.S. Environmental Protection Agency (EPA) recently developed a searchable database and website for the Environmental Radiation Ambient Monitoring System (ERAMS) data. This site contains nationwide radiation monitoring data for air particulates, precipitation, drinking water, surface water and pasteurized milk. This site provides location-specific as well as national information on environmental radioactivity across several media. It provides high quality data for assessing public exposure and environmental impacts resulting from nuclear emergencies and provides baseline data during routine conditions. The database and website are accessible at www.epa.gov/enviro/. This site contains (1) a query for the general public which is easy to use--limits the amount of information provided, but includes the ability to graph the data with risk benchmarks and (2) a query for a more technical user which allows access to all of the data in the database, (3) background information on ER AMS.

  19. The Neotoma Paleoecology Database

    NASA Astrophysics Data System (ADS)

    Grimm, E. C.; Ashworth, A. C.; Barnosky, A. D.; Betancourt, J. L.; Bills, B.; Booth, R.; Blois, J.; Charles, D. F.; Graham, R. W.; Goring, S. J.; Hausmann, S.; Smith, A. J.; Williams, J. W.; Buckland, P.

    2015-12-01

    The Neotoma Paleoecology Database (www.neotomadb.org) is a multiproxy, open-access, relational database that includes fossil data for the past 5 million years (the late Neogene and Quaternary Periods). Modern distributional data for various organisms are also being made available for calibration and paleoecological analyses. The project is a collaborative effort among individuals from more than 20 institutions worldwide, including domain scientists representing a spectrum of Pliocene-Quaternary fossil data types, as well as experts in information technology. Working groups are active for diatoms, insects, ostracodes, pollen and plant macroscopic remains, testate amoebae, rodent middens, vertebrates, age models, geochemistry and taphonomy. Groups are also active in developing online tools for data analyses and for developing modules for teaching at different levels. A key design concept of NeotomaDB is that stewards for various data types are able to remotely upload and manage data. Cooperatives for different kinds of paleo data, or from different regions, can appoint their own stewards. Over the past year, much progress has been made on development of the steward software-interface that will enable this capability. The steward interface uses web services that provide access to the database. More generally, these web services enable remote programmatic access to the database, which both desktop and web applications can use and which provide real-time access to the most current data. Use of these services can alleviate the need to download the entire database, which can be out-of-date as soon as new data are entered. In general, the Neotoma web services deliver data either from an entire table or from the results of a view. Upon request, new web services can be quickly generated. Future developments will likely expand the spatial and temporal dimensions of the database. NeotomaDB is open to receiving new datasets and stewards from the global Quaternary community

  20. The Ribosomal Database Project.

    PubMed Central

    Maidak, B L; Larsen, N; McCaughey, M J; Overbeek, R; Olsen, G J; Fogel, K; Blandy, J; Woese, C R

    1994-01-01

    The Ribosomal Database Project (RDP) is a curated database that offers ribosome-related data, analysis services, and associated computer programs. The offerings include phylogenetically ordered alignments of ribosomal RNA (rRNA) sequences, derived phylogenetic trees, rRNA secondary structure diagrams, and various software for handling, analyzing and displaying alignments and trees. The data are available via anonymous ftp (rdp.life.uiuc.edu), electronic mail (server/rdp.life.uiuc.edu) and gopher (rdpgopher.life.uiuc.edu). The electronic mail server also provides ribosomal probe checking, approximate phylogenetic placement of user-submitted sequences, screening for chimeric nature of newly sequenced rRNAs, and automated alignment. PMID:7524021

  1. Database Management System

    NASA Technical Reports Server (NTRS)

    1990-01-01

    In 1981 Wayne Erickson founded Microrim, Inc, a company originally focused on marketing a microcomputer version of RIM (Relational Information Manager). Dennis Comfort joined the firm and is now vice president, development. The team developed an advanced spinoff from the NASA system they had originally created, a microcomputer database management system known as R:BASE 4000. Microrim added many enhancements and developed a series of R:BASE products for various environments. R:BASE is now the second largest selling line of microcomputer database management software in the world.

  2. The Genopolis Microarray Database

    PubMed Central

    Splendiani, Andrea; Brandizi, Marco; Even, Gael; Beretta, Ottavio; Pavelka, Norman; Pelizzola, Mattia; Mayhaus, Manuel; Foti, Maria; Mauri, Giancarlo; Ricciardi-Castagnoli, Paola

    2007-01-01

    Background Gene expression databases are key resources for microarray data management and analysis and the importance of a proper annotation of their content is well understood. Public repositories as well as microarray database systems that can be implemented by single laboratories exist. However, there is not yet a tool that can easily support a collaborative environment where different users with different rights of access to data can interact to define a common highly coherent content. The scope of the Genopolis database is to provide a resource that allows different groups performing microarray experiments related to a common subject to create a common coherent knowledge base and to analyse it. The Genopolis database has been implemented as a dedicated system for the scientific community studying dendritic and macrophage cells functions and host-parasite interactions. Results The Genopolis Database system allows the community to build an object based MIAME compliant annotation of their experiments and to store images, raw and processed data from the Affymetrix GeneChip® platform. It supports dynamical definition of controlled vocabularies and provides automated and supervised steps to control the coherence of data and annotations. It allows a precise control of the visibility of the database content to different sub groups in the community and facilitates exports of its content to public repositories. It provides an interactive users interface for data analysis: this allows users to visualize data matrices based on functional lists and sample characterization, and to navigate to other data matrices defined by similarity of expression values as well as functional characterizations of genes involved. A collaborative environment is also provided for the definition and sharing of functional annotation by users. Conclusion The Genopolis Database supports a community in building a common coherent knowledge base and analyse it. This fills a gap between a local

  3. Low-Budget Graphic Databases.

    ERIC Educational Resources Information Center

    Mahoney, Dan

    1994-01-01

    Explains the use of a standard text-based database program (i.e., dBase III) to run external programs that display graphic files during a database session and reduces costs normally encountered when preparing a computer to run a graphical database. An example is given of a simple database with two fields. (LRW)

  4. TREC Document Database: Disk 4

    National Institute of Standards and Technology Data Gateway

    NIST TREC Document Database: Disk 4 (PC database for purchase)   NIST TREC Document Databases (Special Database 22) are distributed for the development and testing of information retrieval (IR) systems and related natural language processing research. The document collections consist of the full text of various newspaper and newswire articles plus government proceedings.

  5. TREC Document Database: Disk 5

    National Institute of Standards and Technology Data Gateway

    NIST TREC Document Database: Disk 5 (PC database for purchase)   NIST TREC Document Databases (Special Database 23) are distributed for the development and testing of information retrieval (IR) systems and related natural language processing research. The document collections consist of the full text of various newspaper and newswire articles plus government proceedings.

  6. Reactome from a WikiPathways Perspective.

    PubMed

    Bohler, Anwesha; Wu, Guanming; Kutmon, Martina; Pradhana, Leontius Adhika; Coort, Susan L; Hanspers, Kristina; Haw, Robin; Pico, Alexander R; Evelo, Chris T

    2016-05-01

    Reactome and WikiPathways are two of the most popular freely available databases for biological pathways. Reactome pathways are centrally curated with periodic input from selected domain experts. WikiPathways is a community-based platform where pathways are created and continually curated by any interested party. The nascent collaboration between WikiPathways and Reactome illustrates the mutual benefits of combining these two approaches. We created a format converter that converts Reactome pathways to the GPML format used in WikiPathways. In addition, we developed the ComplexViz plugin for PathVisio which simplifies looking up complex components. The plugin can also score the complexes on a pathway based on a user defined criterion. This score can then be visualized on the complex nodes using the visualization options provided by the plugin. Using the merged collection of curated and converted Reactome pathways, we demonstrate improved pathway coverage of relevant biological processes for the analysis of a previously described polycystic ovary syndrome gene expression dataset. Additionally, this conversion allows researchers to visualize their data on Reactome pathways using PathVisio's advanced data visualization functionalities. WikiPathways benefits from the dedicated focus and attention provided to the content converted from Reactome and the wealth of semantic information about interactions. Reactome in turn benefits from the continuous community curation available on WikiPathways. The research community at large benefits from the availability of a larger set of pathways for analysis in PathVisio and Cytoscape. The pathway statistics results obtained from PathVisio are significantly better when using a larger set of candidate pathways for analysis. The conversion serves as a general model for integration of multiple pathway resources developed using different approaches. PMID:27203685

  7. Reactome from a WikiPathways Perspective

    PubMed Central

    Bohler, Anwesha; Wu, Guanming; Pradhana, Leontius Adhika; Hanspers, Kristina; Haw, Robin; Pico, Alexander R.

    2016-01-01

    Reactome and WikiPathways are two of the most popular freely available databases for biological pathways. Reactome pathways are centrally curated with periodic input from selected domain experts. WikiPathways is a community-based platform where pathways are created and continually curated by any interested party. The nascent collaboration between WikiPathways and Reactome illustrates the mutual benefits of combining these two approaches. We created a format converter that converts Reactome pathways to the GPML format used in WikiPathways. In addition, we developed the ComplexViz plugin for PathVisio which simplifies looking up complex components. The plugin can also score the complexes on a pathway based on a user defined criterion. This score can then be visualized on the complex nodes using the visualization options provided by the plugin. Using the merged collection of curated and converted Reactome pathways, we demonstrate improved pathway coverage of relevant biological processes for the analysis of a previously described polycystic ovary syndrome gene expression dataset. Additionally, this conversion allows researchers to visualize their data on Reactome pathways using PathVisio’s advanced data visualization functionalities. WikiPathways benefits from the dedicated focus and attention provided to the content converted from Reactome and the wealth of semantic information about interactions. Reactome in turn benefits from the continuous community curation available on WikiPathways. The research community at large benefits from the availability of a larger set of pathways for analysis in PathVisio and Cytoscape. The pathway statistics results obtained from PathVisio are significantly better when using a larger set of candidate pathways for analysis. The conversion serves as a general model for integration of multiple pathway resources developed using different approaches. PMID:27203685

  8. The CEBAF Element Database

    SciTech Connect

    Theodore Larrieu, Christopher Slominski, Michele Joyce

    2011-03-01

    With the inauguration of the CEBAF Element Database (CED) in Fall 2010, Jefferson Lab computer scientists have taken a step toward the eventual goal of a model-driven accelerator. Once fully populated, the database will be the primary repository of information used for everything from generating lattice decks to booting control computers to building controls screens. A requirement influencing the CED design is that it provide access to not only present, but also future and past configurations of the accelerator. To accomplish this, an introspective database schema was designed that allows new elements, types, and properties to be defined on-the-fly with no changes to table structure. Used in conjunction with Oracle Workspace Manager, it allows users to query data from any time in the database history with the same tools used to query the present configuration. Users can also check-out workspaces to use as staging areas for upcoming machine configurations. All Access to the CED is through a well-documented Application Programming Interface (API) that is translated automatically from original C++ source code into native libraries for scripting languages such as perl, php, and TCL making access to the CED easy and ubiquitous.

  9. Triatomic Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 117 Triatomic Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 55 triatomic molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty and reference are given for each transition reported.

  10. Hydrocarbon Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 115 Hydrocarbon Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 91 hydrocarbon molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty and reference are given for each transition reported.

  11. Diatomic Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 114 Diatomic Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 121 diatomic molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty, and reference are given for each transition reported.

  12. High Performance Buildings Database

    DOE Data Explorer

    The High Performance Buildings Database is a shared resource for the building industry, a unique central repository of in-depth information and data on high-performance, green building projects across the United States and abroad. The database includes information on the energy use, environmental performance, design process, finances, and other aspects of each project. Members of the design and construction teams are listed, as are sources for additional information. In total, up to twelve screens of detailed information are provided for each project profile. Projects range in size from small single-family homes or tenant fit-outs within buildings to large commercial and institutional buildings and even entire campuses. The database is a data repository as well. A series of Web-based data-entry templates allows anyone to enter information about a building project into the database. Once a project has been submitted, each of the partner organizations can review the entry and choose whether or not to publish that particular project on its own Web site.

  13. The Ribosomal Database Project

    NASA Technical Reports Server (NTRS)

    Olsen, G. J.; Overbeek, R.; Larsen, N.; Marsh, T. L.; McCaughey, M. J.; Maciukenas, M. A.; Kuan, W. M.; Macke, T. J.; Xing, Y.; Woese, C. R.

    1992-01-01

    The Ribosomal Database Project (RDP) complies ribosomal sequences and related data, and redistributes them in aligned and phylogenetically ordered form to its user community. It also offers various software packages for handling, analyzing and displaying sequences. In addition, the RDP offers (or will offer) certain analytic services. At present the project is in an intermediate stage of development.

  14. Weathering Database Technology

    ERIC Educational Resources Information Center

    Snyder, Robert

    2005-01-01

    Collecting weather data is a traditional part of a meteorology unit at the middle level. However, making connections between the data and weather conditions can be a challenge. One way to make these connections clearer is to enter the data into a database. This allows students to quickly compare different fields of data and recognize which…

  15. Patent Family Databases.

    ERIC Educational Resources Information Center

    Simmons, Edlyn S.

    1985-01-01

    Reports on retrieval of patent information online and includes definition of patent family, basic and equivalent patents, "parents and children" applications, designated states, patent family databases--International Patent Documentation Center, World Patents Index, APIPAT (American Petroleum Institute), CLAIMS (IFI/Plenum). A table noting country…

  16. LQTS gene LOVD database.

    PubMed

    Zhang, Tao; Moss, Arthur; Cong, Peikuan; Pan, Min; Chang, Bingxi; Zheng, Liangrong; Fang, Quan; Zareba, Wojciech; Robinson, Jennifer; Lin, Changsong; Li, Zhongxiang; Wei, Junfang; Zeng, Qiang; Qi, Ming

    2010-11-01

    The Long QT Syndrome (LQTS) is a group of genetically heterogeneous disorders that predisposes young individuals to ventricular arrhythmias and sudden death. LQTS is mainly caused by mutations in genes encoding subunits of cardiac ion channels (KCNQ1, KCNH2,SCN5A, KCNE1, and KCNE2). Many other genes involved in LQTS have been described recently(KCNJ2, AKAP9, ANK2, CACNA1C, SCNA4B, SNTA1, and CAV3). We created an online database(http://www.genomed.org/LOVD/introduction.html) that provides information on variants in LQTS-associated genes. As of February 2010, the database contains 1738 unique variants in 12 genes. A total of 950 variants are considered pathogenic, 265 are possible pathogenic, 131 are unknown/unclassified, and 292 have no known pathogenicity. In addition to these mutations collected from published literature, we also submitted information on gene variants, including one possible novel pathogenic mutation in the KCNH2 splice site found in ten Chinese families with documented arrhythmias. The remote user is able to search the data and is encouraged to submit new mutations into the database. The LQTS database will become a powerful tool for both researchers and clinicians. PMID:20809527

  17. LQTS gene LOVD database.

    PubMed

    Zhang, Tao; Moss, Arthur; Cong, Peikuan; Pan, Min; Chang, Bingxi; Zheng, Liangrong; Fang, Quan; Zareba, Wojciech; Robinson, Jennifer; Lin, Changsong; Li, Zhongxiang; Wei, Junfang; Zeng, Qiang; Qi, Ming

    2010-11-01

    The Long QT Syndrome (LQTS) is a group of genetically heterogeneous disorders that predisposes young individuals to ventricular arrhythmias and sudden death. LQTS is mainly caused by mutations in genes encoding subunits of cardiac ion channels (KCNQ1, KCNH2,SCN5A, KCNE1, and KCNE2). Many other genes involved in LQTS have been described recently(KCNJ2, AKAP9, ANK2, CACNA1C, SCNA4B, SNTA1, and CAV3). We created an online database(http://www.genomed.org/LOVD/introduction.html) that provides information on variants in LQTS-associated genes. As of February 2010, the database contains 1738 unique variants in 12 genes. A total of 950 variants are considered pathogenic, 265 are possible pathogenic, 131 are unknown/unclassified, and 292 have no known pathogenicity. In addition to these mutations collected from published literature, we also submitted information on gene variants, including one possible novel pathogenic mutation in the KCNH2 splice site found in ten Chinese families with documented arrhythmias. The remote user is able to search the data and is encouraged to submit new mutations into the database. The LQTS database will become a powerful tool for both researchers and clinicians.

  18. Databases and data mining

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Over the course of the past decade, the breadth of information that is made available through online resources for plant biology has increased astronomically, as have the interconnectedness among databases, online tools, and methods of data acquisition and analysis. For maize researchers, the numbe...

  19. Redis database administration tool

    SciTech Connect

    Martinez, J. J.

    2013-02-13

    MyRedis is a product of the Lorenz subproject under the ASC Scirntific Data Management effort. MyRedis is a web based utility designed to allow easy administration of instances of Redis databases. It can be usedd to view and manipulate data as well as run commands directly against a variety of different Redis hosts.

  20. The AMMA database

    NASA Astrophysics Data System (ADS)

    Boichard, Jean-Luc; Brissebrat, Guillaume; Cloche, Sophie; Eymard, Laurence; Fleury, Laurence; Mastrorillo, Laurence; Moulaye, Oumarou; Ramage, Karim

    2010-05-01

    The AMMA project includes aircraft, ground-based and ocean measurements, an intensive use of satellite data and diverse modelling studies. Therefore, the AMMA database aims at storing a great amount and a large variety of data, and at providing the data as rapidly and safely as possible to the AMMA research community. In order to stimulate the exchange of information and collaboration between researchers from different disciplines or using different tools, the database provides a detailed description of the products and uses standardized formats. The AMMA database contains: - AMMA field campaigns datasets; - historical data in West Africa from 1850 (operational networks and previous scientific programs); - satellite products from past and future satellites, (re-)mapped on a regular latitude/longitude grid and stored in NetCDF format (CF Convention); - model outputs from atmosphere or ocean operational (re-)analysis and forecasts, and from research simulations. The outputs are processed as the satellite products are. Before accessing the data, any user has to sign the AMMA data and publication policy. This chart only covers the use of data in the framework of scientific objectives and categorically excludes the redistribution of data to third parties and the usage for commercial applications. Some collaboration between data producers and users, and the mention of the AMMA project in any publication is also required. The AMMA database and the associated on-line tools have been fully developed and are managed by two teams in France (IPSL Database Centre, Paris and OMP, Toulouse). Users can access data of both data centres using an unique web portal. This website is composed of different modules : - Registration: forms to register, read and sign the data use chart when an user visits for the first time - Data access interface: friendly tool allowing to build a data extraction request by selecting various criteria like location, time, parameters... The request can

  1. JDD, Inc. Database

    NASA Technical Reports Server (NTRS)

    Miller, David A., Jr.

    2004-01-01

    JDD Inc, is a maintenance and custodial contracting company whose mission is to provide their clients in the private and government sectors "quality construction, construction management and cleaning services in the most efficient and cost effective manners, (JDD, Inc. Mission Statement)." This company provides facilities support for Fort Riley in Fo,rt Riley, Kansas and the NASA John H. Glenn Research Center at Lewis Field here in Cleveland, Ohio. JDD, Inc. is owned and operated by James Vaughn, who started as painter at NASA Glenn and has been working here for the past seventeen years. This summer I worked under Devan Anderson, who is the safety manager for JDD Inc. in the Logistics and Technical Information Division at Glenn Research Center The LTID provides all transportation, secretarial, security needs and contract management of these various services for the center. As a safety manager, my mentor provides Occupational Health and Safety Occupation (OSHA) compliance to all JDD, Inc. employees and handles all other issues (Environmental Protection Agency issues, workers compensation, safety and health training) involving to job safety. My summer assignment was not as considered "groundbreaking research" like many other summer interns have done in the past, but it is just as important and beneficial to JDD, Inc. I initially created a database using a Microsoft Excel program to classify and categorize data pertaining to numerous safety training certification courses instructed by our safety manager during the course of the fiscal year. This early portion of the database consisted of only data (training field index, employees who were present at these training courses and who was absent) from the training certification courses. Once I completed this phase of the database, I decided to expand the database and add as many dimensions to it as possible. Throughout the last seven weeks, I have been compiling more data from day to day operations and been adding the

  2. Tautomerism in large databases

    PubMed Central

    Sitzmann, Markus; Ihlenfeldt, Wolf-Dietrich

    2010-01-01

    We have used the Chemical Structure DataBase (CSDB) of the NCI CADD Group, an aggregated collection of over 150 small-molecule databases totaling 103.5 million structure records, to conduct tautomerism analyses on one of the largest currently existing sets of real (i.e. not computer-generated) compounds. This analysis was carried out using calculable chemical structure identifiers developed by the NCI CADD Group, based on hash codes available in the chemoinformatics toolkit CACTVS and a newly developed scoring scheme to define a canonical tautomer for any encountered structure. CACTVS’s tautomerism definition, a set of 21 transform rules expressed in SMIRKS line notation, was used, which takes a comprehensive stance as to the possible types of tautomeric interconversion included. Tautomerism was found to be possible for more than 2/3 of the unique structures in the CSDB. A total of 680 million tautomers were calculated from, and including, the original structure records. Tautomerism overlap within the same individual database (i.e. at least one other entry was present that was really only a different tautomeric representation of the same compound) was found at an average rate of 0.3% of the original structure records, with values as high as nearly 2% for some of the databases in CSDB. Projected onto the set of unique structures (by FICuS identifier), this still occurred in about 1.5% of the cases. Tautomeric overlap across all constituent databases in CSDB was found for nearly 10% of the records in the collection. PMID:20512400

  3. MEROPS: the peptidase database.

    PubMed

    Rawlings, N D; Barrett, A J

    1999-01-01

    The MEROPS database (http://www.bi.bbsrc.ac.uk/Merops/Merops.+ ++htm) provides a catalogue and structure-based classification of peptidases (i.e. all proteolytic enzymes). This is a large group of proteins (approximately 2% of all gene products) that is of particular importance in medicine and biotechnology. An index of the peptidases by name or synonym gives access to a set of files termed PepCards each of which provides information on a single peptidase. Each card file contains information on classification and nomenclature, and hypertext links to the relevant entries in online databases for human genetics, protein and nucleic acid sequence data and tertiary structure. Another index provides access to the PepCards by organism name so that the user can retrieve all known peptidases from a particular species. The peptidases are classified into families on the basis of statistically significant similarities between the protein sequences in the part termed the 'peptidase unit' that is most directly responsible for activity. Families that are thought to have common evolutionary origins and are known or expected to have similar tertiary folds are grouped into clans. The MEROPS database provides sets of files called FamCards and ClanCards describing the individual families and clans. Each FamCard document provides links to other databases for sequence motifs and secondary and tertiary structures, and shows the distribution of the family across the major kingdoms of living creatures. Release 3.03 of MEROPS contains 758 peptidases, 153 families and 22 clans. We suggest that the MEROPS database provides a model for a way in which a system of classification for a functional group of proteins can be developed and used as an organizational framework around which to assemble a variety of related information.

  4. Ovarian Kaleidoscope database: ten years and beyond.

    PubMed

    Hsueh, Aaron J; Rauch, Rami

    2012-06-01

    Ovarian Kaleidoscope database (OKdb) is an online, searchable, public database containing text-based and DNA microarray data to facilitate research by ovarian researchers. Using key words and predetermined categories, users can search ovarian gene information based on gene function, cell type of expression, cellular localization, hormonal regulation, mutant phenotypes, chromosomal location, ligand-receptor relationship, and other criteria, either alone or in combination. For individual genes, users can access more than 10 extensive DNA microarray datasets to interrogate gene expression patterns in a development-specific and cell type-specific manner. All ligand and receptor genes expressed in the ovary are matched to facilitate investigation of paracrine/autocrine signaling. More than 3500 ovarian genes in the database are matched to 185 gene pathways in the Kyoto Encyclopedia of Genes and Genomes to allow for elucidation of gene interactions and relationships. In addition to >400 genes with infertility or subfertility phenotypes when mutated in mice or humans, the OKdb also lists ~50 and ~40 genes associated with polycystic ovarian syndrome and primary ovarian insufficiency, respectively. The expanding OKdb is updated weekly and allows submission of new genes by ovarian researchers to allow instant access to DNA microarray datasets for newly submitted genes. The present database is a virtual community for ovarian researchers and allows users to instantaneously provide their comments for individual gene pages based on an automated Web-discussion system. In the coming years, we will continue to add new features to serve the ovarian research community. PMID:22441797

  5. The genetic basis for inactivation of Wnt pathway in human osteosarcoma

    PubMed Central

    2014-01-01

    Background Osteosarcoma is a highly genetically unstable tumor with poor prognosis. We performed microarray-based comparative genomic hybridization (aCGH), transcriptome sequencing (RNA-seq), and pathway analysis to gain a systemic view of the pathway alterations of osteosarcoma. Methods aCGH experiments were carried out on 10 fresh osteosarcoma samples. The output data (Gene Expression Omnibus Series accession number GSE19180) were pooled with published aCGH raw data (GSE9654) to determine recurrent copy number changes. These were analyzed using Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis to identify altered pathways in osteosarcoma. Transcriptome sequencing of six osteosarcomas was performed to detect the expression profile of Wnt signaling pathway genes. Protein expression of WNT1, β-catenin, c-myc, and cyclin D1 in the Wnt pathway was detected by immunohistochemistry (IHC) in an independent group of 46 osteosarcoma samples. Results KEGG pathway analysis identified frequent deletions of Wnt and other Wnt signaling pathway genes. At the mRNA level, transcriptome sequencing found reduced levels of mRNA expression of Wnt signaling pathway transcripts. While WNT1 protein expression was detected by IHC in 69.6% (32/46) of the osteosarcomas, no β-catenin protein was detected in the nucleus. β-catenin protein expression was, however, detected in the membrane and cytoplasm of 69.6% (32/46) of the osteosarcomas. c-myc protein expression was detected in only 47.8% (22/46) and cyclin D1 protein expression in 52.2% (24/46) of osteosarcoma samples. Kaplan-Meier survival analysis showed that WNT1-negative patients had a trend towards longer disease free survival than WNT1-positive patients. Interestingly, in WNT1-negative patients, those who were also cyclin D1-negative had significantly longer disease free survival than cyclin D1-positive patients. However, there was no significant association between any of the investigated proteins and overall

  6. The GLIMS Glacier Database

    NASA Astrophysics Data System (ADS)

    Raup, B. H.; Khalsa, S. S.; Armstrong, R.

    2007-12-01

    The Global Land Ice Measurements from Space (GLIMS) project has built a geospatial and temporal database of glacier data, composed of glacier outlines and various scalar attributes. These data are being derived primarily from satellite imagery, such as from ASTER and Landsat. Each "snapshot" of a glacier is from a specific time, and the database is designed to store multiple snapshots representative of different times. We have implemented two web-based interfaces to the database; one enables exploration of the data via interactive maps (web map server), while the other allows searches based on text-field constraints. The web map server is an Open Geospatial Consortium (OGC) compliant Web Map Server (WMS) and Web Feature Server (WFS). This means that other web sites can display glacier layers from our site over the Internet, or retrieve glacier features in vector format. All components of the system are implemented using Open Source software: Linux, PostgreSQL, PostGIS (geospatial extensions to the database), MapServer (WMS and WFS), and several supporting components such as Proj.4 (a geographic projection library) and PHP. These tools are robust and provide a flexible and powerful framework for web mapping applications. As a service to the GLIMS community, the database contains metadata on all ASTER imagery acquired over glacierized terrain. Reduced-resolution of the images (browse imagery) can be viewed either as a layer in the MapServer application, or overlaid on the virtual globe within Google Earth. The interactive map application allows the user to constrain by time what data appear on the map. For example, ASTER or glacier outlines from 2002 only, or from Autumn in any year, can be displayed. The system allows users to download their selected glacier data in a choice of formats. The results of a query based on spatial selection (using a mouse) or text-field constraints can be downloaded in any of these formats: ESRI shapefiles, KML (Google Earth), Map

  7. Open geochemical database

    NASA Astrophysics Data System (ADS)

    Zhilin, Denis; Ilyin, Vladimir; Bashev, Anton

    2010-05-01

    We regard "geochemical data" as data on chemical parameters of the environment, linked with the geographical position of the corresponding point. Boosting development of global positioning system (GPS) and measuring instruments allows fast collecting of huge amounts of geochemical data. Presently they are published in scientific journals in text format, that hampers searching for information about particular places and meta-analysis of the data, collected by different researchers. Part of the information is never published. To make the data available and easy to find, it seems reasonable to elaborate an open database of geochemical information, accessible via Internet. It also seems reasonable to link the data with maps or space images, for example, from GoogleEarth service. For this purpose an open geochemical database is being elaborating (http://maps.sch192.ru). Any user after registration can upload geochemical data (position, type of parameter and value of the parameter) and edit them. Every user (including unregistered) can (a) extract the values of parameters, fulfilling desired conditions and (b) see the points, linked to GoogleEarth space image, colored according to a value of selected parameter. Then he can treat extracted values any way he likes. There are the following data types in the database: authors, points, seasons and parameters. Author is a person, who publishes the data. Every author can declare his own profile. A point is characterized by its geographical position and type of the object (i.e. river, lake etc). Value of parameters are linked to a point, an author and a season, when they were obtained. A user can choose a parameter to place on GoogleEarth space image and a scale to color the points on the image according to the value of a parameter. Currently (December, 2009) the database is under construction, but several functions (uploading data on pH and electrical conductivity and placing colored points onto GoogleEarth space image) are

  8. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1992-11-09

    The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R- 717 (ammonia), ethers, and others as well as azeotropic and zeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents on compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. A computerized version is available that includes retrieval software.

  9. The apoptosis database.

    PubMed

    Doctor, K S; Reed, J C; Godzik, A; Bourne, P E

    2003-06-01

    The apoptosis database is a public resource for researchers and students interested in the molecular biology of apoptosis. The resource provides functional annotation, literature references, diagrams/images, and alternative nomenclatures on a set of proteins having 'apoptotic domains'. These are the distinctive domains that are often, if not exclusively, found in proteins involved in apoptosis. The initial choice of proteins to be included is defined by apoptosis experts and bioinformatics tools. Users can browse through the web accessible lists of domains, proteins containing these domains and their associated homologs. The database can also be searched by sequence homology using basic local alignment search tool, text word matches of the annotation, and identifiers for specific records. The resource is available at http://www.apoptosis-db.org and is updated on a regular basis.

  10. Medical Image Databases

    PubMed Central

    Tagare, Hemant D.; Jaffe, C. Carl; Duncan, James

    1997-01-01

    Abstract Information contained in medical images differs considerably from that residing in alphanumeric format. The difference can be attributed to four characteristics: (1) the semantics of medical knowledge extractable from images is imprecise; (2) image information contains form and spatial data, which are not expressible in conventional language; (3) a large part of image information is geometric; (4) diagnostic inferences derived from images rest on an incomplete, continuously evolving model of normality. This paper explores the differentiating characteristics of text versus images and their impact on design of a medical image database intended to allow content-based indexing and retrieval. One strategy for implementing medical image databases is presented, which employs object-oriented iconic queries, semantics by association with prototypes, and a generic schema. PMID:9147338

  11. The Listeria monocytogenes strain 10403S BioCyc database.

    PubMed

    Orsi, Renato H; Bergholz, Teresa M; Wiedmann, Martin; Boor, Kathryn J

    2015-01-01

    Listeria monocytogenes is a food-borne pathogen of humans and other animals. The striking ability to survive several stresses usually used for food preservation makes L. monocytogenes one of the biggest concerns to the food industry, while the high mortality of listeriosis in specific groups of humans makes it a great concern for public health. Previous studies have shown that a regulatory network involving alternative sigma (σ) factors and transcription factors is pivotal to stress survival. However, few studies have evaluated at the metabolic networks controlled by these regulatory mechanisms. The L. monocytogenes BioCyc database uses the strain 10403S as a model. Computer-generated initial annotation for all genes also allowed for identification, annotation and display of predicted reactions and pathways carried out by a single cell. Further ongoing manual curation based on published data as well as database mining for selected genes allowed the more refined annotation of functions, which, in turn, allowed for annotation of new pathways and fine-tuning of previously defined pathways to more L. monocytogenes-specific pathways. Using RNA-Seq data, several transcription start sites and promoter regions were mapped to the 10403S genome and annotated within the database. Additionally, the identification of promoter regions and a comprehensive review of available literature allowed the annotation of several regulatory interactions involving σ factors and transcription factors. The L. monocytogenes 10403S BioCyc database is a new resource for researchers studying Listeria and related organisms. It allows users to (i) have a comprehensive view of all reactions and pathways predicted to take place within the cell in the cellular overview, as well as to (ii) upload their own data, such as differential expression data, to visualize the data in the scope of predicted pathways and regulatory networks and to carry on enrichment analyses using several different annotations

  12. Design and Performance of a Xenobiotic Metabolism Database Manager for Building Metabolic Pathway Databases

    EPA Science Inventory

    A major challenge for scientists and regulators is accounting for the metabolic activation of chemicals that may lead to increased toxicity. Reliable forecasting of chemical metabolism is a critical factor in estimating a chemical’s toxic potential. Research is underway to develo...

  13. Real Time Baseball Database

    NASA Astrophysics Data System (ADS)

    Fukue, Yasuhiro

    The author describes the system outline, features and operations of "Nikkan Sports Realtime Basaball Database" which was developed and operated by Nikkan Sports Shimbun, K. K. The system enables to input numerical data of professional baseball games as they proceed simultaneously, and execute data updating at realtime, just-in-time. Other than serving as supporting tool for prepareing newspapers it is also available for broadcasting media, general users through NTT dial Q2 and others.

  14. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1999-01-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilities access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  15. Clinical Genomic Database

    PubMed Central

    Solomon, Benjamin D.; Nguyen, Anh-Dao; Bear, Kelly A.; Wolfsberg, Tyra G.

    2013-01-01

    Technological advances have greatly increased the availability of human genomic sequencing. However, the capacity to analyze genomic data in a clinically meaningful way lags behind the ability to generate such data. To help address this obstacle, we reviewed all conditions with genetic causes and constructed the Clinical Genomic Database (CGD) (http://research.nhgri.nih.gov/CGD/), a searchable, freely Web-accessible database of conditions based on the clinical utility of genetic diagnosis and the availability of specific medical interventions. The CGD currently includes a total of 2,616 genes organized clinically by affected organ systems and interventions (including preventive measures, disease surveillance, and medical or surgical interventions) that could be reasonably warranted by the identification of pathogenic mutations. To aid independent analysis and optimize new data incorporation, the CGD also includes all genetic conditions for which genetic knowledge may affect the selection of supportive care, informed medical decision-making, prognostic considerations, reproductive decisions, and allow avoidance of unnecessary testing, but for which specific interventions are not otherwise currently available. For each entry, the CGD includes the gene symbol, conditions, allelic conditions, clinical categorization (for both manifestations and interventions), mode of inheritance, affected age group, description of interventions/rationale, links to other complementary databases, including databases of variants and presumed pathogenic mutations, and links to PubMed references (>20,000). The CGD will be regularly maintained and updated to keep pace with scientific discovery. Further content-based expert opinions are actively solicited. Eventually, the CGD may assist the rapid curation of individual genomes as part of active medical care. PMID:23696674

  16. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1996-11-15

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  17. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1996-07-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  18. The Ribonuclease P Database.

    PubMed

    Brown, J W

    1999-01-01

    Ribonuclease P is responsible for the 5'-maturation of tRNA precursors. Ribonuclease P is a ribonucleoprotein, and in bacteria (and some Archaea) the RNA subunit alone is catalytically active in vitro, i.e. it is a ribozyme. The Ribonuclease P Database is a compilation of ribonuclease P sequences, sequence alignments, secondary structures, three-dimensional models and accessory information, available via the World Wide Web at the following URL: http://www.mbio.ncsu.edu/RNaseP/home .html

  19. The Cambridge Structural Database.

    PubMed

    Groom, Colin R; Bruno, Ian J; Lightfoot, Matthew P; Ward, Suzanna C

    2016-04-01

    The Cambridge Structural Database (CSD) contains a complete record of all published organic and metal-organic small-molecule crystal structures. The database has been in operation for over 50 years and continues to be the primary means of sharing structural chemistry data and knowledge across disciplines. As well as structures that are made public to support scientific articles, it includes many structures published directly as CSD Communications. All structures are processed both computationally and by expert structural chemistry editors prior to entering the database. A key component of this processing is the reliable association of the chemical identity of the structure studied with the experimental data. This important step helps ensure that data is widely discoverable and readily reusable. Content is further enriched through selective inclusion of additional experimental data. Entries are available to anyone through free CSD community web services. Linking services developed and maintained by the CCDC, combined with the use of standard identifiers, facilitate discovery from other resources. Data can also be accessed through CCDC and third party software applications and through an application programming interface.

  20. The Cambridge Structural Database

    PubMed Central

    Groom, Colin R.; Bruno, Ian J.; Lightfoot, Matthew P.; Ward, Suzanna C.

    2016-01-01

    The Cambridge Structural Database (CSD) contains a complete record of all published organic and metal–organic small-molecule crystal structures. The database has been in operation for over 50 years and continues to be the primary means of sharing structural chemistry data and knowledge across disciplines. As well as structures that are made public to support scientific articles, it includes many structures published directly as CSD Communications. All structures are processed both computationally and by expert structural chemistry editors prior to entering the database. A key component of this processing is the reliable association of the chemical identity of the structure studied with the experimental data. This important step helps ensure that data is widely discoverable and readily reusable. Content is further enriched through selective inclusion of additional experimental data. Entries are available to anyone through free CSD community web services. Linking services developed and maintained by the CCDC, combined with the use of standard identifiers, facilitate discovery from other resources. Data can also be accessed through CCDC and third party software applications and through an application programming interface. PMID:27048719

  1. ARTI Refrigerant Database

    SciTech Connect

    Cain, J.M. , Great Falls, VA )

    1993-04-30

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents to accelerate availability of the information and will be completed or replaced in future updates.

  2. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1996-04-15

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates. Citations in this report are divided into the following topics: thermophysical properties; materials compatibility; lubricants and tribology; application data; safety; test and analysis methods; impacts; regulatory actions; substitute refrigerants; identification; absorption and adsorption; research programs; and miscellaneous documents. Information is also presented on ordering instructions for the computerized version.

  3. Curcumin Resource Database

    PubMed Central

    Kumar, Anil; Chetia, Hasnahana; Sharma, Swagata; Kabiraj, Debajyoti; Talukdar, Narayan Chandra; Bora, Utpal

    2015-01-01

    Curcumin is one of the most intensively studied diarylheptanoid, Curcuma longa being its principal producer. This apart, a class of promising curcumin analogs has been generated in laboratories, aptly named as Curcuminoids which are showing huge potential in the fields of medicine, food technology, etc. The lack of a universal source of data on curcumin as well as curcuminoids has been felt by the curcumin research community for long. Hence, in an attempt to address this stumbling block, we have developed Curcumin Resource Database (CRDB) that aims to perform as a gateway-cum-repository to access all relevant data and related information on curcumin and its analogs. Currently, this database encompasses 1186 curcumin analogs, 195 molecular targets, 9075 peer reviewed publications, 489 patents and 176 varieties of C. longa obtained by extensive data mining and careful curation from numerous sources. Each data entry is identified by a unique CRDB ID (identifier). Furnished with a user-friendly web interface and in-built search engine, CRDB provides well-curated and cross-referenced information that are hyperlinked with external sources. CRDB is expected to be highly useful to the researchers working on structure as well as ligand-based molecular design of curcumin analogs. Database URL: http://www.crdb.in PMID:26220923

  4. The Cambridge Structural Database.

    PubMed

    Groom, Colin R; Bruno, Ian J; Lightfoot, Matthew P; Ward, Suzanna C

    2016-04-01

    The Cambridge Structural Database (CSD) contains a complete record of all published organic and metal-organic small-molecule crystal structures. The database has been in operation for over 50 years and continues to be the primary means of sharing structural chemistry data and knowledge across disciplines. As well as structures that are made public to support scientific articles, it includes many structures published directly as CSD Communications. All structures are processed both computationally and by expert structural chemistry editors prior to entering the database. A key component of this processing is the reliable association of the chemical identity of the structure studied with the experimental data. This important step helps ensure that data is widely discoverable and readily reusable. Content is further enriched through selective inclusion of additional experimental data. Entries are available to anyone through free CSD community web services. Linking services developed and maintained by the CCDC, combined with the use of standard identifiers, facilitate discovery from other resources. Data can also be accessed through CCDC and third party software applications and through an application programming interface. PMID:27048719

  5. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1997-02-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alterative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on various refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.

  6. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1998-08-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufactures and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on many refrigerants including propane, ammonia, water, carbon dioxide, propylene, ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.

  7. Conserved Senescence Associated Genes and Pathways in Primary Human Fibroblasts Detected by RNA-Seq

    PubMed Central

    Marthandan, S.; Baumgart, M.; Priebe, S.; Groth, M.; Schaer, J.; Kaether, C.; Guthke, R.; Cellerino, A.; Platzer, M.; Diekmann, S.; Hemmerich, P.

    2016-01-01

    Cellular senescence correlates with changes in the transcriptome. To obtain a complete view on senescence-associated transcription networks and pathways, we assessed by deep RNA sequencing the transcriptomes of five of the most commonly used laboratory strains of human fibroblasts during their transition into senescence. In a number of cases, we verified the RNA-seq data by real-time PCR. By determining cellular protein levels we observed that the age-related expression of most but not all genes is regulated at the transcriptional level. We found that 78% of the age-affected differentially expressed genes were commonly regulated in the same direction (either up- or down-regulated) in all five fibroblast strains, indicating a strong conservation of age-associated changes in the transcriptome. KEGG pathway analyses confirmed up-regulation of the senescence-associated secretory phenotype and down-regulation of DNA synthesis/repair and most cell cycle pathways common in all five cell strains. Newly identified senescence-induced pathways include up-regulation of endocytotic/phagocytic pathways and down-regulation of the mRNA metabolism and the mRNA splicing pathways. Our results provide an unprecedented comprehensive and deep view into the individual and common transcriptome and pathway changes during the transition into of senescence of five human fibroblast cell strains. PMID:27140416

  8. Computer Databases: A Survey. Part 3: Product Databases.

    ERIC Educational Resources Information Center

    O'Leary, Mick

    1987-01-01

    Describes five online databases that focus on computer products, primarily software and microcomputing hardware, and compares the databases in terms of record content, product coverage, vertical market coverage, currency, availability, and price. Sample records and searches are provided, as well as a directory of product databases. (CLB)

  9. SmallSat Database

    NASA Technical Reports Server (NTRS)

    Petropulos, Dolores; Bittner, David; Murawski, Robert; Golden, Bert

    2015-01-01

    The SmallSat has an unrealized potential in both the private industry and in the federal government. Currently over 70 companies, 50 universities and 17 governmental agencies are involved in SmallSat research and development. In 1994, the U.S. Army Missile and Defense mapped the moon using smallSat imagery. Since then Smart Phones have introduced this imagery to the people of the world as diverse industries watched this trend. The deployment cost of smallSats is also greatly reduced compared to traditional satellites due to the fact that multiple units can be deployed in a single mission. Imaging payloads have become more sophisticated, smaller and lighter. In addition, the growth of small technology obtained from private industries has led to the more widespread use of smallSats. This includes greater revisit rates in imagery, significantly lower costs, the ability to update technology more frequently and the ability to decrease vulnerability of enemy attacks. The popularity of smallSats show a changing mentality in this fast paced world of tomorrow. What impact has this created on the NASA communication networks now and in future years? In this project, we are developing the SmallSat Relational Database which can support a simulation of smallSats within the NASA SCaN Compatability Environment for Networks and Integrated Communications (SCENIC) Modeling and Simulation Lab. The NASA Space Communications and Networks (SCaN) Program can use this modeling to project required network support needs in the next 10 to 15 years. The SmallSat Rational Database could model smallSats just as the other SCaN databases model the more traditional larger satellites, with a few exceptions. One being that the smallSat Database is designed to be built-to-order. The SmallSat database holds various hardware configurations that can be used to model a smallSat. It will require significant effort to develop as the research material can only be populated by hand to obtain the unique data

  10. A Case for Database Filesystems

    SciTech Connect

    Adams, P A; Hax, J C

    2009-05-13

    Data intensive science is offering new challenges and opportunities for Information Technology and traditional relational databases in particular. Database filesystems offer the potential to store Level Zero data and analyze Level 1 and Level 3 data within the same database system [2]. Scientific data is typically composed of both unstructured files and scalar data. Oracle SecureFiles is a new database filesystem feature in Oracle Database 11g that is specifically engineered to deliver high performance and scalability for storing unstructured or file data inside the Oracle database. SecureFiles presents the best of both the filesystem and the database worlds for unstructured content. Data stored inside SecureFiles can be queried or written at performance levels comparable to that of traditional filesystems while retaining the advantages of the Oracle database.

  11. High Temperature Superconducting Materials Database

    National Institute of Standards and Technology Data Gateway

    SRD 149 NIST High Temperature Superconducting Materials Database (Web, free access)   The NIST High Temperature Superconducting Materials Database (WebHTS) provides evaluated thermal, mechanical, and superconducting property data for oxides and other nonconventional superconductors.

  12. Dietary Supplement Label Database (DSLD)

    MedlinePlus

    ... Print Report Error T he Dietary Supplement Label Database (DSLD) is a joint project of the National ... participants in the latest survey in the DSLD database (NHANES): The search options: Quick Search, Browse Dietary ...

  13. ThermoData Engine Database

    National Institute of Standards and Technology Data Gateway

    SRD 103 NIST ThermoData Engine Database (PC database for purchase)   ThermoData Engine is the first product fully implementing all major principles of the concept of dynamic data evaluation formulated at NIST/TRC.

  14. The ConsensusPathDB interaction database: 2013 update.

    PubMed

    Kamburov, Atanas; Stelzl, Ulrich; Lehrach, Hans; Herwig, Ralf

    2013-01-01

    Knowledge of the various interactions between molecules in the cell is crucial for understanding cellular processes in health and disease. Currently available interaction databases, being largely complementary to each other, must be integrated to obtain a comprehensive global map of the different types of interactions. We have previously reported the development of an integrative interaction database called ConsensusPathDB (http://ConsensusPathDB.org) that aims to fulfill this task. In this update article, we report its significant progress in terms of interaction content and web interface tools. ConsensusPathDB has grown mainly due to the integration of 12 further databases; it now contains 215 541 unique interactions and 4601 pathways from overall 30 databases. Binary protein interactions are scored with our confidence assessment tool, IntScore. The ConsensusPathDB web interface allows users to take advantage of these integrated interaction and pathway data in different contexts. Recent developments include pathway analysis of metabolite lists, visualization of functional gene/metabolite sets as overlap graphs, gene set analysis based on protein complexes and induced network modules analysis that connects a list of genes through various interaction types. To facilitate the interactive, visual interpretation of interaction and pathway data, we have re-implemented the graph visualization feature of ConsensusPathDB using the Cytoscape.js library.

  15. Hydrogen Leak Detection Sensor Database

    NASA Technical Reports Server (NTRS)

    Baker, Barton D.

    2010-01-01

    This slide presentation reviews the characteristics of the Hydrogen Sensor database. The database is the result of NASA's continuing interest in and improvement of its ability to detect and assess gas leaks in space applications. The database specifics and a snapshot of an entry in the database are reviewed. Attempts were made to determine the applicability of each of the 65 sensors for ground and/or vehicle use.

  16. Pathway-specific differences between tumor cell lines and normal and tumor tissue cells

    PubMed Central

    Ertel, Adam; Verghese, Arun; Byers, Stephen W; Ochs, Michael; Tozeren, Aydin

    2006-01-01

    Background Cell lines are used in experimental investigation of cancer but their capacity to represent tumor cells has yet to be quantified. The aim of the study was to identify significant alterations in pathway usage in cell lines in comparison with normal and tumor tissue. Methods This study utilized a pathway-specific enrichment analysis of publicly accessible microarray data and quantified the gene expression differences between cell lines, tumor, and normal tissue cells for six different tissue types. KEGG pathways that are significantly different between cell lines and tumors, cell lines and normal tissues and tumor and normal tissue were identified through enrichment tests on gene lists obtained using Significance Analysis of Microarrays (SAM). Results Cellular pathways that were significantly upregulated in cell lines compared to tumor cells and normal cells of the same tissue type included ATP synthesis, cell communication, cell cycle, oxidative phosphorylation, purine, pyrimidine and pyruvate metabolism, and proteasome. Results on metabolic pathways suggested an increase in the velocity nucleotide metabolism and RNA production. Pathways that were downregulated in cell lines compared to tumor and normal tissue included cell communication, cell adhesion molecules (CAMs), and ECM-receptor interaction. Only a fraction of the significantly altered genes in tumor-to-normal comparison had similar expressions in cancer cell lines and tumor cells. These genes were tissue-specific and were distributed sparsely among multiple pathways. Conclusion Significantly altered genes in tumors compared to normal tissue were largely tissue specific. Among these genes downregulation was a major trend. In contrast, cell lines contained large sets of significantly upregulated genes that were common to multiple tissue types. Pathway upregulation in cell lines was most pronounced over metabolic pathways including cell nucleotide metabolism and oxidative phosphorylation. Signaling

  17. Ontology-based aggregation of biological pathway datasets.

    PubMed

    Jiang, Keyuan; Nash, Christopher

    2005-01-01

    The massive accumulation of biological data in the past decades has generated a significant amount of biological knowledge which is represented in one way as biological pathways. The existence of over 150 pathway databases reflects the diversity of the biological data and heterogeneity of data models, storage formats and access methods. To address an intriguing biological question, it is not uncommon for a biologist to query more one pathway database to acquire a more complete picture of current understanding of biology. To facility life scientists in searching biological pathway data, we designed a biological pathway aggregator which aggregates various pathway datasets via the BioPAX ontology, a community-developed ontology based upon the concept of Semantic Web for integrating and exchanging biological pathway data. Our aggregator is composed of modules that retrieve the data from various sources, transform the raw data to BioPAX format, persist the converted data in the persistent data store, and enable queries by other applications.

  18. Scientific and Technical Document Database

    National Institute of Standards and Technology Data Gateway

    NIST Scientific and Technical Document Database (PC database for purchase)   The images in NIST Special Database 20 contain a very rich set of graphic elements from scientific and technical documents, such as graphs, tables, equations, two column text, maps, pictures, footnotes, annotations, and arrays of such elements.

  19. Microbial Properties Database Editor Tutorial

    EPA Science Inventory

    A Microbial Properties Database Editor (MPDBE) has been developed to help consolidate microbial-relevant data to populate a microbial database and support a database editor by which an authorized user can modify physico-microbial properties related to microbial indicators and pat...

  20. DDRprot: a database of DNA damage response-related proteins

    PubMed Central

    Andrés-León, Eduardo; Cases, Ildefonso; Arcas, Aida; Rojas, Ana M.

    2016-01-01

    The DNA Damage Response (DDR) signalling network is an essential system that protects the genome’s integrity. The DDRprot database presented here is a resource that integrates manually curated information on the human DDR network and its sub-pathways. For each particular DDR protein, we present detailed information about its function. If involved in post-translational modifications (PTMs) with each other, we depict the position of the modified residue/s in the three-dimensional structures, when resolved structures are available for the proteins. All this information is linked to the original publication from where it was obtained. Phylogenetic information is also shown, including time of emergence and conservation across 47 selected species, family trees and sequence alignments of homologues. The DDRprot database can be queried by different criteria: pathways, species, evolutionary age or involvement in (PTM). Sequence searches using hidden Markov models can be also used. Database URL: http://ddr.cbbio.es. PMID:27577567

  1. DDRprot: a database of DNA damage response-related proteins.

    PubMed

    Andrés-León, Eduardo; Cases, Ildefonso; Arcas, Aida; Rojas, Ana M

    2016-01-01

    The DNA Damage Response (DDR) signalling network is an essential system that protects the genome's integrity. The DDRprot database presented here is a resource that integrates manually curated information on the human DDR network and its sub-pathways. For each particular DDR protein, we present detailed information about its function. If involved in post-translational modifications (PTMs) with each other, we depict the position of the modified residue/s in the three-dimensional structures, when resolved structures are available for the proteins. All this information is linked to the original publication from where it was obtained. Phylogenetic information is also shown, including time of emergence and conservation across 47 selected species, family trees and sequence alignments of homologues. The DDRprot database can be queried by different criteria: pathways, species, evolutionary age or involvement in (PTM). Sequence searches using hidden Markov models can be also used.Database URL: http://ddr.cbbio.es. PMID:27577567

  2. Pleurochrysome: A Web Database of Pleurochrysis Transcripts and Orthologs Among Heterogeneous Algae

    PubMed Central

    Fujiwara, Shoko; Takatsuka, Yukiko; Hirokawa, Yasutaka; Tsuzuki, Mikio; Takano, Tomoyuki; Kobayashi, Masaaki; Suda, Kunihiro; Asamizu, Erika; Yokoyama, Koji; Shibata, Daisuke; Tabata, Satoshi; Yano, Kentaro

    2016-01-01

    Pleurochrysis is a coccolithophorid genus, which belongs to the Coccolithales in the Haptophyta. The genus has been used extensively for biological research, together with Emiliania in the Isochrysidales, to understand distinctive features between the two coccolithophorid-including orders. However, molecular biological research on Pleurochrysis such as elucidation of the molecular mechanism behind coccolith formation has not made great progress at least in part because of lack of comprehensive gene information. To provide such information to the research community, we built an open web database, the Pleurochrysome (http://bioinf.mind.meiji.ac.jp/phapt/), which currently stores 9,023 unique gene sequences (designated as UNIGENEs) assembled from expressed sequence tag sequences of P. haptonemofera as core information. The UNIGENEs were annotated with gene sequences sharing significant homology, conserved domains, Gene Ontology, KEGG Orthology, predicted subcellular localization, open reading frames and orthologous relationship with genes of 10 other algal species, a cyanobacterium and the yeast Saccharomyces cerevisiae. This sequence and annotation information can be easily accessed via several search functions. Besides fundamental functions such as BLAST and keyword searches, this database also offers search functions to explore orthologous genes in the 12 organisms and to seek novel genes. The Pleurochrysome will promote molecular biological and phylogenetic research on coccolithophorids and other haptophytes by helping scientists mine data from the primary transcriptome of P. haptonemofera. PMID:26746174

  3. Web-accessible proteome databases for microbial research.

    PubMed

    Pleissner, Klaus-Peter; Eifert, Till; Buettner, Sven; Schmidt, Frank; Boehme, Martina; Meyer, Thomas F; Kaufmann, Stefan H E; Jungblut, Peter R

    2004-05-01

    The analysis of proteomes of biological organisms represents a major challenge of the post-genome era. Classical proteomics combines two-dimensional electrophoresis (2-DE) and mass spectrometry (MS) for the identification of proteins. Novel technologies such as isotope coded affinity tag (ICAT)-liquid chromatography/mass spectrometry (LC/MS) open new insights into protein alterations. The vast amount and diverse types of proteomic data require adequate web-accessible computational and database technologies for storage, integration, dissemination, analysis and visualization. A proteome database system (http://www.mpiib-berlin.mpg.de/2D-PAGE) for microbial research has been constructed which integrates 2-DE/MS, ICAT-LC/MS and functional classification data of proteins with genomic, metabolic and other biological knowledge sources. The two-dimensional polyacrylamide gel electrophoresis database delivers experimental data on microbial proteins including mass spectra for the validation of protein identification. The ICAT-LC/MS database comprises experimental data for protein alterations of mycobacterial strains BCG vs. H37Rv. By formulating complex queries within a functional protein classification database "FUNC_CLASS" for Mycobacterium tuberculosis and Helicobacter pylori the researcher can gather precise information on genes, proteins, protein classes and metabolic pathways. The use of the R language in the database architecture allows high-level data analysis and visualization to be performed "on-the-fly". The database system is centrally administrated, and investigators without specific bioinformatic competence in database construction can submit their data. The database system also serves as a template for a prototype of a European Proteome Database of Pathogenic Bacteria. Currently, the database system includes proteome information for six strains of microorganisms.

  4. Downregulation of Type II Diabetes Mellitus and Maturity Onset Diabetes of Young Pathways in Human Pancreatic Islets from Hyperglycemic Donors

    PubMed Central

    Groop, Leif

    2014-01-01

    Although several molecular pathways have been linked to type 2 diabetes (T2D) pathogenesis, it is uncertain which pathway has the most implication on the disease. Changes in the expression of an entire pathway might be more important for disease pathogenesis than changes in the expression of individual genes. To identify the molecular alterations in T2D, DNA microarrays of human pancreatic islets from donors with hyperglycemia (n = 20) and normoglycemia (n = 58) were subjected to Gene Set Enrichment Analysis (GSEA). About 178 KEGG pathways were investigated for gene expression changes between hyperglycemic donors compared to normoglycemic. Pathway enrichment analysis showed that type II diabetes mellitus (T2DM) and maturity onset diabetes of the young (MODY) pathways are downregulated in hyperglycemic donors, while proteasome and spliceosome pathways are upregulated. The mean centroid of gene expression of T2DM and MODY pathways was shown to be associated positively with insulin secretion and negatively with HbA1c level. To conclude, downregulation of T2DM and MODY pathways is involved in islet function and might be involved in T2D. Also, the study demonstrates that gene expression profiles from pancreatic islets can reveal some of the biological processes related to regulation of glucose hemostats and diabetes pathogenesis. PMID:25379510

  5. Comparison of Signaling Pathways Gene Expression in CD34(-) Umbilical Cord Blood and Bone Marrow Stem Cells.

    PubMed

    Stojko, Rafał; Bojdys-Szyndlar, Monika; Drosdzol-Cop, Agnieszka; Madej, Andrzej; Wilk, Krzysztof

    2016-01-01

    The aim of the study was to compare the biological activity of the total pool of genes in CD34(-) umbilical cord blood and bone marrow stem cells and to search for the differences in signaling pathway gene expression responsible for the biological processes. The introductory analysis revealed a big similarity of gene expression among stem cells. When analyzing GO terms for biological processes, we observed an increased activity of JAK-STAT signaling pathway, calcium-mediated, cytokine-mediated, integrin-mediated signaling pathway, and MAPK in a cluster of upregulating genes in CD34(-) umbilical cord blood stem cells. At the same time, we observed a decreased activity of BMP signaling pathways, TGF-beta pathway, and VEGF receptor signaling pathway in a cluster of downregulating genes in CD34(-) umbilical cord blood stem cells. In accordance with KEGG classification, the cytokine-cytokine receptor interaction, toll-like receptor signaling pathway, and JAK-STAT signaling pathway are overrepresented in CD34(-) umbilical cord blood stem cells. A similar gene expression in both CD34(-) UCB and BM stem cells was characteristic for such biological processes as cell division, cell cycle gene expression, mitosis, telomere maintenance with telomerase, RNA and DNA treatment processes during cell division, and similar genes activity of Notch and Wnt signaling pathways.

  6. Comparison of Signaling Pathways Gene Expression in CD34− Umbilical Cord Blood and Bone Marrow Stem Cells

    PubMed Central

    Stojko, Rafał; Bojdys-Szyndlar, Monika; Drosdzol-Cop, Agnieszka; Madej, Andrzej; Wilk, Krzysztof

    2016-01-01

    The aim of the study was to compare the biological activity of the total pool of genes in CD34− umbilical cord blood and bone marrow stem cells and to search for the differences in signaling pathway gene expression responsible for the biological processes. The introductory analysis revealed a big similarity of gene expression among stem cells. When analyzing GO terms for biological processes, we observed an increased activity of JAK-STAT signaling pathway, calcium-mediated, cytokine-mediated, integrin-mediated signaling pathway, and MAPK in a cluster of upregulating genes in CD34− umbilical cord blood stem cells. At the same time, we observed a decreased activity of BMP signaling pathways, TGF-beta pathway, and VEGF receptor signaling pathway in a cluster of downregulating genes in CD34− umbilical cord blood stem cells. In accordance with KEGG classification, the cytokine-cytokine receptor interaction, toll-like receptor signaling pathway, and JAK-STAT signaling pathway are overrepresented in CD34− umbilical cord blood stem cells. A similar gene expression in both CD34− UCB and BM stem cells was characteristic for such biological processes as cell division, cell cycle gene expression, mitosis, telomere maintenance with telomerase, RNA and DNA treatment processes during cell division, and similar genes activity of Notch and Wnt signaling pathways. PMID:26839563

  7. EMU Lessons Learned Database

    NASA Technical Reports Server (NTRS)

    Matthews, Kevin M., Jr.; Crocker, Lori; Cupples, J. Scott

    2011-01-01

    As manned space exploration takes on the task of traveling beyond low Earth orbit, many problems arise that must be solved in order to make the journey possible. One major task is protecting humans from the harsh space environment. The current method of protecting astronauts during Extravehicular Activity (EVA) is through use of the specially designed Extravehicular Mobility Unit (EMU). As more rigorous EVA conditions need to be endured at new destinations, the suit will need to be tailored and improved in order to accommodate the astronaut. The Objective behind the EMU Lessons Learned Database(LLD) is to be able to create a tool which will assist in the development of next-generation EMUs, along with maintenance and improvement of the current EMU, by compiling data from Failure Investigation and Analysis Reports (FIARs) which have information on past suit failures. FIARs use a system of codes that give more information on the aspects of the failure, but if one is unfamiliar with the EMU they will be unable to decipher the information. A goal of the EMU LLD is to not only compile the information, but to present it in a user-friendly, organized, searchable database accessible to all familiarity levels with the EMU; both newcomers and veterans alike. The EMU LLD originally started as an Excel database, which allowed easy navigation and analysis of the data through pivot charts. Creating an entry requires access to the Problem Reporting And Corrective Action database (PRACA), which contains the original FIAR data for all hardware. FIAR data are then transferred to, defined, and formatted in the LLD. Work is being done to create a web-based version of the LLD in order to increase accessibility to all of Johnson Space Center (JSC), which includes converting entries from Excel to the HTML format. FIARs related to the EMU have been completed in the Excel version, and now focus has shifted to expanding FIAR data in the LLD to include EVA tools and support hardware such as

  8. Databases as an information service

    NASA Technical Reports Server (NTRS)

    Vincent, D. A.

    1983-01-01

    The relationship of databases to information services, and the range of information services users and their needs for information is explored and discussed. It is argued that for database information to be valuable to a broad range of users, it is essential that access methods be provided that are relatively unstructured and natural to information services users who are interested in the information contained in databases, but who are not willing to learn and use traditional structured query languages. Unless this ease of use of databases is considered in the design and application process, the potential benefits from using database systems may not be realized.

  9. Construction of file database management

    SciTech Connect

    MERRILL,KYLE J.

    2000-03-01

    This work created a database for tracking data analysis files from multiple lab techniques and equipment stored on a central file server. Experimental details appropriate for each file type are pulled from the file header and stored in a searchable database. The database also stores specific location and self-directory structure for each data file. Queries can be run on the database according to file type, sample type or other experimental parameters. The database was constructed in Microsoft Access and Visual Basic was used for extraction of information from the file header.

  10. Database-assisted promoter analysis.

    PubMed

    Hehl, R; Wingender, E

    2001-06-01

    The analysis of regulatory sequences is greatly facilitated by database-assisted bioinformatic approaches. The TRANSFAC database contains information on transcription factors and their origins, functional properties and sequence-specific binding activities. Software tools enable us to screen the database with a given DNA sequence for interacting transcription factors. If a regulatory function is already attributed to this sequence then the database-assisted identification of binding sites for proteins or protein classes and subsequent experimental verification might establish functionally relevant sites within this sequence. The binding transcription factors and interacting factors might already be present in the database.

  11. UGTA Photograph Database

    SciTech Connect

    NSTec Environmental Restoration

    2009-04-20

    One of the advantages of the Nevada Test Site (NTS) is that most of the geologic and hydrologic features such as hydrogeologic units (HGUs), hydrostratigraphic units (HSUs), and faults, which are important aspects of flow and transport modeling, are exposed at the surface somewhere in the vicinity of the NTS and thus are available for direct observation. However, due to access restrictions and the remote locations of many of the features, most Underground Test Area (UGTA) participants cannot observe these features directly in the field. Fortunately, National Security Technologies, LLC, geologists and their predecessors have photographed many of these features through the years. During fiscal year 2009, work was done to develop an online photograph database for use by the UGTA community. Photographs were organized, compiled, and imported into Adobe® Photoshop® Elements 7. The photographs were then assigned keyword tags such as alteration type, HGU, HSU, location, rock feature, rock type, and stratigraphic unit. Some fully tagged photographs were then selected and uploaded to the UGTA website. This online photograph database provides easy access for all UGTA participants and can help “ground truth” their analytical and modeling tasks. It also provides new participants a resource to more quickly learn the geology and hydrogeology of the NTS.

  12. Asbestos Exposure Assessment Database

    NASA Technical Reports Server (NTRS)

    Arcot, Divya K.

    2010-01-01

    Exposure to particular hazardous materials in a work environment is dangerous to the employees who work directly with or around the materials as well as those who come in contact with them indirectly. In order to maintain a national standard for safe working environments and protect worker health, the Occupational Safety and Health Administration (OSHA) has set forth numerous precautionary regulations. NASA has been proactive in adhering to these regulations by implementing standards which are often stricter than regulation limits and administering frequent health risk assessments. The primary objective of this project is to create the infrastructure for an Asbestos Exposure Assessment Database specific to NASA Johnson Space Center (JSC) which will compile all of the exposure assessment data into a well-organized, navigable format. The data includes Sample Types, Samples Durations, Crafts of those from whom samples were collected, Job Performance Requirements (JPR) numbers, Phased Contrast Microscopy (PCM) and Transmission Electron Microscopy (TEM) results and qualifiers, Personal Protective Equipment (PPE), and names of industrial hygienists who performed the monitoring. This database will allow NASA to provide OSHA with specific information demonstrating that JSC s work procedures are protective enough to minimize the risk of future disease from the exposures. The data has been collected by the NASA contractors Computer Sciences Corporation (CSC) and Wyle Laboratories. The personal exposure samples were collected from devices worn by laborers working at JSC and by building occupants located in asbestos-containing buildings.

  13. Curcumin Resource Database.

    PubMed

    Kumar, Anil; Chetia, Hasnahana; Sharma, Swagata; Kabiraj, Debajyoti; Talukdar, Narayan Chandra; Bora, Utpal

    2015-01-01

    Curcumin is one of the most intensively studied diarylheptanoid, Curcuma longa being its principal producer. This apart, a class of promising curcumin analogs has been generated in laboratories, aptly named as Curcuminoids which are showing huge potential in the fields of medicine, food technology, etc. The lack of a universal source of data on curcumin as well as curcuminoids has been felt by the curcumin research community for long. Hence, in an attempt to address this stumbling block, we have developed Curcumin Resource Database (CRDB) that aims to perform as a gateway-cum-repository to access all relevant data and related information on curcumin and its analogs. Currently, this database encompasses 1186 curcumin analogs, 195 molecular targets, 9075 peer reviewed publications, 489 patents and 176 varieties of C. longa obtained by extensive data mining and careful curation from numerous sources. Each data entry is identified by a unique CRDB ID (identifier). Furnished with a user-friendly web interface and in-built search engine, CRDB provides well-curated and cross-referenced information that are hyperlinked with external sources. CRDB is expected to be highly useful to the researchers working on structure as well as ligand-based molecular design of curcumin analogs.

  14. Curcumin Resource Database.

    PubMed

    Kumar, Anil; Chetia, Hasnahana; Sharma, Swagata; Kabiraj, Debajyoti; Talukdar, Narayan Chandra; Bora, Utpal

    2015-01-01

    Curcumin is one of the most intensively studied diarylheptanoid, Curcuma longa being its principal producer. This apart, a class of promising curcumin analogs has been generated in laboratories, aptly named as Curcuminoids which are showing huge potential in the fields of medicine, food technology, etc. The lack of a universal source of data on curcumin as well as curcuminoids has been felt by the curcumin research community for long. Hence, in an attempt to address this stumbling block, we have developed Curcumin Resource Database (CRDB) that aims to perform as a gateway-cum-repository to access all relevant data and related information on curcumin and its analogs. Currently, this database encompasses 1186 curcumin analogs, 195 molecular targets, 9075 peer reviewed publications, 489 patents and 176 varieties of C. longa obtained by extensive data mining and careful curation from numerous sources. Each data entry is identified by a unique CRDB ID (identifier). Furnished with a user-friendly web interface and in-built search engine, CRDB provides well-curated and cross-referenced information that are hyperlinked with external sources. CRDB is expected to be highly useful to the researchers working on structure as well as ligand-based molecular design of curcumin analogs. PMID:26220923

  15. Petrophysical database of Uganda

    NASA Astrophysics Data System (ADS)

    Ruotoistenmäki, Tapio; Birungi, Nelson R.

    2015-06-01

    The petrophysical database of Uganda contains data on ca. 5800 rock samples collected and analyzed during 2009-2012 in international geological and geophysical projects covering the main part of the land area of Uganda. The parameters included are the susceptibilities and densities of all available field samples. Susceptibilities were measured from the samples from three directions. Using these parameters, we also calculated the ratios of susceptibility maxima/minima reflecting direction homogeneity of magnetic minerals, and estimated the iron content of paramagnetic samples and the magnetite content of ferrimagnetic samples. Statistical and visual analysis of the petrophysical data of Uganda demonstrated their wide variation, thus emphasizing their importance in analyzing the bedrock variations in three dimensions. Using the density-susceptibility diagram, the data can be classified into six main groups: 1. A low density and susceptibility group, consisting of sedimentary and altered rocks. 2. Low-susceptibility, felsic rocks (e.g. quartzites and metasandstones). 3. Paramagnetic, felsic rocks (e.g. granites). 4. Ferrimagnetic, magnetite-containing felsic rocks (e.g. granites). 5. Paramagnetic mafic rocks (e.g. amphibolites and dolerites). 6. Ferrimagnetic, mafic rocks containing magnetite and high-density mafic minerals (mainly dolerites). Moreover, analysis revealed that the parameter distributions of even a single rock type (e.g. granites) can be very variable, forming separate clusters. This demonstrates that the simple calculation of density or susceptibility averages of rock types can be highly erratic. For example, the average can lie between two groups, where only few, if any, samples exist. Therefore, estimation of the representative density and susceptibility must be visually verified from these diagrams. The areal distribution of parameters and their calculated derivatives generally correlate well with the regional distribution of lithological and

  16. CSF Proteomics Identifies Specific and Shared Pathways for Multiple Sclerosis Clinical Subtypes

    PubMed Central

    Avsar, Timucin; Durası, İlknur Melis; Uygunoğlu, Uğur; Tütüncü, Melih; Demirci, Nuri Onat; Saip, Sabahattin; Sezerman, O. Uğur; Siva, Aksel; Tahir Turanlı, Eda

    2015-01-01

    Multiple sclerosis (MS) is an immune-mediated, neuro-inflammatory, demyelinating and neurodegenerative disease of the central nervous system (CNS) with a heterogeneous clinical presentation and course. There is a remarkable phenotypic heterogeneity in MS, and the molecular mechanisms underlying it remain unknown. We aimed to investigate further the etiopathogenesis related molecular pathways in subclinical types of MS using proteomic and bioinformatics approaches in cerebrospinal fluids of patients with clinically isolated syndrome, relapsing remitting MS and progressive MS (n=179). Comparison of disease groups with controls revealed a total of 151 proteins that are differentially expressed in clinically different MS subtypes. KEGG analysis using PANOGA tool revealed the disease related pathways including aldosterone-regulated sodium reabsorption (p=8.02x10-5) which is important in the immune cell migration, renin-angiotensin (p=6.88x10-5) system that induces Th17 dependent immunity, notch signaling (p=1.83x10-10) pathway indicating the activated remyelination and vitamin digestion and absorption pathways (p=1.73x10-5). An emerging theme from our studies is that whilst all MS clinical forms share common biological pathways, there are also clinical subtypes specific and pathophysiology related pathways which may have further therapeutic implications. PMID:25942430

  17. Enriched pathways for major depressive disorder identified from a genome-wide association study.

    PubMed

    Kao, Chung-Feng; Jia, Peilin; Zhao, Zhongming; Kuo, Po-Hsiu

    2012-11-01

    Major depressive disorder (MDD) has caused a substantial burden of disease worldwide with moderate heritability. Despite efforts through conducting numerous association studies and now, genome-wide association (GWA) studies, the success of identifying susceptibility loci for MDD has been limited, which is partially attributed to the complex nature of depression pathogenesis. A pathway-based analytic strategy to investigate the joint effects of various genes within specific biological pathways has emerged as a powerful tool for complex traits. The present study aimed to identify enriched pathways for depression using a GWA dataset for MDD. For each gene, we estimated its gene-wise p value using combined and minimum p value, separately. Canonical pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) and BioCarta were used. We employed four pathway-based analytic approaches (gene set enrichment analysis, hypergeometric test, sum-square statistic, sum-statistic). We adjusted for multiple testing using Benjamini & Hochberg's method to report significant pathways. We found 17 significantly enriched pathways for depression, which presented low-to-intermediate crosstalk. The top four pathways were long-term depression (p⩽1×10-5), calcium signalling (p⩽6×10-5), arrhythmogenic right ventricular cardiomyopathy (p⩽1.6×10-4) and cell adhesion molecules (p⩽2.2×10-4). In conclusion, our comprehensive pathway analyses identified promising pathways for depression that are related to neurotransmitter and neuronal systems, immune system and inflammatory response, which may be involved in the pathophysiological mechanisms underlying depression. We demonstrated that pathway enrichment analysis is promising to facilitate our understanding of complex traits through a deeper interpretation of GWA data. Application of this comprehensive analytic strategy in upcoming GWA data for depression could validate the findings reported in this study.

  18. Clerkship pathway

    PubMed Central

    MacLellan, Anne-Marie; Brailovsky, Carlos; Miller, François; Leboeuf, Sylvie

    2012-01-01

    Abstract Objective To identify factors that help predict success for international medical graduates (IMGs) who train in Canadian residency programs and pass the Canadian certification examinations. Design A retrospective analysis of 58 variables in the files of IMGs who applied to the Collège des médecins du Québec between 2000 and 2008. Setting Quebec. Participants Eight hundred ten IMGs who applied to the Collège des médecins du Québec through either the “equivalency pathway” (ie, starting training at a residency level) or the “clerkship pathway” (ie, relearning at the level of a medical student in the last 2 years of the MD diploma). Main outcome measures Success factors in achieving certification. Data were analyzed using descriptive statistics and ANOVA (analysis of variance). Results International medical graduates who chose the “clerkship pathway” had greater success on certification examinations than those who started at the residency level did. Conclusion There are several factors that influence IMGs’ success on certification examinations, including integration issues, the acquisition of clinical decision-making skills, and the varied educational backgrounds. These factors perhaps can be better addressed by a regular clerkship pathway, in which IMGs benefit from learner-centred teaching and have more time for reflection on and understanding of the North American approach to medical education. The clerkship pathway is a useful strategy for assuring the integration of IMGs in the North American health care system. A 2-year relearning period in medical school at a clinical clerkship level deserves careful consideration. PMID:22859630

  19. kpath: integration of metabolic pathway linked data

    PubMed Central

    Navas-Delgado, Ismael; García-Godoy, María Jesús; López-Camacho, Esteban; Rybinski, Maciej; Reyes-Palomares, Armando; Medina, Miguel Ángel; Aldana-Montes, José F.

    2015-01-01

    In the last few years, the Life Sciences domain has experienced a rapid growth in the amount of available biological databases. The heterogeneity of these databases makes data integration a challenging issue. Some integration challenges are locating resources, relationships, data formats, synonyms or ambiguity. The Linked Data approach partially solves the heterogeneity problems by introducing a uniform data representation model. Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. This article introduces kpath, a database that integrates information related to metabolic pathways. kpath also provides a navigational interface that enables not only the browsing, but also the deep use of the integrated data to build metabolic networks based on existing disperse knowledge. This user interface has been used to showcase relationships that can be inferred from the information available in several public databases. Database URL: The public Linked Data repository can be queried at http://sparql.kpath.khaos.uma.es using the graph URI “www.khaos.uma.es/metabolic-pathways-app”. The GUI providing navigational access to kpath database is available at http://browser.kpath.khaos.uma.es. PMID:26055101

  20. National Geochronological Database

    USGS Publications Warehouse

    Revised by Sloan, Jan; Henry, Christopher D.; Hopkins, Melanie; Ludington, Steve; Original database by Zartman, Robert E.; Bush, Charles A.; Abston, Carl

    2003-01-01

    The National Geochronological Data Base (NGDB) was established by the United States Geological Survey (USGS) to collect and organize published isotopic (also known as radiometric) ages of rocks in the United States. The NGDB (originally known as the Radioactive Age Data Base, RADB) was started in 1974. A committee appointed by the Director of the USGS was given the mission to investigate the feasibility of compiling the published radiometric ages for the United States into a computerized data bank for ready access by the user community. A successful pilot program, which was conducted in 1975 and 1976 for the State of Wyoming, led to a decision to proceed with the compilation of the entire United States. For each dated rock sample reported in published literature, a record containing information on sample location, rock description, analytical data, age, interpretation, and literature citation was constructed and included in the NGDB. The NGDB was originally constructed and maintained on a mainframe computer, and later converted to a Helix Express relational database maintained on an Apple Macintosh desktop computer. The NGDB and a program to search the data files were published and distributed on Compact Disc-Read Only Memory (CD-ROM) in standard ISO 9660 format as USGS Digital Data Series DDS-14 (Zartman and others, 1995). As of May 1994, the NGDB consisted of more than 18,000 records containing over 30,000 individual ages, which is believed to represent approximately one-half the number of ages published for the United States through 1991. Because the organizational unit responsible for maintaining the database was abolished in 1996, and because we wanted to provide the data in more usable formats, we have reformatted the data, checked and edited the information in some records, and provided this online version of the NGDB. This report describes the changes made to the data and formats, and provides instructions for the use of the database in geographic

  1. A Powerful Procedure for Pathway-Based Meta-analysis Using Summary Statistics Identifies 43 Pathways Associated with Type II Diabetes in European Populations

    PubMed Central

    Zhang, Han; Wheeler, William; Hyland, Paula L.; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai

    2016-01-01

    Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs. PMID:27362418

  2. A Powerful Procedure for Pathway-Based Meta-analysis Using Summary Statistics Identifies 43 Pathways Associated with Type II Diabetes in European Populations.

    PubMed

    Zhang, Han; Wheeler, William; Hyland, Paula L; Yang, Yifan; Shi, Jianxin; Chatterjee, Nilanjan; Yu, Kai

    2016-06-01

    Meta-analysis of multiple genome-wide association studies (GWAS) has become an effective approach for detecting single nucleotide polymorphism (SNP) associations with complex traits. However, it is difficult to integrate the readily accessible SNP-level summary statistics from a meta-analysis into more powerful multi-marker testing procedures, which generally require individual-level genetic data. We developed a general procedure called Summary based Adaptive Rank Truncated Product (sARTP) for conducting gene and pathway meta-analysis that uses only SNP-level summary statistics in combination with genotype correlation estimated from a panel of individual-level genetic data. We demonstrated the validity and power advantage of sARTP through empirical and simulated data. We conducted a comprehensive pathway-based meta-analysis with sARTP on type 2 diabetes (T2D) by integrating SNP-level summary statistics from two large studies consisting of 19,809 T2D cases and 111,181 controls with European ancestry. Among 4,713 candidate pathways from which genes in neighborhoods of 170 GWAS established T2D loci were excluded, we detected 43 T2D globally significant pathways (with Bonferroni corrected p-values < 0.05), which included the insulin signaling pathway and T2D pathway defined by KEGG, as well as the pathways defined according to specific gene expression patterns on pancreatic adenocarcinoma, hepatocellular carcinoma, and bladder carcinoma. Using summary data from 8 eastern Asian T2D GWAS with 6,952 cases and 11,865 controls, we showed 7 out of the 43 pathways identified in European populations remained to be significant in eastern Asians at the false discovery rate of 0.1. We created an R package and a web-based tool for sARTP with the capability to analyze pathways with thousands of genes and tens of thousands of SNPs.

  3. The CHIANTI atomic database

    NASA Astrophysics Data System (ADS)

    Young, P. R.; Dere, K. P.; Landi, E.; Del Zanna, G.; Mason, H. E.

    2016-04-01

    The freely available CHIANTI atomic database was first released in 1996 and has had a huge impact on the analysis and modeling of emissions from astrophysical plasmas. It contains data and software for modeling optically thin atom and positive ion emission from low density (≲1013 cm-3) plasmas from x-ray to infrared wavelengths. A key feature is that the data are assessed and regularly updated, with version 8 released in 2015. Atomic data for modeling the emissivities of 246 ions and neutrals are contained in CHIANTI, together with data for deriving the ionization fractions of all elements up to zinc. The different types of atomic data are summarized here and their formats discussed. Statistics on the impact of CHIANTI to the astrophysical community are given and examples of the diverse range of applications are presented.

  4. Ribosomal Database Project II

    DOE Data Explorer

    The Ribosomal Database Project (RDP) provides ribosome related data and services to the scientific community, including online data analysis and aligned and annotated Bacterial small-subunit 16S rRNA sequences. As of March 2008, RDP Release 10 is available and currently (August 2009) contains 1,074,075 aligned 16S rRNA sequences. Data that can be downloaded include zipped GenBank and FASTA alignment files, a histogram (in Excel) of the number of RDP sequences spanning each base position, data in the Functional Gene Pipeline Repository, and various user submitted data. The RDP-II website also provides numerous analysis tools.[From the RDP-II home page at http://rdp.cme.msu.edu/index.jsp

  5. View generated database

    NASA Technical Reports Server (NTRS)

    Downward, James G.

    1992-01-01

    This document represents the final report for the View Generated Database (VGD) project, NAS7-1066. It documents the work done on the project up to the point at which all project work was terminated due to lack of project funds. The VGD was to provide the capability to accurately represent any real-world object or scene as a computer model. Such models include both an accurate spatial/geometric representation of surfaces of the object or scene, as well as any surface detail present on the object. Applications of such models are numerous, including acquisition and maintenance of work models for tele-autonomous systems, generation of accurate 3-D geometric/photometric models for various 3-D vision systems, and graphical models for realistic rendering of 3-D scenes via computer graphics.

  6. kpath: integration of metabolic pathway linked data.

    PubMed

    Navas-Delgado, Ismael; García-Godoy, María Jesús; López-Camacho, Esteban; Rybinski, Maciej; Reyes-Palomares, Armando; Medina, Miguel Ángel; Aldana-Montes, José F

    2015-01-01

    In the last few years, the Life Sciences domain has experienced a rapid growth in the amount of available biological databases. The heterogeneity of these databases makes data integration a challenging issue. Some integration challenges are locating resources, relationships, data formats, synonyms or ambiguity. The Linked Data approach partially solves the heterogeneity problems by introducing a uniform data representation model. Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. This article introduces kpath, a database that integrates information related to metabolic pathways. kpath also provides a navigational interface that enables not only the browsing, but also the deep use of the integrated data to build metabolic networks based on existing disperse knowledge. This user interface has been used to showcase relationships that can be inferred from the information available in several public databases. PMID:26055101

  7. kpath: integration of metabolic pathway linked data.

    PubMed

    Navas-Delgado, Ismael; García-Godoy, María Jesús; López-Camacho, Esteban; Rybinski, Maciej; Reyes-Palomares, Armando; Medina, Miguel Ángel; Aldana-Montes, José F

    2015-01-01

    In the last few years, the Life Sciences domain has experienced a rapid growth in the amount of available biological databases. The heterogeneity of these databases makes data integration a challenging issue. Some integration challenges are locating resources, relationships, data formats, synonyms or ambiguity. The Linked Data approach partially solves the heterogeneity problems by introducing a uniform data representation model. Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. This article introduces kpath, a database that integrates information related to metabolic pathways. kpath also provides a navigational interface that enables not only the browsing, but also the deep use of the integrated data to build metabolic networks based on existing disperse knowledge. This user interface has been used to showcase relationships that can be inferred from the information available in several public databases.

  8. CycADS: an annotation database system to ease the development and update of BioCyc databases.

    PubMed

    Vellozo, Augusto F; Véron, Amélie S; Baa-Puyoulet, Patrice; Huerta-Cepas, Jaime; Cottret, Ludovic; Febvay, Gérard; Calevro, Federica; Rahbé, Yvan; Douglas, Angela E; Gabaldón, Toni; Sagot, Marie-France; Charles, Hubert; Colella, Stefano

    2011-01-01

    In recent years, genomes from an increasing number of organisms have been sequenced, but their annotation remains a time-consuming process. The BioCyc databases offer a framework for the integrated analysis of metabolic networks. The Pathway tool software suite allows the automated construction of a database starting from an annotated genome, but it requires prior integration of all annotations into a specific summary file or into a GenBank file. To allow the easy creation and update of a BioCyc database starting from the multiple genome annotation resources available over time, we have developed an ad hoc data management system that we called Cyc Annotation Database System (CycADS). CycADS is centred on a specific database model and on a set of Java programs to import, filter and export relevant information. Data from GenBank and other annotation sources (including for example: KAAS, PRIAM, Blast2GO and PhylomeDB) are collected into a database to be subsequently filtered and extracted to generate a complete annotation file. This file is then used to build an enriched BioCyc database using the PathoLogic program of Pathway Tools. The CycADS pipeline for annotation management was used to build the AcypiCyc database for the pea aphid (Acyrthosiphon pisum) whose genome was recently sequenced. The AcypiCyc database webpage includes also, for comparative analyses, two other metabolic reconstruction BioCyc databases generated using CycADS: TricaCyc for Tribolium castaneum and DromeCyc for Drosophila melanogaster. Linked to its flexible design, CycADS offers a powerful software tool for the generation and regular updating of enriched BioCyc databases. The CycADS system is particularly suited for metabolic gene annotation and network reconstruction in newly sequenced genomes. Because of the uniform annotation used for metabolic network reconstruction, CycADS is particularly useful for comparative analysis of the metabolism of different organisms. Database URL: http://www.cycadsys.org.

  9. CycADS: an annotation database system to ease the development and update of BioCyc databases.

    PubMed

    Vellozo, Augusto F; Véron, Amélie S; Baa-Puyoulet, Patrice; Huerta-Cepas, Jaime; Cottret, Ludovic; Febvay, Gérard; Calevro, Federica; Rahbé, Yvan; Douglas, Angela E; Gabaldón, Toni; Sagot, Marie-France; Charles, Hubert; Colella, Stefano

    2011-01-01

    In recent years, genomes from an increasing number of organisms have been sequenced, but their annotation remains a time-consuming process. The BioCyc databases offer a framework for the integrated analysis of metabolic networks. The Pathway tool software suite allows the automated construction of a database starting from an annotated genome, but it requires prior integration of all annotations into a specific summary file or into a GenBank file. To allow the easy creation and update of a BioCyc database starting from the multiple genome annotation resources available over time, we have developed an ad hoc data management system that we called Cyc Annotation Database System (CycADS). CycADS is centred on a specific database model and on a set of Java programs to import, filter and export relevant information. Data from GenBank and other annotation sources (including for example: KAAS, PRIAM, Blast2GO and PhylomeDB) are collected into a database to be subsequently filtered and extracted to generate a complete annotation file. This file is then used to build an enriched BioCyc database using the PathoLogic program of Pathway Tools. The CycADS pipeline for annotation management was used to build the AcypiCyc database for the pea aphid (Acyrthosiphon pisum) whose genome was recently sequenced. The AcypiCyc database webpage includes also, for comparative analyses, two other metabolic reconstruction BioCyc databases generated using CycADS: TricaCyc for Tribolium castaneum and DromeCyc for Drosophila melanogaster. Linked to its flexible design, CycADS offers a powerful software tool for the generation and regular updating of enriched BioCyc databases. The CycADS system is particularly suited for metabolic gene annotation and network reconstruction in newly sequenced genomes. Because of the uniform annotation used for metabolic network reconstruction, CycADS is particularly useful for comparative analysis of the metabolism of different organisms. Database URL: http

  10. Inorganic Crystal Structure Database (ICSD)

    National Institute of Standards and Technology Data Gateway

    SRD 84 FIZ/NIST Inorganic Crystal Structure Database (ICSD) (PC database for purchase)   The Inorganic Crystal Structure Database (ICSD) is produced cooperatively by the Fachinformationszentrum Karlsruhe(FIZ) and the National Institute of Standards and Technology (NIST). The ICSD is a comprehensive collection of crystal structure data of inorganic compounds containing more than 140,000 entries and covering the literature from 1915 to the present.

  11. OCDB: a database collecting genes, miRNAs and drugs for obsessive-compulsive disorder

    PubMed Central

    Privitera, Anna P.; Distefano, Rosario; Wefer, Hugo A.; Ferro, Alfredo; Pulvirenti, Alfredo; Giugno, Rosalba

    2015-01-01

    Obsessive-compulsive disorder (OCD) is a psychiatric condition characterized by intrusive and unwilling thoughts (obsessions) giving rise to anxiety. The patients feel obliged to perform a behavior (compulsions) induced by the obsessions. The World Health Organization ranks OCD as one of the 10 most disabling medical conditions. In the class of Anxiety Disorders, OCD is a pathology that shows an hereditary component. Consequently, an online resource collecting and integrating scientific discoveries and genetic evidence about OCD would be helpful to improve the current knowledge on this disorder. We have developed a manually curated database, OCD Database (OCDB), collecting the relations between candidate genes in OCD, microRNAs (miRNAs) involved in the pathophysiology of OCD and drugs used in its treatments. We have screened articles from PubMed and MEDLINE. For each gene, the bibliographic references with a brief description of the gene and the experimental conditions are shown. The database also lists the polymorphisms within genes and its chromosomal regions. OCDB data is enriched with both validated and predicted miRNA-target and drug-target information. The transcription factors regulations, which are also included, are taken from David and TransmiR. Moreover, a scoring function ranks the relevance of data in the OCDB context. The database is also integrated with the main online resources (PubMed, Entrez-gene, HGNC, dbSNP, DrugBank, miRBase, PubChem, Kegg, Disease-ontology and ChEBI). The web interface has been developed using phpMyAdmin and Bootstrap software. This allows (i) to browse data by category and (ii) to navigate in the database by searching genes, miRNAs, drugs, SNPs, regions, drug targets and articles. The data can be exported in textual format as well as the whole database in.sql or tabular format. OCDB is an essential resource to support genome-wide analysis, genetic and pharmacological studies. It also facilitates the evaluation of genetic data

  12. Biological Databases for Behavioral Neurobiology

    PubMed Central

    Baker, Erich J.

    2014-01-01

    Databases are, at their core, abstractions of data and their intentionally derived relationships. They serve as a central organizing metaphor and repository, supporting or augmenting nearly all bioinformatics. Behavioral domains provide a unique stage for contemporary databases, as research in this area spans diverse data types, locations, and data relationships. This chapter provides foundational information on the diversity and prevalence of databases, how data structures support the various needs of behavioral neuroscience analysis and interpretation. The focus is on the classes of databases, data curation, and advanced applications in bioinformatics using examples largely drawn from research efforts in behavioral neuroscience. PMID:23195119

  13. Relativistic quantum private database queries

    NASA Astrophysics Data System (ADS)

    Sun, Si-Jia; Yang, Yu-Guang; Zhang, Ming-Ou

    2015-04-01

    Recently, Jakobi et al. (Phys Rev A 83, 022301, 2011) suggested the first practical private database query protocol (J-protocol) based on the Scarani et al. (Phys Rev Lett 92, 057901, 2004) quantum key distribution protocol. Unfortunately, the J-protocol is just a cheat-sensitive private database query protocol. In this paper, we present an idealized relativistic quantum private database query protocol based on Minkowski causality and the properties of quantum information. Also, we prove that the protocol is secure in terms of the user security and the database security.

  14. iTRAQ-Based Quantitative Proteomic Analysis of Cotton Roots and Leaves Reveals Pathways Associated with Salt Stress

    PubMed Central

    Shang, Haihong; Liu, Shaodong; Peng, Jun; Gong, Wankui; Shi, Yuzhen; Zhang, Siping; Li, Junwen; Gong, Juwu; Ge, Qun; Liu, Aiying; Ma, Huijuan; Zhao, Xinhua; Yuan, Youlu

    2016-01-01

    Salinity is a major abiotic stress that affects plant growth and development. In this study, we performed a proteomic analysis of cotton roots and leaf tissue following exposure to saline stress. 611 and 1477 proteins were differentially expressed in the roots and leaves, respectively. In the roots, 259 (42%) proteins were up-regulated and 352 (58%) were down-regulated. In the leaves, 748 (51%) proteins were up-regulated and 729 (49%) were down-regulated. On the basis of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, we concluded that the phenylalanine metabolism and starch and sucrose metabolism were active for energy homeostasis to cope with salt stress in cotton roots. Moreover, photosynthesis, pyruvate metabolism, glycolysis / gluconeogenesis, carbon fixation in photosynthetic organisms and phenylalanine metabolism were inhabited to reduce energy consumption. Characterization of the signaling pathways will help elucidate the mechanism activated by cotton in response to salt stress. PMID:26841024

  15. iTRAQ-Based Quantitative Proteomic Analysis of Cotton Roots and Leaves Reveals Pathways Associated with Salt Stress.

    PubMed

    Chen, Tingting; Zhang, Lei; Shang, Haihong; Liu, Shaodong; Peng, Jun; Gong, Wankui; Shi, Yuzhen; Zhang, Siping; Li, Junwen; Gong, Juwu; Ge, Qun; Liu, Aiying; Ma, Huijuan; Zhao, Xinhua; Yuan, Youlu

    2016-01-01

    Salinity is a major abiotic stress that affects plant growth and development. In this study, we performed a proteomic analysis of cotton roots and leaf tissue following exposure to saline stress. 611 and 1477 proteins were differentially expressed in the roots and leaves, respectively. In the roots, 259 (42%) proteins were up-regulated and 352 (58%) were down-regulated. In the leaves, 748 (51%) proteins were up-regulated and 729 (49%) were down-regulated. On the basis of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, we concluded that the phenylalanine metabolism and starch and sucrose metabolism were active for energy homeostasis to cope with salt stress in cotton roots. Moreover, photosynthesis, pyruvate metabolism, glycolysis / gluconeogenesis, carbon fixation in photosynthetic organisms and phenylalanine metabolism were inhabited to reduce energy consumption. Characterization of the signaling pathways will help elucidate the mechanism activated by cotton in response to salt stress.

  16. Conceptual Universal Database Language: Moving Up the Database Design Levels

    NASA Astrophysics Data System (ADS)

    Karanikolas, Nikitas N.; Vassilakopoulos, Michael Gr.

    Today, the simplicity of the relational model types affects Information Systems design. We favor another approach where the Information System designers would be able to portray directly the real world in a database model that provides more powerful and composite data types, as those of the real world. However, more powerful models, like the Frame Database Model (FDB) model, need query and manipulation languages that can handle the features of the new composite data types. We demonstrate that the adoption of such a language, the Conceptual Universal Database Language (CUDL), leads to higher database design levels: a database modeled by Entity-Relationship (ER) diagrams can be first transformed to the CUDL Abstraction Level (CAL), which can be then transformed to the FDB model. Since, the latter transformation has been previously studied, to complete the design process, we present a set of rules for the transformation from ER diagrams to CAL.

  17. The Protein Ensemble Database.

    PubMed

    Varadi, Mihaly; Tompa, Peter

    2015-01-01

    The scientific community's major conceptual notion of structural biology has recently shifted in emphasis from the classical structure-function paradigm due to the emergence of intrinsically disordered proteins (IDPs). As opposed to their folded cousins, these proteins are defined by the lack of a stable 3D fold and a high degree of inherent structural heterogeneity that is closely tied to their function. Due to their flexible nature, solution techniques such as small-angle X-ray scattering (SAXS), nuclear magnetic resonance (NMR) spectroscopy and fluorescence resonance energy transfer (FRET) are particularly well-suited for characterizing their biophysical properties. Computationally derived structural ensembles based on such experimental measurements provide models of the conformational sampling displayed by these proteins, and they may offer valuable insights into the functional consequences of inherent flexibility. The Protein Ensemble Database (http://pedb.vib.be) is the first openly accessible, manually curated online resource storing the ensemble models, protocols used during the calculation procedure, and underlying primary experimental data derived from SAXS and/or NMR measurements. By making this previously inaccessible data freely available to researchers, this novel resource is expected to promote the development of more advanced modelling methodologies, facilitate the design of standardized calculation protocols, and consequently lead to a better understanding of how function arises from the disordered state.

  18. The Mars Observer database

    NASA Technical Reports Server (NTRS)

    Albee, Arden L.

    1988-01-01

    Mars Observer will study the surface, atmosphere, and climate of Mars in a systematic way over an entire Martian year. The observations of the surface will provide a database that will be invaluable to the planning of a future Mars sample return mission. Mars Observer is planned for a September 1992 launch from the Space Shuttle, using an upper-stage. After the one year transit the spacecraft is injected into orbit about Mars and the orbit adjusted to a near-circular, sun-synchronous low-altitude, polar orbit. During the Martian year in this mapping orbit the instruments gather both geoscience data and climatological data by repetitive global mapping. The scientific objectives of the mission are to: (1) determine the global elemental and mineralogical character of the surface material; (2) define globally the topography and gravitational field; (3) establish the nature of the magnetic field; (4) determine the time and space distribution, abundance, sources, and sinks of volatile material and dust over a seasonal cycle; and (5) explore the structure and aspects of the circulation of the atmosphere. The science investigations and instruments for Mars Observer have been chosen with these objectives in mind. These instruments, the principal investigator or team leader and the objectives are discussed.

  19. Pathways associated with lignin biosynthesis in lignomaniac jute fibres.

    PubMed

    Chakraborty, Avrajit; Sarkar, Debabrata; Satya, Pratik; Karmakar, Pran Gobinda; Singh, Nagendra Kumar

    2015-08-01

    We generated the bast transcriptomes of a deficient lignified phloem fibre mutant and its wild-type jute (Corchorus capsularis) using Illumina paired-end sequencing. A total of 34,163 wild-type and 29,463 mutant unigenes, with average lengths of 1442 and 1136 bp, respectively, were assembled de novo, ~77-79 % of which were functionally annotated. These annotated unigenes were assigned to COG (~37-40 %) and GO (~22-28 %) classifications and mapped to 189 KEGG pathways (~19-21 %). We discovered 38 and 43 isoforms of 16 and 10 genes of the upstream shikimate-aromatic amino acid and downstream monolignol biosynthetic pathways, respectively, rendered their sequence similarities, confirmed the identities of 22 of these candidate gene families by phylogenetic analyses and reconstructed the pathway leading to lignin biosynthesis in jute fibres. We also identified major genes and bast-related transcription factors involved in secondary cell wall (SCW) formation. The quantitative RT-PCRs revealed that phenylalanine ammonia-lyase 1 (CcPAL1) was co-down-regulated with several genes of the upstream shikimate pathway in mutant bast tissues at an early growth stage, although its expression relapsed to the normal level at the later growth stage. However, cinnamyl alcohol dehydrogenase 7 (CcCAD7) was strongly down-regulated in mutant bast tissues irrespective of growth stages. CcCAD7 disruption at an early growth stage was accompanied by co-up-regulation of SCW-specific genes cellulose synthase A7 (CcCesA7) and fasciclin-like arabinogalactan 6 (CcFLA6), which was predicted to be involved in coordinating the S-layers' deposition in the xylan-type jute fibres. Our results identified CAD as a promising target for developing low-lignin jute fibres using genomics-assisted molecular approaches.

  20. Pathways associated with lignin biosynthesis in lignomaniac jute fibres.

    PubMed

    Chakraborty, Avrajit; Sarkar, Debabrata; Satya, Pratik; Karmakar, Pran Gobinda; Singh, Nagendra Kumar

    2015-08-01

    We generated the bast transcriptomes of a deficient lignified phloem fibre mutant and its wild-type jute (Corchorus capsularis) using Illumina paired-end sequencing. A total of 34,163 wild-type and 29,463 mutant unigenes, with average lengths of 1442 and 1136 bp, respectively, were assembled de novo, ~77-79 % of which were functionally annotated. These annotated unigenes were assigned to COG (~37-40 %) and GO (~22-28 %) classifications and mapped to 189 KEGG pathways (~19-21 %). We discovered 38 and 43 isoforms of 16 and 10 genes of the upstream shikimate-aromatic amino acid and downstream monolignol biosynthetic pathways, respectively, rendered their sequence similarities, confirmed the identities of 22 of these candidate gene families by phylogenetic analyses and reconstructed the pathway leading to lignin biosynthesis in jute fibres. We also identified major genes and bast-related transcription factors involved in secondary cell wall (SCW) formation. The quantitative RT-PCRs revealed that phenylalanine ammonia-lyase 1 (CcPAL1) was co-down-regulated with several genes of the upstream shikimate pathway in mutant bast tissues at an early growth stage, although its expression relapsed to the normal level at the later growth stage. However, cinnamyl alcohol dehydrogenase 7 (CcCAD7) was strongly down-regulated in mutant bast tissues irrespective of growth stages. CcCAD7 disruption at an early growth stage was accompanied by co-up-regulation of SCW-specific genes cellulose synthase A7 (CcCesA7) and fasciclin-like arabinogalactan 6 (CcFLA6), which was predicted to be involved in coordinating the S-layers' deposition in the xylan-type jute fibres. Our results identified CAD as a promising target for developing low-lignin jute fibres using genomics-assisted molecular approaches. PMID:25724692

  1. Understanding pathogenic Burkholderia glumae metabolic and signaling pathways within rice tissues through in vivo transcriptome analyses.

    PubMed

    Kim, Sunyoung; Park, Jungwook; Lee, Jongyun; Shin, Dongjin; Park, Dong-Soo; Lim, Jong-Sung; Choi, Ik-Young; Seo, Young-Su

    2014-08-15

    Burkholderia glumae is a causal agent of rice grain and sheath rot. Similar to other phytopathogens, B. glumae adapts well to the host environment and controls its biology to induce diseases in the host plant; however, its molecular mechanisms are not yet fully understood. To gain a better understating of the actual physiological changes that occur in B. glumae during infection, we analyzed B. glumae transcriptome from infected rice tissues using an RNA-seq technique. To accomplish this, we analyzed differentially expressed genes (DEGs) and identified 2653 transcripts that were significantly altered. We then performed KEGG pathway and module enrichment of the DEGs. Interestingly, most genes involved bacterial chemotaxis-mediated motility, ascorbate and trehalose metabolisms, and sugar transporters including l-arabinose and d-xylose were found to be highly enriched. The in vivo transcriptional profiling of pathogenic B. glumae will facilitate elucidation of unknown plant-pathogenic bacteria interactions, as well as the overall infection processes.

  2. XCOM: Photon Cross Sections Database

    National Institute of Standards and Technology Data Gateway

    SRD 8 XCOM: Photon Cross Sections Database (Web, free access)   A web database is provided which can be used to calculate photon cross sections for scattering, photoelectric absorption and pair production, as well as total attenuation coefficients, for any element, compound or mixture (Z <= 100) at energies from 1 keV to 100 GeV.

  3. Databases for K-8 Students

    ERIC Educational Resources Information Center

    Young, Terrence E., Jr.

    2004-01-01

    Today's elementary school students have been exposed to computers since birth, so it is not surprising that they are so proficient at using them. As a result, they are ready to search databases that include topics and information appropriate for their age level. Subscription databases are digital copies of magazines, newspapers, journals,…

  4. The Yield of Bibliographic Databases.

    ERIC Educational Resources Information Center

    Kowalski, Kazimierz; Hackett, Timothy P.

    1992-01-01

    Demonstrates a means for estimating the number of retrieved items using well-established selective dissemination of information (SDI) profiles in the SCI, INSPEC, ISMEC, CAS, and PASCAL databases. A correlation between individual database size and number of retrieved documents in technical fields is also examined. (17 references) (LAE)

  5. Mathematical Notation in Bibliographic Databases.

    ERIC Educational Resources Information Center

    Pasterczyk, Catherine E.

    1990-01-01

    Discusses ways in which using mathematical symbols to search online bibliographic databases in scientific and technical areas can improve search results. The representations used for Greek letters, relations, binary operators, arrows, and miscellaneous special symbols in the MathSci, Inspec, Compendex, and Chemical Abstracts databases are…

  6. Wind turbine reliability database update.

    SciTech Connect

    Peters, Valerie A.; Hill, Roger Ray; Stinebaugh, Jennifer A.; Veers, Paul S.

    2009-03-01

    This report documents the status of the Sandia National Laboratories' Wind Plant Reliability Database. Included in this report are updates on the form and contents of the Database, which stems from a fivestep process of data partnerships, data definition and transfer, data formatting and normalization, analysis, and reporting. Selected observations are also reported.

  7. The Slovenian food composition database.

    PubMed

    Korošec, Mojca; Golob, Terezija; Bertoncelj, Jasna; Stibilj, Vekoslava; Seljak, Barbara Koroušić

    2013-10-01

    The preliminary Slovenian food composition database was created in 2003, through the application of the Data management and Alimenta nutritional software. In the subsequent projects, data on the composition of meat and meat products of Slovenian origin were gathered from analyses, and low-quality data of the preliminary database were discarded. The first volume of the Slovenian food composition database was published in 2006, in both electronic and paper versions. When Slovenia joined the EuroFIR NoE, the LanguaL indexing system was adopted. The Optijed nutritional software was developed, and later upgraded to the OPEN platform. This platform serves as an electronic database that currently comprises 620 foods, and as the Slovenian node in the EuroFIR virtual information platform. With the assimilation of the data on the compositions of foods of plant origin obtained within the latest project, the Slovenian database provides a good source for food compositional values of consistent and compatible quality.

  8. The EMBL nucleotide sequence database.

    PubMed Central

    Stoesser, G; Moseley, M A; Sleep, J; McGowran, M; Garcia-Pastor, M; Sterk, P

    1998-01-01

    The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl. html ) constitutes Europe's primary nucleotide sequence resource. DNA and RNA sequences are directly submitted from researchers and genome sequencing groups and collected from the scientific literature and patent applications (Fig. 1). In collaboration with DDBJ and GenBank the database is produced, maintained and distributed at the European Bioinformatics Institute. Database releases are produced quarterly and are distributed on CD-ROM. EBI's network services allow access to the most up-to-date data collection via Internet and World Wide Web interface, providing database searching and sequence similarity facilities plus access to a large number of additional databases. PMID:9399791

  9. GOTTCHA Database, Version 1

    2015-08-03

    One major challenge in the field of shotgun metagenomics is the accurate identification of the organisms present within the community, based on classification of short sequence reads. Though microbial community profiling methods have emerged to attempt to rapidly classify the millions of reads output from contemporary sequencers, the combination of incomplete databases, similarity among otherwise divergent genomes, and the large volumes of sequencing data required for metagenome sequencing has led to unacceptably high false discoverymore » rates (FDR). Here we present the application of a novel, gene-independent and signature-based metagenomic taxonomic profiling tool with significantly smaller FDR, which is also capable of classifying never-before seen genomes into the appropriate parent taxa.The algorithm is based upon three primary computational phases: (I) genomic decomposition into bit vectors, (II) bit vector intersections to identify shared regions, and (III) bit vector subtractions to remove shared regions and reveal unique, signature regions.In the Decomposition phase, genomic data is first masked to highlight only the valid (non-ambiguous) regions and then decomposed into overlapping 24-mers. The k-mers are sorted along with their start positions, de-replicated, and then prefixed, to minimize data duplication. The prefixes are indexed and an identical data structure is created for the start positions to mimic that of the k-mer data structure.During the Intersection phase -- which is the most computationally intensive phase -- as an all-vs-all comparison is made, the number of comparisons is first reduced by four methods: (a) Prefix restriction, (b) Overlap detection, (c) Overlap restriction, and (d) Result recording. In Prefix restriction, only k-mers of the same prefix are compared. Within that group, potential overlap of k-mer suffixes that would result in a non-empty set intersection are screened for. If such an overlap exists, the region which

  10. GOTTCHA Database, Version 1

    SciTech Connect

    Freitas, Tracey; Chain, Patrick; Lo, Chien-Chi; Li, Po-E

    2015-08-03

    One major challenge in the field of shotgun metagenomics is the accurate identification of the organisms present within the community, based on classification of short sequence reads. Though microbial community profiling methods have emerged to attempt to rapidly classify the millions of reads output from contemporary sequencers, the combination of incomplete databases, similarity among otherwise divergent genomes, and the large volumes of sequencing data required for metagenome sequencing has led to unacceptably high false discovery rates (FDR). Here we present the application of a novel, gene-independent and signature-based metagenomic taxonomic profiling tool with significantly smaller FDR, which is also capable of classifying never-before seen genomes into the appropriate parent taxa.The algorithm is based upon three primary computational phases: (I) genomic decomposition into bit vectors, (II) bit vector intersections to identify shared regions, and (III) bit vector subtractions to remove shared regions and reveal unique, signature regions.In the Decomposition phase, genomic data is first masked to highlight only the valid (non-ambiguous) regions and then decomposed into overlapping 24-mers. The k-mers are sorted along with their start positions, de-replicated, and then prefixed, to minimize data duplication. The prefixes are indexed and an identical data structure is created for the start positions to mimic that of the k-mer data structure.During the Intersection phase -- which is the most computationally intensive phase -- as an all-vs-all comparison is made, the number of comparisons is first reduced by four methods: (a) Prefix restriction, (b) Overlap detection, (c) Overlap restriction, and (d) Result recording. In Prefix restriction, only k-mers of the same prefix are compared. Within that group, potential overlap of k-mer suffixes that would result in a non-empty set intersection are screened for. If such an overlap exists, the region which intersects is

  11. OCDB: a database collecting genes, miRNAs and drugs for obsessive-compulsive disorder.

    PubMed

    Privitera, Anna P; Distefano, Rosario; Wefer, Hugo A; Ferro, Alfredo; Pulvirenti, Alfredo; Giugno, Rosalba

    2015-01-01

    Obsessive-compulsive disorder (OCD) is a psychiatric condition characterized by intrusive and unwilling thoughts (obsessions) giving rise to anxiety. The patients feel obliged to perform a behavior (compulsions) induced by the obsessions. The World Health Organization ranks OCD as one of the 10 most disabling medical conditions. In the class of Anxiety Disorders, OCD is a pathology that shows an hereditary component. Consequently, an online resource collecting and integrating scientific discoveries and genetic evidence about OCD would be helpful to improve the current knowledge on this disorder. We have developed a manually curated database, OCD Database (OCDB), collecting the relations between candidate genes in OCD, microRNAs (miRNAs) involved in the pathophysiology of OCD and drugs used in its treatments. We have screened articles from PubMed and MEDLINE. For each gene, the bibliographic references with a brief description of the gene and the experimental conditions are shown. The database also lists the polymorphisms within genes and its chromosomal regions. OCDB data is enriched with both validated and predicted miRNA-target and drug-target information. The transcription factors regulations, which are also included, are taken from David and TransmiR. Moreover, a scoring function ranks the relevance of data in the OCDB context. The database is also integrated with the main online resources (PubMed, Entrez-gene, HGNC, dbSNP, DrugBank, miRBase, PubChem, Kegg, Disease-ontology and ChEBI). The web interface has been developed using phpMyAdmin and Bootstrap software. This allows (i) to browse data by category and (ii) to navigate in the database by searching genes, miRNAs, drugs, SNPs, regions, drug targets and articles. The data can be exported in textual format as well as the whole database in.sql or tabular format. OCDB is an essential resource to support genome-wide analysis, genetic and pharmacological studies. It also facilitates the evaluation of genetic data

  12. OCDB: a database collecting genes, miRNAs and drugs for obsessive-compulsive disorder.

    PubMed

    Privitera, Anna P; Distefano, Rosario; Wefer, Hugo A; Ferro, Alfredo; Pulvirenti, Alfredo; Giugno, Rosalba

    2015-01-01

    Obsessive-compulsive disorder (OCD) is a psychiatric condition characterized by intrusive and unwilling thoughts (obsessions) giving rise to anxiety. The patients feel obliged to perform a behavior (compulsions) induced by the obsessions. The World Health Organization ranks OCD as one of the 10 most disabling medical conditions. In the class of Anxiety Disorders, OCD is a pathology that shows an hereditary component. Consequently, an online resource collecting and integrating scientific discoveries and genetic evidence about OCD would be helpful to improve the current knowledge on this disorder. We have developed a manually curated database, OCD Database (OCDB), collecting the relations between candidate genes in OCD, microRNAs (miRNAs) involved in the pathophysiology of OCD and drugs used in its treatments. We have screened articles from PubMed and MEDLINE. For each gene, the bibliographic references with a brief description of the gene and the experimental conditions are shown. The database also lists the polymorphisms within genes and its chromosomal regions. OCDB data is enriched with both validated and predicted miRNA-target and drug-target information. The transcription factors regulations, which are also included, are taken from David and TransmiR. Moreover, a scoring function ranks the relevance of data in the OCDB context. The database is also integrated with the main online resources (PubMed, Entrez-gene, HGNC, dbSNP, DrugBank, miRBase, PubChem, Kegg, Disease-ontology and ChEBI). The web interface has been developed using phpMyAdmin and Bootstrap software. This allows (i) to browse data by category and (ii) to navigate in the database by searching genes, miRNAs, drugs, SNPs, regions, drug targets and articles. The data can be exported in textual format as well as the whole database in.sql or tabular format. OCDB is an essential resource to support genome-wide analysis, genetic and pharmacological studies. It also facilitates the evaluation of genetic data

  13. Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology.

    PubMed

    Karp, Peter D; Latendresse, Mario; Paley, Suzanne M; Krummenacker, Markus; Ong, Quang D; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M; Caspi, Ron

    2016-09-01

    Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms.

  14. Cloudsat tropical cyclone database

    NASA Astrophysics Data System (ADS)

    Tourville, Natalie D.

    CloudSat (CS), the first 94 GHz spaceborne cloud profiling radar (CPR), launched in 2006 to study the vertical distribution of clouds. Not only are CS observations revealing inner vertical cloud details of water and ice globally but CS overpasses of tropical cyclones (TC's) are providing a new and exciting opportunity to study the vertical structure of these storm systems. CS TC observations are providing first time vertical views of TC's and demonstrate a unique way to observe TC structure remotely from space. Since December 2009, CS has intersected every globally named TC (within 1000 km of storm center) for a total of 5,278 unique overpasses of tropical systems (disturbance, tropical depression, tropical storm and hurricane/typhoon/cyclone (HTC)). In conjunction with the Naval Research Laboratory (NRL), each CS TC overpass is processed into a data file containing observational data from the afternoon constellation of satellites (A-TRAIN), Navy's Operational Global Atmospheric Prediction System Model (NOGAPS), European Center for Medium range Weather Forecasting (ECMWF) model and best track storm data. This study will describe the components and statistics of the CS TC database, present case studies of CS TC overpasses with complementary A-TRAIN observations and compare average reflectivity stratifications of TC's across different atmospheric regimes (wind shear, SST, latitude, maximum wind speed and basin). Average reflectivity stratifications reveal that characteristics in each basin vary from year to year and are dependent upon eye overpasses of HTC strength storms and ENSO phase. West Pacific (WPAC) basin storms are generally larger in size (horizontally and vertically) and have greater values of reflectivity at a predefined height than all other basins. Storm structure at higher latitudes expands horizontally. Higher vertical wind shear (≥ 9.5 m/s) reduces cloud top height (CTH) and the intensity of precipitation cores, especially in HTC strength storms

  15. The Berlin Emissivity Database

    NASA Astrophysics Data System (ADS)

    Helbert, Jorn

    Remote sensing infrared spectroscopy is the principal field of investigation for planetary surfaces composition. Past, present and future missions to the solar system bodies include in their payload instruments measuring the emerging radiation in the infrared range. TES on Mars Global Surveyor and THEMIS on Mars Odyssey have in many ways changed our views of Mars. The PFS instrument on the ESA Mars Express mission has collected spectra since the beginning of 2004. In spring 2006 the VIRTIS experiment started its operation on the ESA Venus Express mission, allowing for the first time to map the surface of Venus using the 1 µm emission from the surface. The MERTIS spectrometer is included in the payload of the ESA BepiColombo mission to Mercury, scheduled for 2013. For the interpretation of the measured data an emissivity spectral library of planetary analogue materials is needed. The Berlin Emissivity Database (BED) presented here is focused on relatively fine-grained size separates, providing a realistic basis for interpretation of thermal emission spectra of planetary regoliths. The BED is therefore complimentary to existing thermal emission libraries, like the ASU library for example. The BED contains currently entries for plagioclase and potassium feldspars, low Ca and high Ca pyroxenes, olivine, elemental sulphur, common martian analogues (JSC Mars-1, Salten Skov, palagonites, montmorillonite) and a lunar highland soil sample measured in the wavelength range from 3 to 50 µm as a function of particle size. For each sample, the spectra of four well defined particle size separates (¡25 µm , 25-63 µm, 63-125 µm, 125-250 µm) are measured with a 4 cm-1 spectral resolution. These size separates have been selected as typical representations for most of the planetary surfaces. Following an ongoing upgrade of the Planetary Emmissivity Laboratory (PEL) at DLR in Berlin measurements can be obtained at temperatures up to 500° C - realistic for the dayside conditions

  16. miR2Gene: pattern discovery of single gene, multiple genes, and pathways by enrichment analysis of their microRNA regulators

    PubMed Central

    2011-01-01

    Background In recent years, a number of tools have been developed to explore microRNAs (miRNAs) by analyzing their target genes. However, a reverse problem, that is, inferring patterns of protein-coding genes through their miRNA regulators, has not been explored. As various miRNA annotation data become available, exploring gene patterns by analyzing the prior knowledge of their miRNA regulators is becoming more feasible. Results In this study, we developed a tool, miR2Gene, for this purpose. Various sets of miRNAs, according to prior rules such as function, associated disease, tissue specificity, family, and cluster, were integrated with miR2Gene. For given genes, miR2Gene evaluates the enrichment of the predicted miRNAs that regulate them in each miRNA set. This tool can be used for single genes, multiple genes, and KEGG pathways. For the KEGG pathway, genes with enriched miRNA sets are highlighted according to various rules. We confirmed the usefulness of miR2Gene through case studies. Conclusions miR2Gene represents a novel and useful tool that integrates miRNA knowledge for protein-coding gene analysis. miR2Gene is freely available at http://cmbi.hsc.pku.edu.cn/mir2gene. PMID:22784580

  17. Population databases in development analysis.

    PubMed

    Chamie, J

    1994-01-01

    Population databases are very important in formulating analyses of social and economic change and development. Since such analyses are often the basis for policy making and program formulation, it is important to have a sound understanding of their strengths and limitations. This paper focuses upon databases which deal with population size, life expectancy at birth, and infant mortality. Considerable progress has been made in producing population databases over the last several decades, but many problems remain with regard to their comparability, completeness of coverage, and accuracy. Governmental and political circumstances greatly influence the availability and quality of population databases. Globally, the comparability of data remains a serious concern due to deviations from standard definitions. The completeness of coverage of databases among less developed countries varies widely by region, while the data for preparing estimates and assessing demographic trends are deficient and problematic. Technological advances and the repackaging of population databases have greatly advanced their production and availability, but confusion and ignorance have become widespread regarding the original source and nature of the data. Database users therefore too often undertake faulty analyses which lead to false conclusions.

  18. Database tomography for commercial application

    NASA Technical Reports Server (NTRS)

    Kostoff, Ronald N.; Eberhart, Henry J.

    1994-01-01

    Database tomography is a method for extracting themes and their relationships from text. The algorithms, employed begin with word frequency and word proximity analysis and build upon these results. When the word 'database' is used, think of medical or police records, patents, journals, or papers, etc. (any text information that can be computer stored). Database tomography features a full text, user interactive technique enabling the user to identify areas of interest, establish relationships, and map trends for a deeper understanding of an area of interest. Database tomography concepts and applications have been reported in journals and presented at conferences. One important feature of the database tomography algorithm is that it can be used on a database of any size, and will facilitate the users ability to understand the volume of content therein. While employing the process to identify research opportunities it became obvious that this promising technology has potential applications for business, science, engineering, law, and academe. Examples include evaluating marketing trends, strategies, relationships and associations. Also, the database tomography process would be a powerful component in the area of competitive intelligence, national security intelligence and patent analysis. User interests and involvement cannot be overemphasized.

  19. Database of Properties of Meteors

    NASA Technical Reports Server (NTRS)

    Suggs, Rob; Anthea, Coster

    2006-01-01

    A database of properties of meteors, and software that provides access to the database, are being developed as a contribution to continuing efforts to model the characteristics of meteors with increasing accuracy. Such modeling is necessary for evaluation of the risk of penetration of spacecraft by meteors. For each meteor in the database, the record will include an identification, date and time, radiant properties, ballistic coefficient, radar cross section, size, density, and orbital elements. The property of primary interest in the present case is density, and one of the primary goals in this case is to derive densities of meteors from their atmospheric decelerations. The database and software are expected to be valid anywhere in the solar system. The database will incorporate new data plus results of meteoroid analyses that, heretofore, have not been readily available to the aerospace community. Taken together, the database and software constitute a model that is expected to provide improved estimates of densities and to result in improved risk analyses for interplanetary spacecraft. It is planned to distribute the database and software on a compact disk.

  20. The YH database: the first Asian diploid genome database.

    PubMed

    Li, Guoqing; Ma, Lijia; Song, Chao; Yang, Zhentao; Wang, Xiulan; Huang, Hui; Li, Yingrui; Li, Ruiqiang; Zhang, Xiuqing; Yang, Huanming; Wang, Jian; Wang, Jun

    2009-01-01

    The YH database is a server that allows the user to easily browse and download data from the first Asian diploid genome. The aim of this platform is to facilitate the study of this Asian genome and to enable improved organization and presentation large-scale personal genome data. Powered by GBrowse, we illustrate here the genome sequences, SNPs, and sequencing reads in the MapView. The relationships between phenotype and genotype can be searched by location, dbSNP ID, HGMD ID, gene symbol and disease name. A BLAST web service is also provided for the purpose of aligning query sequence against YH genome consensus. The YH database is currently one of the three personal genome database, organizing the original data and analysis results in a user-friendly interface, which is an endeavor to achieve fundamental goals for establishing personal medicine. The database is available at http://yh.genomics.org.cn.

  1. FOAM (Functional Ontology Assignments for Metagenomes): A Hidden Markov Model (HMM) database with environmental focus

    DOE PAGES

    Prestat, Emmanuel; David, Maude M.; Hultman, Jenni; Ta , Neslihan; Lamendella, Regina; Dvornik, Jill; Mackelprang, Rachel; Myrold, David D.; Jumpponen, Ari; Tringe, Susannah G.; et al

    2014-09-26

    A new functional gene database, FOAM (Functional Ontology Assignments for Metagenomes), was developed to screen environmental metagenomic sequence datasets. FOAM provides a new functional ontology dedicated to classify gene functions relevant to environmental microorganisms based on Hidden Markov Models (HMMs). Sets of aligned protein sequences (i.e. ‘profiles’) were tailored to a large group of target KEGG Orthologs (KOs) from which HMMs were trained. The alignments were checked and curated to make them specific to the targeted KO. Within this process, sequence profiles were enriched with the most abundant sequences available to maximize the yield of accurate classifier models. An associatedmore » functional ontology was built to describe the functional groups and hierarchy. FOAM allows the user to select the target search space before HMM-based comparison steps and to easily organize the results into different functional categories and subcategories. FOAM is publicly available at http://portal.nersc.gov/project/m1317/FOAM/.« less

  2. Draft secure medical database standard.

    PubMed

    Pangalos, George

    2002-01-01

    Medical database security is a particularly important issue for all Healthcare establishments. Medical information systems are intended to support a wide range of pertinent health issues today, for example: assure the quality of care, support effective management of the health services institutions, monitor and contain the cost of care, implement technology into care without violating social values, ensure the equity and availability of care, preserve humanity despite the proliferation of technology etc.. In this context, medical database security aims primarily to support: high availability, accuracy and consistency of the stored data, the medical professional secrecy and confidentiality, and the protection of the privacy of the patient. These properties, though of technical nature, basically require that the system is actually helpful for medical care and not harmful to patients. These later properties require in turn not only that fundamental ethical principles are not violated by employing database systems, but instead, are effectively enforced by technical means. This document reviews the existing and emerging work on the security of medical database systems. It presents in detail the related problems and requirements related to medical database security. It addresses the problems of medical database security policies, secure design methodologies and implementation techniques. It also describes the current legal framework and regulatory requirements for medical database security. The issue of medical database security guidelines is also examined in detailed. The current national and international efforts in the area are studied. It also gives an overview of the research work in the area. The document also presents in detail the most complete to our knowledge set of security guidelines for the development and operation of medical database systems.

  3. Databases of the marine metagenomics.

    PubMed

    Mineta, Katsuhiko; Gojobori, Takashi

    2016-02-01

    The metagenomic data obtained from marine environments is significantly useful for understanding marine microbial communities. In comparison with the conventional amplicon-based approach of metagenomics, the recent shotgun sequencing-based approach has become a powerful tool that provides an efficient way of grasping a diversity of the entire microbial community at a sampling point in the sea. However, this approach accelerates accumulation of the metagenome data as well as increase of data complexity. Moreover, when metagenomic approach is used for monitoring a time change of marine environments at multiple locations of the seawater, accumulation of metagenomics data will become tremendous with an enormous speed. Because this kind of situation has started becoming of reality at many marine research institutions and stations all over the world, it looks obvious that the data management and analysis will be confronted by the so-called Big Data issues such as how the database can be constructed in an efficient way and how useful knowledge should be extracted from a vast amount of the data. In this review, we summarize the outline of all the major databases of marine metagenome that are currently publically available, noting that database exclusively on marine metagenome is none but the number of metagenome databases including marine metagenome data are six, unexpectedly still small. We also extend our explanation to the databases, as reference database we call, that will be useful for constructing a marine metagenome database as well as complementing important information with the database. Then, we would point out a number of challenges to be conquered in constructing the marine metagenome database.

  4. Curation and Computational Design of Bioenergy-Related Metabolic Pathways

    SciTech Connect

    Karp, Peter D.

    2014-09-12

    Pathway Tools is a systems-biology software package written by SRI International (SRI) that produces Pathway/Genome Databases (PGDBs) for organisms with a sequenced genome. Pathway Tools also provides a wide range of capabilities for analyzing predicted metabolic networks and user-generated omics data. More than 5,000 academic, industrial, and government groups have licensed Pathway Tools. This user community includes researchers at all three DOE bioenergy centers, as well as academic and industrial metabolic engineering (ME) groups. An integral part of the Pathway Tools software is MetaCyc, a large, multiorganism database of metabolic pathways and enzymes that SRI and its academic collaborators manually curate. This project included two main goals: I. Enhance the MetaCyc content of bioenergy-related enzymes and pathways. II. Develop computational tools for engineering metabolic pathways that satisfy specified design goals, in particular for bioenergy-related pathways. In part I, SRI proposed to significantly expand the coverage of bioenergy-related metabolic information in MetaCyc, followed by the generation of organism-specific PGDBs for all energy-relevant organisms sequenced at the DOE Joint Genome Institute (JGI). Part I objectives included: 1: Expand the content of MetaCyc to include bioenergy-related enzymes and pathways. 2: Enhance the Pathway Tools software to enable display of complex polymer degradation processes. 3: Create new PGDBs for the energy-related organisms sequenced by JGI, update existing PGDBs with new MetaCyc content, and make these data available to JBEI via the BioCyc website. In part II, SRI proposed to develop an efficient computational tool for the engineering of metabolic pathways. Part II objectives included: 4: Develop computational tools for generating metabolic pathways that satisfy specified design goals, enabling users to specify parameters such as starting and ending compounds, and preferred or disallowed intermediate compounds

  5. NUCLEAR DATABASES FOR REACTOR APPLICATIONS.

    SciTech Connect

    PRITYCHENKO, B.; ARCILLA, R.; BURROWS, T.; HERMAN, M.W.; MUGHABGHAB, S.; OBLOZINSKY, P.; ROCHMAN, D.; SONZOGNI, A.A.; TULI, J.; WINCHELL, D.F.

    2006-06-05

    The National Nuclear Data Center (NNDC): An overview of nuclear databases, related products, nuclear data Web services and publications. The NNDC collects, evaluates, and disseminates nuclear physics data for basic research and applied nuclear technologies. The NNDC maintains and contributes to the nuclear reaction (ENDF, CSISRS) and nuclear structure databases along with several others databases (CapGam, MIRD, IRDF-2002) and provides coordination for the Cross Section Evaluation Working Group (CSEWG) and the US Nuclear Data Program (USNDP). The Center produces several publications and codes such as Atlas of Neutron Resonances, Nuclear Wallet Cards booklets and develops codes, such as nuclear reaction model code Empire.

  6. Biological Databases for Human Research

    PubMed Central

    Zou, Dong; Ma, Lina; Yu, Jun; Zhang, Zhang

    2015-01-01

    The completion of the Human Genome Project lays a foundation for systematically studying the human genome from evolutionary history to precision medicine against diseases. With the explosive growth of biological data, there is an increasing number of biological databases that have been developed in aid of human-related research. Here we present a collection of human-related biological databases and provide a mini-review by classifying them into different categories according to their data types. As human-related databases continue to grow not only in count but also in volume, challenges are ahead in big data storage, processing, exchange and curation. PMID:25712261

  7. Allergen databases and allergen semantics.

    PubMed

    Gendel, Steven M

    2009-08-01

    The efficacy of any specific bioinformatic analysis of the potential allergenicity of new food proteins depends directly on the nature and content of the databases that are used in the analysis. A number of different allergen-related databases have been developed, each designed to meet a different need. These databases differ in content, organization, and accessibility. These differences create barriers for users and prevent data sharing and integration. The development and application of appropriate semantic web technologies, (for example, a food allergen ontology) could help to overcome these barriers and promote the development of more advanced analytic capabilities.

  8. Biological databases for human research.

    PubMed

    Zou, Dong; Ma, Lina; Yu, Jun; Zhang, Zhang

    2015-02-01

    The completion of the Human Genome Project lays a foundation for systematically studying the human genome from evolutionary history to precision medicine against diseases. With the explosive growth of biological data, there is an increasing number of biological databases that have been developed in aid of human-related research. Here we present a collection of human-related biological databases and provide a mini-review by classifying them into different categories according to their data types. As human-related databases continue to grow not only in count but also in volume, challenges are ahead in big data storage, processing, exchange and curation. PMID:25712261

  9. International forensic automotive paint database

    NASA Astrophysics Data System (ADS)

    Bishea, Gregory A.; Buckle, Joe L.; Ryland, Scott G.

    1999-02-01

    The Technical Working Group for Materials Analysis (TWGMAT) is supporting an international forensic automotive paint database. The Federal Bureau of Investigation and the Royal Canadian Mounted Police (RCMP) are collaborating on this effort through TWGMAT. This paper outlines the support and further development of the RCMP's Automotive Paint Database, `Paint Data Query'. This cooperative agreement augments and supports a current, validated, searchable, automotive paint database that is used to identify make(s), model(s), and year(s) of questioned paint samples in hit-and-run fatalities and other associated investigations involving automotive paint.

  10. The Automatic Library Tracking Database

    SciTech Connect

    Fahey, Mark R; Jones, Nicholas A; Hadri, Bilel

    2010-01-01

    A library tracking database has been developed and put into production at the National Institute for Computational Sciences and the Oak Ridge Leadership Computing Facility (both located at Oak Ridge National Laboratory.) The purpose of the library tracking database is to track which libraries are used at link time on Cray XT5 Supercomputers. The database stores the libraries used at link time and also records the executables run in a batch job. With this data, many operationally important questions can be answered such as which libraries are most frequently used and which users are using deprecated libraries or applications. The infrastructure design and reporting mechanisms are presented along with collected production data.

  11. Database of recent tsunami deposits

    USGS Publications Warehouse

    Peters, Robert; Jaffe, Bruce E.

    2010-01-01

    This report describes a database of sedimentary characteristics of tsunami deposits derived from published accounts of tsunami deposit investigations conducted shortly after the occurrence of a tsunami. The database contains 228 entries, each entry containing data from up to 71 categories. It includes data from 51 publications covering 15 tsunamis distributed between 16 countries. The database encompasses a wide range of depositional settings including tropical islands, beaches, coastal plains, river banks, agricultural fields, and urban environments. It includes data from both local tsunamis and teletsunamis. The data are valuable for interpreting prehistorical, historical, and modern tsunami deposits, and for the development of criteria to identify tsunami deposits in the geologic record.

  12. New geothermal database for Utah

    USGS Publications Warehouse

    Blackett, Robert E.; ,

    1993-01-01

    The Utah Geological Survey complied a preliminary database consisting of over 800 records on thermal wells and springs in Utah with temperatures of 20??C or greater. Each record consists of 35 fields, including location of the well or spring, temperature, depth, flow-rate, and chemical analyses of water samples. Developed for applications on personal computers, the database will be useful for geochemical, statistical, and other geothermal related studies. A preliminary map of thermal wells and springs in Utah, which accompanies the database, could eventually incorporate heat-flow information, bottom-hole temperatures from oil and gas wells, traces of Quaternary faults, and locations of young volcanic centers.

  13. CGMD: An integrated database of cancer genes and markers

    PubMed Central

    Pradeepkiran, Jangampalli Adi; Sainath, Sri Bhashyam; Kramthi Kumar, Konidala; Balasubramanyam, Lokanada; Vidya Prabhakar, Kodali; Bhaskar, Matcha

    2015-01-01

    Integrating cancer genes and markers with experimental evidence might provide valuable information for the further investigation of crosstalk between tumor genes and markers in cancer biology. To achieve this objective, we developed a database known as the Cancer Gene Marker Database (CGMD), which integrates data on tumor genes and markers based on experimental evidence. The major goal of CGMD is to provide the following: 1) current systematic treatment approaches and recent advances in different cancer treatments; 2) the aggregation of different genes and markers by their molecular characteristics and pathway associations; and 3) free access to the data compiled by CGMD at http://cgmd.in/. The database consists of 309 genes and 206 markers, as well as a list of 40 different human cancers, with detailed descriptions of all characterized markers. CGMD provides complete cancer annotations and molecular descriptions of cancer genes and markers such as CpG islands, promoters, exons, PDB structures, active sites and domains. PMID:26160459

  14. The Molecular Biology Database Collection: 2008 update

    PubMed Central

    Galperin, Michael Y.

    2008-01-01

    The Nucleic Acids Research online Molecular Biology Database Collection is a public repository that lists more than 1000 databases described in this and previous Nucleic Acids Research annual database issues, as well as a selection of molecular biology databases described in other journals. All databases included in this Collection are freely available to the public. The 2008 update includes 1078 databases, 110 more than the previous one. The links to more than 80 databases have been updated and 25 obsolete databases have been removed from the list. The complete database list and summaries are available online at the Nucleic Acids Research web site, http://nar.oxfordjournals.org/. PMID:18025043

  15. The Comparative Toxicogenomics Database: update 2013.

    PubMed

    Davis, Allan Peter; Murphy, Cynthia Grondin; Johnson, Robin; Lay, Jean M; Lennon-Hopkins, Kelley; Saraceni-Richards, Cynthia; Sciaky, Daniela; King, Benjamin L; Rosenstein, Michael C; Wiegers, Thomas C; Mattingly, Carolyn J

    2013-01-01

    The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) provides information about interactions between environmental chemicals and gene products and their relationships to diseases. Chemical-gene, chemical-disease and gene-disease interactions manually curated from the literature are integrated to generate expanded networks and predict many novel associations between different data types. CTD now contains over 15 million toxicogenomic relationships. To navigate this sea of data, we added several new features, including DiseaseComps (which finds comparable diseases that share toxicogenomic profiles), statistical scoring for inferred gene-disease and pathway-chemical relationships, filtering options for several tools to refine user analysis and our new Gene Set Enricher (which provides biological annotations that are enriched for gene sets). To improve data visualization, we added a Cytoscape Web view to our ChemComps feature, included color-coded interactions and created a 'slim list' for our MEDIC disease vocabulary (allowing diseases to be grouped for meta-analysis, visualization and better data management). CTD continues to promote interoperability with external databases by providing content and cross-links to their sites. Together, this wealth of expanded chemical-gene-disease data, combined with novel ways to analyze and view content, continues to help users generate testable hypotheses about the molecular mechanisms of environmental diseases.

  16. MGFD: the maize gene families database

    PubMed Central

    Sheng, Lei; Jiang, Haiyang; Yan, Hanwei; Li, Xiaoyu; Lin, Yongxiang; Ye, Hui; Cheng, Beijiu

    2016-01-01

    Most gene families are transcription factor (TF) families, which have fundamental roles in almost all biological processes (development, growth and response to environmental factors) and have been employed to manipulate various types of metabolic, developmental and stress response pathways in plants. Maize (Zea mays) is one of the most important cereal crops in the world due its importance to human nutrition and health. Thus, identifying and annotating all the gene families in maize is an important primary step in defining their functions and understanding their roles in the regulation of diverse biological processes. In this study, we identified 96 predicted maize gene families and systematically characterized all 5826 of the genes in those families. We have also developed a comprehensive database of maize gene families (the MGFD). To further explore the functions of these gene families, we extensively annotated the genes, including such basic information as protein sequence features, gene structure, Gene Ontology classifications, phylogenetic relationships and expression profiles. The MGFD has a user-friendly web interface with multiple browse and search functions, as well as data downloading. The MGFD is freely available to users at http://mgfd.ahau.edu.cn/. Database URL: http://mgfd.ahau.edu.cn/ PMID:26896848

  17. ASDCD: antifungal synergistic drug combination database.

    PubMed

    Chen, Xing; Ren, Biao; Chen, Ming; Liu, Ming-Xi; Ren, Wei; Wang, Quan-Xin; Zhang, Li-Xin; Yan, Gui-Ying

    2014-01-01

    Finding effective drugs to treat fungal infections has important clinical significance based on high mortality rates, especially in an immunodeficient population. Traditional antifungal drugs with single targets have been reported to cause serious side effects and drug resistance. Nowadays, however, drug combinations, particularly with respect to synergistic interaction, have attracted the attention of researchers. In fact, synergistic drug combinations could simultaneously affect multiple subpopulations, targets, and diseases. Therefore, a strategy that employs synergistic antifungal drug combinations could eliminate the limitations noted above and offer the opportunity to explore this emerging bioactive chemical space. However, it is first necessary to build a powerful database in order to facilitate the analysis of drug combinations. To address this gap in our knowledge, we have built the first Antifungal Synergistic Drug Combination Database (ASDCD), including previously published synergistic antifungal drug combinations, chemical structures, targets, target-related signaling pathways, indications, and other pertinent data. Its current version includes 210 antifungal synergistic drug combinations and 1225 drug-target interactions, involving 105 individual drugs from more than 12,000 references. ASDCD is freely available at http://ASDCD.amss.ac.cn.

  18. LncRNA mediated regulation of aging pathways in Drosophila melanogaster during dietary restriction

    PubMed Central

    Tu, Jianbo; Gaur, Uma; Mao, Xueping; Fan, Xiaolan; Li, Diyan; Li, Ying; Yang, Mingyao

    2016-01-01

    Dietary restriction (DR) extends lifespan in many species which is a well-known phenomenon. Long non-coding RNAs (lncRNAs) play an important role in regulation of cell senescence and important age-related signaling pathways. Here, we profiled the lncRNA and mRNA transcriptome of fruit flies at 7 day and 42 day during DR and fully-fed conditions, respectively. In general, 102 differentially expressed lncRNAs and 1406 differentially expressed coding genes were identified. Most informatively we found a large number of differentially expressed lncRNAs and their targets enriched in GO and KEGG analysis. We discovered some new aging related signaling pathways during DR, such as hippo signaling pathway-fly, phototransduction-fly and protein processing in endoplasmic reticulum etc. Novel lncRNAs XLOC_092363 and XLOC_166557 are found to be located in 10 kb upstream sequences of hairy and ems promoters, respectively. Furthermore, tissue specificity of some novel lncRNAs had been analyzed at 7 day of DR in fly head, gut and fat body. Also the silencing of lncRNA XLOC_076307 resulted in altered expression level of its targets including Gadd45 (involved in FoxO signaling pathway). Together, the results implicated many lncRNAs closely associated with dietary restriction, which could provide a resource for lncRNA in aging and age-related disease field. PMID:27687893

  19. Computational identification of altered metabolism using gene expression and metabolic pathways.

    PubMed

    Nam, Hojung; Lee, Jinwon; Lee, Doheon

    2009-07-01

    Understanding altered metabolism is an important issue because altered metabolism is often revealed as a cause or an effect in pathogenesis. It has also been shown to be an important factor in the manipulation of an organism's metabolism in metabolic engineering. Unfortunately, it is not yet possible to measure the concentration levels of all metabolites in the genome-wide scale of a metabolic network; consequently, a method that infers the alteration of metabolism is beneficial. The present study proposes a computational method that identifies genome-wide altered metabolism by analyzing functional units of KEGG pathways. As control of a metabolic pathway is accomplished by altering the activity of at least one rate-determining step enzyme, not all gene expressions of enzymes in the pathway demonstrate significant changes even if the pathway is altered. Therefore, we measure the alteration levels of a metabolic pathway by selectively observing expression levels of significantly changed genes in a pathway. The proposed method was applied to two strains of Saccharomyces cerevisiae gene expression profiles measured in very high-gravity (VHG) fermentation. The method identified altered metabolic pathways whose properties are related to ethanol and osmotic stress responses which had been known to be observed in VHG fermentation because of the high sugar concentration in growth media and high ethanol concentration in fermentation products. With the identified altered pathways, the proposed method achieved best accuracy and sensitivity rates for the Red Star (RS) strain compared to other three related studies (gene-set enrichment analysis (GSEA), significance analysis of microarray to gene set (SAM-GS), reporter metabolite), and for the CEN.PK 113-7D (CEN) strain, the proposed method and the GSEA method showed comparably similar performances.

  20. ARN: analysis and prediction by adipogenic professional database.

    PubMed

    Huang, Yan; Wang, Li; Zan, And Lin-Sen

    2016-01-01

    Adipogenesis is the process of cell differentiation by which mesenchymal stem cells become adipocytes. Extensive research is ongoing to identify genes, their protein products, and microRNAs that correlate with fat cell development. The existing databases have focused on certain types of regulatory factors and interactions. However, there is no relationship between the results of the experimental studies on adipogenesis and these databases because of the lack of an information center. This information fragmentation hampers the identification of key regulatory genes and pathways. Thus, it is necessary to provide an information center that is quickly and easily accessible to researchers in this field. We selected and integrated data from eight external databases based on the results of text-mining, and constructed a publicly available database and web interface (URL: http://210.27.80.93/arn/ ), which contained 30873 records related to adipogenic differentiation. Then, we designed an online analysis tool to analyze the experimental data or form a scientific hypothesis about adipogenesis through Swanson's literature-based discovery process. Furthermore, we calculated the "Impact Factor" ("IF") value that reflects the importance of each node by counting the numbers of relation records, expression records, and prediction records for each node. This platform can support ongoing adipogenesis research and contribute to the discovery of key regulatory genes and pathways. PMID:27503118

  1. ARN: analysis and prediction by adipogenic professional database.

    PubMed

    Huang, Yan; Wang, Li; Zan, And Lin-Sen

    2016-08-08

    Adipogenesis is the process of cell differentiation by which mesenchymal stem cells become adipocytes. Extensive research is ongoing to identify genes, their protein products, and microRNAs that correlate with fat cell development. The existing databases have focused on certain types of regulatory factors and interactions. However, there is no relationship between the results of the experimental studies on adipogenesis and these databases because of the lack of an information center. This information fragmentation hampers the identification of key regulatory genes and pathways. Thus, it is necessary to provide an information center that is quickly and easily accessible to researchers in this field. We selected and integrated data from eight external databases based on the results of text-mining, and constructed a publicly available database and web interface (URL: http://210.27.80.93/arn/ ), which contained 30873 records related to adipogenic differentiation. Then, we designed an online analysis tool to analyze the experimental data or form a scientific hypothesis about adipogenesis through Swanson's literature-based discovery process. Furthermore, we calculated the "Impact Factor" ("IF") value that reflects the importance of each node by counting the numbers of relation records, expression records, and prediction records for each node. This platform can support ongoing adipogenesis research and contribute to the discovery of key regulatory genes and pathways.

  2. Materials Inventory Database for the Light Water Reactor Sustainability Program

    SciTech Connect

    Kazi Ahmed; Shannon M. Bragg-Sitton

    2013-08-01

    Scientific research involves the purchasing, processing, characterization, and fabrication of many sample materials. The history of such materials can become complicated over their lifetime – materials might be cut into pieces or moved to various storage locations, for example. A database with built-in functions to track these kinds of processes facilitates well-organized research. The Material Inventory Database Accounting System (MIDAS) is an easy-to-use tracking and reference system for such items. The Light Water Reactor Sustainability Program (LWRS), which seeks to advance the long-term reliability and productivity of existing nuclear reactors in the United States through multiple research pathways, proposed MIDAS as an efficient way to organize and track all items used in its research. The database software ensures traceability of all items used in research using built-in functions which can emulate actions on tracked items – fabrication, processing, splitting, and more – by performing operations on the data. MIDAS can recover and display the complete history of any item as a simple report. To ensure the database functions suitably for the organization of research, it was developed alongside a specific experiment to test accident tolerant nuclear fuel cladding under the LWRS Advanced Light Water Reactor Nuclear Fuels Pathway. MIDAS kept track of materials used in this experiment from receipt at the laboratory through all processes, test conduct and, ultimately, post-test analysis. By the end of this process, the database proved to be right tool for this program. The database software will help LWRS more efficiently conduct research experiments, from simple characterization tests to in-reactor experiments. Furthermore, MIDAS is a universal tool that any other research team could use to organize their material inventory.

  3. InterAction Database (IADB)

    Cancer.gov

    The InterAction Database includes demographic and prescription information for more than 500,000 patients in the northern and middle Netherlands and has been integrated with other systems to enhance data collection and analysis.

  4. Marine and Hydrokinetic Technology Database

    DOE Data Explorer

    DOE’s Marine and Hydrokinetic Technology Database provides up-to-date information on marine and hydrokinetic renewable energy, both in the U.S. and around the world. The database includes wave, tidal, current, and ocean thermal energy, and contains information on the various energy conversion technologies, companies active in the field, and development of projects in the water. Depending on the needs of the user, the database can present a snapshot of projects in a given region, assess the progress of a certain technology type, or provide a comprehensive view of the entire marine and hydrokinetic energy industry. Results are displayed as a list of technologies, companies, or projects. Data can be filtered by a number of criteria, including country/region, technology type, generation capacity, and technology or project stage. The database was updated in 2009 to include ocean thermal energy technologies, companies, and projects.

  5. SUPERSITES INTEGRATED RELATIONAL DATABASE (SIRD)

    EPA Science Inventory

    As part of EPA's Particulate Matter (PM) Supersites Program (Program), the University of Maryland designed and developed the Supersites Integrated Relational Database (SIRD). Measurement data in SIRD include comprehensive air quality data from the 7 Supersite program locations f...

  6. Fun Databases: My Top Ten.

    ERIC Educational Resources Information Center

    O'Leary, Mick

    1992-01-01

    Provides reviews of 10 online databases: Consumer Reports; Public Opinion Online; Encyclopedia of Associations; Official Airline Guide Adventure Atlas and Events Calendar; CENDATA; Hollywood Hotline; Fearless Taster; Soap Opera Summaries; and Human Sexuality. (LRW)

  7. LDEF meteoroid and debris database

    NASA Technical Reports Server (NTRS)

    Dardano, C. B.; See, Thomas H.; Zolensky, Michael E.

    1994-01-01

    The Long Duration Exposure Facility (LDEF) Meteoroid and Debris Special Investigation Group (M&D SIG) database is maintained at the Johnson Space Center (JSC), Houston, Texas, and consists of five data tables containing information about individual features, digitized images of selected features, and LDEF hardware (i.e., approximately 950 samples) archived at JSC. About 4000 penetrations (greater than 300 micron in diameter) and craters (greater than 500 micron in diameter) were identified and photodocumented during the disassembly of LDEF at the Kennedy Space Center (KSC), while an additional 4500 or so have subsequently been characterized at JSC. The database also contains some data that have been submitted by various PI's, yet the amount of such data is extremely limited in its extent, and investigators are encouraged to submit any and all M&D-type data to JSC for inclusion within the M&D database. Digitized stereo-image pairs are available for approximately 4500 features through the database.

  8. The NRSub database: update 1997.

    PubMed

    Perrière, G; Moszer, I; Gojobori, T

    1997-01-01

    In the context of the international project aiming at sequencing the whole genome of Bacillus subtilis we have developed NRSub, a non-redundant database of sequences from this organism. Starting from the B.subtilis sequences available in the repository collections we have removed all encountered duplications, then we have added extra annotations to the sequences (e.g. accession numbers for the genes, locations on the genetic map, codon usage index). We have also added cross-references with EMBL/GenBank/DDBJ, MEDLINE, SWISS-PROT and ENZYME databases. NRSub is distributed through anonymous FTP as a text file in EMBL format and as an ACNUC database. It is also possible to access the database through two dedicated World Wide Web servers located in France (http://acnuc.univ-lyon1.fr/nrsub/nrsub.++ +html ) and in Japan (http://ddbjs4h.genes.nig.ac.jp/ ). PMID:9016504

  9. Small Business Innovations (Integrated Database)

    NASA Technical Reports Server (NTRS)

    1992-01-01

    Because of the diversity of NASA's information systems, it was necessary to develop DAVID as a central database management system. Under a Small Business Innovation Research (SBIR) grant, Ken Wanderman and Associates, Inc. designed software tools enabling scientists to interface with DAVID and commercial database management systems, as well as artificial intelligence programs. The software has been installed at a number of data centers and is commercially available.

  10. Database Reports Over the Internet

    NASA Technical Reports Server (NTRS)

    Smith, Dean Lance

    2002-01-01

    Most of the summer was spent developing software that would permit existing test report forms to be printed over the web on a printer that is supported by Adobe Acrobat Reader. The data is stored in a DBMS (Data Base Management System). The client asks for the information from the database using an HTML (Hyper Text Markup Language) form in a web browser. JavaScript is used with the forms to assist the user and verify the integrity of the entered data. Queries to a database are made in SQL (Sequential Query Language), a widely supported standard for making queries to databases. Java servlets, programs written in the Java programming language running under the control of network server software, interrogate the database and complete a PDF form template kept in a file. The completed report is sent to the browser requesting the report. Some errors are sent to the browser in an HTML web page, others are reported to the server. Access to the databases was restricted since the data are being transported to new DBMS software that will run on new hardware. However, the SQL queries were made to Microsoft Access, a DBMS that is available on most PCs (Personal Computers). Access does support the SQL commands that were used, and a database was created with Access that contained typical data for the report forms. Some of the problems and features are discussed below.

  11. The new international GLE database

    NASA Astrophysics Data System (ADS)

    Duldig, M. L.; Watts, D. J.

    2001-08-01

    The Australian Antarctic Division has agreed to host the international GLE database. Access to the database is via a world-wide-web interface and initially covers all GLEs since the start of the 22nd solar cycle. Access restriction for recent events is controlled by password protection and these data are available only to those groups contributing data to the database. The restrictions to data will be automatically removed for events older than 2 years, in accordance with the data exchange provisions of the Antarctic Treaty. Use of the data requires acknowledgment of the database as the source of the data and acknowledgment of the specific groups that provided the data used. Furthermore, some groups that provide data to the database have specific acknowledgment requirements or wording. A new submission format has been developed that will allow easier exchange of data, although the old format will be acceptable for some time. Data download options include direct web based download and email. Data may also be viewed as listings or plots with web browsers. Search options have also been incorporated. Development of the database will be ongoing with extension to viewing and delivery options, addition of earlier data and the development of mirror sites. It is expected that two mirror sites, one in North America and one in Europe, will be developed to enable fast access for the whole cosmic ray community.

  12. Rice Glycosyltransferase (GT) Phylogenomic Database

    DOE Data Explorer

    Ronald, Pamela

    The Ronald Laboratory staff at the University of California-Davis has a primary research focus on the genes of the rice plant. They study the role that genetics plays in the way rice plants respond to their environment. They created the Rice GT Database in order to integrate functional genomic information for putative rice Glycosyltransferases (GTs). This database contains information on nearly 800 putative rice GTs (gene models) identified by sequence similarity searches based on the Carbohydrate Active enZymes (CAZy) database. The Rice GT Database provides a platform to display user-selected functional genomic data on a phylogenetic tree. This includes sequence information, mutant line information, expression data, etc. An interactive chromosomal map shows the position of all rice GTs, and links to rice annotation databases are included. The format is intended to "facilitate the comparison of closely related GTs within different families, as well as perform global comparisons between sets of related families." [From http://ricephylogenomics.ucdavis.edu/cellwalls/gt/genInfo.shtml] See also the primary paper discussing this work: Peijian Cao, Laura E. Bartley, Ki-Hong Jung and Pamela C. Ronalda. Construction of a Rice Glycosyltransferase Phylogenomic Database and Identification of Rice-Diverged Glycosyltransferases. Molecular Plant, 2008, 1(5): 858-877.

  13. Identification of the Key Genes and Pathways in Esophageal Carcinoma

    PubMed Central

    Su, Peng; Wen, Shiwang; Zhang, Yuefeng; Li, Yong; Xu, Yanzhao; Zhu, Yonggang; Lv, Huilai; Zhang, Fan; Wang, Mingbo

    2016-01-01

    Objective. Esophageal carcinoma (EC) is a frequently common malignancy of gastrointestinal cancer in the world. This study aims to screen key genes and pathways in EC and elucidate the mechanism of it. Methods. 5 microarray datasets of EC were downloaded from Gene Expression Omnibus. Differentially expressed genes (DEGs) were screened by bioinformatics analysis. Gene Ontology (GO) enrichment, Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment, and protein-protein interaction (PPI) network construction were performed to obtain the biological roles of DEGs in EC. Quantitative real-time polymerase chain reaction (qRT-PCR) was used to verify the expression level of DEGs in EC. Results. A total of 1955 genes were filtered as DEGs in EC. The upregulated genes were significantly enriched in cell cycle and the downregulated genes significantly enriched in Endocytosis. PPI network displayed CDK4 and CCT3 were hub proteins in the network. The expression level of 8 dysregulated DEGs including CDK4, CCT3, THSD4, SIM2, MYBL2, CENPF, CDCA3, and CDKN3 was validated in EC compared to adjacent nontumor tissues and the results were matched with the microarray analysis. Conclusion. The significantly DEGs including CDK4, CCT3, THSD4, and SIM2 may play key roles in tumorigenesis and development of EC involved in cell cycle and Endocytosis.

  14. De Novo Transcriptome Analysis Provides Insights into Immune Related Genes and the RIG-I-Like Receptor Signaling Pathway in the Freshwater Planarian (Dugesia japonica)

    PubMed Central

    Deng, Hongkuan; Zhang, Yichao; Sun, Xiaowen; Zhu, Guangzhong; Liu, Baohua; Zhao, Bosheng

    2016-01-01

    Background The freshwater planarian Dugesia japonica (D. japonica) possesses extraordinary ability to regenerate lost organs or body parts. Interestingly, in the process of regeneration, there is little wound infection, suggesting that D. japonica has a formidable innate immune system. The importance of immune system prompted us to search for immune-related genes and RIG-I-like receptor signaling pathways. Results Transcriptome sequencing of D. japonica was performed on an IlluminaHiSeq2000 platform. A total of 27,180 transcripts were obtained by Trinity assembler. CEGMA analysis and mapping of all trimmed reads back to the assembly result showed that our transcriptome assembly covered most of the whole transcriptome. 23,888 out of 27,180 transcripts contained ORF (open reading fragment), and were highly similar to those in Schistosoma mansoni using BLASTX analysis. 8,079 transcripts (29.7%) and 8,668 (31.9%) were annotated by Blast2GO and KEGG respectively. A DYNLRB-like gene was cloned to verify its roles in the immune response. Finally, the expression patterns of 4 genes (RIG-I, TRAF3, TRAF6, P38) in the RIG-I-like receptor signaling pathway were detected, and the results showed they are very likely to be involved in planarian immune response. Conclusion RNA-Seq analysis based on the next-generation sequencing technology was an efficient approach to discover critical genes and to understand their corresponding biological functions. Through GO and KEGG analysis, several critical and conserved signaling pathways and genes related to RIG-I-like receptor signaling pathway were identified. Four candidate genes were selected to identify their expression dynamics in the process of pathogen stimulation. These annotated transcripts of D. japonica provide a useful resource for subsequent investigation of other important pathways. PMID:26986572

  15. Electron Inelastic-Mean-Free-Path Database

    National Institute of Standards and Technology Data Gateway

    SRD 71 NIST Electron Inelastic-Mean-Free-Path Database (PC database, no charge)   This database provides values of electron inelastic mean free paths (IMFPs) for use in quantitative surface analyses by AES and XPS.

  16. Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases.

    PubMed

    Sanderson, Lacey-Anne; Ficklin, Stephen P; Cheng, Chun-Huai; Jung, Sook; Feltus, Frank A; Bett, Kirstin E; Main, Dorrie

    2013-01-01

    Tripal is an open-source freely available toolkit for construction of online genomic and genetic databases. It aims to facilitate development of community-driven biological websites by integrating the GMOD Chado database schema with Drupal, a popular website creation and content management software. Tripal provides a suite of tools for interaction with a Chado database and display of content therein. The tools are designed to be generic to support the various ways in which data may be stored in Chado. Previous releases of Tripal have supported organisms, genomic libraries, biological stocks, stock collections and genomic features, their alignments and annotations. Also, Tripal and its extension modules provided loaders for commonly used file formats such as FASTA, GFF, OBO, GAF, BLAST XML, KEGG heir files and InterProScan XML. Default generic templates were provided for common views of biological data, which could be customized using an open Application Programming Interface to change the way data are displayed. Here, we report additional tools and functionality that are part of release v1.1 of Tripal. These include (i) a new bulk loader that allows a site curator to import data stored in a custom tab delimited format; (ii) full support of every Chado table for Drupal Views (a powerful tool allowing site developers to construct novel displays and search pages); (iii) new modules including 'Feature Map', 'Genetic', 'Publication', 'Project', 'Contact' and the 'Natural Diversity' modules. Tutorials, mailing lists, download and set-up instructions, extension modules and other documentation can be found at the Tripal website located at http://tripal.info. DATABASE URL: http://tripal.info/. PMID:24163125

  17. Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases

    PubMed Central

    Sanderson, Lacey-Anne; Ficklin, Stephen P.; Cheng, Chun-Huai; Jung, Sook; Feltus, Frank A.; Bett, Kirstin E.; Main, Dorrie

    2013-01-01

    Tripal is an open-source freely available toolkit for construction of online genomic and genetic databases. It aims to facilitate development of community-driven biological websites by integrating the GMOD Chado database schema with Drupal, a popular website creation and content management software. Tripal provides a suite of tools for interaction with a Chado database and display of content therein. The tools are designed to be generic to support the various ways in which data may be stored in Chado. Previous releases of Tripal have supported organisms, genomic libraries, biological stocks, stock collections and genomic features, their alignments and annotations. Also, Tripal and its extension modules provided loaders for commonly used file formats such as FASTA, GFF, OBO, GAF, BLAST XML, KEGG heir files and InterProScan XML. Default generic templates were provided for common views of biological data, which could be customized using an open Application Programming Interface to change the way data are displayed. Here, we report additional tools and functionality that are part of release v1.1 of Tripal. These include (i) a new bulk loader that allows a site curator to import data stored in a custom tab delimited format; (ii) full support of every Chado table for Drupal Views (a powerful tool allowing site developers to construct novel displays and search pages); (iii) new modules including ‘Feature Map’, ‘Genetic’, ‘Publication’, ‘Project’, ‘Contact’ and the ‘Natural Diversity’ modules. Tutorials, mailing lists, download and set-up instructions, extension modules and other documentation can be found at the Tripal website located at http://tripal.info. Database URL: http://tripal.info/ PMID:24163125

  18. Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology

    PubMed Central

    Paley, Suzanne M.; Krummenacker, Markus; Latendresse, Mario; Dale, Joseph M.; Lee, Thomas J.; Kaipa, Pallavi; Gilham, Fred; Spaulding, Aaron; Popescu, Liviu; Altman, Tomer; Paulsen, Ian; Keseler, Ingrid M.; Caspi, Ron

    2010-01-01

    Pathway Tools is a production-quality software environment for creating a type of model-organism database called a Pathway/Genome Database (PGDB). A PGDB such as EcoCyc integrates the evolving understanding of the genes, proteins, metabolic network and regulatory network of an organism. This article provides an overview of Pathway Tools capabilities. The software performs multiple computational inferences including prediction of metabolic pathways, prediction of metabolic pathway hole fillers and prediction of operons. It enables interactive editing of PGDBs by DB curators. It supports web publishing of PGDBs, and provides a large number of query and visualization tools. The software also supports comparative analyses of PGDBs, and provides several systems biology analyses of PGDBs including reachability analysis of metabolic networks, and interactive tracing of metabolites through a metabolic network. More than 800 PGDBs have been created using Pathway Tools by scientists around the world, many of which are curated DBs for important model organisms. Those PGDBs can be exchanged using a peer-to-peer DB sharing system called the PGDB Registry. PMID:19955237

  19. The Chicago Thoracic Oncology Database Consortium: A Multisite Database Initiative

    PubMed Central

    Carey, George B; Tan, Yi-Hung Carol; Bokhary, Ujala; Itkonen, Michelle; Szeto, Kyle; Wallace, James; Campbell, Nicholas; Hensing, Thomas; Salgia, Ravi

    2016-01-01

    Objective: An increasing amount of clinical data is available to biomedical researchers, but specifically designed database and informatics infrastructures are needed to handle this data effectively. Multiple research groups should be able to pool and share this data in an efficient manner. The Chicago Thoracic Oncology Database Consortium (CTODC) was created to standardize data collection and facilitate the pooling and sharing of data at institutions throughout Chicago and across the world. We assessed the CTODC by conducting a proof of principle investigation on lung cancer patients who took erlotinib. This study does not look into epidermal growth factor receptor (EGFR) mutations and tyrosine kinase inhibitors, but rather it discusses the development and utilization of the database involved. Methods:  We have implemented the Thoracic Oncology Program Database Project (TOPDP) Microsoft Access, the Thoracic Oncology Research Program (TORP) Velos, and the TORP REDCap databases for translational research efforts. Standard operating procedures (SOPs) were created to document the construction and proper utilization of these databases. These SOPs have been made available freely to other institutions that have implemented their own databases patterned on these SOPs. Results: A cohort of 373 lung cancer patients who took erlotinib was identified. The EGFR mutation statuses of patients were analyzed. Out of the 70 patients that were tested, 55 had mutations while 15 did not. In terms of overall survival and duration of treatment, the cohort demonstrated that EGFR-mutated patients had a longer duration of erlotinib treatment and longer overall survival compared to their EGFR wild-type counterparts who received erlotinib. Discussion: The investigation successfully yielded data from all institutions of the CTODC. While the investigation identified challenges, such as the difficulty of data transfer and potential duplication of patient data, these issues can be resolved

  20. The care pathway: concepts and theories: an introduction.

    PubMed

    Schrijvers, Guus; van Hoorn, Arjan; Huiskes, Nicolette

    2012-01-01

    This article addresses first the definition of a (care) pathway, and then follows a description of theories since the 1950s. It ends with a discussion of theoretical advantages and disadvantages of care pathways for patients and professionals. The objective of this paper is to provide a theoretical base for empirical studies on care pathways. The knowledge for this chapter is based on several books on pathways, which we found by searching in the digital encyclopedia Wikipedia. Although this is not usual in scientific publications, this method was used because books are not searchable by databases as Pubmed. From 2005, we performed a literature search on Pubmed and other literature databases, and with the keywords integrated care pathway, clinical pathway, critical pathway, theory, research, and evaluation. One of the inspirational sources was the website of the European Pathway Association (EPA) and its journal International Journal of Care Pathways. The authors visited several sites for this paper. These are mentioned as illustration of a concept or theory. Most of them have English websites with more information. The URLs of these websites are not mentioned in this paper as a reference, because the content of them changes fast, sometimes every day. PMID:23593066

  1. The care pathway: concepts and theories: an introduction.

    PubMed

    Schrijvers, Guus; van Hoorn, Arjan; Huiskes, Nicolette

    2012-01-01

    This article addresses first the definition of a (care) pathway, and then follows a description of theories since the 1950s. It ends with a discussion of theoretical advantages and disadvantages of care pathways for patients and professionals. The objective of this paper is to provide a theoretical base for empirical studies on care pathways. The knowledge for this chapter is based on several books on pathways, which we found by searching in the digital encyclopedia Wikipedia. Although this is not usual in scientific publications, this method was used because books are not searchable by databases as Pubmed. From 2005, we performed a literature search on Pubmed and other literature databases, and with the keywords integrated care pathway, clinical pathway, critical pathway, theory, research, and evaluation. One of the inspirational sources was the website of the European Pathway Association (EPA) and its journal International Journal of Care Pathways. The authors visited several sites for this paper. These are mentioned as illustration of a concept or theory. Most of them have English websites with more information. The URLs of these websites are not mentioned in this paper as a reference, because the content of them changes fast, sometimes every day.

  2. The care pathway: concepts and theories: an introduction

    PubMed Central

    Schrijvers, Guus; van Hoorn, Arjan; Huiskes, Nicolette

    2012-01-01

    This article addresses first the definition of a (care) pathway, and then follows a description of theories since the 1950s. It ends with a discussion of theoretical advantages and disadvantages of care pathways for patients and professionals. The objective of this paper is to provide a theoretical base for empirical studies on care pathways. The knowledge for this chapter is based on several books on pathways, which we found by searching in the digital encyclopedia Wikipedia. Although this is not usual in scientific publications, this method was used because books are not searchable by databases as Pubmed. From 2005, we performed a literature search on Pubmed and other literature databases, and with the keywords integrated care pathway, clinical pathway, critical pathway, theory, research, and evaluation. One of the inspirational sources was the website of the European Pathway Association (EPA) and its journal International Journal of Care Pathways. The authors visited several sites for this paper. These are mentioned as illustration of a concept or theory. Most of them have English websites with more information. The URLs of these websites are not mentioned in this paper as a reference, because the content of them changes fast, sometimes every day. PMID:23593066

  3. A Chronostratigraphic Relational Database Ontology

    NASA Astrophysics Data System (ADS)

    Platon, E.; Gary, A.; Sikora, P.

    2005-12-01

    A chronostratigraphic research database was donated by British Petroleum to the Stratigraphy Group at the Energy and Geoscience Institute (EGI), University of Utah. These data consists of over 2,000 measured sections representing over three decades of research into the application of the graphic correlation method. The data are global and includes both microfossil (foraminifera, calcareous nannoplankton, spores, pollen, dinoflagellate cysts, etc) and macrofossil data. The objective of the donation was to make the research data available to the public in order to encourage additional chronostratigraphy studies, specifically regarding graphic correlation. As part of the National Science Foundation's Cyberinfrastructure for the Geosciences (GEON) initiative these data have been made available to the public at http://css.egi.utah.edu. To encourage further research using the graphic correlation method, EGI has developed a software package, StrataPlot that will soon be publicly available from the GEON website as a standalone software download. The EGI chronostratigraphy research database, although relatively large, has many data holes relative to some paleontological disciplines and geographical areas, so the challenge becomes how do we expand the data available for chronostratigrahic studies using graphic correlation. There are several public or soon-to-be public databases available to chronostratigraphic research, but they have their own data structures and modes of presentation. The heterogeneous nature of these database schemas hinders their integration and makes it difficult for the user to retrieve and consolidate potentially valuable chronostratigraphic data. The integration of these data sources would facilitate rapid and comprehensive data searches, thus helping advance studies in chronostratigraphy. The GEON project will host a number of databases within the geology domain, some of which contain biostratigraphic data. Ontologies are being developed to provide

  4. Reference ballistic imaging database performance.

    PubMed

    De Kinder, Jan; Tulleners, Frederic; Thiebaut, Hugues

    2004-03-10

    Ballistic imaging databases allow law enforcement to link recovered cartridge cases to other crime scenes and to firearms. The success of these databases has led many to propose that all firearms in circulation be entered into a reference ballistic image database (RBID). To assess the performance of an RBID, we fired 4200 cartridge cases from 600 9mm Para Sig Sauer model P226 series pistols. Each pistol fired two Remington cartridges, one of which was imaged in the RBID, and five additional cartridges, consisting of Federal, Speer, Winchester, Wolf, and CCI brands. Randomly selected samples from the second series of Remington cartridge cases and from the five additional brands were then correlated against the RBID. Of the 32 cartridges of the same make correlated against the RBID, 72% ranked in the top 10 positions. Likewise, of the 160 cartridges of the five different brands correlated against the database, 21% ranked in the top 10 positions. Generally, the ranking position increased as the size of the RBID increased. We obtained similar results when we expanded the RBID to include firearms with the same class characteristics for breech face marks, firing pin impressions, and extractor marks. The results of our six queries against the RBID indicate that a reference ballistics image database of new guns is currently fraught with too many difficulties to be an effective and efficient law enforcement tool.

  5. REDIdb: the RNA editing database.

    PubMed

    Picardi, Ernesto; Regina, Teresa Maria Rosaria; Brennicke, Axel; Quagliariello, Carla

    2007-01-01

    The RNA Editing Database (REDIdb) is an interactive, web-based database created and designed with the aim to allocate RNA editing events such as substitutions, insertions and deletions occurring in a wide range of organisms. The database contains both fully and partially sequenced DNA molecules for which editing information is available either by experimental inspection (in vitro) or by computational detection (in silico). Each record of REDIdb is organized in a specific flat-file containing a description of the main characteristics of the entry, a feature table with the editing events and related details and a sequence zone with both the genomic sequence and the corresponding edited transcript. REDIdb is a relational database in which the browsing and identification of editing sites has been simplified by means of two facilities to either graphically display genomic or cDNA sequences or to show the corresponding alignment. In both cases, all editing sites are highlighted in colour and their relative positions are detailed by mousing over. New editing positions can be directly submitted to REDIdb after a user-specific registration to obtain authorized secure access. This first version of REDIdb database stores 9964 editing events and can be freely queried at http://biologia.unical.it/py_script/search.html.

  6. YCRD: Yeast Combinatorial Regulation Database

    PubMed Central

    Wu, Wei-Sheng; Hsieh, Yen-Chen; Lai, Fu-Jou

    2016-01-01

    In eukaryotes, the precise transcriptional control of gene expression is typically achieved through combinatorial regulation using cooperative transcription factors (TFs). Therefore, a database which provides regulatory associations between cooperative TFs and their target genes is helpful for biologists to study the molecular mechanisms of transcriptional regulation of gene expression. Because there is no such kind of databases in the public domain, this prompts us to construct a database, called Yeast Combinatorial Regulation Database (YCRD), which deposits 434,197 regulatory associations between 2535 cooperative TF pairs and 6243 genes. The comprehensive collection of more than 2500 cooperative TF pairs was retrieved from 17 existing algorithms in the literature. The target genes of a cooperative TF pair (e.g. TF1-TF2) are defined as the common target genes of TF1 and TF2, where a TF’s experimentally validated target genes were downloaded from YEASTRACT database. In YCRD, users can (i) search the target genes of a cooperative TF pair of interest, (ii) search the cooperative TF pairs which regulate a gene of interest and (iii) identify important cooperative TF pairs which regulate a given set of genes. We believe that YCRD will be a valuable resource for yeast biologists to study combinatorial regulation of gene expression. YCRD is available at http://cosbi.ee.ncku.edu.tw/YCRD/ or http://cosbi2.ee.ncku.edu.tw/YCRD/. PMID:27392072

  7. YCRD: Yeast Combinatorial Regulation Database.

    PubMed

    Wu, Wei-Sheng; Hsieh, Yen-Chen; Lai, Fu-Jou

    2016-01-01

    In eukaryotes, the precise transcriptional control of gene expression is typically achieved through combinatorial regulation using cooperative transcription factors (TFs). Therefore, a database which provides regulatory associations between cooperative TFs and their target genes is helpful for biologists to study the molecular mechanisms of transcriptional regulation of gene expression. Because there is no such kind of databases in the public domain, this prompts us to construct a database, called Yeast Combinatorial Regulation Database (YCRD), which deposits 434,197 regulatory associations between 2535 cooperative TF pairs and 6243 genes. The comprehensive collection of more than 2500 cooperative TF pairs was retrieved from 17 existing algorithms in the literature. The target genes of a cooperative TF pair (e.g. TF1-TF2) are defined as the common target genes of TF1 and TF2, where a TF's experimentally validated target genes were downloaded from YEASTRACT database. In YCRD, users can (i) search the target genes of a cooperative TF pair of interest, (ii) search the cooperative TF pairs which regulate a gene of interest and (iii) identify important cooperative TF pairs which regulate a given set of genes. We believe that YCRD will be a valuable resource for yeast biologists to study combinatorial regulation of gene expression. YCRD is available at http://cosbi.ee.ncku.edu.tw/YCRD/ or http://cosbi2.ee.ncku.edu.tw/YCRD/. PMID:27392072

  8. Searching NCBI databases using Entrez.

    PubMed

    Gibney, Gretchen; Baxevanis, Andreas D

    2011-06-01

    One of the most widely used interfaces for the retrieval of information from biological databases is the NCBI Entrez system. Entrez capitalizes on the fact that there are pre-existing, logical relationships between the individual entries found in numerous public databases. The existence of such natural connections, mostly biological in nature, argued for the development of a method through which all the information about a particular biological entity could be found without having to sequentially visit and query disparate databases. Two basic protocols describe simple, text-based searches, illustrating the types of information that can be retrieved through the Entrez system. An alternate protocol builds upon the first basic protocol, using additional, built-in features of the Entrez system, and providing alternative ways to issue the initial query. The support protocol reviews how to save frequently issued queries. Finally, Cn3D, a structure visualization tool, is also discussed.

  9. Searching NCBI databases using Entrez.

    PubMed

    Baxevanis, Andreas D

    2008-12-01

    One of the most widely used interfaces for the retrieval of information from biological databases is the NCBI Entrez system. Entrez capitalizes on the fact that there are pre-existing, logical relationships between the individual entries found in numerous public databases. The existence of such natural connections, mostly biological in nature, argued for the development of a method through which all the information about a particular biological entity could be found without having to sequentially visit and query disparate databases. Two Basic Protocols describe simple, text-based searches, illustrating the types of information that can be retrieved through the Entrez system. An Alternate Protocol builds upon the first Basic Protocol, using additional, built-in features of the Entrez system, and providing alternative ways to issue the initial query. The Support Protocol reviews how to save frequently issued queries. Finally, Cn3D, a structure visualization tool, is also discussed.

  10. Stratospheric emissions effects database development

    NASA Technical Reports Server (NTRS)

    Baughcum, Steven L.; Henderson, Stephen C.; Hertel, Peter S.; Maggiora, Debra R.; Oncina, Carlos A.

    1994-01-01

    This report describes the development of a stratospheric emissions effects database (SEED) of aircraft fuel burn and emissions from projected Year 2015 subsonic aircraft fleets and from projected fleets of high-speed civil transports (HSCT's). This report also describes the development of a similar database of emissions from Year 1990 scheduled commercial passenger airline and air cargo traffic. The objective of this work was to initiate, develop, and maintain an engineering database for use by atmospheric scientists conducting the Atmospheric Effects of Stratospheric Aircraft (AESA) modeling studies. Fuel burn and emissions of nitrogen oxides (NO(x) as NO2), carbon monoxide, and hydrocarbons (as CH4) have been calculated on a 1-degree latitude x 1-degree longitude x 1-kilometer altitude grid and delivered to NASA as electronic files. This report describes the assumptions and methodology for the calculations and summarizes the results of these calculations.

  11. National Residential Efficiency Measures Database

    DOE Data Explorer

    The National Residential Efficiency Measures Database is a publicly available, centralized resource of residential building retrofit measures and costs for the U.S. building industry. With support from the U.S. Department of Energy, NREL developed this tool to help users determine the most cost-effective retrofit measures for improving energy efficiency of existing homes. Software developers who require residential retrofit performance and cost data for applications that evaluate residential efficiency measures are the primary audience for this database. In addition, home performance contractors and manufacturers of residential materials and equipment may find this information useful. The database offers the following types of retrofit measures: 1) Appliances, 2) Domestic Hot Water, 3) Enclosure, 4) Heating, Ventilating, and Air Conditioning (HVAC), 5) Lighting, 6) Miscellaneous.

  12. DOE Global Energy Storage Database

    DOE Data Explorer

    The DOE International Energy Storage Database has more than 400 documented energy storage projects from 34 countries around the world. The database provides free, up-to-date information on grid-connected energy storage projects and relevant state and federal policies. More than 50 energy storage technologies are represented worldwide, including multiple battery technologies, compressed air energy storage, flywheels, gravel energy storage, hydrogen energy storage, pumped hydroelectric, superconducting magnetic energy storage, and thermal energy storage. The policy section of the database shows 18 federal and state policies addressing grid-connected energy storage, from rules and regulations to tariffs and other financial incentives. It is funded through DOE’s Sandia National Laboratories, and has been operating since January 2012.

  13. Stratospheric emissions effects database development

    SciTech Connect

    Baughcum, S.L.; Henderson, S.C.; Hertel, P.S.; Maggiora, D.R.; Oncina, C.A.

    1994-07-01

    This report describes the development of a stratospheric emissions effects database (SEED) of aircraft fuel burn and emissions from projected Year 2015 subsonic aircraft fleets and from projected fleets of high-speed civil transports (HSCT's). This report also describes the development of a similar database of emissions from Year 1990 scheduled commercial passenger airline and air cargo traffic. The objective of this work was to initiate, develop, and maintain an engineering database for use by atmospheric scientists conducting the Atmospheric Effects of Stratospheric Aircraft (AESA) modeling studies. Fuel burn and emissions of nitrogen oxides (NO(x) as NO2), carbon monoxide, and hydrocarbons (as CH4) have been calculated on a 1-degree latitude x 1-degree longitude x 1-kilometer altitude grid and delivered to NASA as electronic files. This report describes the assumptions and methodology for the calculations and summarizes the results of these calculations.

  14. The Giardia genome project database.

    PubMed

    McArthur, A G; Morrison, H G; Nixon, J E; Passamaneck, N Q; Kim, U; Hinkle, G; Crocker, M K; Holder, M E; Farr, R; Reich, C I; Olsen, G E; Aley, S B; Adam, R D; Gillin, F D; Sogin, M L

    2000-08-15

    The Giardia genome project database provides an online resource for Giardia lamblia (WB strain, clone C6) genome sequence information. The database includes edited single-pass reads, the results of BLASTX searches, and details of progress towards sequencing the entire 12 million-bp Giardia genome. Pre-sorted BLASTX results can be retrieved based on keyword searches and BLAST searches of the high throughput Giardia data can be initiated from the web site or through NCBI. Descriptions of the genomic DNA libraries, project protocols and summary statistics are also available. Although the Giardia genome project is ongoing, new sequences are made available on a bi-monthly basis to ensure that researchers have access to information that may assist them in the search for genes and their biological function. The current URL of the Giardia genome project database is www.mbl.edu/Giardia.

  15. Polymorphix: a sequence polymorphism database.

    PubMed

    Bazin, Eric; Duret, Laurent; Penel, Simon; Galtier, Nicolas

    2005-01-01

    Within-species sequence variation data are of special interest since they contain information about recent population/species history, and the molecular evolutionary forces currently in action in natural populations. These data, however, are presently dispersed within generalist databases, and are difficult to access. To solve this problem, we have developed Polymorphix, a database dedicated to sequence polymorphism. It contains within-species homologous sequence families built using EMBL/GenBank under suitable similarity and bibliographic criteria. Polymorphix is an ACNUC structured database allowing both simple and complex queries for population genomic studies. Alignments within families as well as phylogenetic trees can be download. When available, outgroups are included in the alignment. Polymorphix contains sequences from the nuclear, mitochondrial and chloroplastic genomes of every eukaryote species represented in EMBL. It can be accessed by a web interface (http://pbil.univ-lyon1.fr/polymorphix/query.php).

  16. The CATDAT damaging earthquakes database

    NASA Astrophysics Data System (ADS)

    Daniell, J. E.; Khazai, B.; Wenzel, F.; Vervaeck, A.

    2011-08-01

    The global CATDAT damaging earthquakes and secondary effects (tsunami, fire, landslides, liquefaction and fault rupture) database was developed to validate, remove discrepancies, and expand greatly upon existing global databases; and to better understand the trends in vulnerability, exposure, and possible future impacts of such historic earthquakes. Lack of consistency and errors in other earthquake loss databases frequently cited and used in analyses was a major shortcoming in the view of the authors which needed to be improved upon. Over 17 000 sources of information have been utilised, primarily in the last few years, to present data from over 12 200 damaging earthquakes historically, with over 7000 earthquakes since 1900 examined and validated before insertion into the database. Each validated earthquake includes seismological information, building damage, ranges of social losses to account for varying sources (deaths, injuries, homeless, and affected), and economic losses (direct, indirect, aid, and insured). Globally, a slightly increasing trend in economic damage due to earthquakes is not consistent with the greatly increasing exposure. The 1923 Great Kanto (214 billion USD damage; 2011 HNDECI-adjusted dollars) compared to the 2011 Tohoku (>300 billion USD at time of writing), 2008 Sichuan and 1995 Kobe earthquakes show the increasing concern for economic loss in urban areas as the trend should be expected to increase. Many economic and social loss values not reported in existing databases have been collected. Historical GDP (Gross Domestic Product), exchange rate, wage information, population, HDI (Human Development Index), and insurance information have been collected globally to form comparisons. This catalogue is the largest known cross-checked global historic damaging earthquake database and should have far-reaching consequences for earthquake loss estimation, socio-economic analysis, and the global reinsurance field.

  17. The new IAGOS Database Portal

    NASA Astrophysics Data System (ADS)

    Boulanger, Damien; Gautron, Benoit; Thouret, Valérie; Fontaine, Alain

    2016-04-01

    IAGOS (In-service Aircraft for a Global Observing System) is a European Research Infrastructure which aims at the provision of long-term, regular and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft. The IAGOS database is an essential part of the global atmospheric monitoring network. It contains IAGOS-core data and IAGOS-CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) data. The IAGOS Database Portal (http://www.iagos.fr, damien.boulanger@obs-mip.fr) is part of the French atmospheric chemistry data center AERIS (http://www.aeris-data.fr). The new IAGOS Database Portal has been released in December 2015. The main improvement is the interoperability implementation with international portals or other databases in order to improve IAGOS data discovery. In the frame of the IGAS project (IAGOS for the Copernicus Atmospheric Service), a data network has been setup. It is composed of three data centers: the IAGOS database in Toulouse; the HALO research aircraft database at DLR (https://halo-db.pa.op.dlr.de); and the CAMS data center in Jülich (http://join.iek.fz-juelich.de). The CAMS (Copernicus Atmospheric Monitoring Service) project is a prominent user of the IGAS data network. The new portal provides improved and new services such as the download in NetCDF or NASA Ames formats, plotting tools (maps, time series, vertical profiles, etc.) and user management. Added value products are available on the portal: back trajectories, origin of air masses, co-location with satellite data, etc. The link with the CAMS data center, through JOIN (Jülich OWS Interface), allows to combine model outputs with IAGOS data for inter-comparison. Finally IAGOS metadata has been standardized (ISO 19115) and now provides complete information about data traceability and quality.

  18. EPA U.S. NATIONAL MARKAL DATABASE: DATABASE DOCUMENTATION

    EPA Science Inventory

    This document describes in detail the U.S. Energy System database developed by EPA's Integrated Strategic Assessment Work Group for use with the MARKAL model. The group is part of the Office of Research and Development and is located in the National Risk Management Research Labor...

  19. A Sandia telephone database system

    SciTech Connect

    Nelson, S.D.; Tolendino, L.F.

    1991-08-01

    Sandia National Laboratories, Albuquerque, may soon have more responsibility for the operation of its own telephone system. The processes that constitute providing telephone service can all be improved through the use of a central data information system. We studied these processes, determined the requirements for a database system, then designed the first stages of a system that meets our needs for work order handling, trouble reporting, and ISDN hardware assignments. The design was based on an extensive set of applications that have been used for five years to manage the Sandia secure data network. The system utilizes an Ingres database management system and is programmed using the Application-By-Forms tools.

  20. A comparison of biomedical databases.

    PubMed Central

    Mychko-Megrin, A Y

    1991-01-01

    Various published bibliographic and abstract services covering the period 1970-1988 were compared to analyze scope and coverage. A total of 7,281 articles and book titles (1,655 Soviet and 5,626 foreign) were selected on forty-one topics in different medical fields. The titles originated from three different samples but included all Soviet medical literature on the subjects. A distribution of biomedical serials from five databases is given by country, and twelve indices to assess the quality of biomedical databases are suggested. PMID:1884085

  1. CD-ROM-aided Databases

    NASA Astrophysics Data System (ADS)

    Masuyama, Keiichi

    CD-ROM has rapidly evolved as a new information medium with large capacity, In the U.S. it is predicted that it will become two hundred billion yen market in three years, and thus CD-ROM is strategic target of database industry. Here in Japan the movement toward its commercialization has been active since this year. Shall CD-ROM bussiness ever conquer information market as an on-disk database or electronic publication? Referring to some cases of the applications in the U.S. the author views marketability and the future trend of this new optical disk medium.

  2. Coal quality databases: Practical applications

    SciTech Connect

    Finkelman, R.B.; Gross, P.M.K.

    1999-07-01

    Domestic and worldwide coal use will be influenced by concerns about the effects of coal combustion on the local, regional and global environment. Reliable coal quality data can help decision-makers to better assess risks and determine impacts of coal constituents on technological behavior, economic byproduct recovery, and environmental and human health issues. The US Geological Survey (USGS) maintains an existing coal quality database (COALQUAL) that contains analyses of approximately 14,000 col samples from every major coal-producing basin in the US. For each sample, the database contains results of proximate and ultimate analyses; sulfur form data; and major, minor, and trace element concentrations for approximately 70 elements

  3. 3-MCPD 1-Palmitate Induced Tubular Cell Apoptosis In Vivo via JNK/p53 Pathways.

    PubMed

    Liu, Man; Huang, Guoren; Wang, Thomas T Y; Sun, Xiangjun; Yu, Liangli Lucy

    2016-05-01

    Fatty acid esters of 3-chloro-1, 2-propanediol (3-MCPD esters) are a group of processing induced food contaminants with nephrotoxicity but the molecular mechanism(s) remains unclear. This study investigated whether and how the JNK/p53 pathway may play a role in the nephrotoxic effect of 3-MCPD esters using 3-MCPD 1-palmitate (MPE) as a probe compound in Sprague Dawley rats. Microarray analysis of the kidney from the Sprague Dawley rats treated with MPE, using Gene Ontology categories and KEGG pathways, revealed that MPE altered mRNA expressions of the genes involved in the mitogen-activated protein kinase (JNK and ERK), p53, and apoptotic signal transduction pathways. The changes in the mRNA expressions were confirmed by qRT-PCR and Western blot analyses and were consistent with the induction of tubular cell apoptosis as determined by histopathological, TUNEL, and immunohistochemistry analyses in the kidneys of the Sprague Dawley rats. Additionally, p53 knockout attenuated the apoptosis, and the apoptosis-related protein bax expression and cleaved caspase-3 activation induced by MPE in the p53 knockout C57BL/6 mice, whereas JNK inhibitor SP600125 but not ERK inhibitor U0126 inhibited MPE-induced apoptosis, supporting the conclusion that JNK/p53 might play a critical role in the tubular cell apoptosis induced by MPE and other 3-MCPD fatty acid esters. PMID:27008853

  4. Identification of novel genes and pathways in carotid atheroma using integrated bioinformatic methods

    PubMed Central

    Nai, Wenqing; Threapleton, Diane; Lu, Jingbo; Zhang, Kewei; Wu, Hongyuan; Fu, You; Wang, Yuanyuan; Ou, Zejin; Shan, Lanlan; Ding, Yan; Yu, Yanlin; Dai, Meng

    2016-01-01

    Atherosclerosis is the primary cause of cardiovascular events and its molecular mechanism urgently needs to be clarified. In our study, atheromatous plaques (ATH) and macroscopically intact tissue (MIT) sampled from 32 patients were compared and an integrated series of bioinformatic microarray analyses were used to identify altered genes and pathways. Our work showed 816 genes were differentially expressed between ATH and MIT, including 443 that were up-regulated and 373 that were down-regulated in ATH tissues. GO functional-enrichment analysis for differentially expressed genes (DEGs) indicated that genes related to the “immune response” and “muscle contraction” were altered in ATHs. KEGG pathway-enrichment analysis showed that up-regulated DEGs were significantly enriched in the “FcεRI-mediated signaling pathway”, while down-regulated genes were significantly enriched in the “transforming growth factor-β signaling pathway”. Protein-protein interaction network and module analysis demonstrated that VAV1, SYK, LYN and PTPN6 may play critical roles in the network. Additionally, similar observations were seen in a validation study where SYK, LYN and PTPN6 were markedly elevated in ATH. All in all, identification of these genes and pathways not only provides new insights into the pathogenesis of atherosclerosis, but may also aid in the development of prognostic and therapeutic biomarkers for advanced atheroma. PMID:26742467

  5. Protein Analysis of Sapienic Acid-Treated Porphyromonas gingivalis Suggests Differential Regulation of Multiple Metabolic Pathways

    PubMed Central

    Dawson, Deborah V.; Blanchette, Derek R.; Drake, David R.; Wertz, Philip W.; Brogden, Kim A.

    2015-01-01

    ABSTRACT Lipids endogenous to skin and mucosal surfaces exhibit potent antimicrobial activity against Porphyromonas gingivalis, an important colonizer of the oral cavity implicated in periodontitis. Our previous work demonstrated the antimicrobial activity of the fatty acid sapienic acid (C16:1Δ6) against P. gingivalis and found that sapienic acid treatment alters both protein and lipid composition from those in controls. In this study, we further examined whole-cell protein differences between sapienic acid-treated bacteria and untreated controls, and we utilized open-source functional association and annotation programs to explore potential mechanisms for the antimicrobial activity of sapienic acid. Our analyses indicated that sapienic acid treatment induces a unique stress response in P. gingivalis resulting in differential expression of proteins involved in a variety of metabolic pathways. This network of differentially regulated proteins was enriched in protein-protein interactions (P = 2.98 × 10−8), including six KEGG pathways (P value ranges, 2.30 × 10−5 to 0.05) and four Gene Ontology (GO) molecular functions (P value ranges, 0.02 to 0.04), with multiple suggestive enriched relationships in KEGG pathways and GO molecular functions. Upregulated metabolic pathways suggest increases in energy production, lipid metabolism, iron acquisition and processing, and respiration. Combined with a suggested preferential metabolism of serine, which is necessary for fatty acid biosynthesis, these data support our previous findings that the site of sapienic acid antimicrobial activity is likely at the bacterial membrane. IMPORTANCE P. gingivalis is an important opportunistic pathogen implicated in periodontitis. Affecting nearly 50% of the population, periodontitis is treatable, but the resulting damage is irreversible and eventually progresses to tooth loss. There is a great need for natural products that can be used to treat and/or prevent the overgrowth of

  6. Online Petroleum Industry Bibliographic Databases: A Review.

    ERIC Educational Resources Information Center

    Anderson, Margaret B.

    This paper discusses the present status of the bibliographic database industry, reviews the development of online databases of interest to the petroleum industry, and considers future developments in online searching and their effect on libraries and information centers. Three groups of databases are described: (1) databases developed by the…

  7. Federal Register Document Image Database, Volume 1

    National Institute of Standards and Technology Data Gateway

    NIST Federal Register Document Image Database, Volume 1 (PC database for purchase)   NIST has produced a new document image database for evaluating document analysis and recognition technologies and information retrieval systems. NIST Special Database 25 contains page images from the 1994 Federal Register and much more.

  8. European Community Databases: Online to Europe.

    ERIC Educational Resources Information Center

    Hensley, Colin

    1989-01-01

    Describes three groups of databases sponsored by the European Communities Commission: Eurobases, a textual database of the contents of the "Official Journal" of the European Community; the European Community Host Organization (ECHO) databases, which offer multilingual information about Europe; and statistical databases. Information on access and…

  9. WMC Database Evaluation. Case Study Report

    SciTech Connect

    Palounek, Andrea P. T

    2015-10-29

    The WMC Database is ultimately envisioned to hold a collection of experimental data, design information, and information from computational models. This project was a first attempt at using the Database to access experimental data and extract information from it. This evaluation shows that the Database concept is sound and robust, and that the Database, once fully populated, should remain eminently usable for future researchers.

  10. Prokaryotic Genomes from Microbes Online Database

    DOE Data Explorer

    Alm, Eric J.; Huang, Katherine H.; Price, Morgan N.; Koche, Richard P.; Keller, Keith; Dubchak, Inna L.; Arkin, Adam P.

    To describe the potential functions of genes, MicrobesOnline includes protein family analyses (from InterPro and COG), metabolic maps (from KEGG), links to research papers (from UniProt and PubMed), and operon predictions for every genome. To examine each gene's evolutionary history, MicrobesOnline includes precomputed phylogenetic trees for all the gene families. It displays gene trees with genomic context or it compares the gene tree to the species tree. The tools provided with MicrobesOnline allow users to: compute customized motifs, sequence alignments, and phylogenetic trees change expression patterns in metabolic maps annotate genes in various ways. A browse tree tool and a genome browser are available, along with specialized search capabilities. (Specialized Interface)

  11. Chemical Shifts to Metabolic Pathways: Identifying Metabolic Pathways Directly from a Single 2D NMR Spectrum.

    PubMed

    Dubey, Abhinav; Rangarajan, Annapoorni; Pal, Debnath; Atreya, Hanudatta S

    2015-12-15

    Identifying cellular processes in terms of metabolic pathways is one of the avowed goals of metabolomics studies. Currently, this is done after relevant metabolites are identified to allow their mapping onto specific pathways. This task is daunting due to the complex nature of cellular processes and the difficulty in establishing the identity of individual metabolites. We propose here a new method: ChemSMP (Chemical Shifts to Metabolic Pathways), which facilitates rapid analysis by identifying the active metabolic pathways directly from chemical shifts obtained from a single two-dimensional (2D) [(13)C-(1)H] correlation NMR spectrum without the need for identification and assignment of individual metabolites. ChemSMP uses a novel indexing and scoring system comprised of a "uniqueness score" and a "coverage score". Our method is demonstrated on metabolic pathways data from the Small Molecule Pathway Database (SMPDB) and chemical shifts from the Human Metabolome Database (HMDB). Benchmarks show that ChemSMP has a positive prediction rate of >90% in the presence of decluttered data and can sustain the same at 60-70% even in the presence of noise, such as deletions of peaks and chemical shift deviations. The method tested on NMR data acquired for a mixture of 20 amino acids shows a success rate of 93% in correct recovery of pathways. When used on data obtained from the cell lysate of an unexplored oncogenic cell line, it revealed active metabolic pathways responsible for regulating energy homeostasis of cancer cells. Our unique tool is thus expected to significantly enhance analysis of NMR-based metabolomics data by reducing existing impediments.

  12. Identifying the Genetic Variation of Gene Expression Using Gene Sets: Application of Novel Gene Set eQTL Approach to PharmGKB and KEGG

    PubMed Central

    Abo, Ryan; Jenkins, Gregory D.; Wang, Liewei; Fridley, Brooke L.

    2012-01-01

    Genetic variation underlying the regulation of mRNA gene expression in humans may provide key insights into the molecular mechanisms of human traits and complex diseases. Current statistical methods to map genetic variation associated with mRNA gene expression have typically applied standard linkage and/or association methods; however, when genome-wide SNP and mRNA expression data are available performing all pair wise comparisons is computationally burdensome and may not provide optimal power to detect associations. Consideration of different approaches to account for the high dimensionality and multiple testing issues may provide increased efficiency and statistical power. Here we present a novel approach to model and test the association between genetic variation and mRNA gene expression levels in the context of gene sets (GSs) and pathways, referred to as gene set – expression quantitative trait loci analysis (GS-eQTL). The method uses GSs to initially group SNPs and mRNA expression, followed by the application of principal components analysis (PCA) to collapse the variation and reduce the dimensionality within the GSs. We applied GS-eQTL to assess the association between SNP and mRNA expression level data collected from a cell-based model system using PharmGKB and KEGG defined GSs. We observed a large number of significant GS-eQTL associations, in which the most significant associations arose between genetic variation and mRNA expression from the same GS. However, a number of associations involving genetic variation and mRNA expression from different GSs were also identified. Our proposed GS-eQTL method effectively addresses the multiple testing limitations in eQTL studies and provides biological context for SNP-expression associations. PMID:22905253

  13. Pathway modulators and inhibitors.

    PubMed

    Smith, John A

    2009-07-01

    Inhibitors of specific cellular pathways are useful for investigating the roles of proteins of unknown function, and for selectively inhibiting a protein in complex pathways to uncover its relationships to other proteins in this and other interacting pathways. This appendix provides links to Web sites that describe cellular processes and pathways along with the various classes of inhibitors, numerous references, downloadable diagrams, and technical tips.

  14. NLTE4 Plasma Population Kinetics Database

    National Institute of Standards and Technology Data Gateway

    SRD 159 NLTE4 Plasma Population Kinetics Database (Web database for purchase)   This database contains benchmark results for simulation of plasma population kinetics and emission spectra. The data were contributed by the participants of the 4th Non-LTE Code Comparison Workshop who have unrestricted access to the database. The only limitation for other users is in hidden labeling of the output results. Guest users can proceed to the database entry page without entering userid and password.

  15. KID, a Kinase Inhibitor Database project.

    PubMed

    Collin, O; Meijer, L

    1999-01-01

    The Kinase Inhibitor Database is a small specialized database dedicated to the gathering of information on protein kinase inhibitors. The database is accessible through the World Wide Web system and gives access to structural and bibliographic information on protein kinase inhibitors. The data in the database will be collected and submitted by researchers working in the kinase inhibitor field. The submitted data will be checked by the curator of the database before entry.

  16. PathoPlant: a database on plant-pathogen interactions.

    PubMed

    Bülow, Lorenz; Schindler, Martin; Choi, Claudia; Hehl, Reinhard

    2004-01-01

    Pathogen recognition and signal transduction during plant pathogenesis is essential for the activation of plant defense mechanisms. To facilitate easy access to published data and to permit comparative studies of different pathogen response pathways, a database is indispensable to give a broad overview of the components and reactions so far known. PathoPlant has been developed as a relational database to display relevant components and reactions involved in signal transduction related to plant-pathogen interactions. On the organism level, the tables 'plant', 'pathogen' and 'interaction' are used to describe incompatible interactions between plants and pathogens or diseases. On the molecular level, plant pathogenesis related information is organized in PathoPlant's main tables 'molecule', 'reaction' and 'location'. Signal transduction pathways are modeled as consecutive sequences of known molecules and corresponding reactions. PathoPlant entries are linked to associated internal records as well as to entries in external databases such as SWISS-PROT, GenBank, PubMed, and TRANSFAC. PathoPlant is available as a web-based service at http://www.pathoplant.de. PMID:15752070

  17. The NASA Fireball Network Database

    NASA Technical Reports Server (NTRS)

    Moser, Danielle E.

    2011-01-01

    The NASA Meteoroid Environment Office (MEO) has been operating an automated video fireball network since late-2008. Since that time, over 1,700 multi-station fireballs have been observed. A database containing orbital data and trajectory information on all these events has recently been compiled and is currently being mined for information. Preliminary results are presented here.

  18. LDEF meteoroid and debris database

    NASA Astrophysics Data System (ADS)

    Dardano, C. B.; See, Thomas H.; Zolensky, Michael E.

    The Long Duration Exposure Facility (LDEF) Meteoroid and Debris Special Investigation Group (M&D SIG) database is maintained at the Johnson Space Center (JSC), Houston, Texas, and consists of five data tables containing information about individual features, digitized images of selected features, and LDEF hardware (i.e., approximately 950 samples) archived at JSC. About 4000 penetrations (greater than 300 micron in diameter) and craters (greater than 500 micron in diameter) were identified and photo-documented during the disassembly of LDEF at the Kennedy Space Center (KSC), while an additional 4500 or so have subsequently been characterized at JSC. The database also contains some data that have been submitted by various PI's, yet the amount of such data is extremely limited in its extent, and investigators are encouraged to submit any and all M&D-type data to JSC for inclusion within the M&D database. Digitized stereo-image pairs are available for approximately 4500 features through the database.

  19. Databases in the United Kingdom.

    ERIC Educational Resources Information Center

    Chadwyck-Healey, Charles

    This overview of the status of online databases in the United Kingdom describes online users' attitudes and practices in light of two surveys conducted in the past two years. The Online Information Centre at ASLIB sampled 325 users, and Chadwyck-Healey, Ltd., conducted a face-to-face survey of librarians in a broad cross-section of 76 libraries.…

  20. Database Transformations for Biological Applications

    SciTech Connect

    Overton, C.; Davidson, S. B.; Buneman, P.; Tannen, V.

    2001-04-11

    The goal of this project was to develop tools to facilitate data transformations between heterogeneous data sources found throughout biomedical applications. Such transformations are necessary when sharing data between different groups working on related problems as well as when querying data spread over different databases, files and software analysis packages.

  1. Interactive bibliographical database on color

    NASA Astrophysics Data System (ADS)

    Caivano, Jose L.

    2002-06-01

    The paper describes the methodology and results of a project under development, aimed at the elaboration of an interactive bibliographical database on color in all fields of application: philosophy, psychology, semiotics, education, anthropology, physical and natural sciences, biology, medicine, technology, industry, architecture and design, arts, linguistics, geography, history. The project is initially based upon an already developed bibliography, published in different journals, updated in various opportunities, and now available at the Internet, with more than 2,000 entries. The interactive database will amplify that bibliography, incorporating hyperlinks and contents (indexes, abstracts, keywords, introductions, or eventually the complete document), and devising mechanisms for information retrieval. The sources to be included are: books, doctoral dissertations, multimedia publications, reference works. The main arrangement will be chronological, but the design of the database will allow rearrangements or selections by different fields: subject, Decimal Classification System, author, language, country, publisher, etc. A further project is to develop another database, including color-specialized journals or newsletters, and articles on color published in international journals, arranged in this case by journal name and date of publication, but allowing also rearrangements or selections by author, subject and keywords.

  2. The Visual Double Star Database

    NASA Astrophysics Data System (ADS)

    Worley, Charles E.

    1997-03-01

    The collection of visual double star data in a systematic way began more than a century ago and has been continued regularly since that time. Thus, it forms the oldest extant database in astronomy. This contribution briefly reviews the history and highlights of the project, describes the current status and future prospects for this endeavor and comments on the current modes of data distribution.

  3. Begin: Online Database Searching Now!

    ERIC Educational Resources Information Center

    Lodish, Erica K.

    1986-01-01

    Because of the increasing importance of online databases, school library media specialists are encouraged to introduce students to online searching. Four books that would help media specialists gain a basic background are reviewed and it is noted that although they are very technical, they can be adapted to individual needs. (EM)

  4. Using Databases in History Teaching.

    ERIC Educational Resources Information Center

    Knight, P.; Timmins, G.

    1986-01-01

    Discusses advantages and limitations of database software in meeting the educational objectives of history instruction; reviews five currently available computer programs (FACTFILE, QUEST, QUARRY BANK 1851, Census Analysis, and Beta Base); highlights major considerations that arise in designing such programs; and describes their classroom use.…

  5. Berlin Emissivity Database (BED) Archive

    NASA Astrophysics Data System (ADS)

    D'Amore, M.; Helbert, J.; Maturilli, A.

    2009-03-01

    The Berlin Emissivity Database ranges from 3 to 50 µm. BED comprises several grain-sized mineral, up to high temperature, and has a modular structure, to collect in the future Raman measurement, samples pictures, thin section images and so on.

  6. The New NRL Crystallographic Database

    NASA Astrophysics Data System (ADS)

    Mehl, Michael; Curtarolo, Stefano; Hicks, David; Toher, Cormac; Levy, Ohad; Hart, Gus

    For many years the Naval Research Laboratory maintained an online graphical database of crystal structures for a wide variety of materials. This database has now been redesigned, updated and integrated with the AFLOW framework for high throughput computational materials discovery (http://materials.duke.edu/aflow.html). For each structure we provide an image showing the atomic positions; the primitive vectors of the lattice and the basis vectors of every atom in the unit cell; the space group and Wyckoff positions; Pearson symbols; common names; and Strukturbericht designations, where available. References for each structure are provided, as well as a Crystallographic Information File (CIF). The database currently includes almost 300 entries and will be continuously updated and expanded. It enables easy search of the various structures based on their underlying symmetries, either by Bravais lattice, Pearson symbol, Strukturbericht designation or commonly used prototypes. The talk will describe the features of the database, and highlight its utility for high throughput computational materials design. Work at NRL is funded by a Contract with the Duke University Department of Mechanical Engineering.

  7. Safeguarding Databases Basic Concepts Revisited.

    ERIC Educational Resources Information Center

    Cardinali, Richard

    1995-01-01

    Discusses issues of database security and integrity, including computer crime and vandalism, human error, computer viruses, employee and user access, and personnel policies. Suggests some precautions to minimize system vulnerability such as careful personnel screening, audit systems, passwords, and building and software security systems. (JKP)

  8. DED: Database of Evolutionary Distances.

    PubMed

    Veeramachaneni, Vamsi; Makalowski, Wojciech

    2005-01-01

    A large database of homologous sequence alignments with good estimates of evolutionary distances can be a valuable resource for molecular evolutionary studies and phylogenetic research in particular. We recently created a database containing 159,921 transcripts from human, mouse, rat, zebrafish and fugu species. Approximately 1,000 homology groups were identified with the help of Ensembl homology evidence. At the macro-level, the database allows us to answer queries of the form: 1. What is the average k-distance between 5' untranslated regions of human and mouse? 2. List the 10 groups with the highest K(a)/K(s) ratio between mouse and rat. 3. List all identical proteins between human and rat. Researchers interested in specific proteins can use a simple web interface to retrieve the homology groups of interest, examine all pairwise distances between members of the group and study the conservation of exon-intron gene structures using a graphical interface. The database is available at http://warta.bio.psu.edu/DED/.

  9. Open access intrapartum CTG database

    PubMed Central

    2014-01-01

    Background Cardiotocography (CTG) is a monitoring of fetal heart rate and uterine contractions. Since 1960 it is routinely used by obstetricians to assess fetal well-being. Many attempts to introduce methods of automatic signal processing and evaluation have appeared during the last 20 years, however still no significant progress similar to that in the domain of adult heart rate variability, where open access databases are available (e.g. MIT-BIH), is visible. Based on a thorough review of the relevant publications, presented in this paper, the shortcomings of the current state are obvious. A lack of common ground for clinicians and technicians in the field hinders clinically usable progress. Our open access database of digital intrapartum cardiotocographic recordings aims to change that. Description The intrapartum CTG database consists in total of 552 intrapartum recordings, which were acquired between April 2010 and August 2012 at the obstetrics ward of the University Hospital in Brno, Czech Republic. All recordings were stored in electronic form in the OB TraceVue®;system. The recordings were selected from 9164 intrapartum recordings with clinical as well as technical considerations in mind. All recordings are at most 90 minutes long and start a maximum of 90 minutes before delivery. The time relation of CTG to delivery is known as well as the length of the second stage of labor which does not exceed 30 minutes. The majority of recordings (all but 46 cesarean sections) is – on purpose – from vaginal deliveries. All recordings have available biochemical markers as well as some more general clinical features. Full description of the database and reasoning behind selection of the parameters is presented in the paper. Conclusion A new open-access CTG database is introduced which should give the research community common ground for comparison of results on reasonably large database. We anticipate that after reading the paper, the reader will understand the

  10. Exploring genetic, genomic, and phenotypic data at the rat genome database.

    PubMed

    Laulederkind, Stanley J F; Hayman, G Thomas; Wang, Shur-Jen; Lowry, Timothy F; Nigam, Rajni; Petri, Victoria; Smith, Jennifer R; Dwinell, Melinda R; Jacob, Howard J; Shimoyama, Mary

    2012-12-01

    The laboratory rat, Rattus norvegicus, is an important model of human health and disease, and experimental findings in the rat have relevance to human physiology and disease. The Rat Genome Database (RGD, http://rgd.mcw.edu) is a model organism database that provides access to a wide variety of curated rat data including disease associations, phenotypes, pathways, molecular functions, biological processes, and cellular components for genes, quantitative trait loci, and strains. We present an overview of the database followed by specific examples that can be used to gain experience in employing RGD to explore the wealth of functional data available for the rat.

  11. Dietary MicroRNA Database (DMD): An Archive Database and Analytic Tool for Food-Borne microRNAs.

    PubMed

    Chiang, Kevin; Shu, Jiang; Zempleni, Janos; Cui, Juan

    2015-01-01

    With the advent of high throughput technology, a huge amount of microRNA information has been added to the growing body of knowledge for non-coding RNAs. Here we present the Dietary MicroRNA Databases (DMD), the first repository for archiving and analyzing the published and novel microRNAs discovered in dietary resources. Currently there are fifteen types of dietary species, such as apple, grape, cow milk, and cow fat, included in the database originating from 9 plant and 5 animal species. Annotation for each entry, a mature microRNA indexed as DM0000*, covers information of the mature sequences, genome locations, hairpin structures of parental pre-microRNAs, cross-species sequence comparison, disease relevance, and the experimentally validated gene targets. Furthermore, a few functional analyses including target prediction, pathway enrichment and gene network construction have been integrated into the system, which enable users to generate functional insights through viewing the functional pathways and building protein-protein interaction networks associated with each microRNA. Another unique feature of DMD is that it provides a feature generator where a total of 411 descriptive attributes can be calculated for any given microRNAs based on their sequences and structures. DMD would be particularly useful for research groups studying microRNA regulation from a nutrition point of view. The database can be accessed at http://sbbi.unl.edu/dmd/.

  12. dbPTB: a database for preterm birth.

    PubMed

    Uzun, Alper; Laliberte, Alyse; Parker, Jeremy; Andrew, Caroline; Winterrowd, Emily; Sharma, Surendra; Istrail, Sorin; Padbury, James F

    2012-01-01

    Genome-wide association studies (GWAS) query the entire genome in a hypothesis-free, unbiased manner. Since they have the potential for identifying novel genetic variants, they have become a very popular approach to the investigation of complex diseases. Nonetheless, since the success of the GWAS approach varies widely, the identification of genetic variants for complex diseases remains a difficult problem. We developed a novel bioinformatics approach to identify the nominal genetic variants associated with complex diseases. To test the feasibility of our approach, we developed a web-based aggregation tool to organize the genes, genetic variations and pathways involved in preterm birth. We used semantic data mining to extract all published articles related to preterm birth. All articles were reviewed by a team of curators. Genes identified from public databases and archives of expression arrays were aggregated with genes curated from the literature. Pathway analysis was used to impute genes from pathways identified in the curations. The curated articles and collected genetic information form a unique resource for investigators interested in preterm birth. The Database for Preterm Birth exemplifies an approach that is generalizable to other disorders for which there is evidence of significant genetic contributions. PMID:22323062

  13. An incremental database access method for autonomous interoperable databases

    NASA Technical Reports Server (NTRS)

    Roussopoulos, Nicholas; Sellis, Timos

    1994-01-01

    We investigated a number of design and performance issues of interoperable database management systems (DBMS's). The major results of our investigation were obtained in the areas of client-server database architectures for heterogeneous DBMS's, incremental computation models, buffer management techniques, and query optimization. We finished a prototype of an advanced client-server workstation-based DBMS which allows access to multiple heterogeneous commercial DBMS's. Experiments and simulations were then run to compare its performance with the standard client-server architectures. The focus of this research was on adaptive optimization methods of heterogeneous database systems. Adaptive buffer management accounts for the random and object-oriented access methods for which no known characterization of the access patterns exists. Adaptive query optimization means that value distributions and selectives, which play the most significant role in query plan evaluation, are continuously refined to reflect the actual values as opposed to static ones that are computed off-line. Query feedback is a concept that was first introduced to the literature by our group. We employed query feedback for both adaptive buffer management and for computing value distributions and selectivities. For adaptive buffer management, we use the page faults of prior executions to achieve more 'informed' management decisions. For the estimation of the distributions of the selectivities, we use curve-fitting techniques, such as least squares and splines, for regressing on these values.

  14. GPCR & company: databases and servers for GPCRs and interacting partners.

    PubMed

    Kowalsman, Noga; Niv, Masha Y

    2014-01-01

    G-protein-coupled receptors (GPCRs) are a large superfamily of membrane receptors that are involved in a wide range of signaling pathways. To fulfill their tasks, GPCRs interact with a variety of partners, including small molecules, lipids and proteins. They are accompanied by different proteins during all phases of their life cycle. Therefore, GPCR interactions with their partners are of great interest in basic cell-signaling research and in drug discovery.Due to the rapid development of computers and internet communication, knowledge and data can be easily shared within the worldwide research community via freely available databases and servers. These provide an abundance of biological, chemical and pharmacological information.This chapter describes the available web resources for investigating GPCR interactions. We review about 40 freely available databases and servers, and provide a few sentences about the essence and the data they supply. For simplification, the databases and servers were grouped under the following topics: general GPCR-ligand interactions; particular families of GPCRs and their ligands; GPCR oligomerization; GPCR interactions with intracellular partners; and structural information on GPCRs. In conclusion, a multitude of useful tools are currently available. Summary tables are provided to ease navigation between the numerous and partially overlapping resources. Suggestions for future enhancements of the online tools include the addition of links from general to specialized databases and enabling usage of user-supplied template for GPCR structural modeling. PMID:24158806

  15. MeioBase: a comprehensive database for meiosis.

    PubMed

    Li, Hao; Meng, Fanrui; Guo, Chunce; Wang, Yingxiang; Xie, Xiaojing; Zhu, Tiansheng; Zhou, Shuigeng; Ma, Hong; Shan, Hongyan; Kong, Hongzhi

    2014-01-01

    Meiosis is a special type of cell division process necessary for the sexual reproduction of all eukaryotes. The ever expanding meiosis research calls for an effective and specialized database that is not readily available yet. To fill this gap, we have developed a knowledge database MeioBase (http://meiosis.ibcas.ac.cn), which is comprised of two core parts, Resources and Tools. In the Resources part, a wealth of meiosis data collected by curation and manual review from published literatures and biological databases are integrated and organized into various sections, such as Cytology, Pathway, Species, Interaction, and Expression. In the Tools part, some useful tools have been integrated into MeioBase, such as Search, Download, Blast, Comparison, My Favorites, Submission, and Advice. With a simplified and efficient web interface, users are able to search against the database with gene model IDs or keywords, and batch download the data for local investigation. We believe that MeioBase can greatly facilitate the researches related to meiosis. PMID:25566299

  16. Ovarian Kaleidoscope Database: Ten Years and Beyond1

    PubMed Central

    Hsueh, Aaron J.; Rauch, Rami

    2012-01-01

    ABSTRACT Ovarian Kaleidoscope database (OKdb) is an online, searchable, public database containing text-based and DNA microarray data to facilitate research by ovarian researchers. Using key words and predetermined categories, users can search ovarian gene information based on gene function, cell type of expression, cellular localization, hormonal regulation, mutant phenotypes, chromosomal location, ligand-receptor relationship, and other criteria, either alone or in combination. For individual genes, users can access more than 10 extensive DNA microarray datasets to interrogate gene expression patterns in a development-specific and cell type-specific manner. All ligand and receptor genes expressed in the ovary are matched to facilitate investigation of paracrine/autocrine signaling. More than 3500 ovarian genes in the database are matched to 185 gene pathways in the Kyoto Encyclopedia of Genes and Genomes to allow for elucidation of gene interactions and relationships. In addition to >400 genes with infertility or subfertility phenotypes when mutated in mice or humans, the OKdb also lists ∼50 and ∼40 genes associated with polycystic ovarian syndrome and primary ovarian insufficiency, respectively. The expanding OKdb is updated weekly and allows submission of new genes by ovarian researchers to allow instant access to DNA microarray datasets for newly submitted genes. The present database is a virtual community for ovarian researchers and allows users to instantaneously provide their comments for individual gene pages based on an automated Web-discussion system. In the coming years, we will continue to add new features to serve the ovarian research community. PMID:22441797

  17. GPCR & company: databases and servers for GPCRs and interacting partners.

    PubMed

    Kowalsman, Noga; Niv, Masha Y

    2014-01-01

    G-protein-coupled receptors (GPCRs) are a large superfamily of membrane receptors that are involved in a wide range of signaling pathways. To fulfill their tasks, GPCRs interact with a variety of partners, including small molecules, lipids and proteins. They are accompanied by different proteins during all phases of their life cycle. Therefore, GPCR interactions with their partners are of great interest in basic cell-signaling research and in drug discovery.Due to the rapid development of computers and internet communication, knowledge and data can be easily shared within the worldwide research community via freely available databases and servers. These provide an abundance of biological, chemical and pharmacological information.This chapter describes the available web resources for investigating GPCR interactions. We review about 40 freely available databases and servers, and provide a few sentences about the essence and the data they supply. For simplification, the databases and servers were grouped under the following topics: general GPCR-ligand interactions; particular families of GPCRs and their ligands; GPCR oligomerization; GPCR interactions with intracellular partners; and structural information on GPCRs. In conclusion, a multitude of useful tools are currently available. Summary tables are provided to ease navigation between the numerous and partially overlapping resources. Suggestions for future enhancements of the online tools include the addition of links from general to specialized databases and enabling usage of user-supplied template for GPCR structural modeling.

  18. SASD: the Synthetic Alternative Splicing Database for identifying novel isoform from proteomics

    PubMed Central

    2013-01-01

    Background Alternative splicing is an important and widespread mechanism for generating protein diversity and regulating protein expression. High-throughput identification and analysis of alternative splicing in the protein level has more advantages than in the mRNA level. The combination of alternative splicing database and tandem mass spectrometry provides a powerful technique for identification, analysis and characterization of potential novel alternative splicing protein isoforms from proteomics. Therefore, based on the peptidomic database of human protein isoforms for proteomics experiments, our objective is to design a new alternative splicing database to 1) provide more coverage of genes, transcripts and alternative splicing, 2) exclusively focus on the alternative splicing, and 3) perform context-specific alternative splicing analysis. Results We used a three-step pipeline to create a synthetic alternative splicing database (SASD) to identify novel alternative splicing isoforms and interpret them at the context of pathway, disease, drug and organ specificity or custom gene set with maximum coverage and exclusive focus on alternative splicing. First, we extracted information on gene structures of all genes in the Ensembl Genes 71 database and incorporated the Integrated Pathway Analysis Database. Then, we compiled artificial splicing transcripts. Lastly, we translated the artificial transcripts into alternative splicing peptides. The SASD is a comprehensive database containing 56,630 genes (Ensembl gene IDs), 95,260 transcripts (Ensembl transcript IDs), and 11,919,779 Alternative Splicing peptides, and also covering about 1,956 pathways, 6,704 diseases, 5,615 drugs, and 52 organs. The database has a web-based user interface that allows users to search, display and download a single gene/transcript/protein, custom gene set, pathway, disease, drug, organ related alternative splicing. Moreover, the quality of the database was validated with comparison to other

  19. DEPOT: A Database of Environmental Parameters, Organizations and Tools

    SciTech Connect

    CARSON,SUSAN D.; HUNTER,REGINA LEE; MALCZYNSKI,LEONARD A.; POHL,PHILLIP I.; QUINTANA,ENRICO; SOUZA,CAROLINE A.; HIGLEY,KATHRYN; MURPHIE,WILLIAM

    2000-12-19

    The Database of Environmental Parameters, Organizations, and Tools (DEPOT) has been developed by the Department of Energy (DOE) as a central warehouse for access to data essential for environmental risk assessment analyses. Initial efforts have concentrated on groundwater and vadose zone transport data and bioaccumulation factors. DEPOT seeks to provide a source of referenced data that, wherever possible, includes the level of uncertainty associated with these parameters. Based on the amount of data available for a particular parameter, uncertainty is expressed as a standard deviation or a distribution function. DEPOT also provides DOE site-specific performance assessment data, pathway-specific transport data, and links to environmental regulations, disposal site waste acceptance criteria, other environmental parameter databases, and environmental risk assessment models.

  20. enviPath – The environmental contaminant biotransformation pathway resource

    PubMed Central

    Wicker, Jörg; Lorsbach, Tim; Gütlein, Martin; Schmid, Emanuel; Latino, Diogo; Kramer, Stefan; Fenner, Kathrin

    2016-01-01

    The University of Minnesota Biocatalysis/Biodegradation Database and Pathway Prediction System (UM-BBD/PPS) has been a unique resource covering microbial biotransformation pathways of primarily xenobiotic chemicals for over 15 years. This paper introduces the successor system, enviPath (The Environmental Contaminant Biotransformation Pathway Resource), which is a complete redesign and reimplementation of UM-BBD/PPS. enviPath uses the database from the UM-BBD/PPS as a basis, extends the use of this database, and allows users to include their own data to support multiple use cases. Relative reasoning is supported for the refinement of predictions and to allow its extensions in terms of previously published, but not implemented machine learning models. User access is simplified by providing a REST API that simplifies the inclusion of enviPath into existing workflows. An RDF database is used to enable simple integration with other databases. enviPath is publicly available at https://envipath.org with free and open access to its core data. PMID:26582924